Implementing the Deutsch-Jozsa algorithm with macroscopic ensembles
NASA Astrophysics Data System (ADS)
Semenenko, Henry; Byrnes, Tim
2016-05-01
Quantum computing implementations under consideration today typically deal with systems with microscopic degrees of freedom such as photons, ions, cold atoms, and superconducting circuits. The quantum information is stored typically in low-dimensional Hilbert spaces such as qubits, as quantum effects are strongest in such systems. It has, however, been demonstrated that quantum effects can be observed in mesoscopic and macroscopic systems, such as nanomechanical systems and gas ensembles. While few-qubit quantum information demonstrations have been performed with such macroscopic systems, a quantum algorithm showing exponential speedup over classical algorithms is yet to be shown. Here, we show that the Deutsch-Jozsa algorithm can be implemented with macroscopic ensembles. The encoding that we use avoids the detrimental effects of decoherence that normally plagues macroscopic implementations. We discuss two mapping procedures which can be chosen depending upon the constraints of the oracle and the experiment. Both methods have an exponential speedup over the classical case, and only require control of the ensembles at the level of the total spin of the ensembles. It is shown that both approaches reproduce the qubit Deutsch-Jozsa algorithm, and are robust under decoherence.
Graphene-based room-temperature implementation of a modified Deutsch-Jozsa quantum algorithm.
Dragoman, Daniela; Dragoman, Mircea
2015-12-04
We present an implementation of a one-qubit and two-qubit modified Deutsch-Jozsa quantum algorithm based on graphene ballistic devices working at room temperature. The modified Deutsch-Jozsa algorithm decides whether a function, equivalent to the effect of an energy potential distribution on the wave function of ballistic charge carriers, is constant or not, without measuring the output wave function. The function need not be Boolean. Simulations confirm that the algorithm works properly, opening the way toward quantum computing at room temperature based on the same clean-room technologies as those used for fabrication of very-large-scale integrated circuits.
Quantum computation with classical light: Implementation of the Deutsch-Jozsa algorithm
NASA Astrophysics Data System (ADS)
Perez-Garcia, Benjamin; McLaren, Melanie; Goyal, Sandeep K.; Hernandez-Aranda, Raul I.; Forbes, Andrew; Konrad, Thomas
2016-05-01
We propose an optical implementation of the Deutsch-Jozsa Algorithm using classical light in a binary decision-tree scheme. Our approach uses a ring cavity and linear optical devices in order to efficiently query the oracle functional values. In addition, we take advantage of the intrinsic Fourier transforming properties of a lens to read out whether the function given by the oracle is balanced or constant.
Scheme for implementing the Deutsch-Jozsa algorithm in cavity QED
Zheng Shibiao
2004-09-01
We propose a scheme for realizing the Deutsch-Jozsa algorithm in cavity QED. The scheme is based on the resonant interaction of atoms with a cavity mode. The required experimental techniques are within the scope of what can be obtained in the microwave cavity QED setup. The experimental implementation of the scheme would be an important step toward more complex quantum computation in cavity QED.
Implementing Deutsch-Jozsa algorithm using light shifts and atomic ensembles
Dasgupta, Shubhrangshu; Biswas, Asoka; Agarwal, G.S.
2005-01-01
We present an optical scheme to implement the Deutsch-Jozsa algorithm using ac Stark shifts. The scheme uses an atomic ensemble consisting of four-level atoms interacting dispersively with a field. This leads to a Hamiltonian in the atom-field basis which is quite suitable for quantum computation. We show how one can implement the algorithm by performing proper one- and two-qubit operations. We emphasize that in our model the decoherence is expected to be minimal due to our usage of atomic ground states and freely propagating photon.
NASA Astrophysics Data System (ADS)
Kraus, Christina V.; Zoller, P.; Baranov, Mikhail A.
2013-11-01
We propose an efficient protocol for braiding Majorana fermions realized as edge states in atomic wire networks, and demonstrate its robustness against experimentally relevant errors. The braiding of two Majorana fermions located on one side of two adjacent wires requires only a few local operations on this side which can be implemented using local site addressing available in current experiments with cold atoms and molecules. Based on this protocol we provide an experimentally feasible implementation of the Deutsch-Jozsa algorithm for two qubits in a topologically protected way.
Kessel, Alexander R.; Yakovleva, Natalia M.
2002-12-01
Schemes of experimental realization of the main two-qubit processors for quantum computers and the Deutsch-Jozsa algorithm are derived in virtual spin representation. The results are applicable for every four quantum states allowing the required properties for quantum processor implementation if for qubit encoding, virtual spin representation is used. A four-dimensional Hilbert space of nuclear spin 3/2 is considered in detail for this aim.
NASA Astrophysics Data System (ADS)
Wei, Hai-Rui; Liu, Ji-Zhen
2017-02-01
It is very important to seek an efficient and robust quantum algorithm demanding less quantum resources. We propose one-photon three-qubit original and refined Deutsch-Jozsa algorithms with polarization and two linear momentums degrees of freedom (DOFs). Our schemes are constructed by solely using linear optics. Compared to the traditional ones with one DOF, our schemes are more economic and robust because the necessary photons are reduced from three to one. Our linear-optic schemes are working in a determinate way, and they are feasible with current experimental technology.
Unifying parameter estimation and the Deutsch-Jozsa algorithm for continuous variables
Zwierz, Marcin; Perez-Delgado, Carlos A.; Kok, Pieter
2010-10-15
We reveal a close relationship between quantum metrology and the Deutsch-Jozsa algorithm on continuous-variable quantum systems. We develop a general procedure, characterized by two parameters, that unifies parameter estimation and the Deutsch-Jozsa algorithm. Depending on which parameter we keep constant, the procedure implements either the parameter-estimation protocol or the Deutsch-Jozsa algorithm. The parameter-estimation part of the procedure attains the Heisenberg limit and is therefore optimal. Due to the use of approximate normalizable continuous-variable eigenstates, the Deutsch-Jozsa algorithm is probabilistic. The procedure estimates a value of an unknown parameter and solves the Deutsch-Jozsa problem without the use of any entanglement.
Collins, David
2010-05-15
A general framework for regarding oracle-assisted quantum algorithms as tools for discriminating among unitary transformations is described. This framework is applied to the Deutsch-Jozsa problem and all possible quantum algorithms which solve the problem with certainty using oracle unitaries in a particular form are derived. It is also used to show that any quantum algorithm that solves the Deutsch-Jozsa problem starting with a quantum system in a particular class of initial, thermal equilibrium-based states of the type encountered in solution-state NMR can only succeed with greater probability than a classical algorithm when the problem size n exceeds {approx}10{sup 5}.
Vala, Jiri; Kosloff, Ronnie; Amitay, Zohar; Zhang Bo; Leone, Stephen R.
2002-12-01
The Deutsch-Jozsa algorithm is experimentally demonstrated for three-qubit functions using pure coherent superpositions of Li{sub 2} rovibrational eigenstates. The function's character, either constant or balanced, is evaluated by first imprinting the function, using a phase-shaped femtosecond pulse, on a coherent superposition of the molecular states, and then projecting the superposition onto an ionic final state, using a second femtosecond pulse at a specific time delay.
Quantum Cryptography Based on the Deutsch-Jozsa Algorithm
NASA Astrophysics Data System (ADS)
Nagata, Koji; Nakamura, Tadao; Farouk, Ahmed
2017-06-01
Recently, secure quantum key distribution based on Deutsch's algorithm using the Bell state is reported (Nagata and Nakamura, Int. J. Theor. Phys. doi: 10.1007/s10773-017-3352-4, 2017). Our aim is of extending the result to a multipartite system. In this paper, we propose a highly speedy key distribution protocol. We present sequre quantum key distribution based on a special Deutsch-Jozsa algorithm using Greenberger-Horne-Zeilinger states. Bob has promised to use a function f which is of one of two kinds; either the value of f(x) is constant for all values of x, or else the value of f(x) is balanced, that is, equal to 1 for exactly half of the possible x, and 0 for the other half. Here, we introduce an additional condition to the function when it is balanced. Our quantum key distribution overcomes a classical counterpart by a factor O(2 N ).
Quantum Cryptography Based on the Deutsch-Jozsa Algorithm
NASA Astrophysics Data System (ADS)
Nagata, Koji; Nakamura, Tadao; Farouk, Ahmed
2017-09-01
Recently, secure quantum key distribution based on Deutsch's algorithm using the Bell state is reported (Nagata and Nakamura, Int. J. Theor. Phys. doi: 10.1007/s10773-017-3352-4, 2017). Our aim is of extending the result to a multipartite system. In this paper, we propose a highly speedy key distribution protocol. We present sequre quantum key distribution based on a special Deutsch-Jozsa algorithm using Greenberger-Horne-Zeilinger states. Bob has promised to use a function f which is of one of two kinds; either the value of f( x) is constant for all values of x, or else the value of f( x) is balanced, that is, equal to 1 for exactly half of the possible x, and 0 for the other half. Here, we introduce an additional condition to the function when it is balanced. Our quantum key distribution overcomes a classical counterpart by a factor O(2 N ).
Experimental demonstration of the Deutsch-Jozsa algorithm in homonuclear multispin systems
Wu Zhen; Luo Jun; Feng Mang; Li Jun; Zheng Wenqiang; Peng Xinhua
2011-10-15
Despite early experimental tests of the Deutsch-Jozsa (DJ) algorithm, there have been only a very few nontrivial balanced functions tested for register number n>3. In this paper, we experimentally demonstrate the DJ algorithm in four- and five-qubit homonuclear spin systems by the nuclear-magnetic-resonance technique, by which we encode the one function evaluation into a long shaped pulse with the application of the gradient ascent algorithm. Our work, dramatically reducing the accumulated errors due to gate imperfections and relaxation, demonstrates a better implementation of the DJ algorithm.
Experimental realization of the Deutsch-Jozsa algorithm with a six-qubit cluster state
Vallone, Giuseppe; Donati, Gaia; Bruno, Natalia; Chiuri, Andrea; Mataloni, Paolo
2010-05-15
We describe an experimental realization of the Deutsch-Jozsa quantum algorithm to evaluate the properties of a two-bit Boolean function in the framework of one-way quantum computation. For this purpose, a two-photon six-qubit cluster state was engineered. Its peculiar topological structure is the basis of the original measurement pattern allowing the algorithm realization. The good agreement of the experimental results with the theoretical predictions, obtained at {approx}1 kHz success rate, demonstrates the correct implementation of the algorithm.
Tame, M. S.; Kim, M. S.
2010-09-15
We show that fundamental versions of the Deutsch-Jozsa and Bernstein-Vazirani quantum algorithms can be performed using a small entangled cluster state resource of only six qubits. We then investigate the minimal resource states needed to demonstrate general n-qubit versions and a scalable method to produce them. For this purpose, we propose a versatile photonic on-chip setup.
NMR tomography of the three-qubit Deutsch-Jozsa algorithm
NASA Astrophysics Data System (ADS)
Mangold, Oliver; Heidebrecht, Andreas; Mehring, Michael
2004-10-01
The optimized version of the Deutsch-Jozsa algorithm proposed by Collins was implemented using the three F19 nuclear spins of 2,3,4-trifluoroaniline as qubits. To emulate the behavior of pure quantum-mechanical states pseudopure states of the ensemble were prepared prior to execution of the algorithm. Full tomography of the density matrix was employed to obtain detailed information about initial, intermediate, and final states. Information, thus obtained, was applied to optimize the pulse sequences used. It is shown that substantial improvement of the fidelity of the preparation may be achieved by compensating the effects caused by the different relaxation behavior of the different substates of the density matrix. All manipulations of the quantum states were performed under the conditions of unresolved spin-spin interactions.
Efficient classical simulation of the Deutsch-Jozsa and Simon's algorithms
NASA Astrophysics Data System (ADS)
Johansson, Niklas; Larsson, Jan-Åke
2017-09-01
A long-standing aim of quantum information research is to understand what gives quantum computers their advantage. This requires separating problems that need genuinely quantum resources from those for which classical resources are enough. Two examples of quantum speed-up are the Deutsch-Jozsa and Simon's problem, both efficiently solvable on a quantum Turing machine, and both believed to lack efficient classical solutions. Here we present a framework that can simulate both quantum algorithms efficiently, solving the Deutsch-Jozsa problem with probability 1 using only one oracle query, and Simon's problem using linearly many oracle queries, just as expected of an ideal quantum computer. The presented simulation framework is in turn efficiently simulatable in a classical probabilistic Turing machine. This shows that the Deutsch-Jozsa and Simon's problem do not require any genuinely quantum resources, and that the quantum algorithms show no speed-up when compared with their corresponding classical simulation. Finally, this gives insight into what properties are needed in the two algorithms and calls for further study of oracle separation between quantum and classical computation.
NASA Astrophysics Data System (ADS)
Bera, Debajyoti
2015-06-01
One of the early achievements of quantum computing was demonstrated by Deutsch and Jozsa (Proc R Soc Lond A Math Phys Sci 439(1907):553, 1992) regarding classification of a particular type of Boolean functions. Their solution demonstrated an exponential speedup compared to classical approaches to the same problem; however, their solution was the only known quantum algorithm for that specific problem so far. This paper demonstrates another quantum algorithm for the same problem, with the same exponential advantage compared to classical algorithms. The novelty of this algorithm is the use of quantum amplitude amplification, a technique that is the key component of another celebrated quantum algorithm developed by Grover (Proceedings of the twenty-eighth annual ACM symposium on theory of computing, ACM Press, New York, 1996). A lower bound for randomized (classical) algorithms is also presented which establishes a sound gap between the effectiveness of our quantum algorithm and that of any randomized algorithm with similar efficiency.
Multipartite entanglement in quantum algorithms
Bruss, D.; Macchiavello, C.
2011-05-15
We investigate the entanglement features of the quantum states employed in quantum algorithms. In particular, we analyze the multipartite entanglement properties in the Deutsch-Jozsa, Grover, and Simon algorithms. Our results show that for these algorithms most instances involve multipartite entanglement.
The Deutch-Jozsa algorithm as a suitable framework for MapReduce in a quantum computer
NASA Astrophysics Data System (ADS)
Lipovaca, Samir
The essence of the MapReduce paradigm is a parallel, distributed algorithm across hundreds or thousands machines. In crude fashion this parallelism reminds us of the method of computation by quantum parallelism which is possible only with quantum computers. Deutsch and Jozsa showed that there is a class of problems which can be solved more efficiently by quantum computer than by any classical or stochastic method. The method of computation by quantum parallelism solves the problem with certainty in exponentially less time than any classical computation. This leads to question would it be possible to implement the MapReduce paradigm in a quantum computer and harness this incredible speedup over the classical computation performed by the current computers. Although present quantum computers are not robust enough for code writing and execution, it is worth to explore this question from a theoretical point of view. We will show from a theoretical point of view that the Deutsch-Jozsa algorithm is a suitable framework to implement the MapReduce paradigm in a quantum computer.
Demonstration of two-qubit algorithms with a superconducting quantum processor.
DiCarlo, L; Chow, J M; Gambetta, J M; Bishop, Lev S; Johnson, B R; Schuster, D I; Majer, J; Blais, A; Frunzio, L; Girvin, S M; Schoelkopf, R J
2009-07-09
Quantum computers, which harness the superposition and entanglement of physical states, could outperform their classical counterparts in solving problems with technological impact-such as factoring large numbers and searching databases. A quantum processor executes algorithms by applying a programmable sequence of gates to an initialized register of qubits, which coherently evolves into a final state containing the result of the computation. Building a quantum processor is challenging because of the need to meet simultaneously requirements that are in conflict: state preparation, long coherence times, universal gate operations and qubit readout. Processors based on a few qubits have been demonstrated using nuclear magnetic resonance, cold ion trap and optical systems, but a solid-state realization has remained an outstanding challenge. Here we demonstrate a two-qubit superconducting processor and the implementation of the Grover search and Deutsch-Jozsa quantum algorithms. We use a two-qubit interaction, tunable in strength by two orders of magnitude on nanosecond timescales, which is mediated by a cavity bus in a circuit quantum electrodynamics architecture. This interaction allows the generation of highly entangled states with concurrence up to 94 per cent. Although this processor constitutes an important step in quantum computing with integrated circuits, continuing efforts to increase qubit coherence times, gate performance and register size will be required to fulfil the promise of a scalable technology.
A strategy for quantum algorithm design assisted by machine learning
NASA Astrophysics Data System (ADS)
Bang, Jeongho; Ryu, Junghee; Yoo, Seokwon; Pawłowski, Marcin; Lee, Jinhyoung
2014-07-01
We propose a method for quantum algorithm design assisted by machine learning. The method uses a quantum-classical hybrid simulator, where a ‘quantum student’ is being taught by a ‘classical teacher’. In other words, in our method, the learning system is supposed to evolve into a quantum algorithm for a given problem, assisted by a classical main-feedback system. Our method is applicable for designing quantum oracle-based algorithms. We chose, as a case study, an oracle decision problem, called a Deutsch-Jozsa problem. We showed by using Monte Carlo simulations that our simulator can faithfully learn a quantum algorithm for solving the problem for a given oracle. Remarkably, the learning time is proportional to the square root of the total number of parameters, rather than showing the exponential dependence found in the classical machine learning-based method.
Implementation of Parallel Algorithms
1993-06-30
their socia ’ relations or to achieve some goals. For example, we define a pair-wise force law of i epulsion and attraction for a group of identical...quantization based compression schemes. Photo-refractive crystals, which provide high density recording in real time, are used as our holographic media . The...of Parallel Algorithms (J. Reif, ed.). Kluwer Academic Pu’ ishers, 1993. (4) "A Dynamic Separator Algorithm", D. Armon and J. Reif. To appear in
Interfacing external quantum devices to a universal quantum computer.
Lagana, Antonio A; Lohe, Max A; von Smekal, Lorenz
2011-01-01
We present a scheme to use external quantum devices using the universal quantum computer previously constructed. We thereby show how the universal quantum computer can utilize networked quantum information resources to carry out local computations. Such information may come from specialized quantum devices or even from remote universal quantum computers. We show how to accomplish this by devising universal quantum computer programs that implement well known oracle based quantum algorithms, namely the Deutsch, Deutsch-Jozsa, and the Grover algorithms using external black-box quantum oracle devices. In the process, we demonstrate a method to map existing quantum algorithms onto the universal quantum computer.
Linear Bregman algorithm implemented in parallel GPU
NASA Astrophysics Data System (ADS)
Li, Pengyan; Ke, Jue; Sui, Dong; Wei, Ping
2015-08-01
At present, most compressed sensing (CS) algorithms have poor converging speed, thus are difficult to run on PC. To deal with this issue, we use a parallel GPU, to implement a broadly used compressed sensing algorithm, the Linear Bregman algorithm. Linear iterative Bregman algorithm is a reconstruction algorithm proposed by Osher and Cai. Compared with other CS reconstruction algorithms, the linear Bregman algorithm only involves the vector and matrix multiplication and thresholding operation, and is simpler and more efficient for programming. We use C as a development language and adopt CUDA (Compute Unified Device Architecture) as parallel computing architectures. In this paper, we compared the parallel Bregman algorithm with traditional CPU realized Bregaman algorithm. In addition, we also compared the parallel Bregman algorithm with other CS reconstruction algorithms, such as OMP and TwIST algorithms. Compared with these two algorithms, the result of this paper shows that, the parallel Bregman algorithm needs shorter time, and thus is more convenient for real-time object reconstruction, which is important to people's fast growing demand to information technology.
Deterministic quantum computation with one photonic qubit
NASA Astrophysics Data System (ADS)
Hor-Meyll, M.; Tasca, D. S.; Walborn, S. P.; Ribeiro, P. H. Souto; Santos, M. M.; Duzzioni, E. I.
2015-07-01
We show that deterministic quantum computing with one qubit (DQC1) can be experimentally implemented with a spatial light modulator, using the polarization and the transverse spatial degrees of freedom of light. The scheme allows the computation of the trace of a high-dimension matrix, being limited by the resolution of the modulator panel and the technical imperfections. In order to illustrate the method, we compute the normalized trace of unitary matrices and implement the Deutsch-Jozsa algorithm. The largest matrix that can be manipulated with our setup is 1080 ×1920 , which is able to represent a system with approximately 21 qubits.
Java implementation of Class Association Rule algorithms
Tamura, Makio
2007-08-30
Java implementation of three Class Association Rule mining algorithms, NETCAR, CARapriori, and clustering based rule mining. NETCAR algorithm is a novel algorithm developed by Makio Tamura. The algorithm is discussed in a paper: UCRL-JRNL-232466-DRAFT, and would be published in a peer review scientific journal. The software is used to extract combinations of genes relevant with a phenotype from a phylogenetic profile and a phenotype profile. The phylogenetic profiles is represented by a binary matrix and a phenotype profile is represented by a binary vector. The present application of this software will be in genome analysis, however, it could be applied more generally.
A Fast Implementation of the ISOCLUS Algorithm
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess; Mount, David M.; Netanyahu, Nathan S.; LeMoigne, Jacqueline
2003-01-01
Unsupervised clustering is a fundamental tool in numerous image processing and remote sensing applications. For example, unsupervised clustering is often used to obtain vegetation maps of an area of interest. This approach is useful when reliable training data are either scarce or expensive, and when relatively little a priori information about the data is available. Unsupervised clustering methods play a significant role in the pursuit of unsupervised classification. One of the most popular and widely used clustering schemes for remote sensing applications is the ISOCLUS algorithm, which is based on the ISODATA method. The algorithm is given a set of n data points (or samples) in d-dimensional space, an integer k indicating the initial number of clusters, and a number of additional parameters. The general goal is to compute a set of cluster centers in d-space. Although there is no specific optimization criterion, the algorithm is similar in spirit to the well known k-means clustering method in which the objective is to minimize the average squared distance of each point to its nearest center, called the average distortion. One significant feature of ISOCLUS over k-means is that clusters may be merged or split, and so the final number of clusters may be different from the number k supplied as part of the input. This algorithm will be described in later in this paper. The ISOCLUS algorithm can run very slowly, particularly on large data sets. Given its wide use in remote sensing, its efficient computation is an important goal. We have developed a fast implementation of the ISOCLUS algorithm. Our improvement is based on a recent acceleration to the k-means algorithm, the filtering algorithm, by Kanungo et al.. They showed that, by storing the data in a kd-tree, it was possible to significantly reduce the running time of k-means. We have adapted this method for the ISOCLUS algorithm. For technical reasons, which are explained later, it is necessary to make a minor
Implementation details of the coupled QMR algorithm
NASA Technical Reports Server (NTRS)
Freund, Roland W.; Nachtigal, Noel M.
1992-01-01
The original quasi-minimal residual method (QMR) relies on the three-term look-ahead Lanczos process, to generate basis vectors for the underlying Krylov subspaces. However, empirical observations indicate that, in finite precision arithmetic, three-term vector recurrences are less robust than mathematically equivalent coupled two-term recurrences. Therefore, we recently proposed a new implementation of the QMR method based on a coupled two-term look-ahead Lanczos procedure. In this paper, we describe implementation details of this coupled QMR algorithm, and we present results of numerical experiments.
Terascale spectral element algorithms and implementations.
Fischer, P. F.; Tufo, H. M.
1999-08-17
We describe the development and implementation of an efficient spectral element code for multimillion gridpoint simulations of incompressible flows in general two- and three-dimensional domains. We review basic and recently developed algorithmic underpinnings that have resulted in good parallel and vector performance on a broad range of architectures, including the terascale computing systems now coming online at the DOE labs. Sustained performance of 219 GFLOPS has been recently achieved on 2048 nodes of the Intel ASCI-Red machine at Sandia.
A Fast Implementation of the ISOCLUS Algorithm
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess; Mount, David M.; Netanyahu, Nathan S.; LeMoigne, Jacqueline
2003-01-01
Unsupervised clustering is a fundamental building block in numerous image processing applications. One of the most popular and widely used clustering schemes for remote sensing applications is the ISOCLUS algorithm, which is based on the ISODATA method. The algorithm is given a set of n data points in d-dimensional space, an integer k indicating the initial number of clusters, and a number of additional parameters. The general goal is to compute the coordinates of a set of cluster centers in d-space, such that those centers minimize the mean squared distance from each data point to its nearest center. This clustering algorithm is similar to another well-known clustering method, called k-means. One significant feature of ISOCLUS over k-means is that the actual number of clusters reported might be fewer or more than the number supplied as part of the input. The algorithm uses different heuristics to determine whether to merge lor split clusters. As ISOCLUS can run very slowly, particularly on large data sets, there has been a growing .interest in the remote sensing community in computing it efficiently. We have developed a faster implementation of the ISOCLUS algorithm. Our improvement is based on a recent acceleration to the k-means algorithm of Kanungo, et al. They showed that, by using a kd-tree data structure for storing the data, it is possible to reduce the running time of k-means. We have adapted this method for the ISOCLUS algorithm, and we show that it is possible to achieve essentially the same results as ISOCLUS on large data sets, but with significantly lower running times. This adaptation involves computing a number of cluster statistics that are needed for ISOCLUS but not for k-means. Both the k-means and ISOCLUS algorithms are based on iterative schemes, in which nearest neighbors are calculated until some convergence criterion is satisfied. Each iteration requires that the nearest center for each data point be computed. Naively, this requires O
Categorizing Variations of Student-Implemented Sorting Algorithms
ERIC Educational Resources Information Center
Taherkhani, Ahmad; Korhonen, Ari; Malmi, Lauri
2012-01-01
In this study, we examined freshmen students' sorting algorithm implementations in data structures and algorithms' course in two phases: at the beginning of the course before the students received any instruction on sorting algorithms, and after taking a lecture on sorting algorithms. The analysis revealed that many students have insufficient…
Categorizing Variations of Student-Implemented Sorting Algorithms
ERIC Educational Resources Information Center
Taherkhani, Ahmad; Korhonen, Ari; Malmi, Lauri
2012-01-01
In this study, we examined freshmen students' sorting algorithm implementations in data structures and algorithms' course in two phases: at the beginning of the course before the students received any instruction on sorting algorithms, and after taking a lecture on sorting algorithms. The analysis revealed that many students have insufficient…
Machine Learning Algorithms Implemented in Image Analysis
Chen, J.; Renner, L.; Neuringer, M.; Cornea, A.
2014-01-01
A typical core facility is faced with a wide variety of experimental paradigms, samples, and images to be analyzed. They typically have one thing in common: a need to segment features of interest from the rest of the image. In many cases, for example fluorescence images with good contrast and signal to noise, intensity segmentation may be successful. Often, however, images may not be acquired in optimum conditions, or features of interest are not distinguished by intensity alone. Examples we encountered are: retina fundus photographs, histological stains, DAB immunohistochemistry, etc. We used machine learning algorithms as implemented in FIJI to isolate specific features in longitudinal retinal photographs of non-human primates. Images acquired over several years with different technologies, cameras and skills were analyzed to evaluate small changes with precision. The protocol used includes: Scale-Invariant feature Transform (SIFT) registration, Contrast Limited Adaptive Histogram Equalization (CLAHE) and Weka training. Variance of results for different images of the same time point and for different raters of the same images was less than 10% in most cases.
Implementation of a Wavefront-Sensing Algorithm
NASA Technical Reports Server (NTRS)
Smith, Jeffrey S.; Dean, Bruce; Aronstein, David
2013-01-01
A computer program has been written as a unique implementation of an image-based wavefront-sensing algorithm reported in "Iterative-Transform Phase Retrieval Using Adaptive Diversity" (GSC-14879-1), NASA Tech Briefs, Vol. 31, No. 4 (April 2007), page 32. This software was originally intended for application to the James Webb Space Telescope, but is also applicable to other segmented-mirror telescopes. The software is capable of determining optical-wavefront information using, as input, a variable number of irradiance measurements collected in defocus planes about the best focal position. The software also uses input of the geometrical definition of the telescope exit pupil (otherwise denoted the pupil mask) to identify the locations of the segments of the primary telescope mirror. From the irradiance data and mask information, the software calculates an estimate of the optical wavefront (a measure of performance) of the telescope generally and across each primary mirror segment specifically. The software is capable of generating irradiance data, wavefront estimates, and basis functions for the full telescope and for each primary-mirror segment. Optionally, each of these pieces of information can be measured or computed outside of the software and incorporated during execution of the software.
Parallel optimization algorithms and their implementation in VLSI design
NASA Technical Reports Server (NTRS)
Lee, G.; Feeley, J. J.
1991-01-01
Two new parallel optimization algorithms based on the simplex method are described. They may be executed by a SIMD parallel processor architecture and be implemented in VLSI design. Several VLSI design implementations are introduced. An application example is reported to demonstrate that the algorithms are effective.
Categorizing variations of student-implemented sorting algorithms
NASA Astrophysics Data System (ADS)
Taherkhani, Ahmad; Korhonen, Ari; Malmi, Lauri
2012-06-01
In this study, we examined freshmen students' sorting algorithm implementations in data structures and algorithms' course in two phases: at the beginning of the course before the students received any instruction on sorting algorithms, and after taking a lecture on sorting algorithms. The analysis revealed that many students have insufficient understanding of implementing sorting algorithms. For example, they include unnecessary swaps in their Insertion or Selection sort implementations resulting in more complicated and inefficient code. Based on the data, we present a categorization of these types of variations and discuss the implications of the results. In addition, we introduce an instrument to recognize these algorithms automatically. This is done in terms of white-box testing. Our aim is to develop an automatic assessment system to help teachers in the burden of marking students' assignments and give feedback to the students on their algorithmic solutions. We outline how the presented results can be used to develop the instrument further.
Parallel implementation of an algorithm for Delaunay triangulation
NASA Technical Reports Server (NTRS)
Merriam, Marshal L.
1992-01-01
The theory and practice of implementing Tanemura's algorithm for 3D Delaunay triangulation on Intel's Gamma prototype, a 128 processor MIMD computer, is described. Efficient implementation of Tanemura's algorithm on a conventional, vector processing supercomputer is problematic. It does not vectorize to any significant degree and requires indirect addressing. Efficient implementation on a parallel architecture is possible, however. Speeds in excess of 20 times a single processor Cray Y-MP are realized on 128 processors of the Intel Gamma prototype.
Parallel implementation of an algorithm for Delaunay triangulation
NASA Technical Reports Server (NTRS)
Merriam, Marshall L.
1992-01-01
This work concerns the theory and practice of implementing Tanemura's algorithm for 3D Delaunay triangulation on Intel's Gamma prototype, a 128 processor MIMD computer. Tanemura's algorithm does not vectorize to any significant degree and requires indirect addressing. Efficient implementation on a conventional, vector processing, supercomputer is problematic. Efficient implementation on a parallel architecture is possible, however. In this work, speeds in excess of 8 times a single processor Cray Y-mp are realized on 128 processors of the Intel Gamma prototype.
An adaptive, lossless data compression algorithm and VLSI implementations
NASA Technical Reports Server (NTRS)
Venbrux, Jack; Zweigle, Greg; Gambles, Jody; Wiseman, Don; Miller, Warner H.; Yeh, Pen-Shu
1993-01-01
This paper first provides an overview of an adaptive, lossless, data compression algorithm originally devised by Rice in the early '70s. It then reports the development of a VLSI encoder/decoder chip set developed which implements this algorithm. A recent effort in making a space qualified version of the encoder is described along with several enhancements to the algorithm. The performance of the enhanced algorithm is compared with those from other currently available lossless compression techniques on multiple sets of test data. The results favor our implemented technique in many applications.
New algorithms for phase unwrapping: implementation and testing
NASA Astrophysics Data System (ADS)
Kotlicki, Krzysztof
1998-11-01
In this paper it is shown how the regularization theory was used for the new noise immune algorithm for phase unwrapping. The algorithm were developed by M. Servin, J.L. Marroquin and F.J. Cuevas in Centro de Investigaciones en Optica A.C. and Centro de Investigacion en Matematicas A.C. in Mexico. The theory was presented. The objective of the work was to implement the algorithm into the software able to perform the off-line unwrapping on the fringe pattern. The algorithms are present as well as the result and also the software developed for the implementation.
Fixed Point Implementations of Fast Kalman Algorithms.
1983-11-01
fined point multiply. ve &geete a meatn ’C.- nero. variance N random vector s~t) each time weAfilter is said to be 12 Scaled if udae 8(t+11t0 - 3-1* AS...nl.v by bl ’k rn.b.) 20 AST iA C T ’Cnnin to .- a , o. a ide It .,oco ea ry and Idenuty by block number) In this paper we study scaling rules and round...realized in a -fast form that uses the so-called fast Kalman gain algorithm. The algorithm for the gain is fixed point. Scaling rules and expressions for
An Agent Inspired Reconfigurable Computing Implementation of a Genetic Algorithm
NASA Technical Reports Server (NTRS)
Weir, John M.; Wells, B. Earl
2003-01-01
Many software systems have been successfully implemented using an agent paradigm which employs a number of independent entities that communicate with one another to achieve a common goal. The distributed nature of such a paradigm makes it an excellent candidate for use in high speed reconfigurable computing hardware environments such as those present in modem FPGA's. In this paper, a distributed genetic algorithm that can be applied to the agent based reconfigurable hardware model is introduced. The effectiveness of this new algorithm is evaluated by comparing the quality of the solutions found by the new algorithm with those found by traditional genetic algorithms. The performance of a reconfigurable hardware implementation of the new algorithm on an FPGA is compared to traditional single processor implementations.
Multiple Lookup Table-Based AES Encryption Algorithm Implementation
NASA Astrophysics Data System (ADS)
Gong, Jin; Liu, Wenyi; Zhang, Huixin
Anew AES (Advanced Encryption Standard) encryption algorithm implementation was proposed in this paper. It is based on five lookup tables, which are generated from S-box(the substitution table in AES). The obvious advantages are reducing the code-size, improving the implementation efficiency, and helping new learners to understand the AES encryption algorithm and GF(28) multiplication which are necessary to correctly implement AES[1]. This method can be applied on processors with word length 32 or above, FPGA and others. And correspondingly we can implement it by VHDL, Verilog, VB and other languages.
PURGATORIO—a new implementation of the INFERNO algorithm
NASA Astrophysics Data System (ADS)
Wilson, B.; Sonnad, V.; Sterne, P.; Isaacs, W.
2006-05-01
An overview of PURGATORIO, a new implementation of the INFERNO [Liberman, Phys Rev B 1979;20:4981 9] equation of state model, is presented. The new algorithm emphasizes a novel subdivision scheme for automatically resolving the structure of the continuum density of states, circumventing limitations of the pseudo-R matrix algorithm previously utilized.
A fast portable implementation of the Secure Hash Algorithm, III.
McCurley, Kevin S.
1992-10-01
In 1992, NIST announced a proposed standard for a collision-free hash function. The algorithm for producing the hash value is known as the Secure Hash Algorithm (SHA), and the standard using the algorithm in known as the Secure Hash Standard (SHS). Later, an announcement was made that a scientist at NSA had discovered a weakness in the original algorithm. A revision to this standard was then announced as FIPS 180-1, and includes a slight change to the algorithm that eliminates the weakness. This new algorithm is called SHA-1. In this report we describe a portable and efficient implementation of SHA-1 in the C language. Performance information is given, as well as tips for porting the code to other architectures. We conclude with some observations on the efficiency of the algorithm, and a discussion of how the efficiency of SHA might be improved.
Algorithm implementation on the Navier-Stokes computer
NASA Technical Reports Server (NTRS)
Krist, Steven E.; Zang, Thomas A.
1987-01-01
The Navier-Stokes Computer is a multi-purpose parallel-processing supercomputer which is currently under development at Princeton University. It consists of multiple local memory parallel processors, called Nodes, which are interconnected in a hypercube network. Details of the procedures involved in implementing an algorithm on the Navier-Stokes computer are presented. The particular finite difference algorithm considered in this analysis was developed for simulation of laminar-turbulent transition in wall bounded shear flows. Projected timing results for implementing this algorithm indicate that operation rates in excess of 42 GFLOPS are feasible on a 128 Node machine.
Implementing a self-structuring data learning algorithm
NASA Astrophysics Data System (ADS)
Graham, James; Carson, Daniel; Ternovskiy, Igor
2016-05-01
In this paper, we elaborate on what we did to implement our self-structuring data learning algorithm. To recap, we are working to develop a data learning algorithm that will eventually be capable of goal driven pattern learning and extrapolation of more complex patterns from less complex ones. At this point we have developed a conceptual framework for the algorithm, but have yet to discuss our actual implementation and the consideration and shortcuts we needed to take to create said implementation. We will elaborate on our initial setup of the algorithm and the scenarios we used to test our early stage algorithm. While we want this to be a general algorithm, it is necessary to start with a simple scenario or two to provide a viable development and testing environment. To that end, our discussion will be geared toward what we include in our initial implementation and why, as well as what concerns we may have. In the future, we expect to be able to apply our algorithm to a more general approach, but to do so within a reasonable time, we needed to pick a place to start.
Implementing Agglomerative Hierarchic Clustering Algorithms for Use in Document Retrieval.
ERIC Educational Resources Information Center
Voorhees, Ellen M.
1986-01-01
Describes a computerized information retrieval system that uses three agglomerative hierarchic clustering algorithms--single link, complete link, and group average link--and explains their implementations. It is noted that these implementations have been used to cluster a collection of 12,000 documents. (LRW)
Implementation and performance evaluation of acoustic denoising algorithms for UAV
NASA Astrophysics Data System (ADS)
Chowdhury, Ahmed Sony Kamal
Unmanned Aerial Vehicles (UAVs) have become popular alternative for wildlife monitoring and border surveillance applications. Elimination of the UAV's background noise and classifying the target audio signal effectively are still a major challenge. The main goal of this thesis is to remove UAV's background noise by means of acoustic denoising techniques. Existing denoising algorithms, such as Adaptive Least Mean Square (LMS), Wavelet Denoising, Time-Frequency Block Thresholding, and Wiener Filter, were implemented and their performance evaluated. The denoising algorithms were evaluated for average Signal to Noise Ratio (SNR), Segmental SNR (SSNR), Log Likelihood Ratio (LLR), and Log Spectral Distance (LSD) metrics. To evaluate the effectiveness of the denoising algorithms on classification of target audio, we implemented Support Vector Machine (SVM) and Naive Bayes classification algorithms. Simulation results demonstrate that LMS and Discrete Wavelet Transform (DWT) denoising algorithm offered superior performance than other algorithms. Finally, we implemented the LMS and DWT algorithms on a DSP board for hardware evaluation. Experimental results showed that LMS algorithm's performance is robust compared to DWT for various noise types to classify target audio signals.
Rapid algorithm prototyping and implementation for power quality measurement
NASA Astrophysics Data System (ADS)
Kołek, Krzysztof; Piątek, Krzysztof
2015-12-01
This article presents a Model-Based Design (MBD) approach to rapidly implement power quality (PQ) metering algorithms. Power supply quality is a very important aspect of modern power systems and will become even more important in future smart grids. In this case, maintaining the PQ parameters at the desired level will require efficient implementation methods of the metering algorithms. Currently, the development of new, advanced PQ metering algorithms requires new hardware with adequate computational capability and time intensive, cost-ineffective manual implementations. An alternative, considered here, is an MBD approach. The MBD approach focuses on the modelling and validation of the model by simulation, which is well-supported by a Computer-Aided Engineering (CAE) packages. This paper presents two algorithms utilized in modern PQ meters: a phase-locked loop based on an Enhanced Phase Locked Loop (EPLL), and the flicker measurement according to the IEC 61000-4-15 standard. The algorithms were chosen because of their complexity and non-trivial development. They were first modelled in the MATLAB/Simulink package, then tested and validated in a simulation environment. The models, in the form of Simulink diagrams, were next used to automatically generate C code. The code was compiled and executed in real-time on the Zynq Xilinx platform that combines a reconfigurable Field Programmable Gate Array (FPGA) with a dual-core processor. The MBD development of PQ algorithms, automatic code generation, and compilation form a rapid algorithm prototyping and implementation path for PQ measurements. The main advantage of this approach is the ability to focus on the design, validation, and testing stages while skipping over implementation issues. The code generation process renders production-ready code that can be easily used on the target hardware. This is especially important when standards for PQ measurement are in constant development, and the PQ issues in emerging smart
Testing of hardware implementation of infrared image enhancing algorithm
NASA Astrophysics Data System (ADS)
Dulski, R.; Sosnowski, T.; PiÄ tkowski, T.; Trzaskawka, P.; Kastek, M.; Kucharz, J.
2012-10-01
The interpretation of IR images depends on radiative properties of observed objects and surrounding scenery. Skills and experience of an observer itself are also of great importance. The solution to improve the effectiveness of observation is utilization of algorithm of image enhancing capable to improve the image quality and the same effectiveness of object detection. The paper presents results of testing the hardware implementation of IR image enhancing algorithm based on histogram processing. Main issue in hardware implementation of complex procedures for image enhancing algorithms is high computational cost. As a result implementation of complex algorithms using general purpose processors and software usually does not bring satisfactory results. Because of high efficiency requirements and the need of parallel operation, the ALTERA's EP2C35F672 FPGA device was used. It provides sufficient processing speed combined with relatively low power consumption. A digital image processing and control module was designed and constructed around two main integrated circuits: a FPGA device and a microcontroller. Programmable FPGA device performs image data processing operations which requires considerable computing power. It also generates the control signals for array readout, performs NUC correction and bad pixel mapping, generates the control signals for display module and finally executes complex image processing algorithms. Implemented adaptive algorithm is based on plateau histogram equalization. Tests were performed on real IR images of different types of objects registered in different spectral bands. The simulations and laboratory experiments proved the correct operation of the designed system in executing the sophisticated image enhancement.
Bomble, Laeetitia; Pellegrini, Philippe; Ghesquiere, Pierre; Desouter-Lecomte, Michele
2010-12-15
We numerically investigate the possibilities of driving quantum algorithms with laser pulses in a register of ultracold NaCs polar molecules in a static electric field. We focus on the possibilities of performing scalable logical operations by considering circuits that involve intermolecular gates (implemented on adjacent interacting molecules) to enable the transfer of information from one molecule to another during conditional laser-driven population inversions. We study the implementation of an arithmetic operation (the addition of 0 or 1 on a binary digit and a carry in) which requires population inversions only and the Deutsch-Jozsa algorithm which requires a control of the phases. Under typical experimental conditions, our simulations show that high-fidelity logical operations involving several qubits can be performed in a time scale of a few hundreds of microseconds, opening promising perspectives for the manipulation of a large number of qubits in these systems.
Design technologies for DSP algorithm implementation on heterogeneous architectures
NASA Astrophysics Data System (ADS)
McAllister, John; Yi, Ying; Woods, Roger F.; Walke, Richard L.; Reilly, Darren; Colgan, Kevin
2003-12-01
Computationally intensive digital signal processing (DSP) systems sometimes have real time requirements beyond that which programmable processor platform solutions, consisting of RISC and DSP processors, can achieve. The addition of Field Programmable Gate Array (FPGA) components to these platforms provides a configurable hardware resource where increased parallelism levels allow very large computational rates. Techniques to implement circuit architectures from signal flow graph (SFG) algorithm expression can produce highly efficient processor implementations. Applying folding transformations produces implementations where hardware resource usage is reduced at the possible expense of throughput. In this paper a new development methodology is presented which analyses the SFG algorithm to suggest appropriate folding techniques. By characterizing different folding techniques, a template circuit architecture can be created early in the design process which does not alter throughout the remainder of the implementation process. Retiming techniques applied to the algorithm SFG produces the properly timed implementation from the template. By applying this methodology, architectural exploration can be quickly and efficiently performed to generate a set of implementations (an 'implementation space") to best meet the constraints of the system. When applied to a Normalised Lattice Filter design example, the results demonstrate high savings on FPGA resource usage, with little reduction in real time performance, demonstrating the implementation advantage of employing this methodology.
Outline of a fast hardware implementation of Winograd's DFT algorithm
NASA Technical Reports Server (NTRS)
Zohar, S.
1980-01-01
The main characteristics of the discrete Fourier transform (DFT) algorithm considered by Winograd (1976) is a significant reduction in the number of multiplications. Its primary disadvantage is a higher structural complexity. It is, therefore, difficult to translate the reduced number of multiplications into faster execution of the DFT by means of a software implementation of the algorithm. For this reason, a hardware implementation is considered in the current study, taking into account a design based on the algorithm prescription discussed by Zohar (1979). The hardware implementation of a FORTRAN subroutine is proposed, giving attention to a pipelining scheme in which 5 consecutive data batches are being operated on simultaneously, each batch undergoing one of 5 processing phases.
Highly parallel consistent labeling algorithm suitable for optoelectronic implementation.
Marsden, G C; Kiamilev, F; Esener, S; Lee, S H
1991-01-10
Constraint satisfaction problems require a search through a large set of possibilities. Consistent labeling is a method by which search spaces can be drastically reduced. We present a highly parallel consistent labeling algorithm, which achieves strong k-consistency for any value k and which can include higher-order constraints. The algorithm uses vector outer product, matrix summation, and matrix intersection operations. These operations require local computation with global communication and, therefore, are well suited to a optoelectronic implementation.
Modification of MSDR algorithm and ITS implementation on graph clustering
NASA Astrophysics Data System (ADS)
Prastiwi, D.; Sugeng, K. A.; Siswantining, T.
2017-07-01
Maximum Standard Deviation Reduction (MSDR) is a graph clustering algorithm to minimize the distance variation within a cluster. In this paper we propose a modified MSDR by replacing one technical step in MSDR which uses polynomial regression, with a new and simpler step. This leads to our new algorithm called Modified MSDR (MMSDR). We implement the new algorithm to separate a domestic flight network of an Indonesian airline into two large clusters. Further analysis allows us to discover a weak link in the network, which should be improved by adding more flights.
Study of hardware implementations of fast tracking algorithms
NASA Astrophysics Data System (ADS)
Song, Z.; De Lentdecker, G.; Dong, J.; Huang, G.; Léonard, A.; Robert, F.; Wang, D.; Yang, Y.
2017-02-01
Real-time track reconstruction at high event rates is a major challenge for future experiments in high energy physics. To perform pattern-recognition and track fitting, artificial retina or Hough transformation methods have been introduced in the field which have to be implemented in FPGA firmware. In this note we report on a case study of a possible FPGA hardware implementation approach of the retina algorithm based on a Floating-Point core. Detailed measurements with this algorithm are investigated. Retina performance and capabilities of the FPGA are discussed along with perspectives for further optimization and applications.
Some Computer Algorithms to Implement a Reliability Shorthand.
1982-10-01
AD-A123 781 SOME COMPUTER ALGORITHMS TO IMPLEMENT A RELIAILITY /I I SHORTHAND(U) N VAL POSTGRADUATE SCHOOL MONTEREY CA UNCLASSIFIED SGREOC82F/G 12...California THESIS SOME COMPUTER ALGORITHMS TO IMPLEMENT A RELIABILITY SHORTHAND Sadan Gursel October 1982 JAN 26I A :: Thesis Advisor: J. D. Esary...DOCMEWTATION PAGE ISSFORK COMPLZT’Nc FORM .REPORTNMU1EUGW CKO N.3 19IiNI CATALOG mao d. TMTE (od Sid"Ifte) $. ?’V9E OF 1119000 & PEUoOŔ COVERED Some Computer
Efficient implementation of the adaptive scale pixel decomposition algorithm
NASA Astrophysics Data System (ADS)
Zhang, L.; Bhatnagar, S.; Rau, U.; Zhang, M.
2016-08-01
Context. Most popular algorithms in use to remove the effects of a telescope's point spread function (PSF) in radio astronomy are variants of the CLEAN algorithm. Most of these algorithms model the sky brightness using the delta-function basis, which results in undesired artefacts when used to image extended emission. The adaptive scale pixel decomposition (Asp-Clean) algorithm models the sky brightness on a scale-sensitive basis and thus gives a significantly better imaging performance when imaging fields that contain both resolved and unresolved emission. Aims: However, the runtime cost of Asp-Clean is higher than that of scale-insensitive algorithms. In this paper, we identify the most expensive step in the original Asp-Clean algorithm and present an efficient implementation of it, which significantly reduces the computational cost while keeping the imaging performance comparable to the original algorithm. The PSF sidelobe levels of modern wide-band telescopes are significantly reduced, allowing us to make approximations to reduce the computational cost, which in turn allows for the deconvolution of larger images on reasonable timescales. Methods: As in the original algorithm, scales in the image are estimated through function fitting. Here we introduce an analytical method to model extended emission, and a modified method for estimating the initial values used for the fitting procedure, which ultimately leads to a lower computational cost. Results: The new implementation was tested with simulated EVLA data and the imaging performance compared well with the original Asp-Clean algorithm. Tests show that the current algorithm can recover features at different scales with lower computational cost.
Error-Detection Codes: Algorithms and Fast Implementation
2005-01-01
information on this or any other computing topic, please visit our Digital Library at www.computer.org/publications/ dlib . NGUYEN: ERROR-DETECTION CODES: ALGORITHMS AND FAST IMPLEMENTATION 11 ...in Appendix C (which can be found on the Computer Society Digital Library at http:// computer.org/tc/archives.htm). In this paper, we focus on code...the Computer Society Digital Library at http://computer.org/ tc/archives.htm) that include theoremproofs, code segments implemented in C programming
Implementing Linear Algebra Related Algorithms on the TI-92+ Calculator.
ERIC Educational Resources Information Center
Alexopoulos, John; Abraham, Paul
2001-01-01
Demonstrates a less utilized feature of the TI-92+: its natural and powerful programming language. Shows how to implement several linear algebra related algorithms including the Gram-Schmidt process, Least Squares Approximations, Wronskians, Cholesky Decompositions, and Generalized Linear Least Square Approximations with QR Decompositions.…
Implementing Linear Algebra Related Algorithms on the TI-92+ Calculator.
ERIC Educational Resources Information Center
Alexopoulos, John; Abraham, Paul
2001-01-01
Demonstrates a less utilized feature of the TI-92+: its natural and powerful programming language. Shows how to implement several linear algebra related algorithms including the Gram-Schmidt process, Least Squares Approximations, Wronskians, Cholesky Decompositions, and Generalized Linear Least Square Approximations with QR Decompositions.…
A novel pipeline based FPGA implementation of a genetic algorithm
NASA Astrophysics Data System (ADS)
Thirer, Nonel
2014-05-01
To solve problems when an analytical solution is not available, more and more bio-inspired computation techniques have been applied in the last years. Thus, an efficient algorithm is the Genetic Algorithm (GA), which imitates the biological evolution process, finding the solution by the mechanism of "natural selection", where the strong has higher chances to survive. A genetic algorithm is an iterative procedure which operates on a population of individuals called "chromosomes" or "possible solutions" (usually represented by a binary code). GA performs several processes with the population individuals to produce a new population, like in the biological evolution. To provide a high speed solution, pipelined based FPGA hardware implementations are used, with a nstages pipeline for a n-phases genetic algorithm. The FPGA pipeline implementations are constraints by the different execution time of each stage and by the FPGA chip resources. To minimize these difficulties, we propose a bio-inspired technique to modify the crossover step by using non identical twins. Thus two of the chosen chromosomes (parents) will build up two new chromosomes (children) not only one as in classical GA. We analyze the contribution of this method to reduce the execution time in the asynchronous and synchronous pipelines and also the possibility to a cheaper FPGA implementation, by using smaller populations. The full hardware architecture for a FPGA implementation to our target ALTERA development card is presented and analyzed.
Efficient Implementation of Nested-Loop Multimedia Algorithms
NASA Astrophysics Data System (ADS)
Kittitornkun, Surin; Hu, Yu Hen
2001-12-01
A novel dependence graph representation called the multiple-order dependence graph for nested-loop formulated multimedia signal processing algorithms is proposed. It allows a concise representation of an entire family of dependence graphs. This powerful representation facilitates the development of innovative implementation approach for nested-loop formulated multimedia algorithms such as motion estimation, matrix-matrix product, 2D linear transform, and others. In particular, algebraic linear mapping (assignment and scheduling) methodology can be applied to implement such algorithms on an array of simple-processing elements. The feasibility of this new approach is demonstrated in three major target architectures: application-specific integrated circuit (ASIC), field programmable gate array (FPGA), and a programmable clustered VLIW processor.
Implementation of pattern recognition algorithm based on RBF neural network
NASA Astrophysics Data System (ADS)
Bouchoux, Sophie; Brost, Vincent; Yang, Fan; Grapin, Jean Claude; Paindavoine, Michel
2002-12-01
In this paper, we present implementations of a pattern recognition algorithm which uses a RBF (Radial Basis Function) neural network. Our aim is to elaborate a quite efficient system which realizes real time faces tracking and identity verification in natural video sequences. Hardware implementations have been realized on an embedded system developed by our laboratory. This system is based on a DSP (Digital Signal Processor) TMS320C6x. The optimization of implementations allow us to obtain a processing speed of 4.8 images (240x320 pixels) per second with a correct rate of 95% of faces tracking and identity verification.
Efficient implementations of hyperspectral chemical-detection algorithms
NASA Astrophysics Data System (ADS)
Brett, Cory J. C.; DiPietro, Robert S.; Manolakis, Dimitris G.; Ingle, Vinay K.
2013-10-01
Many military and civilian applications depend on the ability to remotely sense chemical clouds using hyperspectral imagers, from detecting small but lethal concentrations of chemical warfare agents to mapping plumes in the aftermath of natural disasters. Real-time operation is critical in these applications but becomes diffcult to achieve as the number of chemicals we search for increases. In this paper, we present efficient CPU and GPU implementations of matched-filter based algorithms so that real-time operation can be maintained with higher chemical-signature counts. The optimized C++ implementations show between 3x and 9x speedup over vectorized MATLAB implementations.
A distributed Canny edge detector: algorithm and FPGA implementation.
Xu, Qian; Varadarajan, Srenivas; Chakrabarti, Chaitali; Karam, Lina J
2014-07-01
The Canny edge detector is one of the most widely used edge detection algorithms due to its superior performance. Unfortunately, not only is it computationally more intensive as compared with other edge detection algorithms, but it also has a higher latency because it is based on frame-level statistics. In this paper, we propose a mechanism to implement the Canny algorithm at the block level without any loss in edge detection performance compared with the original frame-level Canny algorithm. Directly applying the original Canny algorithm at the block-level leads to excessive edges in smooth regions and to loss of significant edges in high-detailed regions since the original Canny computes the high and low thresholds based on the frame-level statistics. To solve this problem, we present a distributed Canny edge detection algorithm that adaptively computes the edge detection thresholds based on the block type and the local distribution of the gradients in the image block. In addition, the new algorithm uses a nonuniform gradient magnitude histogram to compute block-based hysteresis thresholds. The resulting block-based algorithm has a significantly reduced latency and can be easily integrated with other block-based image codecs. It is capable of supporting fast edge detection of images and videos with high resolutions, including full-HD since the latency is now a function of the block size instead of the frame size. In addition, quantitative conformance evaluations and subjective tests show that the edge detection performance of the proposed algorithm is better than the original frame-based algorithm, especially when noise is present in the images. Finally, this algorithm is implemented using a 32 computing engine architecture and is synthesized on the Xilinx Virtex-5 FPGA. The synthesized architecture takes only 0.721 ms (including the SRAM READ/WRITE time and the computation time) to detect edges of 512 × 512 images in the USC SIPI database when clocked at 100
Implementation of the TRL Algorithm for Improved Impedance Measurements
Mane, Vibha; Shea, Tom
1993-05-03
The thru-reflect-line algorithm for deembedding the scattering parameters and hence the impedance of a device under test has been implemented in LabVIEW. This algorithm helps obtain the correct impedance of a device placed between mismatched ports. The nonideal por at each end of the two-port DUT is modeled by an ideal port in cascade with an error box. The scattering parameters are measured for three known conditions between the measurement planes M1 and M2, using the Network Analyzer.
FPGA implementation of digital down converter using CORDIC algorithm
NASA Astrophysics Data System (ADS)
Agarwal, Ashok; Lakshmi, Boppana
2013-01-01
In radio receivers, Digital Down Converters (DDC) are used to translate the signal from Intermediate Frequency level to baseband. It also decimates the oversampled signal to a lower sample rate, eliminating the need of a high end digital signal processors. In this paper we have implemented architecture for DDC employing CORDIC algorithm, which down converts an IF signal of 70MHz (3G) to 200 KHz baseband GSM signal, with an SFDR greater than 100dB. The implemented architecture reduces the hardware resource requirements by 15 percent when compared with other architecture available in the literature due to elimination of explicit multipliers and a quadrature phase shifter for mixing.
Implementing wide baseline matching algorithms on a graphics processing unit.
Rothganger, Fredrick H.; Larson, Kurt W.; Gonzales, Antonio Ignacio; Myers, Daniel S.
2007-10-01
Wide baseline matching is the state of the art for object recognition and image registration problems in computer vision. Though effective, the computational expense of these algorithms limits their application to many real-world problems. The performance of wide baseline matching algorithms may be improved by using a graphical processing unit as a fast multithreaded co-processor. In this paper, we present an implementation of the difference of Gaussian feature extractor, based on the CUDA system of GPU programming developed by NVIDIA, and implemented on their hardware. For a 2000x2000 pixel image, the GPU-based method executes nearly thirteen times faster than a comparable CPU-based method, with no significant loss of accuracy.
Implementation and Optimization of Image Processing Algorithms on Embedded GPU
NASA Astrophysics Data System (ADS)
Singhal, Nitin; Yoo, Jin Woo; Choi, Ho Yeol; Park, In Kyu
In this paper, we analyze the key factors underlying the implementation, evaluation, and optimization of image processing and computer vision algorithms on embedded GPU using OpenGL ES 2.0 shader model. First, we present the characteristics of the embedded GPU and its inherent advantage when compared to embedded CPU. Additionally, we propose techniques to achieve increased performance with optimized shader design. To show the effectiveness of the proposed techniques, we employ cartoon-style non-photorealistic rendering (NPR), speeded-up robust feature (SURF) detection, and stereo matching as our example algorithms. Performance is evaluated in terms of the execution time and speed-up achieved in comparison with the implementation on embedded CPU.
Control algorithm implementation for a redundant degree of freedom manipulator
NASA Technical Reports Server (NTRS)
Cohan, Steve
1991-01-01
This project's purpose is to develop and implement control algorithms for a kinematically redundant robotic manipulator. The manipulator is being developed concurrently by Odetics Inc., under internal research and development funding. This SBIR contract supports algorithm conception, development, and simulation, as well as software implementation and integration with the manipulator hardware. The Odetics Dexterous Manipulator is a lightweight, high strength, modular manipulator being developed for space and commercial applications. It has seven fully active degrees of freedom, is electrically powered, and is fully operational in 1 G. The manipulator consists of five self-contained modules. These modules join via simple quick-disconnect couplings and self-mating connectors which allow rapid assembly/disassembly for reconfiguration, transport, or servicing. Each joint incorporates a unique drive train design which provides zero backlash operation, is insensitive to wear, and is single fault tolerant to motor or servo amplifier failure. The sensing system is also designed to be single fault tolerant. Although the initial prototype is not space qualified, the design is well-suited to meeting space qualification requirements. The control algorithm design approach is to develop a hierarchical system with well defined access and interfaces at each level. The high level endpoint/configuration control algorithm transforms manipulator endpoint position/orientation commands to joint angle commands, providing task space motion. At the same time, the kinematic redundancy is resolved by controlling the configuration (pose) of the manipulator, using several different optimizing criteria. The center level of the hierarchy servos the joints to their commanded trajectories using both linear feedback and model-based nonlinear control techniques. The lowest control level uses sensed joint torque to close torque servo loops, with the goal of improving the manipulator dynamic behavior
Multiplatform GPGPU implementation of the active contours without edges algorithm
NASA Astrophysics Data System (ADS)
Zavala-Romero, Olmo; Meyer-Baese, Anke; Meyer-Baese, Uwe
2012-05-01
An OpenCL implementation of the Active Contours Without Edges algorithm is presented. The proposed algorithm uses the General Purpose Computing on Graphics Processing Units (GPGPU) to accelerate the original model by parallelizing the two main steps of the segmentation process, the computation of the Signed Distance Function (SDF) and the evolution of the segmented curve. The proposed scheme for the computation of the SDF is based on the iterative construction of partial Voronoi diagrams of a reduced dimension and obtains the exact Euclidean distance in a time of order O(N/p), where N is the number of pixels and p the number of processors. With high resolution images the segmentation algorithm runs 10 times faster than its equivalent sequential implementation. This work is being done as an open source software that, being programmed in OpenCL, can be used in dierent platforms allowing a broad number of nal users and can be applied in dierent areas of computer vision, like medical imaging, tracking, robotics, etc. This work uses OpenGL to visualize the algorithm results in real time.
Decoding the brain's algorithm for categorization from its neural implementation.
Mack, Michael L; Preston, Alison R; Love, Bradley C
2013-10-21
Acts of cognition can be described at different levels of analysis: what behavior should characterize the act, what algorithms and representations underlie the behavior, and how the algorithms are physically realized in neural activity [1]. Theories that bridge levels of analysis offer more complete explanations by leveraging the constraints present at each level [2-4]. Despite the great potential for theoretical advances, few studies of cognition bridge levels of analysis. For example, formal cognitive models of category decisions accurately predict human decision making [5, 6], but whether model algorithms and representations supporting category decisions are consistent with underlying neural implementation remains unknown. This uncertainty is largely due to the hurdle of forging links between theory and brain [7-9]. Here, we tackle this critical problem by using brain response to characterize the nature of mental computations that support category decisions to evaluate two dominant, and opposing, models of categorization. We found that brain states during category decisions were significantly more consistent with latent model representations from exemplar [5] rather than prototype theory [10, 11]. Representations of individual experiences, not the abstraction of experiences, are critical for category decision making. Holding models accountable for behavior and neural implementation provides a means for advancing more complete descriptions of the algorithms of cognition.
A bioinspired collision detection algorithm for VLSI implementation
NASA Astrophysics Data System (ADS)
Cuadri, J.; Linan, G.; Stafford, R.; Keil, M. S.; Roca, E.
2005-06-01
In this paper a bioinspired algorithm for collision detection is proposed, based on previous models of the locust (Locusta migratoria) visual system reported by F.C. Rind and her group, in the University of Newcastle-upon-Tyne. The algorithm is suitable for VLSI implementation in standard CMOS technologies as a system-on-chip for automotive applications. The working principle of the algorithm is to process a video stream that represents the current scenario, and to fire an alarm whenever an object approaches on a collision course. Moreover, it establishes a scale of warning states, from no danger to collision alarm, depending on the activity detected in the current scenario. In the worst case, the minimum time before collision at which the model fires the collision alarm is 40 msec (1 frame before, at 25 frames per second). Since the average time to successfully fire an airbag system is 2 msec, even in the worst case, this algorithm would be very helpful to more efficiently arm the airbag system, or even take some kind of collision avoidance countermeasures. Furthermore, two additional modules have been included: a "Topological Feature Estimator" and an "Attention Focusing Algorithm". The former takes into account the shape of the approaching object to decide whether it is a person, a road line or a car. This helps to take more adequate countermeasures and to filter false alarms. The latter centres the processing power into the most active zones of the input frame, thus saving memory and processing time resources.
Implementation of several mathematical algorithms to breast tissue density classification
NASA Astrophysics Data System (ADS)
Quintana, C.; Redondo, M.; Tirao, G.
2014-02-01
The accuracy of mammographic abnormality detection methods is strongly dependent on breast tissue characteristics, where a dense breast tissue can hide lesions causing cancer to be detected at later stages. In addition, breast tissue density is widely accepted to be an important risk indicator for the development of breast cancer. This paper presents the implementation and the performance of different mathematical algorithms designed to standardize the categorization of mammographic images, according to the American College of Radiology classifications. These mathematical techniques are based on intrinsic properties calculations and on comparison with an ideal homogeneous image (joint entropy, mutual information, normalized cross correlation and index Q) as categorization parameters. The algorithms evaluation was performed on 100 cases of the mammographic data sets provided by the Ministerio de Salud de la Provincia de Córdoba, Argentina—Programa de Prevención del Cáncer de Mama (Department of Public Health, Córdoba, Argentina, Breast Cancer Prevention Program). The obtained breast classifications were compared with the expert medical diagnostics, showing a good performance. The implemented algorithms revealed a high potentiality to classify breasts into tissue density categories.
Experience with a Genetic Algorithm Implemented on a Multiprocessor Computer
NASA Technical Reports Server (NTRS)
Plassman, Gerald E.; Sobieszczanski-Sobieski, Jaroslaw
2000-01-01
Numerical experiments were conducted to find out the extent to which a Genetic Algorithm (GA) may benefit from a multiprocessor implementation, considering, on one hand, that analyses of individual designs in a population are independent of each other so that they may be executed concurrently on separate processors, and, on the other hand, that there are some operations in a GA that cannot be so distributed. The algorithm experimented with was based on a gaussian distribution rather than bit exchange in the GA reproductive mechanism, and the test case was a hub frame structure of up to 1080 design variables. The experimentation engaging up to 128 processors confirmed expectations of radical elapsed time reductions comparing to a conventional single processor implementation. It also demonstrated that the time spent in the non-distributable parts of the algorithm and the attendant cross-processor communication may have a very detrimental effect on the efficient utilization of the multiprocessor machine and on the number of processors that can be used effectively in a concurrent manner. Three techniques were devised and tested to mitigate that effect, resulting in efficiency increasing to exceed 99 percent.
Experience with a Genetic Algorithm Implemented on a Multiprocessor Computer
NASA Technical Reports Server (NTRS)
Plassman, Gerald E.; Sobieszczanski-Sobieski, Jaroslaw
2000-01-01
Numerical experiments were conducted to find out the extent to which a Genetic Algorithm (GA) may benefit from a multiprocessor implementation, considering, on one hand, that analyses of individual designs in a population are independent of each other so that they may be executed concurrently on separate processors, and, on the other hand, that there are some operations in a GA that cannot be so distributed. The algorithm experimented with was based on a gaussian distribution rather than bit exchange in the GA reproductive mechanism, and the test case was a hub frame structure of up to 1080 design variables. The experimentation engaging up to 128 processors confirmed expectations of radical elapsed time reductions comparing to a conventional single processor implementation. It also demonstrated that the time spent in the non-distributable parts of the algorithm and the attendant cross-processor communication may have a very detrimental effect on the efficient utilization of the multiprocessor machine and on the number of processors that can be used effectively in a concurrent manner. Three techniques were devised and tested to mitigate that effect, resulting in efficiency increasing to exceed 99 percent.
A hardware implementation of a relaxation algorithm to segment images
NASA Technical Reports Server (NTRS)
Loda, Antonio G.; Ranganath, Heggere S.
1988-01-01
Relaxation labelling is a mathematical technique frequently applied in image processing algorithms. In particular, it is extensively used for the purpose of segmenting images. The paper presents a hardware implementation of a segmentation algorithm, for images consisting of two regions, based on relaxation labelling. The algorithm determines, for each pixel, the probability that it should be labelled as belonging to a particular region, for all regions in the image. The label probabilities (labellings) of every pixel are iteratively updated, based on those of the pixel's neighbors, until they converge. The pixel is then assigned to the region correspondent to the maximum label probability. The system consists of a control unit and of a pipeline of segmentation stages. Each segmentation stage emulates in the hardware an iteration of the relaxation algorithm. The design of the segmentation stage is based on commercially available digital signal processing integrated circuits. Multiple iterations are accomplished by stringing stages together or by looping the output of a stage, or string of stages, to its input. The system interfaces with a generic host computer. Given the modularity of the architecture, performance can be enhanced by merely adding segmentation stages.
Multi-Angle Implementation of Atmospheric Correction (MAIAC) Algorithm
NASA Technical Reports Server (NTRS)
Lyapustin, A.; Wang, Y.
2012-01-01
Multi-Angle Implementation of Atmospheric Correction (MAIAC) is a new algorithm developed for MODIS. MAIAC uses a time series analysis and processing of groups of pixels to perform simultaneous retrievals of aerosol properties and surface bidirectional reflectance without typical assumptions about the surface. It is a generic algorithm which works over both dark and bright land surfaces, including deserts. MAIAC has an internal Cloud Mask, a dynamic land-water-snow classification and a surface change mask which allows it to flexibly choose processing path over different surfaces. A distinct feature of MAIAC is a high 1 km resolution of aerosol retrievals which is required in different applications including the air quality analysis. The novel features of MAIAC include the high quality cloud mask, discrimination of aerosol type, including biomass burning smoke and dust, and detection of surface change - all required for high quality aerosol retrievals. An overview of the algorithm, results of AERONET validation, and examples of comparison with MODIS Collection 5 aerosol product and Deep Blue algorithm for different parts of the world, will be presented.
A Modified ART 1 Algorithm more Suitable for VLSI Implementations.
Linares-Barranco, Bernabe; Serrano-Gotarredona, Teresa
1996-08-01
This paper presents a modification to the original ART 1 algorithm ([Carpenter and Grossberg, 1987a], A massively parallel architecture for a self-organizing neural pattern recognition machine, Computer Vision, Graphics, and Image Processing, 37, 54-115) that is conceptually similar, can be implemented in hardware with less sophisticated building blocks, and maintains the computational capabilities of the originally proposed algorithm. This modified ART 1 algorithm (which we will call here ART 1(m)) is the result of hardware motivated simplifications investigated during the design of an actual ART 1 chip [Serrano-Gotarredona et al., 1994, Proc. 1994 IEEE Int. Conf. Neural Networks (Vol. 3, pp. 1912-1916); [Serrano-Gotarredona and Linares-Barranco, 1996], IEEE Trans. VLSI Systems, (in press)]. The purpose of this paper is simply to justify theoretically that the modified algorithm preserves the computational properties of the original one and to study the difference in behavior between the two approaches. Copyright 1996 Elsevier Science Ltd.
Developing and Implementing the Data Mining Algorithms in RAVEN
Sen, Ramazan Sonat; Maljovec, Daniel Patrick; Alfonsi, Andrea; Rabiti, Cristian
2015-09-01
The RAVEN code is becoming a comprehensive tool to perform probabilistic risk assessment, uncertainty quantification, and verification and validation. The RAVEN code is being developed to support many programs and to provide a set of methodologies and algorithms for advanced analysis. Scientific computer codes can generate enormous amounts of data. To post-process and analyze such data might, in some cases, take longer than the initial software runtime. Data mining algorithms/methods help in recognizing and understanding patterns in the data, and thus discover knowledge in databases. The methodologies used in the dynamic probabilistic risk assessment or in uncertainty and error quantification analysis couple system/physics codes with simulation controller codes, such as RAVEN. RAVEN introduces both deterministic and stochastic elements into the simulation while the system/physics code model the dynamics deterministically. A typical analysis is performed by sampling values of a set of parameter values. A major challenge in using dynamic probabilistic risk assessment or uncertainty and error quantification analysis for a complex system is to analyze the large number of scenarios generated. Data mining techniques are typically used to better organize and understand data, i.e. recognizing patterns in the data. This report focuses on development and implementation of Application Programming Interfaces (APIs) for different data mining algorithms, and the application of these algorithms to different databases.
Implementation and performance evaluation of reconstruction algorithms on graphics processors.
Castaño Díez, Daniel; Mueller, Hannes; Frangakis, Achilleas S
2007-01-01
The high-throughput needs in electron tomography and in single particle analysis have driven the parallel implementation of several reconstruction algorithms and software packages on computing clusters. Here, we report on the implementation of popular reconstruction algorithms as weighted backprojection, simultaneous iterative reconstruction technique (SIRT) and simultaneous algebraic reconstruction technique (SART) on common graphics processors (GPUs). The speed gain achieved on the GPUs is in the order of sixty (60x) to eighty (80x) times, compared to the performance of a single central processing unit (CPU), which is comparable to the acceleration achieved on a medium-range computing cluster. This acceleration of the reconstruction is caused by the highly specialized architecture of the GPU. Further, we show that the quality of the reconstruction on the GPU is comparable to the CPU. We present detailed flow-chart diagrams of the implementation. The reconstruction software does not require special hardware apart from the commercially available graphics cards and could be easily integrated in software packages like SPIDER, XMIPP, TOM-Package and others.
Algorithms and implementations of APT resonant control system
Wang, Yi-Ming; Regan, A.
1997-08-01
A digital signal processor is implemented to control resonant frequency of the RFQ prototype in APT/LEDA. Two schemes are implemented to calculate the resonant frequency of RFQ. One uses the measurement of the forward and reflected fields. The other uses the measurement of the forward and transmitted fields. The former is sensitive and accurate when the operation frequency is relatively far from the resonant frequency. The latter gives accurate results when the operation frequency is close to the resonant frequency. Linearized algorithms are derived to calculate the resonant frequency of the RFQ efficiently using a fixed-point DSP. The control frequency range is about 100kHz for 350MHz operation frequency. A frequency agile scheme is employed using a dual direct digital synthesizer to drive klystron at the cavity`s resonant frequency (not necessarily the required beam resonant frequency) in power-up mode to quickly the cavity to the desired resonant frequency. This paper will address the algorithm implementation, error analysis, as well as related hardware design issues.
Neural network implementations of data association algorithms for sensor fusion
NASA Technical Reports Server (NTRS)
Brown, Donald E.; Pittard, Clarence L.; Martin, Worthy N.
1989-01-01
The paper is concerned with locating a time varying set of entities in a fixed field when the entities are sensed at discrete time instances. At a given time instant a collection of bivariate Gaussian sensor reports is produced, and these reports estimate the location of a subset of the entities present in the field. A database of reports is maintained, which ideally should contain one report for each entity sensed. Whenever a collection of sensor reports is received, the database must be updated to reflect the new information. This updating requires association processing between the database reports and the new sensor reports to determine which pairs of sensor and database reports correspond to the same entity. Algorithms for performing this association processing are presented. Neural network implementation of the algorithms, along with simulation results comparing the approaches are provided.
Infrared Jitter Imaging Data Reduction: Algorithms and Implementation
NASA Astrophysics Data System (ADS)
Devillard, Nicolas
Jitter imaging (also known as microscanning) is probably one of the most efficient ways to perform astronomical observations in the infrared. It requires very efficient filtering and recentering methods to produce the best possible output from raw data. This paper discusses issues attached to Poisson offset generation, efficient infrared sky filtering, offset recovery between planes through cross-correlation and/or point pattern recognition techniques, plane shifting with subpixel resolution through various kernel-based interpolation schemes, and 3D filtering for plane accumulation. Several algorithms are described for each step, having in mind an automatic data processing in pipeline mode (i.e., without user interaction) as intended for the Very Large Telescope. Implementation of these algorithms in optimized ANSI C (the eclipse library) is also described here.
Kodiak: An Implementation Framework for Branch and Bound Algorithms
NASA Technical Reports Server (NTRS)
Smith, Andrew P.; Munoz, Cesar A.; Narkawicz, Anthony J.; Markevicius, Mantas
2015-01-01
Recursive branch and bound algorithms are often used to refine and isolate solutions to several classes of global optimization problems. A rigorous computation framework for the solution of systems of equations and inequalities involving nonlinear real arithmetic over hyper-rectangular variable and parameter domains is presented. It is derived from a generic branch and bound algorithm that has been formally verified, and utilizes self-validating enclosure methods, namely interval arithmetic and, for polynomials and rational functions, Bernstein expansion. Since bounds computed by these enclosure methods are sound, this approach may be used reliably in software verification tools. Advantage is taken of the partial derivatives of the constraint functions involved in the system, firstly to reduce the branching factor by the use of bisection heuristics and secondly to permit the computation of bifurcation sets for systems of ordinary differential equations. The associated software development, Kodiak, is presented, along with examples of three different branch and bound problem types it implements.
A comparative analysis of GPU implementations of spectral unmixing algorithms
NASA Astrophysics Data System (ADS)
Sanchez, Sergio; Plaza, Antonio
2011-11-01
Spectral unmixing is a very important task for remotely sensed hyperspectral data exploitation. It involves the separation of a mixed pixel spectrum into its pure component spectra (called endmembers) and the estimation of the proportion (abundance) of each endmember in the pixel. Over the last years, several algorithms have been proposed for: i) automatic extraction of endmembers, and ii) estimation of the abundance of endmembers in each pixel of the hyperspectral image. The latter step usually imposes two constraints in abundance estimation: the non-negativity constraint (meaning that the estimated abundances cannot be negative) and the sum-toone constraint (meaning that the sum of endmember fractional abundances for a given pixel must be unity). These two steps comprise a hyperspectral unmixing chain, which can be very time-consuming (particularly for high-dimensional hyperspectral images). Parallel computing architectures have offered an attractive solution for fast unmixing of hyperspectral data sets, but these systems are expensive and difficult to adapt to on-board data processing scenarios, in which low-weight and low-power integrated components are essential to reduce mission payload and obtain analysis results in (near) real-time. In this paper, we perform an inter-comparison of parallel algorithms for automatic extraction of pure spectral signatures or endmembers and for estimation of the abundance of endmembers in each pixel of the scene. The compared techniques are implemented in graphics processing units (GPUs). These hardware accelerators can bridge the gap towards on-board processing of this kind of data. The considered algorithms comprise the orthogonal subspace projection (OSP), iterative error analysis (IEA) and N-FINDR algorithms for endmember extraction, as well as unconstrained, partially constrained and fully constrained abundance estimation. The considered implementations are inter-compared using different GPU architectures and hyperspectral
DSP Implementation of the Multiscale Retinex Image Enhancement Algorithm
NASA Technical Reports Server (NTRS)
Hines, Glenn D.; Rahman, Zia-ur; Jobson, Daniel J.; Woodell, Glenn A.
2004-01-01
The Retinex is a general-purpose image enhancement algorithm that is used to produce good visual representations of scenes. It performs a non-linear spatial/ spectral transform that synthesizes strong local contrast enhancement and color constancy. A real-time, video frame rate implementation of the Retinex is required to meet the needs of various potential users. Retinex processing contains a relatively large number of complex computations, thus to achieve real-time performance using current technologies requires specialized hardware and software. In this paper we discuss the design and development of a digital signal processor (DSP) implementation of the Retinex. The target processor is a Texas Instruments TMS320C6711 floating point DSP. NTSC video is captured using a dedicated frame grabber card, Retinex processed, and displayed on a standard monitor. We discuss the optimizations used to achieve real-time performance of the Retinex and also describe our future plans on using alternative architectures.
DSP Implementation of the Multiscale Retinex Image Enhancement Algorithm
NASA Technical Reports Server (NTRS)
Hines, Glenn D.; Rahman, Zia-ur; Jobson, Daniel J.; Woodell, Glenn A.
2004-01-01
The Retinex is a general-purpose image enhancement algorithm that is used to produce good visual representations of scenes. It performs a non-linear spatial/ spectral transform that synthesizes strong local contrast enhancement and color constancy. A real-time, video frame rate implementation of the Retinex is required to meet the needs of various potential users. Retinex processing contains a relatively large number of complex computations, thus to achieve real-time performance using current technologies requires specialized hardware and software. In this paper we discuss the design and development of a digital signal processor (DSP) implementation of the Retinex. The target processor is a Texas Instruments TMS320C6711 floating point DSP. NTSC video is captured using a dedicated frame grabber card, Retinex processed, and displayed on a standard monitor. We discuss the optimizations used to achieve real-time performance of the Retinex and also describe our future plans on using alternative architectures.
DSP Implementation of the Retinex Image Enhancement Algorithm
NASA Technical Reports Server (NTRS)
Hines, Glenn; Rahman, Zia-Ur; Jobson, Daniel; Woodell, Glenn
2004-01-01
The Retinex is a general-purpose image enhancement algorithm that is used to produce good visual representations of scenes. It performs a non-linear spatial/spectral transform that synthesizes strong local contrast enhancement and color constancy. A real-time, video frame rate implementation of the Retinex is required to meet the needs of various potential users. Retinex processing contains a relatively large number of complex computations, thus to achieve real-time performance using current technologies requires specialized hardware and software. In this paper we discuss the design and development of a digital signal processor (DSP) implementation of the Retinex. The target processor is a Texas Instruments TMS320C6711 floating point DSP. NTSC video is captured using a dedicated frame-grabber card, Retinex processed, and displayed on a standard monitor. We discuss the optimizations used to achieve real-time performance of the Retinex and also describe our future plans on using alternative architectures.
Purgatorio - A new implementation of the Inferno algorithm
Wilson, B; Sonnad, V; Sterne, P; Isaacs, W
2005-03-29
For astrophysical applications, as well as modeling laser-produced plasmas, there is a continual need for equation-of-state data over a wide domain of physical conditions. This paper presents algorithmic aspects for computing the Helmholtz free energy of plasma electrons for temperatures spanning from a few Kelvin to several KeV, and densities ranging from essentially isolated ion conditions to such large compressions that most bound orbitals become delocalized. The objective is high precision results in order to compute pressure and other thermodynamic quantities by numerical differentiation. This approach has the advantage that internal thermodynamic self-consistency is ensured, regardless of the specific physical model, but at the cost of very stringent numerical tolerances for each operation. The computational aspects we address in this paper are faced by any model that relies on input from the quantum mechanical spectrum of a spherically symmetric Hamiltonian operator. The particular physical model we employ is that of INFERNO; of a spherically averaged ion embedded in jellium. An overview of PURGATORIO, a new implementation of the INFERNO equation of state model, is presented. The new algorithm emphasizes a novel decimation scheme for automatically resolving the structure of the continuum density of states, circumventing limitations of the pseudo-R matrix algorithm previously utilized.
NASA Astrophysics Data System (ADS)
Ervin, Katherine; Shipman, Steven
2017-06-01
While rotational spectra can be rapidly collected, their analysis (especially for complex systems) is seldom straightforward, leading to a bottleneck. The AUTOFIT program was designed to serve that need by quickly matching rotational constants to spectra with little user input and supervision. This program can potentially be improved by incorporating an optimization algorithm in the search for a solution. The Particle Swarm Optimization Algorithm (PSO) was chosen for implementation. PSO is part of a family of optimization algorithms called heuristic algorithms, which seek approximate best answers. This is ideal for rotational spectra, where an exact match will not be found without incorporating distortion constants, etc., which would otherwise greatly increase the size of the search space. PSO was tested for robustness against five standard fitness functions and then applied to a custom fitness function created for rotational spectra. This talk will explain the Particle Swarm Optimization algorithm and how it works, describe how Autofit was modified to use PSO, discuss the fitness function developed to work with spectroscopic data, and show our current results. Seifert, N.A., Finneran, I.A., Perez, C., Zaleski, D.P., Neill, J.L., Steber, A.L., Suenram, R.D., Lesarri, A., Shipman, S.T., Pate, B.H., J. Mol. Spec. 312, 13-21 (2015)
Cascade Error Projection: A Learning Algorithm for Hardware Implementation
NASA Technical Reports Server (NTRS)
Duong, Tuan A.; Daud, Taher
1996-01-01
In this paper, we workout a detailed mathematical analysis for a new learning algorithm termed Cascade Error Projection (CEP) and a general learning frame work. This frame work can be used to obtain the cascade correlation learning algorithm by choosing a particular set of parameters. Furthermore, CEP learning algorithm is operated only on one layer, whereas the other set of weights can be calculated deterministically. In association with the dynamical stepsize change concept to convert the weight update from infinite space into a finite space, the relation between the current stepsize and the previous energy level is also given and the estimation procedure for optimal stepsize is used for validation of our proposed technique. The weight values of zero are used for starting the learning for every layer, and a single hidden unit is applied instead of using a pool of candidate hidden units similar to cascade correlation scheme. Therefore, simplicity in hardware implementation is also obtained. Furthermore, this analysis allows us to select from other methods (such as the conjugate gradient descent or the Newton's second order) one of which will be a good candidate for the learning technique. The choice of learning technique depends on the constraints of the problem (e.g., speed, performance, and hardware implementation); one technique may be more suitable than others. Moreover, for a discrete weight space, the theoretical analysis presents the capability of learning with limited weight quantization. Finally, 5- to 8-bit parity and chaotic time series prediction problems are investigated; the simulation results demonstrate that 4-bit or more weight quantization is sufficient for learning neural network using CEP. In addition, it is demonstrated that this technique is able to compensate for less bit weight resolution by incorporating additional hidden units. However, generation result may suffer somewhat with lower bit weight quantization.
Parallel Implementation of the Box Counting Algorithm in Opencl
NASA Astrophysics Data System (ADS)
Mukundan, Ramakrishnan
2015-06-01
The box counting algorithm is a well-known method for the computation of the fractal dimension of an image. It is often implemented using a recursive subdivision of the image into a set of regular tiles or boxes. Parallel implementations often try to map the boxes to different compute units, and combine the results to get the total number of boxes intersecting a shape. This paper presents a novel and highly efficient method using Open Computing Language (OpenCL) kernels to perform the computation on a per-pixel basis. The mapping and reduction stages are performed in a single pass, and therefore require the enqueuing of only a single kernel. Each instance of the kernel updates the information pertaining to all the boxes containing the pixel, and simultaneously increments the box counters at multiple levels, thereby eliminating the need for another pass to perform the summation. The complete implementation and coding details of the proposed method are outlined. The performance of the method on different processors are analyzed with respect to varying image sizes.
NASA Astrophysics Data System (ADS)
López-Medina, Mario E.; Vázquez-Montiel, Sergio; Herrera-Vázquez, Joel
2008-04-01
The Genetic Algorithms, GAs, are a method of global optimization that we use in the stage of optimization in the design of optical systems. In the case of optical design and optimization, the efficiency and convergence speed of GAs are related with merit function, crossover operator, and mutation operator. In this study we present a comparison between several genetic algorithms implementations using different optical systems, like achromatic cemented doublet, air spaced doublet and telescopes. We do the comparison varying the type of design parameters and the number of parameters to be optimized. We also implement the GAs using discreet parameters with binary chains and with continuous parameter using real numbers in the chromosome; analyzing the differences in the time taken to find the solution and the precision in the results between discreet and continuous parameters. Additionally, we use different merit function to optimize the same optical system. We present the obtained results in tables, graphics and a detailed example; and of the comparison we conclude which is the best way to implement GAs for design and optimization optical system. The programs developed for this work were made using the C programming language and OSLO for the simulation of the optical systems.
FPGA implementation of Generalized Hebbian Algorithm for texture classification.
Lin, Shiow-Jyu; Hwang, Wen-Jyi; Lee, Wei-Hao
2012-01-01
This paper presents a novel hardware architecture for principal component analysis. The architecture is based on the Generalized Hebbian Algorithm (GHA) because of its simplicity and effectiveness. The architecture is separated into three portions: the weight vector updating unit, the principal computation unit and the memory unit. In the weight vector updating unit, the computation of different synaptic weight vectors shares the same circuit for reducing the area costs. To show the effectiveness of the circuit, a texture classification system based on the proposed architecture is physically implemented by Field Programmable Gate Array (FPGA). It is embedded in a System-On-Programmable-Chip (SOPC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient design for attaining both high speed performance and low area costs.
An overview of SuperLU: Algorithms, implementation, and userinterface
Li, Xiaoye S.
2003-09-30
We give an overview of the algorithms, design philosophy,and implementation techniques in the software SuperLU, for solving sparseunsymmetric linear systems. In particular, we highlight the differencesbetween the sequential SuperLU (including its multithreaded extension)and parallel SuperLU_DIST. These include the numerical pivoting strategy,the ordering strategy for preserving sparsity, the ordering in which theupdating tasks are performed, the numerical kernel, and theparallelization strategy. Because of the scalability concern, theparallel code is drastically different from the sequential one. Wedescribe the user interfaces ofthe libraries, and illustrate how to usethe libraries most efficiently depending on some matrix characteristics.Finally, we give some examples of how the solver has been used inlarge-scale scientific applications, and the performance.
The implementation of Grover's algorithm in optically driven quantum dots
NASA Astrophysics Data System (ADS)
Yin, W.; Liang, J. Q.; Yan, Q. W.
2006-11-01
In this paper, we study the implementation of Grover's algorithm using the system of three identical quantum dots (QDs) coupled by a multi-frequency optical field. Our result shows that increasing the electric field strength A speeds up the oscillations of the occupations of the excited states rather than increasing the occupation probabilities of those states. The larger the detuning of the field from resonance, the fewer the states which can be used as qubits. Compared with a multi-frequency external field, a single-frequency external field will generate much lower amplitudes of the excited states under the same coupling strength A and interdot Coulomb interaction V. However, when the three quantum dots are coupled with a single-frequency external field, these amplitudes increase on increasing the coupling strength A or decreasing the interdot Coulomb interaction V.
FPGA Implementation of Generalized Hebbian Algorithm for Texture Classification
Lin, Shiow-Jyu; Hwang, Wen-Jyi; Lee, Wei-Hao
2012-01-01
This paper presents a novel hardware architecture for principal component analysis. The architecture is based on the Generalized Hebbian Algorithm (GHA) because of its simplicity and effectiveness. The architecture is separated into three portions: the weight vector updating unit, the principal computation unit and the memory unit. In the weight vector updating unit, the computation of different synaptic weight vectors shares the same circuit for reducing the area costs. To show the effectiveness of the circuit, a texture classification system based on the proposed architecture is physically implemented by Field Programmable Gate Array (FPGA). It is embedded in a System-On-Programmable-Chip (SOPC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient design for attaining both high speed performance and low area costs. PMID:22778640
All-Optical Implementation of the Ant Colony Optimization Algorithm
Hu, Wenchao; Wu, Kan; Shum, Perry Ping; Zheludev, Nikolay I.; Soci, Cesare
2016-01-01
We report all-optical implementation of the optimization algorithm for the famous “ant colony” problem. Ant colonies progressively optimize pathway to food discovered by one of the ants through identifying the discovered route with volatile chemicals (pheromones) secreted on the way back from the food deposit. Mathematically this is an important example of graph optimization problem with dynamically changing parameters. Using an optical network with nonlinear waveguides to represent the graph and a feedback loop, we experimentally show that photons traveling through the network behave like ants that dynamically modify the environment to find the shortest pathway to any chosen point in the graph. This proof-of-principle demonstration illustrates how transient nonlinearity in the optical system can be exploited to tackle complex optimization problems directly, on the hardware level, which may be used for self-routing of optical signals in transparent communication networks and energy flow in photonic systems. PMID:27222098
All-Optical Implementation of the Ant Colony Optimization Algorithm
NASA Astrophysics Data System (ADS)
Hu, Wenchao; Wu, Kan; Shum, Perry Ping; Zheludev, Nikolay I.; Soci, Cesare
2016-05-01
We report all-optical implementation of the optimization algorithm for the famous “ant colony” problem. Ant colonies progressively optimize pathway to food discovered by one of the ants through identifying the discovered route with volatile chemicals (pheromones) secreted on the way back from the food deposit. Mathematically this is an important example of graph optimization problem with dynamically changing parameters. Using an optical network with nonlinear waveguides to represent the graph and a feedback loop, we experimentally show that photons traveling through the network behave like ants that dynamically modify the environment to find the shortest pathway to any chosen point in the graph. This proof-of-principle demonstration illustrates how transient nonlinearity in the optical system can be exploited to tackle complex optimization problems directly, on the hardware level, which may be used for self-routing of optical signals in transparent communication networks and energy flow in photonic systems.
Neuromorphic implementations of neurobiological learning algorithms for spiking neural networks.
Walter, Florian; Röhrbein, Florian; Knoll, Alois
2015-12-01
The application of biologically inspired methods in design and control has a long tradition in robotics. Unlike previous approaches in this direction, the emerging field of neurorobotics not only mimics biological mechanisms at a relatively high level of abstraction but employs highly realistic simulations of actual biological nervous systems. Even today, carrying out these simulations efficiently at appropriate timescales is challenging. Neuromorphic chip designs specially tailored to this task therefore offer an interesting perspective for neurorobotics. Unlike Von Neumann CPUs, these chips cannot be simply programmed with a standard programming language. Like real brains, their functionality is determined by the structure of neural connectivity and synaptic efficacies. Enabling higher cognitive functions for neurorobotics consequently requires the application of neurobiological learning algorithms to adjust synaptic weights in a biologically plausible way. In this paper, we therefore investigate how to program neuromorphic chips by means of learning. First, we provide an overview over selected neuromorphic chip designs and analyze them in terms of neural computation, communication systems and software infrastructure. On the theoretical side, we review neurobiological learning techniques. Based on this overview, we then examine on-die implementations of these learning algorithms on the considered neuromorphic chips. A final discussion puts the findings of this work into context and highlights how neuromorphic hardware can potentially advance the field of autonomous robot systems. The paper thus gives an in-depth overview of neuromorphic implementations of basic mechanisms of synaptic plasticity which are required to realize advanced cognitive capabilities with spiking neural networks. Copyright © 2015 Elsevier Ltd. All rights reserved.
The implement of Talmud property allocation algorithm based on graphic point-segment way
NASA Astrophysics Data System (ADS)
Cen, Haifeng
2017-04-01
Under the guidance of the Talmud allocation scheme's theory, the paper analyzes the algorithm implemented process via the perspective of graphic point-segment way, and designs the point-segment way's Talmud property allocation algorithm. Then it uses Java language to implement the core of allocation algorithm, by using Android programming to build a visual interface.
Border-tracing algorithm implementation for the femoral geometry reconstruction.
Testi, D; Zannoni, C; Cappello, A; Viceconti, M
2001-06-01
In some orthopaedic applications such as the design of custom-made hip prostheses, reconstruction of the bone morphology is a fundamental step. Different methods are available to extract the geometry of the femoral medullary canal from computed tomography (CT) images. In this research, an automatic procedure (border-tracing method) for the extraction of bone contours was implemented and validated. A composite replica of the human femur was scanned and the CT images processed using three different methods, a manual procedure; the border-tracing algorithm; and a threshold-based method. The resulting contours were used to estimate the accuracy of the implemented procedure. The two software techniques were more accurate than the manual procedure. Then, these two procedures were applied to an in vivo CT data set in order to determine to most critical region for repeatability. Only for the images located in this region, the repeatability measurement was carried out for six in vivo CT data sets to evaluate the inter-femur repeatability. The border-tracing method was found to achieve the highest repeatability.
An implementation of continuous genetic algorithm in parameter estimation of predator-prey model
NASA Astrophysics Data System (ADS)
Windarto
2016-03-01
Genetic algorithm is an optimization method based on the principles of genetics and natural selection in life organisms. The main components of this algorithm are chromosomes population (individuals population), parent selection, crossover to produce new offspring, and random mutation. In this paper, continuous genetic algorithm was implemented to estimate parameters in a predator-prey model of Lotka-Volterra type. For simplicity, all genetic algorithm parameters (selection rate and mutation rate) are set to be constant along implementation of the algorithm. It was found that by selecting suitable mutation rate, the algorithms can estimate these parameters well.
An Object-Oriented Collection of Minimum Degree Algorithms: Design, Implementation, and Experiences
NASA Technical Reports Server (NTRS)
Kumfert, Gary; Pothen, Alex
1999-01-01
The multiple minimum degree (MMD) algorithm and its variants have enjoyed 20+ years of research and progress in generating fill-reducing orderings for sparse, symmetric positive definite matrices. Although conceptually simple, efficient implementations of these algorithms are deceptively complex and highly specialized. In this case study, we present an object-oriented library that implements several recent minimum degree-like algorithms. We discuss how object-oriented design forces us to decompose these algorithms in a different manner than earlier codes and demonstrate how this impacts the flexibility and efficiency of our C++ implementation. We compare the performance of our code against other implementations in C or Fortran.
Systolic VLSI array for implementing the Kalman filter algorithm
NASA Technical Reports Server (NTRS)
Chang, Jaw J. (Inventor); Yeh, Hen-Geul (Inventor)
1989-01-01
A method and apparatus for processing signals representative of a complex matrix/vector equation is disclosed and claimed. More particularly, signals representing an orderly sequence of the combined matrix and vector equation, known as a Kalman filter algorithm, is processed in real-time in accordance with the principles of this invention. The Kalman filter algorithm is converted into a Faddeev algorithm, which is a matrix-only algorithm. The Faddeev algorithm is modified to represent both the matrix and vector portions of the Kalman filter algorithm. The modified Faddeev algorithm is embodied into electrical signals which are applied as inputs to a systolic array processor, which performs triangulation and nullification on the input signals, and delivers an output signal to a real-time utilization circuit.
An implementable algorithm for the optimal design centering, tolerancing, and tuning problem
Polak, E.
1982-05-01
An implementable master algorithm for solving optimal design centering, tolerancing, and tuning problems is presented. This master algorithm decomposes the original nondifferentiable optimization problem into a sequence of ordinary nonlinear programming problems. The master algorithm generates sequences with accumulation points that are feasible and satisfy a new optimality condition, which is shown to be stronger than the one previously used for these problems.
FPGA Implementation of Back Projection Algorithm for Radar Imaging (PREPRINT)
2014-10-09
of back projection algorithm compared to other beamforming algorithms. The raw data is generated using stepped frequency continuous wave radar. I... transmitter to target and back to receiver) is constant. The points that have the same TOA are on a hyperbola H with focuses at transmitter and receiver...an antenna array system of 4 transmitters and 4 receivers. There are several migration algorithms which can be used for through the barrier imaging
NASA Astrophysics Data System (ADS)
Klimesh, M.; Stanton, V.; Watola, D.
2000-10-01
We describe a hardware implementation of a state-of-the-art lossless image compression algorithm. The algorithm is based on the LOCO-I (low complexity lossless compression for images) algorithm developed by Weinberger, Seroussi, and Sapiro, with modifications to lower the implementation complexity. In this setup, the compression itself is performed entirely in hardware using a field programmable gate array and a small amount of random access memory. The compression speed achieved is 1.33 Mpixels/second. Our algorithm yields about 15 percent better compression than the Rice algorithm.
Performance of new GPU-based scan-conversion algorithm implemented using OpenGL.
Steelman, William A; Richard, William D
2011-04-01
A new GPU-based scan-conversion algorithm implemented using OpenGL is described. The compute performance of this new algorithm running on a modem GPU is compared to the performance of three common scan-conversion algorithms (nearest-neighbor, linear interpolation and bilinear interpolation) implemented in software using a modem CPU. The quality of the images produced by the algorithm, as measured by signal-to-noise power, is also compared to the quality of the images produced using these three common scan-conversion algorithms.
Parallel implementation of the FETI-DPEM algorithm for general 3D EM simulations
NASA Astrophysics Data System (ADS)
Li, Yu-Jia; Jin, Jian-Ming
2009-05-01
A parallel implementation of the electromagnetic dual-primal finite element tearing and interconnecting algorithm (FETI-DPEM) is designed for general three-dimensional (3D) electromagnetic large-scale simulations. As a domain decomposition implementation of the finite element method, the FETI-DPEM algorithm provides fully decoupled subdomain problems and an excellent numerical scalability, and thus is well suited for parallel computation. The parallel implementation of the FETI-DPEM algorithm on a distributed-memory system using the message passing interface (MPI) is discussed in detail along with a few practical guidelines obtained from numerical experiments. Numerical examples are provided to demonstrate the efficiency of the parallel implementation.
Improvement and implementation for Canny edge detection algorithm
NASA Astrophysics Data System (ADS)
Yang, Tao; Qiu, Yue-hong
2015-07-01
Edge detection is necessary for image segmentation and pattern recognition. In this paper, an improved Canny edge detection approach is proposed due to the defect of traditional algorithm. A modified bilateral filter with a compensation function based on pixel intensity similarity judgment was used to smooth image instead of Gaussian filter, which could preserve edge feature and remove noise effectively. In order to solve the problems of sensitivity to the noise in gradient calculating, the algorithm used 4 directions gradient templates. Finally, Otsu algorithm adaptively obtain the dual-threshold. All of the algorithm simulated with OpenCV 2.4.0 library in the environments of vs2010, and through the experimental analysis, the improved algorithm has been proved to detect edge details more effectively and with more adaptability.
Design and implementation of intelligent electronic warfare decision making algorithm
NASA Astrophysics Data System (ADS)
Peng, Hsin-Hsien; Chen, Chang-Kuo; Hsueh, Chi-Shun
2017-05-01
Electromagnetic signals and the requirements of timely response have been a rapid growth in modern electronic warfare. Although jammers are limited resources, it is possible to achieve the best electronic warfare efficiency by tactical decisions. This paper proposes the intelligent electronic warfare decision support system. In this work, we develop a novel hybrid algorithm, Digital Pheromone Particle Swarm Optimization, based on Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO) and Shuffled Frog Leaping Algorithm (SFLA). We use PSO to solve the problem and combine the concept of pheromones in ACO to accumulate more useful information in spatial solving process and speed up finding the optimal solution. The proposed algorithm finds the optimal solution in reasonable computation time by using the method of matrix conversion in SFLA. The results indicated that jammer allocation was more effective. The system based on the hybrid algorithm provides electronic warfare commanders with critical information to assist commanders in effectively managing the complex electromagnetic battlefield.
Alternative implementations of Monte Carlo EM algorithms for likelihood inferences
García-Cortés, Louis Alberto; Sorensen, Daniel
2001-01-01
Two methods of computing Monte Carlo estimators of variance components using restricted maximum likelihood via the expectation-maximisation algorithm are reviewed. A third approach is suggested and the performance of the methods is compared using simulated data. PMID:11559486
Efficient GPU implementation for Particle in Cell algorithm
Joseph, Rejith George; Ravunnikutty, Girish; Ranka, Sanjay; Klasky, Scott A
2011-01-01
Particle in cell method is widely used method in the plasma physics to study the trajectories of charged particles under electromagnetic fields. The PIC algorithm is computationally intensive and its time requirements are proportional to the number of charged particles involved in the simulation. The focus of the paper is to parallelize the PIC algorithm on Graphics Processing Unit (GPU). We present several performance tradeoffs related to the small shared memory and atomic operations on the GPU to achieve high performance.
Experimental implementation of an adiabatic quantum optimization algorithm
NASA Astrophysics Data System (ADS)
Steffen, Matthias; van Dam, Wim; Hogg, Tad; Breyta, Greg; Chuang, Isaac
2003-03-01
A novel quantum algorithm using adiabatic evolution was recently presented by Ed Farhi [1] and Tad Hogg [2]. This algorithm represents a remarkable discovery because it offers new insights into the usefulness of quantum resources. An experimental demonstration of an adiabatic algorithm has remained beyond reach because it requires an experimentally accessible Hamiltonian which encodes the problem and which must also be smoothly varied over time. We present tools to overcome these difficulties by discretizing the algorithm and extending average Hamiltonian techniques [3]. We used these techniques in the first experimental demonstration of an adiabatic optimization algorithm: solving an instance of the MAXCUT problem using three qubits and nuclear magnetic resonance techniques. We show that there exists an optimal run-time of the algorithm which can be predicted using a previously developed decoherence model. [1] E. Farhi et al., quant-ph/0001106 (2000) [2] T. Hogg, PRA, 61, 052311 (2000) [3] W. Rhim, A. Pines, J. Waugh, PRL, 24,218 (1970)
NASA Astrophysics Data System (ADS)
Modgil, Dimple; La Rivière, Patrick J.
2009-07-01
Our goal is to compare and contrast various image reconstruction algorithms for optoacoustic tomography (OAT) assuming a finite linear aperture of the kind that arises when using a linear-array transducer. Because such transducers generally have tall, narrow elements, they are essentially insensitive to out-of-plane acoustic waves, and the usually 3-D OAT problem reduces to a 2-D problem. Algorithms developed for the 3-D problem may not perform optimally in 2-D. We have implemented and evaluated a number of previously described OAT algorithms, including an exact (in 3-D) Fourier-based algorithm and a synthetic-aperture-based algorithm. We have also implemented a 2-D algorithm developed by Norton for reflection mode tomography that has not, to the best of our knowledge, been applied to OAT before. Our simulation studies of resolution, contrast, noise properties, and signal detectability measures suggest that Norton's approach-based algorithm has the best contrast, resolution, and signal detectability.
PDoublePop: An implementation of parallel genetic algorithm for function optimization
NASA Astrophysics Data System (ADS)
Tsoulos, Ioannis G.; Tzallas, Alexandros; Tsalikakis, Dimitris
2016-12-01
A software for the implementation of parallel genetic algorithms is presented in this article. The underlying genetic algorithm is aimed to locate the global minimum of a multidimensional function inside a rectangular hyperbox. The proposed software named PDoublePop implements a client-server model for parallel genetic algorithms with advanced features for the local genetic algorithms such as: an enhanced stopping rule, an advanced mutation scheme and periodical application of a local search procedure. The user may code the objective function either in C++ or in Fortran77. The method is tested on a series of well-known test functions and the results are reported.
Research and implementation of finger-vein recognition algorithm
NASA Astrophysics Data System (ADS)
Pang, Zengyao; Yang, Jie; Chen, Yilei; Liu, Yin
2017-06-01
In finger vein image preprocessing, finger angle correction and ROI extraction are important parts of the system. In this paper, we propose an angle correction algorithm based on the centroid of the vein image, and extract the ROI region according to the bidirectional gray projection method. Inspired by the fact that features in those vein areas have similar appearance as valleys, a novel method was proposed to extract center and width of palm vein based on multi-directional gradients, which is easy-computing, quick and stable. On this basis, an encoding method was designed to determine the gray value distribution of texture image. This algorithm could effectively overcome the edge of the texture extraction error. Finally, the system was equipped with higher robustness and recognition accuracy by utilizing fuzzy threshold determination and global gray value matching algorithm. Experimental results on pairs of matched palm images show that, the proposed method has a EER with 3.21% extracts features at the speed of 27ms per image. It can be concluded that the proposed algorithm has obvious advantages in grain extraction efficiency, matching accuracy and algorithm efficiency.
Endgame implementations for the Efficient Global Optimization (EGO) algorithm
NASA Astrophysics Data System (ADS)
Southall, Hugh L.; O'Donnell, Teresa H.; Kaanta, Bryan
2009-05-01
Efficient Global Optimization (EGO) is a competent evolutionary algorithm which can be useful for problems with expensive cost functions [1,2,3,4,5]. The goal is to find the global minimum using as few function evaluations as possible. Our research indicates that EGO requires far fewer evaluations than genetic algorithms (GAs). However, both algorithms do not always drill down to the absolute minimum, therefore the addition of a final local search technique is indicated. In this paper, we introduce three "endgame" techniques. The techniques can improve optimization efficiency (fewer cost function evaluations) and, if required, they can provide very accurate estimates of the global minimum. We also report results using a different cost function than the one previously used [2,3].
Algorithm and implementation of GPS/VRS network RTK
NASA Astrophysics Data System (ADS)
Gao, Chengfa; Yuan, Benyin; Ke, Fuyang; Pan, Shuguo
2009-06-01
This paper presents a virtual reference station method and its application. Details of how to generate GPS virtual phase observation are discussed in depth. The developed algorithms are successfully applied to the independent development network digital land investigation system. Experiments are carried out to investigate the system's performance whose results show that the algorithms have good availability and stability. The resulted accuracy of the VRS/RTK positioning was found to be within +/-3.3cm in the horizontal component and +/-7.9cm in the vertical component, which meets the requirements of precise digital land investigation.
The design and implementation of MPI master-slave parallel genetic algorithm
NASA Astrophysics Data System (ADS)
Liu, Shuping; Cheng, Yanliu
2013-03-01
In this paper, the MPI master-slave parallel genetic algorithm is implemented by analyzing the basic genetic algorithm and parallel MPI program, and building a Linux cluster. This algorithm is used for the test of maximum value problems (Rosen brocks function) .And we acquire the factors influencing the master-slave parallel genetic algorithm by deriving from the analysis of test data. The experimental data shows that the balanced hardware configuration and software design optimization can improve the performance of system in the complexity of the computing environment using the master-slave parallel genetic algorithms.
Implementation of an efficient labeling algorithm on a pipelined architecture
NASA Astrophysics Data System (ADS)
Olsson, Olof J.; Penman, David W.
1992-11-01
This paper describes an efficient approach, developed by the authors, for labelling images using a combination of pipeline (Datacube) and host (general purpose computer) processing. The output of the algorithm is a coordinate list of labelled object pixels that facilitates further high level operations.
Implementations of back propagation algorithm in ecosystems applications
NASA Astrophysics Data System (ADS)
Ali, Khalda F.; Sulaiman, Riza; Elamir, Amir Mohamed
2015-05-01
Artificial Neural Networks (ANNs) have been applied to an increasing number of real world problems of considerable complexity. Their most important advantage is in solving problems which are too complex for conventional technologies, that do not have an algorithmic solutions or their algorithmic Solutions is too complex to be found. In general, because of their abstraction from the biological brain, ANNs are developed from concept that evolved in the late twentieth century neuro-physiological experiments on the cells of the human brain to overcome the perceived inadequacies with conventional ecological data analysis methods. ANNs have gained increasing attention in ecosystems applications, because of ANN's capacity to detect patterns in data through non-linear relationships, this characteristic confers them a superior predictive ability. In this research, ANNs is applied in an ecological system analysis. The neural networks use the well known Back Propagation (BP) Algorithm with the Delta Rule for adaptation of the system. The Back Propagation (BP) training Algorithm is an effective analytical method for adaptation of the ecosystems applications, the main reason because of their capacity to detect patterns in data through non-linear relationships. This characteristic confers them a superior predicting ability. The BP algorithm uses supervised learning, which means that we provide the algorithm with examples of the inputs and outputs we want the network to compute, and then the error is calculated. The idea of the back propagation algorithm is to reduce this error, until the ANNs learns the training data. The training begins with random weights, and the goal is to adjust them so that the error will be minimal. This research evaluated the use of artificial neural networks (ANNs) techniques in an ecological system analysis and modeling. The experimental results from this research demonstrate that an artificial neural network system can be trained to act as an expert
A block-wise approximate parallel implementation for ART algorithm on CUDA-enabled GPU.
Fan, Zhongyin; Xie, Yaoqin
2015-01-01
Computed tomography (CT) has been widely used to acquire volumetric anatomical information in the diagnosis and treatment of illnesses in many clinics. However, the ART algorithm for reconstruction from under-sampled and noisy projection is still time-consuming. It is the goal of our work to improve a block-wise approximate parallel implementation for the ART algorithm on CUDA-enabled GPU to make the ART algorithm applicable to the clinical environment. The resulting method has several compelling features: (1) the rays are allotted into blocks, making the rays in the same block parallel; (2) GPU implementation caters to the actual industrial and medical application demand. We test the algorithm on a digital shepp-logan phantom, and the results indicate that our method is more efficient than the existing CPU implementation. The high computation efficiency achieved in our algorithm makes it possible for clinicians to obtain real-time 3D images.
An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices, part 2
NASA Technical Reports Server (NTRS)
Freund, Roland W.; Nachtigal, Noel M.
1990-01-01
It is shown how the look-ahead Lanczos process (combined with a quasi-minimal residual QMR) approach) can be used to develop a robust black box solver for large sparse non-Hermitian linear systems. Details of an implementation of the resulting QMR algorithm are presented. It is demonstrated that the QMR method is closely related to the biconjugate gradient (BCG) algorithm; however, unlike BCG, the QMR algorithm has smooth convergence curves and good numerical properties. We report numerical experiments with our implementation of the look-ahead Lanczos algorithm, both for eigenvalue problem and linear systems. Also, program listings of FORTRAN implementations of the look-ahead algorithm and the QMR method are included.
Hasan, Laiq; Al-Ars, Zaid
2009-01-01
In this paper, we present an efficient and high performance linear recursive variable expansion (RVE) implementation of the Smith-Waterman (S-W) algorithm and compare it with a traditional linear systolic array implementation. The results demonstrate that the linear RVE implementation performs up to 2.33 times better than the traditional linear systolic array implementation, at the cost of utilizing 2 times more resources.
Comparative study of fusion algorithms and implementation of new efficient solution
NASA Astrophysics Data System (ADS)
Besrour, Amine; Snoussi, Hichem; Siala, Mohamed; Abdelkefi, Fatma
2014-05-01
High Dynamic Range (HDR) imaging has been the subject of significant researches over the past years, the goal of acquiring the best cinema-quality HDR images of fast-moving scenes using an efficient merging algorithm has not yet been achieved. In fact, through the years, many efficient algorithms have been implemented and developed. However, they are not yet efficient since they don't treat all the situations and they have not enough speed to ensure fast HDR image reconstitution. In this paper, we will present a full comparative analyze and study of the available fusion algorithms. Also, we will implement our personal algorithm which can be more optimized and faster than the existed ones. We will also present our investigated algorithm that has the advantage to be more optimized than the existing ones. This merging algorithm is related to our hardware solution allowing us to obtain four pictures with different exposures.
Design and implementation of a multi-sensor fusion algorithm on a hypercube computer architecture
Glover, C.W.
1989-01-01
A multi-sensor integration (MSI) algorithm written for sequential single processor computer architecture has been transformed into a concurrent algorithm and implemented in parallel on a multi-processor hypercube computer architecture. This paper will present the philosophy and methodologies used in the decomposition of the sequential MSI algorithm, and its transformation into a parallel MSI algorithm. The parallel MSI algorithm was implemented on a NCUBE{trademark} hypercube computer. The performance of the parallel MSI algorithm has been measured and compared against its sequential counterpart by running test case scenarios through a simulation program. The simulation program allows the user to define the trajectories of all players in the scenarios, and to pick the sensor suites of the players and their operating characteristics.
Constant-time parallel sorting algorithm and its optical implementation using smart pixels.
Louri, A; Hatch, J A; Na, J
1995-06-10
Sorting is a fundamental operation that has important implications in a vast number of areas. For instance, sorting is heavily utilized in applications such as database machines, in which hashing techniques are used to accelerate data-processing algorithms. It is also the basis for interprocessor message routing and has strong implications in video telecommunications. However, high-speed electronic sorting networks are difficult to implement with VLSI technology because of the dense, global connectivity required. Optics eliminates this bottleneck by offering global interconnects, massive parallelism, and noninterfering communications. We present a parallel sorting algorithm and its efficient optical implementation. The algorithm sorts n data elements in few steps, independent of the number of elements to be sorted. Thus it is a constant-time sorting algorithm [i.e., O(1) time]. We also estimate the system's performance to show that the proposed sorting algorithm can provide at least 2 orders of magnitude improvement in execution time over conventional electronic algorithms.
Holographic implementation of a learning machine based on a multicategory perceptron algorithm.
Paek, E G; Wullert Ii, J R; Patel, J S
1989-12-01
An optical learning machine that has multicategory classification capability is demonstrated. The system exactly implements the single-layer perceptron algorithm and is fully parallel and analog. Experimental results on the learning by examples obtained from the system are described.
2014-07-31
Parallel Cyclic Convolution, Parallel Circular Correlators, Parallel One-Dimensional DFT, SDR GPS REPORT DOCUMENTATION PAGE 11. SPONSOR/MONITOR’S REPORT...of the MIT developed Quicksynch algorithm for fast circular correlation in GPS SDR systems. The algorithm is based on the sparsity of the...parallel correlator constructs that we are proposing. We also did a MATLAB implementation of an algorithm developed at MIT (Quicksynch) for GPS SDR
Implementation of Real-Time Feedback Flow Control Algorithms on a Canonical Testbed
NASA Technical Reports Server (NTRS)
Tian, Ye; Song, Qi; Cattafesta, Louis
2005-01-01
This report summarizes the activities on "Implementation of Real-Time Feedback Flow Control Algorithms on a Canonical Testbed." The work summarized consists primarily of two parts. The first part summarizes our previous work and the extensions to adaptive ID and control algorithms. The second part concentrates on the validation of adaptive algorithms by applying them to a vibration beam test bed. Extensions to flow control problems are discussed.
A Fast Implementation of the ISODATA Clustering Algorithm
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess; Mount, David M.; Netanyahu, Nathan S.; LeMoigne, Jacqueline
2005-01-01
Clustering is central to many image processing and remote sensing applications. ISODATA is one of the most popular and widely used clustering methods in geoscience applications, but it can run slowly, particularly with large data sets. We present a more efficient approach to ISODATA clustering, which achieves better running times by storing the points in a kd-tree and through a modification of the way in which the algorithm estimates the dispersion of each cluster. We also present an approximate version of the algorithm which allows the user to further improve the running time, at the expense of lower fidelity in computing the nearest cluster center to each point. We provide both theoretical and empirical justification that our modified approach produces clusterings that are very similar to those produced by the standard ISODATA approach. We also provide empirical studies on both synthetic data and remotely sensed Landsat and MODIS images that show that our approach has significantly lower running times.
NASA Technical Reports Server (NTRS)
Delaat, J. C.; Merrill, W. C.
1983-01-01
A sensor failure detection, isolation, and accommodation algorithm was developed which incorporates analytic sensor redundancy through software. This algorithm was implemented in a high level language on a microprocessor based controls computer. Parallel processing and state-of-the-art 16-bit microprocessors are used along with efficient programming practices to achieve real-time operation.
Implementing the Continued Fraction Algorithm on the Illiac IV.
1980-01-01
the Research 1. "On Computing Unitary Aliquot Sequences", with R.K. Guy, Procedings of the tenth Manitoba Conference on Numerical Mathematics , 1979...ADA100 145 NORTHERN ILLINOIS UNIV DE KALB DEPT OF MATHEMATICAL -ETC F/G 9/2 IMPLE1ENTING THE CONTINUED FRACTION ALGORITHM ON TIE ILLIAC IV.(U) 1980...Wunderlich F49620-79-C-O199 (, ) Mathematical Sciences Dept Northern ILlinois University _ 9. PERFORMING ORGANIZATION NAME AND ADDRESS 10. PROGRAM
Investigation and Implementation of Matrix Permanent Algorithms for Identity Resolution
2014-12-01
Awareness ( MSA ) in both tactical and operational settings. Resolving the identities of unknown targets often demands significant resources, and thus it is...characterization, particularly in regard to their suitability for MSA applications. The present work seeks to clarify the computational options available and...those algorithms most suited for MSA . i The use or disclosure of the information on this sheet is subject to the restrictions on the title page of this
A Novel Implementation of Efficient Algorithms for Quantum Circuit Synthesis
NASA Astrophysics Data System (ADS)
Zeller, Luke
In this project, we design and develop a computer program to effectively approximate arbitrary quantum gates using the discrete set of Clifford Gates together with the T gate (π/8 gate). Employing recent results from Mosca et. al. and Giles and Selinger, we implement a decomposition scheme that outputs a sequence of Clifford, T, and Tt gates that approximate the input to within a specified error range ɛ. Specifically, the given gate is first rounded to an element of Z[1/2, i] with a precision determined by ɛ, and then exact synthesis is employed to produce the resulting gate. It is known that this procedure is optimal in approximating an arbitrary single qubit gate. Our program, written in Matlab and Python, can complete both approximate and exact synthesis of qubits. It can be used to assist in the experimental implementation of an arbitrary fault-tolerant single qubit gate, for which direct implementation isn't feasible.
Demonstration of a small programmable quantum computer with atomic qubits
NASA Astrophysics Data System (ADS)
Debnath, S.; Linke, N. M.; Figgatt, C.; Landsman, K. A.; Wright, K.; Monroe, C.
2016-08-01
Quantum computers can solve certain problems more efficiently than any possible conventional computer. Small quantum algorithms have been demonstrated on multiple quantum computing platforms, many specifically tailored in hardware to implement a particular algorithm or execute a limited number of computational paths. Here we demonstrate a five-qubit trapped-ion quantum computer that can be programmed in software to implement arbitrary quantum algorithms by executing any sequence of universal quantum logic gates. We compile algorithms into a fully connected set of gate operations that are native to the hardware and have a mean fidelity of 98 per cent. Reconfiguring these gate sequences provides the flexibility to implement a variety of algorithms without altering the hardware. As examples, we implement the Deutsch-Jozsa and Bernstein-Vazirani algorithms with average success rates of 95 and 90 per cent, respectively. We also perform a coherent quantum Fourier transform on five trapped-ion qubits for phase estimation and period finding with average fidelities of 62 and 84 per cent, respectively. This small quantum computer can be scaled to larger numbers of qubits within a single register, and can be further expanded by connecting several such modules through ion shuttling or photonic quantum channels.
Demonstration of a small programmable quantum computer with atomic qubits.
Debnath, S; Linke, N M; Figgatt, C; Landsman, K A; Wright, K; Monroe, C
2016-08-04
Quantum computers can solve certain problems more efficiently than any possible conventional computer. Small quantum algorithms have been demonstrated on multiple quantum computing platforms, many specifically tailored in hardware to implement a particular algorithm or execute a limited number of computational paths. Here we demonstrate a five-qubit trapped-ion quantum computer that can be programmed in software to implement arbitrary quantum algorithms by executing any sequence of universal quantum logic gates. We compile algorithms into a fully connected set of gate operations that are native to the hardware and have a mean fidelity of 98 per cent. Reconfiguring these gate sequences provides the flexibility to implement a variety of algorithms without altering the hardware. As examples, we implement the Deutsch-Jozsa and Bernstein-Vazirani algorithms with average success rates of 95 and 90 per cent, respectively. We also perform a coherent quantum Fourier transform on five trapped-ion qubits for phase estimation and period finding with average fidelities of 62 and 84 per cent, respectively. This small quantum computer can be scaled to larger numbers of qubits within a single register, and can be further expanded by connecting several such modules through ion shuttling or photonic quantum channels.
High-performance spectral element algorithms and implementations.
Fischer, P. F.; Tufo, H. M.
1999-08-28
We describe the development and implementation of a spectral element code for multimillion gridpoint simulations of incompressible flows in general two- and three-dimensional domains. Parallel performance is present on up to 2048 nodes of the Intel ASCI-Red machine at Sandia.
Design and Implementation of VLSI Prime Factor Algorithm Processor.
1987-12-01
for A, 1i ’ 1 1 Fh Ai ,,r equjAtIInI. art, ( , 4 10 4,t’ 4 ( - /’ cr tht, (-arr% sur ma% akL, be represented as Figure 36 Carry Select Adder Blocking... Select Adder Blocking .......................................................... 81 Figure 37: ALU Adder Cell...ALU Logic Implementation............................................................ 81 viii J,.. in List of Figures (continued) Figure 36: Carry
Motion Cueing Algorithm Development: New Motion Cueing Program Implementation and Tuning
NASA Technical Reports Server (NTRS)
Houck, Jacob A. (Technical Monitor); Telban, Robert J.; Cardullo, Frank M.; Kelly, Lon C.
2005-01-01
A computer program has been developed for the purpose of driving the NASA Langley Research Center Visual Motion Simulator (VMS). This program includes two new motion cueing algorithms, the optimal algorithm and the nonlinear algorithm. A general description of the program is given along with a description and flowcharts for each cueing algorithm, and also descriptions and flowcharts for subroutines used with the algorithms. Common block variable listings and a program listing are also provided. The new cueing algorithms have a nonlinear gain algorithm implemented that scales each aircraft degree-of-freedom input with a third-order polynomial. A description of the nonlinear gain algorithm is given along with past tuning experience and procedures for tuning the gain coefficient sets for each degree-of-freedom to produce the desired piloted performance. This algorithm tuning will be needed when the nonlinear motion cueing algorithm is implemented on a new motion system in the Cockpit Motion Facility (CMF) at the NASA Langley Research Center.
FPGA implementation of vision algorithms for small autonomous robots
NASA Astrophysics Data System (ADS)
Anderson, J. D.; Lee, D. J.; Archibald, J. K.
2005-10-01
The use of on-board vision with small autonomous robots has been made possible by the advances in the field of Field Programmable Gate Array (FPGA) technology. By connecting a CMOS camera to an FPGA board, on-board vision has been used to reduce the computation time inherent in vision algorithms. The FPGA board allows the user to create custom hardware in a faster, safer, and more easily verifiable manner that decreases the computation time and allows the vision to be done in real-time. Real-time vision tasks for small autonomous robots include object tracking, obstacle detection and avoidance, and path planning. Competitions were created to demonstrate that our algorithms work with our small autonomous vehicles in dealing with these problems. These competitions include Mouse-Trapped-in-a-Box, where the robot has to detect the edges of a box that it is trapped in and move towards them without touching them; Obstacle Avoidance, where an obstacle is placed at any arbitrary point in front of the robot and the robot has to navigate itself around the obstacle; Canyon Following, where the robot has to move to the center of a canyon and follow the canyon walls trying to stay in the center; the Grand Challenge, where the robot had to navigate a hallway and return to its original position in a given amount of time; and Stereo Vision, where a separate robot had to catch tennis balls launched from an air powered cannon. Teams competed on each of these competitions that were designed for a graduate-level robotic vision class, and each team had to develop their own algorithm and hardware components. This paper discusses one team's approach to each of these problems.
[Osteoarthrosis: implementation of current diagnostic and therapeutic algorithms].
Meza-Reyes, Gilberto; Aldrete-Velasco, Jorge; Espinosa-Morales, Rolando; Torres-Roldán, Fernando; Díaz-Borjón, Alejandro; Robles-San Román, Manuel
2017-01-01
In the modern world, among the different clinical presentations of osteoarthritis, gonarthrosis and coxarthrosis exhibit the highest prevalence. In this paper, the characteristics of osteoarthritis and the different scales of assessment and classification of this pathology are exposed, to provide an exhibition of current evidence generated around diagnostic algorithms and treatment of osteoarthritis, with emphasis set out in the knee and hip, as these are the most frequent; a rational procedure for monitoring patients with osteoarthritis based on characteristic symptoms and the severity of the condition is also set. Finally, reference is made to the therapeutic benefits of the recent introduction of viscosupplementation with Hylan GF-20.
Design methodology for optimal hardware implementation of wavelet transform domain algorithms
NASA Astrophysics Data System (ADS)
Johnson-Bey, Charles; Mickens, Lisa P.
2005-05-01
The work presented in this paper lays the foundation for the development of an end-to-end system design methodology for implementing wavelet domain image/video processing algorithms in hardware using Xilinx field programmable gate arrays (FPGAs). With the integration of the Xilinx System Generator toolbox, this methodology will allow algorithm developers to design and implement their code using the familiar MATLAB/Simulink development environment. By using this methodology, algorithm developers will not be required to become proficient in the intricacies of hardware design, thus reducing the design cycle and time-to-market.
A Linac Simulation Code for Macro-Particles Tracking and Steering Algorithm Implementation
sun, yipeng
2012-05-03
In this paper, a linac simulation code written in Fortran90 is presented and several simulation examples are given. This code is optimized to implement linac alignment and steering algorithms, and evaluate the accelerator errors such as RF phase and acceleration gradient, quadrupole and BPM misalignment. It can track a single particle or a bunch of particles through normal linear accelerator elements such as quadrupole, RF cavity, dipole corrector and drift space. One-to-one steering algorithm and a global alignment (steering) algorithm are implemented in this code.
Clinical implementation and evaluation of the Acuros dose calculation algorithm.
Yan, Chenyu; Combine, Anthony G; Bednarz, Greg; Lalonde, Ronald J; Hu, Bin; Dickens, Kathy; Wynn, Raymond; Pavord, Daniel C; Saiful Huq, M
2017-09-01
The main aim of this study is to validate the Acuros XB dose calculation algorithm for a Varian Clinac iX linac in our clinics, and subsequently compare it with the wildely used AAA algorithm. The source models for both Acuros XB and AAA were configured by importing the same measured beam data into Eclipse treatment planning system. Both algorithms were validated by comparing calculated dose with measured dose on a homogeneous water phantom for field sizes ranging from 6 cm × 6 cm to 40 cm × 40 cm. Central axis and off-axis points with different depths were chosen for the comparison. In addition, the accuracy of Acuros was evaluated for wedge fields with wedge angles from 15 to 60°. Similarly, variable field sizes for an inhomogeneous phantom were chosen to validate the Acuros algorithm. In addition, doses calculated by Acuros and AAA at the center of lung equivalent tissue from three different VMAT plans were compared to the ion chamber measured doses in QUASAR phantom, and the calculated dose distributions by the two algorithms and their differences on patients were compared. Computation time on VMAT plans was also evaluated for Acuros and AAA. Differences between dose-to-water (calculated by AAA and Acuros XB) and dose-to-medium (calculated by Acuros XB) on patient plans were compared and evaluated. For open 6 MV photon beams on the homogeneous water phantom, both Acuros XB and AAA calculations were within 1% of measurements. For 23 MV photon beams, the calculated doses were within 1.5% of measured doses for Acuros XB and 2% for AAA. Testing on the inhomogeneous phantom demonstrated that AAA overestimated doses by up to 8.96% at a point close to lung/solid water interface, while Acuros XB reduced that to 1.64%. The test on QUASAR phantom showed that Acuros achieved better agreement in lung equivalent tissue while AAA underestimated dose for all VMAT plans by up to 2.7%. Acuros XB computation time was about three times faster than AAA for VMAT plans, and
Implementation of an institution-wide acute stroke algorithm: Improving stroke quality metrics
Zuckerman, Scott L.; Magarik, Jordan A.; Espaillat, Kiersten B.; Kumar, Nishant Ganesh; Bhatia, Ritwik; Dewan, Michael C.; Morone, Peter J.; Hermann, Lisa D.; O’Duffy, Anne E.; Riebau, Derek A.; Kirshner, Howard S.; Mocco, J.
2016-01-01
Background: In May 2012, an updated stroke algorithm was implemented at Vanderbilt University Medical Center. The current study objectives were to: (1) describe the process of implementing a new stroke algorithm and (2) compare pre- and post-algorithm quality improvement (QI) metrics, specificaly door to computed tomography time (DTCT), door to neurology time (DTN), and door to tPA administration time (DTT). Methods: Our institutional stroke algorithm underwent extensive revision, with a focus on removing variability, streamlining care, and improving time delays. The updated stroke algorithm was implemented in May 2012. Three primary stroke QI metrics were evaluated over four separate 3-month time points, one pre- and three post-algorithm periods. Results: The following data points improved after algorithm implementation: average DTCT decreased from 39.9 to 12.8 min (P < 0.001); average DTN decreased from 34.1 to 8.2 min (P ≤ 0.001), and average DTT decreased from 62.5 to 43.5 min (P = 0.17). Conclusion: A new stroke protocol that prioritized neurointervention at our institution resulted in significant lowering in the DTCT and DTN, with a nonsignificant improvement in DTT. PMID:28144480
Implementation and evaluation of ILLIAC 4 algorithms for multispectral image processing
NASA Technical Reports Server (NTRS)
Swain, P. H.
1974-01-01
Data concerning a multidisciplinary and multi-organizational effort to implement multispectral data analysis algorithms on a revolutionary computer, the Illiac 4, are reported. The effectiveness and efficiency of implementing the digital multispectral data analysis techniques for producing useful land use classifications from satellite collected data were demonstrated.
Implementation of the SITAN algorithm in the digital terrain management and display system
Cambron, T.M.; Snyder, F.B.; Fellerhoff, J.R.
1985-01-01
This paper describes the functional methodologies and development processes used to integrate the SITAN autonomous navigation algorithm with the Digital Terrain Management and Display System (DTMDS) for real-time demonstrations aboard the AFTI/F-16 aircraft. Heretofore, the SITAN algorithm has not been implemented for real-time operation aboard a military aircraft. The paper describes the implementation design of the DTMDS and how the elevation data base supported by the digital map generator subsystem is made available to the SITAN algorithm. The effects of aircraft motion as related to SITAN algorithm timing and processor loading are evaluated. The closed-loop implementation with the AFTI aircraft inertial navigation system (INS) was specifically selected for the initial demonstration.
A GPU-paralleled implementation of an enhanced face recognition algorithm
NASA Astrophysics Data System (ADS)
Chen, Hao; Liu, Xiyang; Shao, Shuai; Zan, Jiguo
2013-03-01
Face recognition algorithm based on compressed sensing and sparse representation is hotly argued in these years. The scheme of this algorithm increases recognition rate as well as anti-noise capability. However, the computational cost is expensive and has become a main restricting factor for real world applications. In this paper, we introduce a GPU-accelerated hybrid variant of face recognition algorithm named parallel face recognition algorithm (pFRA). We describe here how to carry out parallel optimization design to take full advantage of many-core structure of a GPU. The pFRA is tested and compared with several other implementations under different data sample size. Finally, Our pFRA, implemented with NVIDIA GPU and Computer Unified Device Architecture (CUDA) programming model, achieves a significant speedup over the traditional CPU implementations.
Design and FPGA implementation of real-time automatic image enhancement algorithm
NASA Astrophysics Data System (ADS)
Dong, GuoWei; Hou, ZuoXun; Tang, Qi; Pan, Zheng; Li, Xin
2016-11-01
In order to improve image processing quality and boost processing rate, this paper proposes an real-time automatic image enhancement algorithm. It is based on the histogram equalization algorithm and the piecewise linear enhancement algorithm, and it calculate the relationship of the histogram and the piecewise linear function by analyzing the histogram distribution for adaptive image enhancement. Furthermore, the corresponding FPGA processing modules are designed to implement the methods. Especially, the high-performance parallel pipelined technology and inner potential parallel processing ability of the modules are paid more attention to ensure the real-time processing ability of the complete system. The simulations and the experimentations show that the algorithm is based on the design and implementation of FPGA hardware circuit less cost on hardware, high real-time performance, the good processing performance in different sceneries. The algorithm can effectively improve the image quality, and would have wide prospect on imaging processing field.
de Paula, Lauro C. M.; Soares, Anderson S.; de Lima, Telma W.; Delbem, Alexandre C. B.; Coelho, Clarimar J.; Filho, Arlindo R. G.
2014-01-01
Several variable selection algorithms in multivariate calibration can be accelerated using Graphics Processing Units (GPU). Among these algorithms, the Firefly Algorithm (FA) is a recent proposed metaheuristic that may be used for variable selection. This paper presents a GPU-based FA (FA-MLR) with multiobjective formulation for variable selection in multivariate calibration problems and compares it with some traditional sequential algorithms in the literature. The advantage of the proposed implementation is demonstrated in an example involving a relatively large number of variables. The results showed that the FA-MLR, in comparison with the traditional algorithms is a more suitable choice and a relevant contribution for the variable selection problem. Additionally, the results also demonstrated that the FA-MLR performed in a GPU can be five times faster than its sequential implementation. PMID:25493625
NASA Technical Reports Server (NTRS)
Delaat, J. C.
1984-01-01
An advanced, sensor failure detection, isolation, and accomodation algorithm has been developed by NASA for the F100 turbofan engine. The algorithm takes advantage of the analytical redundancy of the sensors to improve the reliability of the sensor set. The method requires the controls computer, to determine when a sensor failure has occurred without the help of redundant hardware sensors in the control system. The controls computer provides an estimate of the correct value of the output of the failed sensor. The algorithm has been programmed in FORTRAN using a real-time microprocessor-based controls computer. A detailed description of the algorithm and its implementation on a microprocessor is given.
On distribution reduction and algorithm implementation in inconsistent ordered information systems.
Zhang, Yanqin
2014-01-01
As one part of our work in ordered information systems, distribution reduction is studied in inconsistent ordered information systems (OISs). Some important properties on distribution reduction are studied and discussed. The dominance matrix is restated for reduction acquisition in dominance relations based information systems. Matrix algorithm for distribution reduction acquisition is stepped. And program is implemented by the algorithm. The approach provides an effective tool for the theoretical research and the applications for ordered information systems in practices. For more detailed and valid illustrations, cases are employed to explain and verify the algorithm and the program which shows the effectiveness of the algorithm in complicated information systems.
On Distribution Reduction and Algorithm Implementation in Inconsistent Ordered Information Systems
Zhang, Yanqin
2014-01-01
As one part of our work in ordered information systems, distribution reduction is studied in inconsistent ordered information systems (OISs). Some important properties on distribution reduction are studied and discussed. The dominance matrix is restated for reduction acquisition in dominance relations based information systems. Matrix algorithm for distribution reduction acquisition is stepped. And program is implemented by the algorithm. The approach provides an effective tool for the theoretical research and the applications for ordered information systems in practices. For more detailed and valid illustrations, cases are employed to explain and verify the algorithm and the program which shows the effectiveness of the algorithm in complicated information systems. PMID:25258721
Implementation and performance of a domain decomposition algorithm in Sisal
DeBoni, T.; Feo, J.; Rodrigue, G.; Muller, J.
1993-09-23
Sisal is a general-purpose functional language that hides the complexity of parallel processing, expedites parallel program development, and guarantees determinacy. Parallelism and management of concurrent tasks are realized automatically by the compiler and runtime system. Spatial domain decomposition is a widely-used method that focuses computational resources on the most active, or important, areas of a domain. Many complex programming issues are introduced in paralleling this method including: dynamic spatial refinement, dynamic grid partitioning and fusion, task distribution, data distribution, and load balancing. In this paper, we describe a spatial domain decomposition algorithm programmed in Sisal. We explain the compilation process, and present the execution performance of the resultant code on two different multiprocessor systems: a multiprocessor vector supercomputer, and cache-coherent scalar multiprocessor.
Implementation of jump-diffusion algorithms for understanding FLIR scenes
NASA Astrophysics Data System (ADS)
Lanterman, Aaron D.; Miller, Michael I.; Snyder, Donald L.
1995-07-01
Our pattern theoretic approach to the automated understanding of forward-looking infrared (FLIR) images brings the traditionally separate endeavors of detection, tracking, and recognition together into a unified jump-diffusion process. New objects are detected and object types are recognized through discrete jump moves. Between jumps, the location and orientation of objects are estimated via continuous diffusions. An hypothesized scene, simulated from the emissive characteristics of the hypothesized scene elements, is compared with the collected data by a likelihood function based on sensor statistics. This likelihood is combined with a prior distribution defined over the set of possible scenes to form a posterior distribution. The jump-diffusion process empirically generates the posterior distribution. Both the diffusion and jump operations involve the simulation of a scene produced by a hypothesized configuration. Scene simulation is most effectively accomplished by pipelined rendering engines such as silicon graphics. We demonstrate the execution of our algorithm on a silicon graphics onyx/reality engine.
Implementation of a partitioned algorithm for simulation of large CSI problems
NASA Technical Reports Server (NTRS)
Alvin, Kenneth F.; Park, K. C.
1991-01-01
The implementation of a partitioned numerical algorithm for determining the dynamic response of coupled structure/controller/estimator finite-dimensional systems is reviewed. The partitioned approach leads to a set of coupled first and second-order linear differential equations which are numerically integrated with extrapolation and implicit step methods. The present software implementation, ACSIS, utilizes parallel processing techniques at various levels to optimize performance on a shared-memory concurrent/vector processing system. A general procedure for the design of controller and filter gains is also implemented, which utilizes the vibration characteristics of the structure to be solved. Also presented are: example problems; a user's guide to the software; the procedures and algorithm scripts; a stability analysis for the algorithm; and the source code for the parallel implementation.
Implementation on a nonlinear concrete cracking algorithm in NASTRAN
NASA Technical Reports Server (NTRS)
Herting, D. N.; Herendeen, D. L.; Hoesly, R. L.; Chang, H.
1976-01-01
A computer code for the analysis of reinforced concrete structures was developed using NASTRAN as a basis. Nonlinear iteration procedures were developed for obtaining solutions with a wide variety of loading sequences. A direct access file system was used to save results at each load step to restart within the solution module for further analysis. A multi-nested looping capability was implemented to control the iterations and change the loads. The basis for the analysis is a set of mutli-layer plate elements which allow local definition of materials and cracking properties.
Implementation of 128 bits Camellia Algorithm for Cryptography in Digital Image
NASA Astrophysics Data System (ADS)
Satrio Waluyo Poetro, Bagus
2017-04-01
Nowadays information technology requires stronger cryptographic algorithms. Camellia algorithm is also known for its suitability in terms of the implementation of both software and hardware as well as a high level of safety. The digital image is an image f (x, y) which having the spatial coordinates, and brightness levels are discrete. Unlike text messages, the image data has special features such as high redundancy and a high correlation between pixels. This research conducted a cryptographic process of the digital image using the Camellia algorithm. Comparisons were made on three digital image format .bmp, .jpg, .png with 128 bits key block Camellia algorithm. Results shows that Camellia cryptographic algorithms in digital image can successfully produce encrypted images. In addition, the same algorithm can also reproduce the image when decryption process.
Implementation of an Algorithm for Prosthetic Joint Infection: Deviations and Problems.
Mühlhofer, Heinrich M L; Kanz, Karl-Georg; Pohlig, Florian; Lenze, Ulrich; Lenze, Florian; Toepfer, Andreas; von Eisenhart-Rothe, Ruediger; Schauwecker, Johannes
The outcome of revision surgery in arthroplasty is based on a precise diagnosis. In addition, the treatment varies based on whether the prosthetic failure is caused by aseptic or septic loosening. Algorithms can help to identify periprosthetic joint infections (PJI) and standardize diagnostic steps, however, algorithms tend to oversimplify the treatment of complex cases. We conducted a process analysis during the implementation of a PJI algorithm to determine problems and deviations associated with the implementation of this algorithm. Fifty patients who were treated after implementing a standardized algorithm were monitored retrospectively. Their treatment plans and diagnostic cascades were analyzed for deviations from the implemented algorithm. Each diagnostic procedure was recorded, compared with the algorithm, and evaluated statistically. We detected 52 deviations while treating 50 patients. In 25 cases, no discrepancy was observed. Synovial fluid aspiration was not performed in 31.8% of patients (95% confidence interval [CI], 18.1%-45.6%), while white blood cell counts (WBCs) and neutrophil differentiation were assessed in 54.5% of patients (95% CI, 39.8%-69.3%). We also observed that the prolonged incubation of cultures was not requested in 13.6% of patients (95% CI, 3.5%-23.8%). In seven of 13 cases (63.6%; 95% CI, 35.2%-92.1%), arthroscopic biopsy was performed; 6 arthroscopies were performed in discordance with the algorithm (12%; 95% CI, 3%-21%). Self-critical analysis of diagnostic processes and monitoring of deviations using algorithms are important and could increase the quality of treatment by revealing recurring faults.
Implementing embedded artificial intelligence rules within algorithmic programming languages
NASA Technical Reports Server (NTRS)
Feyock, Stefan
1988-01-01
Most integrations of artificial intelligence (AI) capabilities with non-AI (usually FORTRAN-based) application programs require the latter to execute separately to run as a subprogram or, at best, as a coroutine, of the AI system. In many cases, this organization is unacceptable; instead, the requirement is for an AI facility that runs in embedded mode; i.e., is called as subprogram by the application program. The design and implementation of a Prolog-based AI capability that can be invoked in embedded mode are described. The significance of this system is twofold: Provision of Prolog-based symbol-manipulation and deduction facilities makes a powerful symbolic reasoning mechanism available to applications programs written in non-AI languages. The power of the deductive and non-procedural descriptive capabilities of Prolog, which allow the user to describe the problem to be solved, rather than the solution, is to a large extent vitiated by the absence of the standard control structures provided by other languages. Embedding invocations of Prolog rule bases in programs written in non-AI languages makes it possible to put Prolog calls inside DO loops and similar control constructs. The resulting merger of non-AI and AI languages thus results in a symbiotic system in which the advantages of both programming systems are retained, and their deficiencies largely remedied.
Clinical implementation of a neonatal seizure detection algorithm.
Temko, Andriy; Marnane, William; Boylan, Geraldine; Lightbody, Gordon
2015-02-01
Technologies for automated detection of neonatal seizures are gradually moving towards cot-side implementation. The aim of this paper is to present different ways to visualize the output of a neonatal seizure detection system and analyse their influence on performance in a clinical environment. Three different ways to visualize the detector output are considered: a binary output, a probabilistic trace, and a spatio-temporal colormap of seizure observability. As an alternative to visual aids, audified neonatal EEG is also considered. Additionally, a survey on the usefulness and accuracy of the presented methods has been performed among clinical personnel. The main advantages and disadvantages of the presented methods are discussed. The connection between information visualization and different methods to compute conventional metrics is established. The results of the visualization methods along with the system validation results indicate that the developed neonatal seizure detector with its current level of performance would unambiguously be of benefit to clinicians as a decision support system. The results of the survey suggest that a suitable way to visualize the output of neonatal seizure detection systems in a clinical environment is a combination of a binary output and a probabilistic trace. The main healthcare benefits of the tool are outlined. The decision support system with the chosen visualization interface is currently undergoing pre-market European multi-centre clinical investigation to support its regulatory approval and clinical adoption.
Quantum computation: algorithms and implementation in quantum dot devices
NASA Astrophysics Data System (ADS)
Gamble, John King
In this thesis, we explore several aspects of both the software and hardware of quantum computation. First, we examine the computational power of multi-particle quantum random walks in terms of distinguishing mathematical graphs. We study both interacting and non-interacting multi-particle walks on strongly regular graphs, proving some limitations on distinguishing powers and presenting extensive numerical evidence indicative of interactions providing more distinguishing power. We then study the recently proposed adiabatic quantum algorithm for Google PageRank, and show that it exhibits power-law scaling for realistic WWW-like graphs. Turning to hardware, we next analyze the thermal physics of two nearby 2D electron gas (2DEG), and show that an analogue of the Coulomb drag effect exists for heat transfer. In some distance and temperature, this heat transfer is more significant than phonon dissipation channels. After that, we study the dephasing of two-electron states in a single silicon quantum dot. Specifically, we consider dephasing due to the electron-phonon coupling and charge noise, separately treating orbital and valley excitations. In an ideal system, dephasing due to charge noise is strongly suppressed due to a vanishing dipole moment. However, introduction of disorder or anharmonicity leads to large effective dipole moments, and hence possibly strong dephasing. Building on this work, we next consider more realistic systems, including structural disorder systems. We present experiment and theory, which demonstrate energy levels that vary with quantum dot translation, implying a structurally disordered system. Finally, we turn to the issues of valley mixing and valley-orbit hybridization, which occurs due to atomic-scale disorder at quantum well interfaces. We develop a new theoretical approach to study these effects, which we name the disorder-expansion technique. We demonstrate that this method successfully reproduces atomistic tight-binding techniques
An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices, part 1
NASA Technical Reports Server (NTRS)
Freund, Roland W.; Gutknecht, Martin H.; Nachtigal, Noel M.
1990-01-01
The nonsymmetric Lanczos method can be used to compute eigenvalues of large sparse non-Hermitian matrices or to solve large sparse non-Hermitian linear systems. However, the original Lanczos algorithm is susceptible to possible breakdowns and potential instabilities. We present an implementation of a look-ahead version of the Lanczos algorithm which overcomes these problems by skipping over those steps in which a breakdown or near-breakdown would occur in the standard process. The proposed algorithm can handle look-ahead steps of any length and is not restricted to steps of length 2, as earlier implementations are. Also, our implementation has the feature that it requires roughly the same number of inner products as the standard Lanczos process without look-ahead.
NASA Astrophysics Data System (ADS)
Yerkes, Christopher R.; Webster, Eric D.
1994-06-01
Advanced algorithms for synthetic aperture radar (SAR) imaging have in the past required computing capabilities only available from high performance special purpose hardware. Such architectures have tended to have short life cycles with respect to development expense. Current generation Massively Parallel Processors (MPP) are offering high performance capabilities necessary for such applications with both a scalable architecture and a longer projected life cycle. In this paper we explore issues associated with implementation of a SAR imaging algorithm on a mesh configured MPP architecture.
AlgoRun: a Docker-based packaging system for platform-agnostic implemented algorithms.
Hosny, Abdelrahman; Vera-Licona, Paola; Laubenbacher, Reinhard; Favre, Thibauld
2016-08-01
There is a growing need in bioinformatics for easy-to-use software implementations of algorithms that are usable across platforms. At the same time, reproducibility of computational results is critical and often a challenge due to source code changes over time and dependencies. The approach introduced in this paper addresses both of these needs with AlgoRun, a dedicated packaging system for implemented algorithms, using Docker technology. Implemented algorithms, packaged with AlgoRun, can be executed through a user-friendly interface directly from a web browser or via a standardized RESTful web API to allow easy integration into more complex workflows. The packaged algorithm includes the entire software execution environment, thereby eliminating the common problem of software dependencies and the irreproducibility of computations over time. AlgoRun-packaged algorithms can be published on http://algorun.org, a centralized searchable directory to find existing AlgoRun-packaged algorithms. AlgoRun is available at http://algorun.org and the source code under GPL license is available at https://github.com/algorun laubenbacher@uchc.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
An improved non-uniformity correction algorithm and its hardware implementation on FPGA
NASA Astrophysics Data System (ADS)
Rong, Shenghui; Zhou, Huixin; Wen, Zhigang; Qin, Hanlin; Qian, Kun; Cheng, Kuanhong
2017-09-01
The Non-uniformity of Infrared Focal Plane Arrays (IRFPA) severely degrades the infrared image quality. An effective non-uniformity correction (NUC) algorithm is necessary for an IRFPA imaging and application system. However traditional scene-based NUC algorithm suffers the image blurring and artificial ghosting. In addition, few effective hardware platforms have been proposed to implement corresponding NUC algorithms. Thus, this paper proposed an improved neural-network based NUC algorithm by the guided image filter and the projection-based motion detection algorithm. First, the guided image filter is utilized to achieve the accurate desired image to decrease the artificial ghosting. Then a projection-based moving detection algorithm is utilized to determine whether the correction coefficients should be updated or not. In this way the problem of image blurring can be overcome. At last, an FPGA-based hardware design is introduced to realize the proposed NUC algorithm. A real and a simulated infrared image sequences are utilized to verify the performance of the proposed algorithm. Experimental results indicated that the proposed NUC algorithm can effectively eliminate the fix pattern noise with less image blurring and artificial ghosting. The proposed hardware design takes less logic elements in FPGA and spends less clock cycles to process one frame of image.
NASA Astrophysics Data System (ADS)
Tsukahara, Hiroshi; Iwano, Kaoru; Mitsumata, Chiharu; Ishikawa, Tadashi; Ono, Kanta
2016-10-01
We implement low communication frequency three-dimensional fast Fourier transform algorithms on micromagnetics simulator for calculations of a magnetostatic field which occupies a significant portion of large-scale micromagnetics simulation. This fast Fourier transform algorithm reduces the frequency of all-to-all communications from six to two times. Simulation times with our simulator show high scalability in parallelization, even if we perform the micromagnetics simulation using 32 768 physical computing cores. This low communication frequency fast Fourier transform algorithm enables world largest class micromagnetics simulations to be carried out with over one billion calculation cells.
Odd-graceful labeling algorithm and its implementation of generalized ring core network
NASA Astrophysics Data System (ADS)
Xie, Jianmin; Hong, Wenmei; Zhao, Tinggang; Yao, Bing
2017-08-01
The computer implementation of some labeling algorithms of special networks has practical guiding significance to computer communication network system design of functional, reliability, low communication cost. Generalized ring core network is a very important hybrid network topology structure and it is the basis of generalized ring network. In this paper, based on the requirements of research of generalized ring network addressing, the author has designed the odd-graceful labeling algorithm of generalized ring core network when n1, n2,…nm ≡ 0(mod 4), proved odd-graceful of the structure, worked out the corresponding software, and shown the practical effectiveness of this algorithm with our experimental data.
NASA Astrophysics Data System (ADS)
Lu, Dawei; Zhu, Jing; Zou, Ping; Peng, Xinhua; Yu, Yihua; Zhang, Shanmin; Chen, Qun; Du, Jiangfeng
2010-02-01
An important quantum search algorithm based on the quantum random walk performs an oracle search on a database of N items with O(phN) calls, yielding a speedup similar to the Grover quantum search algorithm. The algorithm was implemented on a quantum information processor of three-qubit liquid-crystal nuclear magnetic resonance (NMR) in the case of finding 1 out of 4, and the diagonal elements’ tomography of all the final density matrices was completed with comprehensible one-dimensional NMR spectra. The experimental results agree well with the theoretical predictions.
Experimental implementation of Hogg's algorithm on a three-quantum-bit NMR quantum computer
NASA Astrophysics Data System (ADS)
Peng, Xinhua; Zhu, Xiwen; Fang, Ximing; Feng, Mang; Liu, Maili; Gao, Kelin
2002-04-01
Using nuclear magnetic resonance (NMR) techniques with a three-qubit sample, we have experimentally implemented the highly structured algorithm for the satisfiability problem with one variable in each clause proposed by Hogg. A simplified temporal averaging procedure was employed to prepare the three-qubit pseudopure state. The algorithm was completed with only a single evaluation of the structure of the problem and the solutions were found theoretically with probability 100%, results that outperform both unstructured quantum and the best classical search algorithms. However, about 90% of the corresponding experimental fidelities can be attributed to the imperfections of manipulations.
Lu Dawei; Peng Xinhua; Du Jiangfeng; Zhu Jing; Zou Ping; Yu Yihua; Zhang Shanmin; Chen Qun
2010-02-15
An important quantum search algorithm based on the quantum random walk performs an oracle search on a database of N items with O({radical}(phN)) calls, yielding a speedup similar to the Grover quantum search algorithm. The algorithm was implemented on a quantum information processor of three-qubit liquid-crystal nuclear magnetic resonance (NMR) in the case of finding 1 out of 4, and the diagonal elements' tomography of all the final density matrices was completed with comprehensible one-dimensional NMR spectra. The experimental results agree well with the theoretical predictions.
Implementation of a robust 2400 b/s LPC algorithm for operation in noisy environments
NASA Astrophysics Data System (ADS)
Singer, Elliot; Tierney, Joseph
1987-04-01
A detailed description of the implementation of a robust 2400 b/s LPC algorithm is presented. The algorithm was developed to improve vocoder performance in acoustically compromised environments. Improved robustness in noise is achieved by: (1) increasing the speech bandwidth to 5 kHz; (2) increasing the LPC model order to 12; and (3) doubling the analysis rate. Frame fill techniques are used to achieve the 2400 b/s data rate. The algorithm is embodied in the Advanced Linear Predictive Coding Microprocessor which was developed as a prototype voice processor for in-flight evaluation of narrowband voice communication in the JTIDS communication system.
NASA Astrophysics Data System (ADS)
Turner, Giles W.; Tedesco, Emilio; Harris, Kenneth D. M.; Johnston, Roy L.; Kariuki, Benson M.
2000-04-01
Previous implementations of Genetic Algorithms in direct-space strategies for structure solution from powder diffraction data have employed the operations of mating, mutation and natural selection, with the fitness of each structure based on comparison between calculated and experimental powder diffraction patterns (we define fitness as a function of weighted-profile R-factor Rwp). We report an extension to this method, in which each structure generated in the Genetic Algorithm is subjected to local minimization of Rwp with respect to structural variables. This approach represents an implementation of Lamarckian concepts of evolution, and is found to give significant improvements in efficiency and reliability.
Mühlbacher, Thomas; Piringer, Harald; Gratzl, Samuel; Sedlmair, Michael; Streit, Marc
2014-12-01
An increasing number of interactive visualization tools stress the integration with computational software like MATLAB and R to access a variety of proven algorithms. In many cases, however, the algorithms are used as black boxes that run to completion in isolation which contradicts the needs of interactive data exploration. This paper structures, formalizes, and discusses possibilities to enable user involvement in ongoing computations. Based on a structured characterization of needs regarding intermediate feedback and control, the main contribution is a formalization and comparison of strategies for achieving user involvement for algorithms with different characteristics. In the context of integration, we describe considerations for implementing these strategies either as part of the visualization tool or as part of the algorithm, and we identify requirements and guidelines for the design of algorithmic APIs. To assess the practical applicability, we provide a survey of frequently used algorithm implementations within R regarding the fulfillment of these guidelines. While echoing previous calls for analysis modules which support data exploration more directly, we conclude that a range of pragmatic options for enabling user involvement in ongoing computations exists on both the visualization and algorithm side and should be used.
Linear array implementation of the EM algorithm for PET image reconstruction
Rajan, K.; Patnaik, L.M.; Ramakrishna, J.
1995-08-01
The PET image reconstruction based on the EM algorithm has several attractive advantages over the conventional convolution back projection algorithms. However, the PET image reconstruction based on the EM algorithm is computationally burdensome for today`s single processor systems. In addition, a large memory is required for the storage of the image, projection data, and the probability matrix. Since the computations are easily divided into tasks executable in parallel, multiprocessor configurations are the ideal choice for fast execution of the EM algorithms. In tis study, the authors attempt to overcome these two problems by parallelizing the EM algorithm on a multiprocessor systems. The parallel EM algorithm on a linear array topology using the commercially available fast floating point digital signal processor (DSP) chips as the processing elements (PE`s) has been implemented. The performance of the EM algorithm on a 386/387 machine, IBM 6000 RISC workstation, and on the linear array system is discussed and compared. The results show that the computational speed performance of a linear array using 8 DSP chips as PE`s executing the EM image reconstruction algorithm is about 15.5 times better than that of the IBM 6000 RISC workstation. The novelty of the scheme is its simplicity. The linear array topology is expandable with a larger number of PE`s. The architecture is not dependant on the DSP chip chosen, and the substitution of the latest DSP chip is straightforward and could yield better speed performance.
Implementation of a new segmentation algorithm using the Eye-RIS CMOS vision system
NASA Astrophysics Data System (ADS)
Karabiber, Fethullah; Arena, Paolo; De Fiore, Sebastiano; Vagliasindi, Guido; Fortuna, Luigi; Arik, Sabri
2009-05-01
Segmentation is the process of representing a digital image into multiple meaningful regions. Since these applications require more computational power in real time applications, we have implemented a new segmentation algorithm using the capabilities of Eye-RIS Vision System to execute the algorithm in very short time. The segmentation algorithm is implemented mainly in three steps. In the first step, which is pre-processing step, the images are acquired and noise filtering through Gaussian function is performed. In the second step, Sobel operators based edge detection approach is implemented on the system. In the last step, morphologic and logic operations are used to segment the images as post processing. The experimental results performed for different images show the accuracy of the proposed segmentation algorithm. Visual inspection and timing analysis (7.83 ms, 127 frame/sec) prove that the proposed segmentation algorithm can be executed for real time video processing applications. Also, these results prove the capability of Eye-RIS Vision System for real time image processing applications
NASA Astrophysics Data System (ADS)
Obara, Lukasz; Żarnecki, Aleksander Filip
2015-09-01
Pi of the Sky is a system of wide field-of-view robotic telescopes, which search for short timescale astrophysical phenomena, especially for prompt optical GRB emission. The system was designed for autonomous operation, monitoring a large fraction of the sky with 12m-13m range and time resolution of the order of 1 - 100 seconds. LUIZA is a dedicated framework developed for efficient off-line processing of the Pi of the Sky data, implemented in C++. The photometric algorithm based on ASAS photometry was implemented in LUIZA and compared with the algorithm based on the pixel cluster reconstruction and simple aperture photometry algorithm. Optimized photometry algorithms were then applied to the sample of test images, which were modified to include different patterns of variability of the stars (training sample). Different statistical estimators are considered for developing the general variable star identification algorithm. The algorithm will then be used to search for short-period variable stars in the real data.
Implementation and Performance of a Binary Lattice Gas Algorithm on Parallel Processor Systems
NASA Astrophysics Data System (ADS)
Hayot, F.; Mandal, M.; Sadayappan, P.
1989-02-01
We study the performance of a lattice gas binary algorithm on a "real arithmetic" machine, a 32 processor INTEL iPSC hypercube. The implementation is based on so-called multi-spin coding techniques. From the measured performance we extrapolate to larger and more powerful parallel systems. Comparisons are made with "bit" machines, such as the parallel Connection Machine.
Implementation and evaluation of the new wind algorithm in NASA's 50 MHz doppler radar wind profiler
NASA Technical Reports Server (NTRS)
Taylor, Gregory E.; Manobianco, John T.; Schumann, Robin S.; Wheeler, Mark M.; Yersavich, Ann M.
1993-01-01
The purpose of this report is to document the Applied Meteorology Unit's implementation and evaluation of the wind algorithm developed by Marshall Space Flight Center (MSFC) on the data analysis processor (DAP) of NASA's 50 MHz doppler radar wind profiler (DRWP). The report also includes a summary of the 50 MHz DRWP characteristics and performance and a proposed concept of operations for the DRWP.
Electromagnetic Interactions GEneRalized (EIGER): Algorithm abstraction and HPC implementation
Sharpe, R.M.; Grant, J.B.; Champagne, N.J.; Wilton, D.R.; Jackson, D.R.; Johnson, W.A.; Jorgensen, R.E.; Rockway, J.W.; Manry, C.W.
1998-06-01
Modern software development methods combined with key generalizations of standard computational algorithms enable the development of a new class of electromagnetic modeling tools. This paper describes current and anticipated capabilities of a frequency domain modeling code, EIGER, which has an extremely wide range of applicability. In addition, software implementation methods and high performance computing issues are discussed.
Electromagnetic interactions GEneRalized (EIGER): algorithm abstraction and HPC implementation
Sharpe, R.M., LLNL
1998-04-21
Modern software development methods combined with key generalizations of standard computational algorithms enable the development of a new class of electromagnetic modeling tools. This paper describes current and anticipated capabilities of a frequency domain modeling code, EIGER, which has an extremely wide range of applicability. In addition, software implementation methods and high performance computing issues are discussed.
ERIC Educational Resources Information Center
Auberry, Kathy; Cullen, Deborah
2016-01-01
Based on the results of the Surrogate Decision-Making Self Efficacy Scale (Lopez, 2009a), this study sought to determine whether nurses working in the field of intellectual disability (ID) experience increased confidence when they implemented the American Association of Neuroscience Nurses (AANN) Seizure Algorithm during telephone triage. The…
ERIC Educational Resources Information Center
Auberry, Kathy; Cullen, Deborah
2016-01-01
Based on the results of the Surrogate Decision-Making Self Efficacy Scale (Lopez, 2009a), this study sought to determine whether nurses working in the field of intellectual disability (ID) experience increased confidence when they implemented the American Association of Neuroscience Nurses (AANN) Seizure Algorithm during telephone triage. The…
Towards the Implementation of an Autonomous Camera Algorithm on the da Vinci Platform.
Eslamian, Shahab; Reisner, Luke A; King, Brady W; Pandya, Abhilash K
2016-01-01
Camera positioning is critical for all telerobotic surgical systems. Inadequate visualization of the remote site can lead to serious errors that can jeopardize the patient. An autonomous camera algorithm has been developed on a medical robot (da Vinci) simulator. It is found to be robust in key scenarios of operation. This system behaves with predictable and expected actions for the camera arm with respect to the tool positions. The implementation of this system is described herein. The simulation closely models the methodology needed to implement autonomous camera control in a real hardware system. The camera control algorithm follows three rules: (1) keep the view centered on the tools, (2) keep the zoom level optimized such that the tools never leave the field of view, and (3) avoid unnecessary movement of the camera that may distract/disorient the surgeon. Our future work will apply this algorithm to the real da Vinci hardware.
Evaluation of mass spectral library search algorithms implemented in commercial software.
Samokhin, Andrey; Sotnezova, Ksenia; Lashin, Vitaly; Revelsky, Igor
2015-06-01
Performance of several library search algorithms (against EI mass spectral databases) implemented in commercial software products ( acd/specdb, chemstation, gc/ms solution and ms search) was estimated. Test set contained 1000 mass spectra, which were randomly selected from NIST'08 (RepLib) mass spectral database. It was shown that composite (also known as identity) algorithm implemented in ms search (NIST) software gives statistically the best results: the correct compound occupied the first position in the list of possible candidates in 81% of cases; the correct compound was within the list of top ten candidates in 98% of cases. It was found that use of presearch option can lead to rejection of the correct answer from the list of possible candidates (therefore presearch option should not be used, if possible). Overall performance of library search algorithms was estimated using receiver operating characteristic curves.
Zhang, Aizhen; Wen, Ning; Nurushev, Teamour; Burmeister, Jay; Chetty, Indrin J
2013-03-04
A commercial electron Monte Carlo (eMC) dose calculation algorithm has become available in Eclipse treatment planning system. The purpose of this work was to evaluate the eMC algorithm and investigate the clinical implementation of this system. The beam modeling of the eMC algorithm was performed for beam energies of 6, 9, 12, 16, and 20 MeV for a Varian Trilogy and all available applicator sizes in the Eclipse treatment planning system. The accuracy of the eMC algorithm was evaluated in a homogeneous water phantom, solid water phantoms containing lung and bone materials, and an anthropomorphic phantom. In addition, dose calculation accuracy was compared between pencil beam (PB) and eMC algorithms in the same treatment planning system for heterogeneous phantoms. The overall agreement between eMC calculations and measurements was within 3%/2 mm, while the PB algorithm had large errors (up to 25%) in predicting dose distributions in the presence of inhomogeneities such as bone and lung. The clinical implementation of the eMC algorithm was investigated by performing treatment planning for 15 patients with lesions in the head and neck, breast, chest wall, and sternum. The dose distributions were calculated using PB and eMC algorithms with no smoothing and all three levels of 3D Gaussian smoothing for comparison. Based on a routine electron beam therapy prescription method, the number of eMC calculated monitor units (MUs) was found to increase with increased 3D Gaussian smoothing levels. 3D Gaussian smoothing greatly improved the visual usability of dose distributions and produced better target coverage. Differences of calculated MUs and dose distributions between eMC and PB algorithms could be significant when oblique beam incidence, surface irregularities, and heterogeneous tissues were present in the treatment plans. In our patient cases, monitor unit differences of up to 7% were observed between PB and eMC algorithms. Monitor unit calculations were also preformed
NASA Astrophysics Data System (ADS)
Dai, Liyi
2016-05-01
Stochastic optimization is a fundamental problem that finds applications in many areas including biological and cognitive sciences. The classical stochastic approximation algorithm for iterative stochastic optimization requires gradient information of the sample object function that is typically difficult to obtain in practice. Recently there has been renewed interests in derivative free approaches to stochastic optimization. In this paper, we examine the rates of convergence for the Kiefer-Wolfowitz algorithm and the mirror descent algorithm, by approximating gradient using finite differences generated through common random numbers. It is shown that the convergence of these algorithms can be accelerated by controlling the implementation of the finite differences. Particularly, it is shown that the rate can be increased to n-2/5 in general and to n-1/2, the best possible rate of stochastic approximation, in Monte Carlo optimization for a broad class of problems, in the iteration number n.
Design and Implementation of Hybrid CORDIC Algorithm Based on Phase Rotation Estimation for NCO
Zhang, Chaozhu; Han, Jinan; Li, Ke
2014-01-01
The numerical controlled oscillator has wide application in radar, digital receiver, and software radio system. Firstly, this paper introduces the traditional CORDIC algorithm. Then in order to improve computing speed and save resources, this paper proposes a kind of hybrid CORDIC algorithm based on phase rotation estimation applied in numerical controlled oscillator (NCO). Through estimating the direction of part phase rotation, the algorithm reduces part phase rotation and add-subtract unit, so that it decreases delay. Furthermore, the paper simulates and implements the numerical controlled oscillator by Quartus II software and Modelsim software. Finally, simulation results indicate that the improvement over traditional CORDIC algorithm is achieved in terms of ease of computation, resource utilization, and computing speed/delay while maintaining the precision. It is suitable for high speed and precision digital modulation and demodulation. PMID:25110750
Design and Implementation of Broadcast Algorithms for Extreme-Scale Systems
Shamis, Pavel; Graham, Richard L; Gorentla Venkata, Manjunath; Ladd, Joshua
2011-01-01
The scalability and performance of collective communication operations limit the scalability and performance of many scientific applications. This paper presents two new blocking and nonblocking Broadcast algorithms for communicators with arbitrary communication topology, and studies their performance. These algorithms benefit from increased concurrency and a reduced memory footprint, making them suitable for use on large-scale systems. Measuring small, medium, and large data Broadcasts on a Cray-XT5, using 24,576 MPI processes, the Cheetah algorithms outperform the native MPI on that system by 51%, 69%, and 9%, respectively, at the same process count. These results demonstrate an algorithmic approach to the implementation of the important class of collective communications, which is high performing, scalable, and also uses resources in a scalable manner.
Implementation of the Iterative Proportion Fitting Algorithm for Geostatistical Facies Modeling
Li Yupeng Deutsch, Clayton V.
2012-06-15
In geostatistics, most stochastic algorithm for simulation of categorical variables such as facies or rock types require a conditional probability distribution. The multivariate probability distribution of all the grouped locations including the unsampled location permits calculation of the conditional probability directly based on its definition. In this article, the iterative proportion fitting (IPF) algorithm is implemented to infer this multivariate probability. Using the IPF algorithm, the multivariate probability is obtained by iterative modification to an initial estimated multivariate probability using lower order bivariate probabilities as constraints. The imposed bivariate marginal probabilities are inferred from profiles along drill holes or wells. In the IPF process, a sparse matrix is used to calculate the marginal probabilities from the multivariate probability, which makes the iterative fitting more tractable and practical. This algorithm can be extended to higher order marginal probability constraints as used in multiple point statistics. The theoretical framework is developed and illustrated with estimation and simulation example.
Quantum Algorithm for Universal Implementation of the Projective Measurement of Energy
NASA Astrophysics Data System (ADS)
Nakayama, Shojun; Soeda, Akihito; Murao, Mio
2015-05-01
A projective measurement of energy (PME) on a quantum system is a quantum measurement determined by the Hamiltonian of the system. PME protocols exist when the Hamiltonian is given in advance. Unknown Hamiltonians can be identified by quantum tomography, but the time cost to achieve a given accuracy increases exponentially with the size of the quantum system. In this Letter, we improve the time cost by adapting quantum phase estimation, an algorithm designed for computational problems, to measurements on physical systems. We present a PME protocol without quantum tomography for Hamiltonians whose dimension and energy scale are given but which are otherwise unknown. Our protocol implements a PME to arbitrary accuracy without any dimension dependence on its time cost. We also show that another computational quantum algorithm may be used for efficient estimation of the energy scale. These algorithms show that computational quantum algorithms, with suitable modifications, have applications beyond their original context.
Design and implementation of hybrid CORDIC algorithm based on phase rotation estimation for NCO.
Zhang, Chaozhu; Han, Jinan; Li, Ke
2014-01-01
The numerical controlled oscillator has wide application in radar, digital receiver, and software radio system. Firstly, this paper introduces the traditional CORDIC algorithm. Then in order to improve computing speed and save resources, this paper proposes a kind of hybrid CORDIC algorithm based on phase rotation estimation applied in numerical controlled oscillator (NCO). Through estimating the direction of part phase rotation, the algorithm reduces part phase rotation and add-subtract unit, so that it decreases delay. Furthermore, the paper simulates and implements the numerical controlled oscillator by Quartus II software and Modelsim software. Finally, simulation results indicate that the improvement over traditional CORDIC algorithm is achieved in terms of ease of computation, resource utilization, and computing speed/delay while maintaining the precision. It is suitable for high speed and precision digital modulation and demodulation.
Current Status of Multi-Angle Implementation of Atmospheric Correction (MAIAC) Algorithm
NASA Technical Reports Server (NTRS)
Lyapustin, A.; Wang, Y.
2011-01-01
A new Multi-Angle Implementation of Atmospheric Correction (MAIAC) algorithm has been developed for MODIS. MAIAC uses a time series and an image based rather than pixel-based processing to perform simultaneous retrievals of aerosol properties and surface bidirectional reflectance. It is a generic algorithm which works over all land surface types with the exception of snow. MAIAC has an internal Cloud Mask, a dynamic land-water-snow classification and a surface change mask which allows it to flexibly choose processing path over different surfaces. A distinct feature of MAIAC is a high 1 km resolution of aerosol retrievals including optical thickness and fine mode fraction, which is required in different applications including the air quality analysis. An overview of the algorithm, results of AERONET validation, and examples of comparison with MODIS Collection 5 aerosol product, including Deep Blue algorithm, will be presented for different parts of the world including continental USA, Persian Gulf region and India.
Quantum algorithm for universal implementation of the projective measurement of energy.
Nakayama, Shojun; Soeda, Akihito; Murao, Mio
2015-05-15
A projective measurement of energy (PME) on a quantum system is a quantum measurement determined by the Hamiltonian of the system. PME protocols exist when the Hamiltonian is given in advance. Unknown Hamiltonians can be identified by quantum tomography, but the time cost to achieve a given accuracy increases exponentially with the size of the quantum system. In this Letter, we improve the time cost by adapting quantum phase estimation, an algorithm designed for computational problems, to measurements on physical systems. We present a PME protocol without quantum tomography for Hamiltonians whose dimension and energy scale are given but which are otherwise unknown. Our protocol implements a PME to arbitrary accuracy without any dimension dependence on its time cost. We also show that another computational quantum algorithm may be used for efficient estimation of the energy scale. These algorithms show that computational quantum algorithms, with suitable modifications, have applications beyond their original context.
A complete implementation of the conjugate gradient algorithm on a reconfigurable supercomputer
Dubois, David H; Dubois, Andrew J; Connor, Carolyn M; Boorman, Thomas M; Poole, Stephen W
2008-01-01
The conjugate gradient is a prominent iterative method for solving systems of sparse linear equations. Large-scale scientific applications often utilize a conjugate gradient solver at their computational core. In this paper we present a field programmable gate array (FPGA) based implementation of a double precision, non-preconditioned, conjugate gradient solver for fmite-element or finite-difference methods. OUf work utilizes the SRC Computers, Inc. MAPStation hardware platform along with the 'Carte' software programming environment to ease the programming workload when working with the hybrid (CPUIFPGA) environment. The implementation is designed to handle large sparse matrices of up to order N x N where N <= 116,394, with up to 7 non-zero, 64-bit elements per sparse row. This implementation utilizes an optimized sparse matrix-vector multiply operation which is critical for obtaining high performance. Direct parallel implementations of loop unrolling and loop fusion are utilized to extract performance from the various vector/matrix operations. Rather than utilize the FPGA devices as function off-load accelerators, our implementation uses the FPGAs to implement the core conjugate gradient algorithm. Measured run-time performance data is presented comparing the FPGA implementation to a software-only version showing that the FPGA can outperform processors running up to 30x the clock rate. In conclusion we take a look at the new SRC-7 system and estimate the performance of this algorithm on that architecture.
NASA Astrophysics Data System (ADS)
Nemes, Csaba; Barcza, Gergely; Nagy, Zoltán; Legeza, Örs; Szolgay, Péter
2014-06-01
In the numerical analysis of strongly correlated quantum lattice models one of the leading algorithms developed to balance the size of the effective Hilbert space and the accuracy of the simulation is the density matrix renormalization group (DMRG) algorithm, in which the run-time is dominated by the iterative diagonalization of the Hamilton operator. As the most time-dominant step of the diagonalization can be expressed as a list of dense matrix operations, the DMRG is an appealing candidate to fully utilize the computing power residing in novel kilo-processor architectures. In the paper a smart hybrid CPU-GPU implementation is presented, which exploits the power of both CPU and GPU and tolerates problems exceeding the GPU memory size. Furthermore, a new CUDA kernel has been designed for asymmetric matrix-vector multiplication to accelerate the rest of the diagonalization. Besides the evaluation of the GPU implementation, the practical limits of an FPGA implementation are also discussed.
NASA Astrophysics Data System (ADS)
Das, Ranabir; Kumar, Anil
2004-10-01
Quantum information processing has been effectively demonstrated on a small number of qubits by nuclear magnetic resonance. An important subroutine in any computing is the readout of the output. "Spectral implementation" originally suggested by Z. L. Madi, R. Bruschweiler, and R. R. Ernst [J. Chem. Phys. 109, 10603 (1999)], provides an elegant method of readout with the use of an extra "observer" qubit. At the end of computation, detection of the observer qubit provides the output via the multiplet structure of its spectrum. In spectral implementation by two-dimensional experiment the observer qubit retains the memory of input state during computation, thereby providing correlated information on input and output, in the same spectrum. Spectral implementation of Grover's search algorithm, approximate quantum counting, a modified version of Berstein-Vazirani problem, and Hogg's algorithm are demonstrated here in three- and four-qubit systems.
On recursive least-squares filtering algorithms and implementations. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Hsieh, Shih-Fu
1990-01-01
In many real-time signal processing applications, fast and numerically stable algorithms for solving least-squares problems are necessary and important. In particular, under non-stationary conditions, these algorithms must be able to adapt themselves to reflect the changes in the system and take appropriate adjustments to achieve optimum performances. Among existing algorithms, the QR-decomposition (QRD)-based recursive least-squares (RLS) methods have been shown to be useful and effective for adaptive signal processing. In order to increase the speed of processing and achieve high throughput rate, many algorithms are being vectorized and/or pipelined to facilitate high degrees of parallelism. A time-recursive formulation of RLS filtering employing block QRD will be considered first. Several methods, including a new non-continuous windowing scheme based on selectively rejecting contaminated data, were investigated for adaptive processing. Based on systolic triarrays, many other forms of systolic arrays are shown to be capable of implementing different algorithms. Various updating and downdating systolic algorithms and architectures for RLS filtering are examined and compared in details, which include Householder reflector, Gram-Schmidt procedure, and Givens rotation. A unified approach encompassing existing square-root-free algorithms is also proposed. For the sinusoidal spectrum estimation problem, a judicious method of separating the noise from the signal is of great interest. Various truncated QR methods are proposed for this purpose and compared to the truncated SVD method. Computer simulations provided for detailed comparisons show the effectiveness of these methods. This thesis deals with fundamental issues of numerical stability, computational efficiency, adaptivity, and VLSI implementation for the RLS filtering problems. In all, various new and modified algorithms and architectures are proposed and analyzed; the significance of any of the new method depends
Whitney, Gina; Daves, Suanne; Hughes, Alex; Watkins, Scott; Woods, Marcella; Kreger, Michael; Marincola, Paula; Chocron, Isaac; Donahue, Brian
2013-07-01
The goal of this project is to measure the impact of standardization of transfusion practice on blood product utilization and postoperative bleeding in pediatric cardiac surgery patients. Transfusion is common following cardiopulmonary bypass (CPB) in children and is associated with increased mortality, infection, and duration of mechanical ventilation. Transfusion in pediatric cardiac surgery is often based on clinical judgment rather than objective data. Although objective transfusion algorithms have demonstrated efficacy for reducing transfusion in adult cardiac surgery, such algorithms have not been applied in the pediatric setting. This quality improvement effort was designed to reduce blood product utilization in pediatric cardiac surgery using a blood product transfusion algorithm. We implemented an evidence-based transfusion protocol in January 2011 and monitored the impact of this algorithm on blood product utilization, chest tube output during the first 12 h of intensive care unit (ICU) admission, and predischarge mortality. When compared with the 12 months preceding implementation, blood utilization per case in the operating room odds ratio (OR) for the 11 months following implementation decreased by 66% for red cells (P = 0.001) and 86% for cryoprecipitate (P < 0.001). Blood utilization during the first 12 h of ICU did not increase during this time and actually decreased 56% for plasma (P = 0.006) and 41% for red cells (P = 0.031), indicating that the decrease in OR transfusion did not shift the transfusion burden to the ICU. Postoperative bleeding, as measured by chest tube output in the first 12 ICU hours, did not increase following implementation of the algorithm. Monthly surgical volume did not change significantly following implementation of the algorithm (P = 0.477). In a logistic regression model for predischarge mortality among the nontransplant patients, after accounting for surgical severity and duration of CPB, use of the transfusion
Madduri, Kamesh; Ediger, David; Jiang, Karl; Bader, David A.; Chavarría-Miranda, Daniel
2009-05-29
We present a new lock-free parallel algorithm for computing betweenness centrality of massive small-world networks. With minor changes to the data structures, our algorithm also achieves better spatial cache locality compared to previous approaches. Betweenness centrality is a key algorithm kernel in the HPCS SSCA#2 Graph Analysis benchmark, which has been extensively used to evaluate the performance of emerging high-performance computing architectures for graph-theoretic computations. We design optimized implementations of betweenness centrality and the SSCA#2 benchmark for two hardware multithreaded systems: a Cray XMT system with the ThreadStorm processor, and a single-socket Sun multicore server with the UltraSparc T2 processor. For a small-world network of 134 million vertices and 1.073 billion edges, the 16-processor XMT system and the 8-core Sun Fire T5120 server achieve TEPS scores (an algorithmic performance count for the SSCA#2 benchmark) of 160 million and 90 million respectively, which corresponds to more than a 2X performance improvement over the previous parallel implementations. To better characterize the performance of these multithreaded systems, we correlate the SSCA#2 performance results with data from the memory-intensive STREAM and RandomAccess benchmarks. Finally, we demonstrate the applicability of our implementation to analyze massive real-world datasets by computing approximate betweenness centrality for a large-scale IMDb movie-actor network.
A dual-processor multi-frequency implementation of the FINDS algorithm
NASA Technical Reports Server (NTRS)
Godiwala, Pankaj M.; Caglayan, Alper K.
1987-01-01
This report presents a parallel processing implementation of the FINDS (Fault Inferring Nonlinear Detection System) algorithm on a dual processor configured target flight computer. First, a filter initialization scheme is presented which allows the no-fail filter (NFF) states to be initialized using the first iteration of the flight data. A modified failure isolation strategy, compatible with the new failure detection strategy reported earlier, is discussed and the performance of the new FDI algorithm is analyzed using flight recorded data from the NASA ATOPS B-737 aircraft in a Microwave Landing System (MLS) environment. The results show that low level MLS, IMU, and IAS sensor failures are detected and isolated instantaneously, while accelerometer and rate gyro failures continue to take comparatively longer to detect and isolate. The parallel implementation is accomplished by partitioning the FINDS algorithm into two parts: one based on the translational dynamics and the other based on the rotational kinematics. Finally, a multi-rate implementation of the algorithm is presented yielding significantly low execution times with acceptable estimation and FDI performance.
The design and hardware implementation of a low-power real-time seizure detection algorithm.
Raghunathan, Shriram; Gupta, Sumeet K; Ward, Matthew P; Worth, Robert M; Roy, Kaushik; Irazoqui, Pedro P
2009-10-01
Epilepsy affects more than 1% of the world's population. Responsive neurostimulation is emerging as an alternative therapy for the 30% of the epileptic patient population that does not benefit from pharmacological treatment. Efficient seizure detection algorithms will enable closed-loop epilepsy prostheses by stimulating the epileptogenic focus within an early onset window. Critically, this is expected to reduce neuronal desensitization over time and lead to longer-term device efficacy. This work presents a novel event-based seizure detection algorithm along with a low-power digital circuit implementation. Hippocampal depth-electrode recordings from six kainate-treated rats are used to validate the algorithm and hardware performance in this preliminary study. The design process illustrates crucial trade-offs in translating mathematical models into hardware implementations and validates statistical optimizations made with empirical data analyses on results obtained using a real-time functioning hardware prototype. Using quantitatively predicted thresholds from the depth-electrode recordings, the auto-updating algorithm performs with an average sensitivity and selectivity of 95.3 +/- 0.02% and 88.9 +/- 0.01% (mean +/- SE(alpha = 0.05)), respectively, on untrained data with a detection delay of 8.5 s [5.97, 11.04] from electrographic onset. The hardware implementation is shown feasible using CMOS circuits consuming under 350 nW of power from a 250 mV supply voltage from simulations on the MIT 180 nm SOI process.
Image coding using parallel implementations of the embedded zerotree wavelet algorithm
NASA Astrophysics Data System (ADS)
Creusere, Charles D.
1996-03-01
We explore here the implementation of Shapiro's embedded zerotree wavelet (EZW) image coding algorithms on an array of parallel processors. To this end, we first consider the problem of parallelizing the basic wavelet transform, discussing past work in this area and the compatibility of that work with the zerotree coding process. From this discussion, we present a parallel partitioning of the transform which is computationally efficient and which allows the wavelet coefficients to be coded with little or no additional inter-processor communication. The key to achieving low data dependence between the processors is to ensure that each processor contains only entire zerotrees of wavelet coefficients after the decomposition is complete. We next quantify the rate-distortion tradeoffs associated with different levels of parallelization for a few variations of the basic coding algorithm. Studying these results, we conclude that the quality of the coder decreases as the number of parallel processors used to implement it increases. Noting that the performance of the parallel algorithm might be unacceptably poor for large processor arrays, we also develop an alternate algorithm which always achieves the same rate-distortion performance as the original sequential EZW algorithm at the cost of higher complexity and reduced scalability.
Implementation and analysis of a Navier-Stokes algorithm on parallel computers
NASA Technical Reports Server (NTRS)
Fatoohi, Raad A.; Grosch, Chester E.
1988-01-01
The results of the implementation of a Navier-Stokes algorithm on three parallel/vector computers are presented. The object of this research is to determine how well, or poorly, a single numerical algorithm would map onto three different architectures. The algorithm is a compact difference scheme for the solution of the incompressible, two-dimensional, time-dependent Navier-Stokes equations. The computers were chosen so as to encompass a variety of architectures. They are the following: the MPP, an SIMD machine with 16K bit serial processors; Flex/32, an MIMD machine with 20 processors; and Cray/2. The implementation of the algorithm is discussed in relation to these architectures and measures of the performance on each machine are given. The basic comparison is among SIMD instruction parallelism on the MPP, MIMD process parallelism on the Flex/32, and vectorization of a serial code on the Cray/2. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally, conclusions are presented.
Co-design of software and hardware to implement remote sensing algorithms
Theiler, J. P.; Frigo, J.; Gokhale, M.; Szymanski, J. J.
2001-01-01
Both for offline searches through large data archives and for onboard computation at the sensor head, there is a growing need for ever-more rapid processing of remote sensing data. For many algorithms of use in remote sensing, the bulk of the processing takes place in an 'inner loop' with a large number of simple operations. For these algorithms, dramatic speedups can often be obtained with specialized hardware. The difficulty and expense of digital design continues to limit applicability of this approach, but the development of new design tools is making this approach more feasible, and some notable successes have been reported. On the other hand, it is often the case that processing can also be accelerated by adopting a more sophisticated algorithm design. Unfortunately, a more sophisticated algorithm is much harder to implement in hardware, so these approaches are often at odds with each other. With careful planning, however, it is sometimes possible to combine software and hardware design in such a way that each complements the other, and the final implementation achieves speedup that would not have been possible with a hardware-only or a software-only solution. We will in particular discuss the co-design of software and hardware to achieve substantial speedup of algorithms for multispectral image segmentation and for endmember identification.
Efficient parallel implementation of active appearance model fitting algorithm on GPU.
Wang, Jinwei; Ma, Xirong; Zhu, Yuanping; Sun, Jizhou
2014-01-01
The active appearance model (AAM) is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs) that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA) on the Nvidia's GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures.
NASA Astrophysics Data System (ADS)
Ortiz, Fernando E.; Kelmelis, Eric J.; Arce, Gonzalo R.
2007-04-01
According to the Shannon-Nyquist theory, the number of samples required to reconstruct a signal is proportional to its bandwidth. Recently, it has been shown that acceptable reconstructions are possible from a reduced number of random samples, a process known as compressive sampling. Taking advantage of this realization has radical impact on power consumption and communication bandwidth, crucial in applications based on small/mobile/unattended platforms such as UAVs and distributed sensor networks. Although the benefits of these compression techniques are self-evident, the reconstruction process requires the solution of nonlinear signal processing algorithms, which limit applicability in portable and real-time systems. In particular, (1) the power consumption associated with the difficult computations offsets the power savings afforded by compressive sampling, and (2) limited computational power prevents these algorithms to maintain pace with the data-capturing sensors, resulting in undesirable data loss. FPGA based computers offer low power consumption and high computational capacity, providing a solution to both problems simultaneously. In this paper, we present an architecture that implements the algorithms central to compressive sampling in an FPGA environment. We start by studying the computational profile of the convex optimization algorithms used in compressive sampling. Then we present the design of a pixel pipeline suitable for FPGA implementation, able to compute these algorithms.
NASA Astrophysics Data System (ADS)
Ríos, A. B.; Valda, A.; Somacal, H.
2007-10-01
Usually tomographic procedure requires a set of projections around the object under study and a mathematical processing of such projections through reconstruction algorithms. An accurate reconstruction requires a proper number of projections (angular sampling) and a proper number of elements in each projection (linear sampling) [1]. However in several practical cases it is not possible to fulfill these conditions leading to the so-called problem of few projections. In this case, iterative reconstruction algorithms are more suitable than analytic ones. In this work we present a program written in C++ that provides an environment for two iterative algorithm implementations, one algebraic and the other statistical. The software allows the user a full definition of the acquisition and reconstruction geometries used for the reconstruction algorithms but also to perform projection and backprojection operations. A set of analysis tools was implemented for the characterization of the convergence process. We analyze the performance of the algorithms on numerical phantoms and present the reconstruction of experimental data with few projections coming from transmission X-ray and micro PIXE (Particle-Induced X-Ray Emission) images.
Rios, A. B.; Valda, A.; Somacal, H.
2007-10-26
Usually tomographic procedure requires a set of projections around the object under study and a mathematical processing of such projections through reconstruction algorithms. An accurate reconstruction requires a proper number of projections (angular sampling) and a proper number of elements in each projection (linear sampling). However in several practical cases it is not possible to fulfill these conditions leading to the so-called problem of few projections. In this case, iterative reconstruction algorithms are more suitable than analytic ones. In this work we present a program written in C++ that provides an environment for two iterative algorithm implementations, one algebraic and the other statistical. The software allows the user a full definition of the acquisition and reconstruction geometries used for the reconstruction algorithms but also to perform projection and backprojection operations. A set of analysis tools was implemented for the characterization of the convergence process. We analyze the performance of the algorithms on numerical phantoms and present the reconstruction of experimental data with few projections coming from transmission X-ray and micro PIXE (Particle-Induced X-Ray Emission) images.
Edge detection algorithms implemented on Bi-i cellular vision system
NASA Astrophysics Data System (ADS)
Karabiber, Fethullah; Arik, Sabri
2009-02-01
Bi-i (Bio-inspired) Cellular Vision system is built mainly on Cellular Neural /Nonlinear Networks (CNNs) type (ACE16k) and Digital Signal Processing (DSP) type microprocessors. CNN theory proposed by Chua has advanced properties for image processing applications. In this study, the edge detection algorithms are implemented on the Bi-i Cellular Vision System. Extracting the edge of an image to be processed correctly and fast is of crucial importance for image processing applications. Threshold Gradient based edge detection algorithm is implemented using ACE16k microprocessor. In addition, pre-processing operation is realized by using an image enhancement technique based on Laplacian operator. Finally, morphologic operations are performed as post processing operations. Sobel edge detection algorithm is performed by convolving sobel operators with the image in the DSP. The performances of the edge detection algorithms are compared using visual inspection and timing analysis. Experimental results show that the ACE16k has great computational power and Bi-i Cellular Vision System is very qualified to apply image processing algorithms in real time.
Chen, Fangyue; Chen, Guanrong Ron; He, Guolong; Xu, Xiubin; He, Qinbin
2009-10-01
Universal perceptron (UP), a generalization of Rosenblatt's perceptron, is considered in this paper, which is capable of implementing all Boolean functions (BFs). In the classification of BFs, there are: 1) linearly separable Boolean function (LSBF) class, 2) parity Boolean function (PBF) class, and 3) non-LSBF and non-PBF class. To implement these functions, UP takes different kinds of simple topological structures in which each contains at most one hidden layer along with the smallest possible number of hidden neurons. Inspired by the concept of DNA sequences in biological systems, a novel learning algorithm named DNA-like learning is developed, which is able to quickly train a network with any prescribed BF. The focus is on performing LSBF and PBF by a single-layer perceptron (SLP) with the new algorithm. Two criteria for LSBF and PBF are proposed, respectively, and a new measure for a BF, named nonlinearly separable degree (NLSD), is introduced. In the sense of this measure, the PBF is the most complex one. The new algorithm has many advantages including, in particular, fast running speed, good robustness, and no need of considering the convergence property. For example, the number of iterations and computations in implementing the basic 2-bit logic operations such as AND, OR, and XOR by using the new algorithm is far smaller than the ones needed by using other existing algorithms such as error-correction (EC) and backpropagation (BP) algorithms. Moreover, the synaptic weights and threshold values derived from UP can be directly used in designing of the template of cellular neural networks (CNNs), which has been considered as a new spatial-temporal sensory computing paradigm.
Lewandowski, Michał; Przybylski, Andrzej; Kuźmicz, Wiesław; Szwed, Hanna
2013-09-01
The aim of the study was to analyze the value of a completely new fuzzy logic-based detection algorithm (FA) in comparison with arrhythmia classification algorithms used in existing ICDs in order to demonstrate whether the rate of inappropriate therapies can be reduced. On the basis of the RR intervals database containing arrhythmia events and controls recordings from the ICD memory a diagnostic algorithm was developed and tested by a computer program. This algorithm uses the same input signals as existing ICDs: RR interval as the primary input variable and two variables derived from it, onset and stability. However, it uses 15 fuzzy rules instead of fixed thresholds used in existing devices. The algorithm considers 6 diagnostic categories: (1) VF (ventricular fibrillation), (2) VT (ventricular tachycardia), (3) ST (sinus tachycardia), (4) DAI (artifacts and heart rhythm irregularities including extrasystoles and T-wave oversensing-TWOS), (5) ATF (atrial and supraventricular tachycardia or fibrillation), and 96) NT (sinus rhythm). This algorithm was tested on 172 RR recordings from different ICDs in the follow-up of 135 patients. All diagnostic categories of the algorithm were present in the analyzed recordings: VF (n = 35), VT (n = 48), ST (n = 14), DAI (n = 32), ATF (n = 18), NT (n = 25). Thirty-eight patients (31.4%) in the studied group received inappropriate ICD therapies. In all these cases the final diagnosis of the algorithm was correct (19 cases of artifacts, 11 of atrial fibrillation and 8 of ST) and fuzzy rules algorithm implementation would have withheld unnecessary therapies. Incidence of inappropriate therapies: 3 vs. 38 (the proposed algorithm vs. ICD diagnosis, respectively) differed significantly (p < 0.05). VT/VF were detected correctly in both groups. Sensitivity and specificity were calculated: 100%, 97.8%, and 100%, 72.9% respectively for FA and tested ICDs recordings (p < 0.05). Diagnostic performance of the proposed fuzzy logic based
Real-time implementation of a multispectral mine target detection algorithm
NASA Astrophysics Data System (ADS)
Samson, Joseph W.; Witter, Lester J.; Kenton, Arthur C.; Holloway, John H., Jr.
2003-09-01
Spatial-spectral anomaly detection (the "RX Algorithm") has been exploited on the USMC's Coastal Battlefield Reconnaissance and Analysis (COBRA) Advanced Technology Demonstration (ATD) and several associated technology base studies, and has been found to be a useful method for the automated detection of surface-emplaced antitank land mines in airborne multispectral imagery. RX is a complex image processing algorithm that involves the direct spatial convolution of a target/background mask template over each multispectral image, coupled with a spatially variant background spectral covariance matrix estimation and inversion. The RX throughput on the ATD was about 38X real time using a single Sun UltraSparc system. A goal to demonstrate RX in real-time was begun in FY01. We now report the development and demonstration of a Field Programmable Gate Array (FPGA) solution that achieves a real-time implementation of the RX algorithm at video rates using COBRA ATD data. The approach uses an Annapolis Microsystems Firebird PMC card containing a Xilinx XCV2000E FPGA with over 2,500,000 logic gates and 18MBytes of memory. A prototype system was configured using a Tek Microsystems VME board with dual-PowerPC G4 processors and two PMC slots. The RX algorithm was translated from its C programming implementation into the VHDL language and synthesized into gates that were loaded into the FPGA. The VHDL/synthesizer approach allows key RX parameters to be quickly changed and a new implementation automatically generated. Reprogramming the FPGA is done rapidly and in-circuit. Implementation of the RX algorithm in a single FPGA is a major first step toward achieving real-time land mine detection.
Implementation of The LDA Algorithm for Online Validation Based on Face Recognition
NASA Astrophysics Data System (ADS)
Zainuddin, Z.; Laswi, A. S.
2017-01-01
This paper report work in implementation of computer vision application in face recognition to the on-line validation for distance learning. Face recognition is chosen among many other alternatives of validation because its robustness. The problem with basic validation such as password cannot validate the student in distance learning. This cannot be accepted especially is distance examination. Face recognition algorithm used in this research is Linear Discriminant Analysis (LDA). By using this algorithm, the system capable of recognize the authorized persons about 93% and reject the unauthorized persons 100%.
Implementation of a spiral CT backprojection algorithm on the Cell Broadband Engine processor
NASA Astrophysics Data System (ADS)
Bockenbach, Olivier; Goddard, Iain; Schuberth, Sebastian; Seebass, Martin
2006-03-01
Over the last few decades, the medical imaging community has passionately debated over different approaches to implement reconstruction algorithms for Spiral CT. Numerous alternatives have been proposed. Whether they are approximate, exact or, iterative, those implementations generally include a backprojection step. Specialized compute platforms have been designed to perform this compute-intensive algorithm within a timeframe compatible with hospital-workflow requirements. Solving the performance problem in a cost-effective way had driven designers to use a combination of digital signal processor (DSP) chips, general-purpose processors, application-specific integrated circuits (ASICs) and field programmable gate arrays (FPGAs). The Cell processor by IBM offers an interesting alternative for implementing the backprojection, especially since it offers a good level of parallelism and vast I/O capabilities. In this paper, we consider the implementation of a straight backprojection algorithm on the Cell processor to design a cost-effective system that matches the performance requirements of clinically deployed systems. The effects on performance of system parameters such as pitch and detector size are also analyzed to determine the ideal system size for modern CT scanners.
Demonstration of quantum advantage in machine learning
NASA Astrophysics Data System (ADS)
Ristè, Diego; da Silva, Marcus P.; Ryan, Colm A.; Cross, Andrew W.; Córcoles, Antonio D.; Smolin, John A.; Gambetta, Jay M.; Chow, Jerry M.; Johnson, Blake R.
2017-04-01
The main promise of quantum computing is to efficiently solve certain problems that are prohibitively expensive for a classical computer. Most problems with a proven quantum advantage involve the repeated use of a black box, or oracle, whose structure encodes the solution. One measure of the algorithmic performance is the query complexity, i.e., the scaling of the number of oracle calls needed to find the solution with a given probability. Few-qubit demonstrations of quantum algorithms, such as Deutsch-Jozsa and Grover, have been implemented across diverse physical systems such as nuclear magnetic resonance, trapped ions, optical systems, and superconducting circuits. However, at the small scale, these problems can already be solved classically with a few oracle queries, limiting the obtained advantage. Here we solve an oracle-based problem, known as learning parity with noise, on a five-qubit superconducting processor. Executing classical and quantum algorithms using the same oracle, we observe a large gap in query count in favor of quantum processing. We find that this gap grows by orders of magnitude as a function of the error rates and the problem size. This result demonstrates that, while complex fault-tolerant architectures will be required for universal quantum computing, a significant quantum advantage already emerges in existing noisy systems.
A rapid prototyping methodology to implement and optimize image processing algorithms for FPGAs
NASA Astrophysics Data System (ADS)
Akil, Mohamed; Niang, Pierre; Grandpierre, Thierry
2006-02-01
In this article we present the local operations in image processing based upon spatial 2D discrete convolution. We study different implementation of such local operations. We also present the principles and the design flow of the AAA methodology and its associated CAD software tool for integrated circuit (SynDEx-IC). In this methodology, the algorithm is modeled by Conditioned (if - then - else) and Factorized (Loop) Data Dependence Graph and the optimized implementation is obtained by graph transformations. The AAA/SynDEx-IC is used to specify and to optimize the some digital image filters on FPGA XC2100 board.
An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices
NASA Technical Reports Server (NTRS)
Freund, Roland W.; Gutknecht, Martin H.; Nachtigal, Noel M.
1991-01-01
The nonsymmetric Lanczos method can be used to compute eigenvalues of large sparse non-Hermitian matrices or to solve large sparse non-Hermitian linear systems. However, the original Lanczos algorithm is susceptible to possible breakdowns and potential instabilities. An implementation is presented of a look-ahead version of the Lanczos algorithm that, except for the very special situation of an incurable breakdown, overcomes these problems by skipping over those steps in which a breakdown or near-breakdown would occur in the standard process. The proposed algorithm can handle look-ahead steps of any length and requires the same number of matrix-vector products and inner products as the standard Lanczos process without look-ahead.
Implementation of an algorithm for cylindrical object identification using range data
NASA Technical Reports Server (NTRS)
Bozeman, Sylvia T.; Martin, Benjamin J.
1989-01-01
One of the problems in 3-D object identification and localization is addressed. In robotic and navigation applications the vision system must be able to distinguish cylindrical or spherical objects as well as those of other geometric shapes. An algorithm was developed to identify cylindrical objects in an image when range data is used. The algorithm incorporates the Hough transform for line detection using edge points which emerge from a Sobel mask. Slices of the data are examined to locate arcs of circles using the normal equations of an over-determined linear system. Current efforts are devoted to testing the computer implementation of the algorithm. Refinements are expected to continue in order to accommodate cylinders in various positions. A technique is sought which is robust in the presence of noise and partial occlusions.
A hardware-oriented histogram of oriented gradients algorithm and its VLSI implementation
NASA Astrophysics Data System (ADS)
Zhang, Xiangyu; An, Fengwei; Nakashima, Ikki; Luo, Aiwen; Chen, Lei; Ishii, Idaku; Jürgen Mattausch, Hans
2017-04-01
A challenging and important issue for object recognition is feature extraction on embedded systems. We report a hardware implementation of the histogram of oriented gradients (HOG) algorithm for real-time object recognition, which is known to provide high efficiency and accuracy. The developed hardware-oriented algorithm exploits the cell-based scan strategy which enables image-sensor synchronization and extraction-speed acceleration. Furthermore, buffers for image frames or integral images are avoided. An image-size scalable hardware architecture with an effective bin-decoder and a parallelized voting element (PVE) is developed and used to verify the hardware-oriented HOG implementation with the application of human detection. The fabricated test chip in 180 nm CMOS technology achieves fast processing speed and large flexibility for different image resolutions with substantially reduced hardware cost and energy consumption.
NASA Astrophysics Data System (ADS)
Karabiber, Fethullah; Vecchio, Pietro; Grassi, Giuseppe
2011-12-01
The Bio-inspired (Bi-i) Cellular Vision System is a computing platform consisting of sensing, array sensing-processing, and digital signal processing. The platform is based on the Cellular Neural/Nonlinear Network (CNN) paradigm. This article presents the implementation of a novel CNN-based segmentation algorithm onto the Bi-i system. Each part of the algorithm, along with the corresponding implementation on the hardware platform, is carefully described through the article. The experimental results, carried out for Foreman and Car-phone video sequences, highlight the feasibility of the approach, which provides a frame rate of about 26 frames/s. Comparisons with existing CNN-based methods show that the conceived approach is more accurate, thus representing a good trade-off between real-time requirements and accuracy.
Design and Implementation of IIR Algorithms for Control of Longitudinal Coupled-Bunch Instabilities
Teytelman, Dmitry
2000-05-16
The recent installation of third-harmonic RF cavities at the Advanced Light Source has raised instability growth rates, and also caused tune shifts (coherent and incoherent) of more than an octave over the required range of beam currents and energies. The larger growth rates and tune shifts have rendered control by the original bandpass FIR feedback algorithms unreliable. In this paper the authors describe an implementation of an IIR feedback algorithm with more exible response tailoring. A cascade of up to 6 second-order IIR sections (12 poles and 12 zeros) was implemented in the DSPs of the longitudinal feedback system. Filter design has been formulated as an optimization problem and solved using constrained optimization methods. These IIR filters provided 2.4 times the control bandwidth as compared to the original FIR designs. Here the authors demonstrate the performance of the designed filters using transient diagnostic measurements from ALS and DAPNE.
A Flexible VHDL Floating Point Module for Control Algorithm Implementation in Space Applications
NASA Astrophysics Data System (ADS)
Padierna, A.; Nicoleau, C.; Sanchez, J.; Hidalgo, I.; Elvira, S.
2012-08-01
The implementation of control loops for space applications is an area with great potential. However, the characteristics of this kind of systems, such as its wide dynamic range of numeric values, make inadequate the use of fixed-point algorithms.However, because the generic chips available for the treatment of floating point data are, in general, not qualified to operate in space environments and the possibility of using an IP module in a FPGA/ASIC qualified for space is not viable due to the low amount of logic cells available for these type of devices, it is necessary to find a viable alternative.For these reasons, in this paper a VHDL Floating Point Module is presented. This proposal allows the design and execution of floating point algorithms with acceptable occupancy to be implemented in FPGAs/ASICs qualified for space environments.
VHDL implementation of feature-extraction algorithm for the PANDA electromagnetic calorimeter
NASA Astrophysics Data System (ADS)
Guliyev, E.; Kavatsyuk, M.; Lemmens, P. J. J.; Tambave, G.; Löhner, H.; Panda Collaboration
2012-02-01
A simple, efficient, and robust feature-extraction algorithm, developed for the digital front-end electronics of the electromagnetic calorimeter of the PANDA spectrometer at FAIR, Darmstadt, is implemented in VHDL for a commercial 16 bit 100 MHz sampling ADC. The source-code is available as an open-source project and is adaptable for other projects and sampling ADCs. Best performance with different types of signal sources can be achieved through flexible parameter selection. The on-line data-processing in FPGA enables to construct an almost dead-time free data acquisition system which is successfully evaluated as a first step towards building a complete trigger-less readout chain. Prototype setups are studied to determine the dead-time of the implemented algorithm, the rate of false triggering, timing performance, and event correlations.
An Optional Threshold with Svm Cloud Detection Algorithm and Dsp Implementation
NASA Astrophysics Data System (ADS)
Zhou, Guoqing; Zhou, Xiang; Yue, Tao; Liu, Yilong
2016-06-01
This paper presents a method which combines the traditional threshold method and SVM method, to detect the cloud of Landsat-8 images. The proposed method is implemented using DSP for real-time cloud detection. The DSP platform connects with emulator and personal computer. The threshold method is firstly utilized to obtain a coarse cloud detection result, and then the SVM classifier is used to obtain high accuracy of cloud detection. More than 200 cloudy images from Lansat-8 were experimented to test the proposed method. Comparing the proposed method with SVM method, it is demonstrated that the cloud detection accuracy of each image using the proposed algorithm is higher than those of SVM algorithm. The results of the experiment demonstrate that the implementation of the proposed method on DSP can effectively realize the real-time cloud detection accurately.
NASA Technical Reports Server (NTRS)
Tilton, James C.; Plaza, Antonio J. (Editor); Chang, Chein-I. (Editor)
2008-01-01
The hierarchical image segmentation algorithm (referred to as HSEG) is a hybrid of hierarchical step-wise optimization (HSWO) and constrained spectral clustering that produces a hierarchical set of image segmentations. HSWO is an iterative approach to region grooving segmentation in which the optimal image segmentation is found at N(sub R) regions, given a segmentation at N(sub R+1) regions. HSEG's addition of constrained spectral clustering makes it a computationally intensive algorithm, for all but, the smallest of images. To counteract this, a computationally efficient recursive approximation of HSEG (called RHSEG) has been devised. Further improvements in processing speed are obtained through a parallel implementation of RHSEG. This chapter describes this parallel implementation and demonstrates its computational efficiency on a Landsat Thematic Mapper test scene.
NASA Technical Reports Server (NTRS)
Jaggi, S.; Quattrochi, Dale A.; Lam, Nina S.-N.
1993-01-01
Fractal geometry is increasingly becoming a useful tool for modeling natural phenomena. As an alternative to Euclidean concepts, fractals allow for a more accurate representation of the nature of complexity in natural boundaries and surfaces. The purpose of this paper is to introduce and implement three algorithms in C code for deriving fractal measurement from remotely sensed data. These three methods are: the line-divider method, the variogram method, and the triangular prism method. Remote-sensing data acquired by NASA's Calibrated Airborne Multispectral Scanner (CAMS) are used to compute the fractal dimension using each of the three methods. These data were obtained as a 30 m pixel spatial resolution over a portion of western Puerto Rico in January 1990. A description of the three methods, their implementation in PC-compatible environment, and some results of applying these algorithms to remotely sensed image data are presented.
NASA Technical Reports Server (NTRS)
Tilton, James C.; Plaza, Antonio J. (Editor); Chang, Chein-I. (Editor)
2008-01-01
The hierarchical image segmentation algorithm (referred to as HSEG) is a hybrid of hierarchical step-wise optimization (HSWO) and constrained spectral clustering that produces a hierarchical set of image segmentations. HSWO is an iterative approach to region grooving segmentation in which the optimal image segmentation is found at N(sub R) regions, given a segmentation at N(sub R+1) regions. HSEG's addition of constrained spectral clustering makes it a computationally intensive algorithm, for all but, the smallest of images. To counteract this, a computationally efficient recursive approximation of HSEG (called RHSEG) has been devised. Further improvements in processing speed are obtained through a parallel implementation of RHSEG. This chapter describes this parallel implementation and demonstrates its computational efficiency on a Landsat Thematic Mapper test scene.
NASA Technical Reports Server (NTRS)
Jaggi, S.; Quattrochi, Dale A.; Lam, Nina S.-N.
1993-01-01
Fractal geometry is increasingly becoming a useful tool for modeling natural phenomena. As an alternative to Euclidean concepts, fractals allow for a more accurate representation of the nature of complexity in natural boundaries and surfaces. The purpose of this paper is to introduce and implement three algorithms in C code for deriving fractal measurement from remotely sensed data. These three methods are: the line-divider method, the variogram method, and the triangular prism method. Remote-sensing data acquired by NASA's Calibrated Airborne Multispectral Scanner (CAMS) are used to compute the fractal dimension using each of the three methods. These data were obtained as a 30 m pixel spatial resolution over a portion of western Puerto Rico in January 1990. A description of the three methods, their implementation in PC-compatible environment, and some results of applying these algorithms to remotely sensed image data are presented.
A new morphological anomaly detection algorithm for hyperspectral images and its GPU implementation
NASA Astrophysics Data System (ADS)
Paz, Abel; Plaza, Antonio
2011-10-01
Anomaly detection is considered a very important task for hyperspectral data exploitation. It is now routinely applied in many application domains, including defence and intelligence, public safety, precision agriculture, geology, or forestry. Many of these applications require timely responses for swift decisions which depend upon high computing performance of algorithm analysis. However, with the recent explosion in the amount and dimensionality of hyperspectral imagery, this problem calls for the incorporation of parallel computing techniques. In the past, clusters of computers have offered an attractive solution for fast anomaly detection in hyperspectral data sets already transmitted to Earth. However, these systems are expensive and difficult to adapt to on-board data processing scenarios, in which low-weight and low-power integrated components are essential to reduce mission payload and obtain analysis results in (near) real-time, i.e., at the same time as the data is collected by the sensor. An exciting new development in the field of commodity computing is the emergence of commodity graphics processing units (GPUs), which can now bridge the gap towards on-board processing of remotely sensed hyperspectral data. In this paper, we develop a new morphological algorithm for anomaly detection in hyperspectral images along with an efficient GPU implementation of the algorithm. The algorithm is implemented on latest-generation GPU architectures, and evaluated with regards to other anomaly detection algorithms using hyperspectral data collected by NASA's Airborne Visible Infra-Red Imaging Spectrometer (AVIRIS) over the World Trade Center (WTC) in New York, five days after the terrorist attacks that collapsed the two main towers in the WTC complex. The proposed GPU implementation achieves real-time performance in the considered case study.
Hsu, Chung-Jen; Fikentscher, Joshua; Kreeb, Robert
2017-05-29
A channel congestion problem might occur when the traffic density increases because the number of basic safety messages carried on the communication channel also increases in vehicle-to-vehicle communications. A remedy algorithm proposed in SAE J2945/1 is designed to address the channel congestion issue by decreasing transmission frequency and radiated power. This study is to develop potential test procedures for evaluating or validating the congestion control algorithm. Simulations of a reference unit transmitting at a higher frequency are implemented to emulate a number of onboard equipment (OBE) transmitting at the normal interval of 100 ms (10 Hz). When the transmitting interval is reduced to 1.25 ms (800 Hz), the reference unit emulates 80 vehicles transmitting at 10 Hz. By increasing the number of reference units transmitting at 800 Hz in the simulations, the corresponding channel busy percentages are obtained. An algorithm for Global Positioning System (GPS) data generation of virtual vehicles is developed for facilitating the validation of transmission intervals in the congestion control algorithm. Channel busy percentage is the channel busy time over a specified period of time. Three or 4 reference units are needed to generate channel busy percentages between 50 and 80%, and 5 reference units can generate channel busy percentages above 80%. The proposed test procedures can verify the operation of congestion control algorithm when channel busy percentages are between 50 and 80% and above 80%. By using a GPS data generation algorithm, the test procedures can also verify the transmission intervals when traffic densities are 80 and 200 vehicles in a radius of 100 m. A suite of test tools with functional requirements is also proposed for facilitating the implementation of test procedures. The potential test procedures for a congestion control algorithm are developed based on the simulation results of channel busy percentage and the GPS data generation
Audi, Ahmad; Pierrot-Deseilligny, Marc; Meynard, Christophe; Thom, Christian
2017-07-18
Images acquired with a long exposure time using a camera embedded on UAVs (Unmanned Aerial Vehicles) exhibit motion blur due to the erratic movements of the UAV. The aim of the present work is to be able to acquire several images with a short exposure time and use an image processing algorithm to produce a stacked image with an equivalent long exposure time. Our method is based on the feature point image registration technique. The algorithm is implemented on the light-weight IGN (Institut national de l'information géographique) camera, which has an IMU (Inertial Measurement Unit) sensor and an SoC (System on Chip)/FPGA (Field-Programmable Gate Array). To obtain the correct parameters for the resampling of the images, the proposed method accurately estimates the geometrical transformation between the first and the N-th images. Feature points are detected in the first image using the FAST (Features from Accelerated Segment Test) detector, then homologous points on other images are obtained by template matching using an initial position benefiting greatly from the presence of the IMU sensor. The SoC/FPGA in the camera is used to speed up some parts of the algorithm in order to achieve real-time performance as our ultimate objective is to exclusively write the resulting image to save bandwidth on the storage device. The paper includes a detailed description of the implemented algorithm, resource usage summary, resulting processing time, resulting images and block diagrams of the described architecture. The resulting stacked image obtained for real surveys does not seem visually impaired. An interesting by-product of this algorithm is the 3D rotation estimated by a photogrammetric method between poses, which can be used to recalibrate in real time the gyrometers of the IMU. Timing results demonstrate that the image resampling part of this algorithm is the most demanding processing task and should also be accelerated in the FPGA in future work.
Audi, Ahmad; Pierrot-Deseilligny, Marc; Meynard, Christophe
2017-01-01
Images acquired with a long exposure time using a camera embedded on UAVs (Unmanned Aerial Vehicles) exhibit motion blur due to the erratic movements of the UAV. The aim of the present work is to be able to acquire several images with a short exposure time and use an image processing algorithm to produce a stacked image with an equivalent long exposure time. Our method is based on the feature point image registration technique. The algorithm is implemented on the light-weight IGN (Institut national de l’information géographique) camera, which has an IMU (Inertial Measurement Unit) sensor and an SoC (System on Chip)/FPGA (Field-Programmable Gate Array). To obtain the correct parameters for the resampling of the images, the proposed method accurately estimates the geometrical transformation between the first and the N-th images. Feature points are detected in the first image using the FAST (Features from Accelerated Segment Test) detector, then homologous points on other images are obtained by template matching using an initial position benefiting greatly from the presence of the IMU sensor. The SoC/FPGA in the camera is used to speed up some parts of the algorithm in order to achieve real-time performance as our ultimate objective is to exclusively write the resulting image to save bandwidth on the storage device. The paper includes a detailed description of the implemented algorithm, resource usage summary, resulting processing time, resulting images and block diagrams of the described architecture. The resulting stacked image obtained for real surveys does not seem visually impaired. An interesting by-product of this algorithm is the 3D rotation estimated by a photogrammetric method between poses, which can be used to recalibrate in real time the gyrometers of the IMU. Timing results demonstrate that the image resampling part of this algorithm is the most demanding processing task and should also be accelerated in the FPGA in future work. PMID:28718788
Baugh, J; Moussa, O; Ryan, C A; Nayak, A; Laflamme, R
2005-11-24
The counter-intuitive properties of quantum mechanics have the potential to revolutionize information processing by enabling the development of efficient algorithms with no known classical counterparts. Harnessing this power requires the development of a set of building blocks, one of which is a method to initialize the set of quantum bits (qubits) to a known state. Additionally, fresh ancillary qubits must be available during the course of computation to achieve fault tolerance. In any physical system used to implement quantum computation, one must therefore be able to selectively and dynamically remove entropy from the part of the system that is to be mapped to qubits. One such method is an 'open-system' cooling protocol in which a subset of qubits can be brought into contact with an external system of large heat capacity. Theoretical efforts have led to an implementation-independent cooling procedure, namely heat-bath algorithmic cooling. These efforts have culminated with the proposal of an optimal algorithm, the partner-pairing algorithm, which was used to compute the physical limits of heat-bath algorithmic cooling. Here we report the experimental realization of multi-step cooling of a quantum system via heat-bath algorithmic cooling. The experiment was carried out using nuclear magnetic resonance of a solid-state ensemble three-qubit system. We demonstrate the repeated repolarization of a particular qubit to an effective spin-bath temperature, and alternating logical operations within the three-qubit subspace to ultimately cool a second qubit below this temperature. Demonstration of the control necessary for these operations represents an important step forward in the manipulation of solid-state nuclear magnetic resonance qubits.
Implementation of a block Lanczos algorithm for Eigenproblem solution of gyroscopic systems
NASA Technical Reports Server (NTRS)
Gupta, Kajal K.; Lawson, Charles L.
1987-01-01
The details of implementation of a general numerical procedure developed for the accurate and economical computation of natural frequencies and associated modes of any elastic structure rotating along an arbitrary axis are described. A block version of the Lanczos algorithm is derived for the solution that fully exploits associated matrix sparsity and employs only real numbers in all relevant computations. It is also capable of determining multiple roots and proves to be most efficient when compared to other, similar, exisiting techniques.
Decoding the Brain’s Algorithm for Categorization from its Neural Implementation
Mack, Michael L.; Preston, Alison R.; Love, Bradley C.
2013-01-01
Summary Acts of cognition can be described at different levels of analysis: what behavior should characterize the act, what algorithms and representations underlie the behavior, and how the algorithms are physically realized in neural activity [1]. Theories that bridge levels of analysis offer more complete explanations by leveraging the constraints present at each level [2–4]. Despite the great potential for theoretical advances, few studies of cognition bridge levels of analysis. For example, formal cognitive models of category decisions accurately predict human decision making [5, 6], but whether model algorithms and representations supporting category decisions are consistent with underlying neural implementation remains unknown. This uncertainty is largely due to the hurdle of forging links between theory and brain [7–9]. Here, we tackle this critical problem by using brain response to characterize the nature of mental computations that support category decisions to evaluate two dominant, and opposing, models of categorization. We found that brain states during category decisions were significantly more consistent with latent model representations from exemplar [5] rather than prototype theory [10, 11]. Representations of individual experiences, not the abstraction of experiences, are critical for category decision making. Holding models accountable for behavior and neural implementation provides a means for advancing more complete descriptions of the algorithms of cognition. PMID:24094852
NASA Astrophysics Data System (ADS)
Besrour, Amine; Abdelkefi, Fatma; Siala, Mohamed; Snoussi, Hichem
2015-09-01
Nowadays, the high dynamic range (HDR) imaging represents the subject of the most researches. The major problem lies in the implementation of the best algorithm to acquire the best video quality. In fact, the major constraint is to conceive an optimal fusion which must meet the rapid movement of video frames. The implemented merging algorithms were not quick enough to reconstitute the HDR video. In this paper, we detail each of the previous existing works before detailing our algorithm and presenting results from the acquired HDR images, tone mapped with various techniques. Our proposed algorithm guarantees a more enhanced and faster solution compared to the existing ones. In fact, it has the ability to calculate the saturation matrix related to the saturation rate of the neighboring pixels. The computed coefficients are affected respectively to each picture from the tested ones. This analysis provides faster and efficient results in terms of quality and brightness. The originality of our work remains on its processing method including the pixels saturation in the totality of the captured pictures and their combination in order to obtain the best pictures illustrating all the possible details. These parameters are computed for each zone depending on the contrast and the luminosity of the current pixel and its neighboring. The final HDR image's coefficients are calculated dynamically ensuring the best image quality equilibrating the brightness and contrast values and making the perfect final image.
Implementation of Human Trafficking Education and Treatment Algorithm in the Emergency Department.
Egyud, Amber; Stephens, Kimberly; Swanson-Bierman, Brenda; DiCuccio, Marge; Whiteman, Kimberly
2017-04-18
Health care professionals have not been successful in recognizing or rescuing victims of human trafficking. The purpose of this project was to implement a screening system and treatment algorithm in the emergency department to improve the identification and rescue of victims of human trafficking. The lack of recognition by health care professionals is related to inadequate education and training tools and confusion with other forms of violence such as trauma and sexual assault. A multidisciplinary team was formed to assess the evidence related to human trafficking and make recommendations for practice. After receiving education, staff completed a survey about knowledge gained from the training. An algorithm for identification and treatment of sex trafficking victims was implemented and included a 2-pronged identification approach: (1) medical red flags created by a risk-assessment tool embedded in the electronic health record and (2) a silent notification process. Outcome measures were the number of victims who were identified either by the medical red flags or by silent notification and were offered and accepted intervention. Survey results indicated that 75% of participants reported that the education improved their competence level. The results demonstrated that an education and treatment algorithm may be an effective strategy to improve recognition. One patient was identified as an actual victim of human trafficking; the remaining patients reported other forms of abuse. Education and a treatment algorithm were effective strategies to improve recognition and rescue of human trafficking victims and increase identification of other forms of abuse. Copyright © 2017 Elsevier Inc. All rights reserved.
Chen, Fangyue; Chen, Guanrong; He, Qinbin; He, Guolong; Xu, Xiubin
2009-08-01
Implementing linearly nonseparable Boolean functions (non-LSBF) has been an important and yet challenging task due to the extremely high complexity of this kind of functions and the exponentially increasing percentage of the number of non-LSBF in the entire set of Boolean functions as the number of input variables increases. In this paper, an algorithm named DNA-like learning and decomposing algorithm (DNA-like LDA) is proposed, which is capable of effectively implementing non-LSBF. The novel algorithm first trains the DNA-like offset sequence and decomposes non-LSBF into logic XOR operations of a sequence of LSBF, and then determines the weight-threshold values of the multilayer perceptron (MLP) that perform both the decompositions of LSBF and the function mapping the hidden neurons to the output neuron. The algorithm is validated by two typical examples about the problem of approximating the circular region and the well-known n-bit parity Boolean function (PBF).
Artificial immune algorithm implementation for optimized multi-axis sculptured surface CNC machining
NASA Astrophysics Data System (ADS)
Fountas, N. A.; Kechagias, J. D.; Vaxevanidis, N. M.
2016-11-01
This paper presents the results obtained by the implementation of an artificial immune algorithm to optimize standard multi-axis tool-paths applied to machine free-form surfaces. The investigation for its applicability was based on a full factorial experimental design addressing the two additional axes for tool inclination as independent variables whilst a multi-objective response was formulated by taking into consideration surface deviation and tool path time; objectives assessed directly from computer-aided manufacturing environment A standard sculptured part was developed by scratch considering its benchmark specifications and a cutting-edge surface machining tool-path was applied to study the effects of the pattern formulated when dynamically inclining a toroidal end-mill and guiding it towards the feed direction under fixed lead and tilt inclination angles. The results obtained form the series of the experiments were used for the fitness function creation the algorithm was about to sequentially evaluate. It was found that the artificial immune algorithm employed has the ability of attaining optimal values for inclination angles facilitating thus the complexity of such manufacturing process and ensuring full potentials in multi-axis machining modelling operations for producing enhanced CNC manufacturing programs. Results suggested that the proposed algorithm implementation may reduce the mean experimental objective value to 51.5%
ParaKMeans: Implementation of a parallelized K-means algorithm suitable for general laboratory use.
Kraj, Piotr; Sharma, Ashok; Garge, Nikhil; Podolsky, Robert; McIndoe, Richard A
2008-04-16
During the last decade, the use of microarrays to assess the transcriptome of many biological systems has generated an enormous amount of data. A common technique used to organize and analyze microarray data is to perform cluster analysis. While many clustering algorithms have been developed, they all suffer a significant decrease in computational performance as the size of the dataset being analyzed becomes very large. For example, clustering 10000 genes from an experiment containing 200 microarrays can be quite time consuming and challenging on a desktop PC. One solution to the scalability problem of clustering algorithms is to distribute or parallelize the algorithm across multiple computers. The software described in this paper is a high performance multithreaded application that implements a parallelized version of the K-means Clustering algorithm. Most parallel processing applications are not accessible to the general public and require specialized software libraries (e.g. MPI) and specialized hardware configurations. The parallel nature of the application comes from the use of a web service to perform the distance calculations and cluster assignments. Here we show our parallel implementation provides significant performance gains over a wide range of datasets using as little as seven nodes. The software was written in C# and was designed in a modular fashion to provide both deployment flexibility as well as flexibility in the user interface. ParaKMeans was designed to provide the general scientific community with an easy and manageable client-server application that can be installed on a wide variety of Windows operating systems.
Madduri, Kamesh; Ediger, David; Jiang, Karl; Bader, David A.; Chavarria-Miranda, Daniel
2009-02-15
We present a new lock-free parallel algorithm for computing betweenness centralityof massive small-world networks. With minor changes to the data structures, ouralgorithm also achieves better spatial cache locality compared to previous approaches. Betweenness centrality is a key algorithm kernel in HPCS SSCA#2, a benchmark extensively used to evaluate the performance of emerging high-performance computing architectures for graph-theoretic computations. We design optimized implementations of betweenness centrality and the SSCA#2 benchmark for two hardware multithreaded systems: a Cray XMT system with the Threadstorm processor, and a single-socket Sun multicore server with the UltraSPARC T2 processor. For a small-world network of 134 million vertices and 1.073 billion edges, the 16-processor XMT system and the 8-core Sun Fire T5120 server achieve TEPS scores (an algorithmic performance count for the SSCA#2 benchmark) of 160 million and 90 million respectively, which corresponds to more than a 2X performance improvement over the previous parallel implementations. To better characterize the performance of these multithreaded systems, we correlate the SSCA#2 performance results with data from the memory-intensive STREAM and RandomAccess benchmarks. Finally, we demonstrate the applicability of our implementation to analyze massive real-world datasets by computing approximate betweenness centrality for a large-scale IMDb movie-actor network.
Implementation and evaluation of various demons deformable image registration algorithms on a GPU.
Gu, Xuejun; Pan, Hubert; Liang, Yun; Castillo, Richard; Yang, Deshan; Choi, Dongju; Castillo, Edward; Majumdar, Amitava; Guerrero, Thomas; Jiang, Steve B
2010-01-07
Online adaptive radiation therapy (ART) promises the ability to deliver an optimal treatment in response to daily patient anatomic variation. A major technical barrier for the clinical implementation of online ART is the requirement of rapid image segmentation. Deformable image registration (DIR) has been used as an automated segmentation method to transfer tumor/organ contours from the planning image to daily images. However, the current computational time of DIR is insufficient for online ART. In this work, this issue is addressed by using computer graphics processing units (GPUs). A gray-scale-based DIR algorithm called demons and five of its variants were implemented on GPUs using the compute unified device architecture (CUDA) programming environment. The spatial accuracy of these algorithms was evaluated over five sets of pulmonary 4D CT images with an average size of 256 x 256 x 100 and more than 1100 expert-determined landmark point pairs each. For all the testing scenarios presented in this paper, the GPU-based DIR computation required around 7 to 11 s to yield an average 3D error ranging from 1.5 to 1.8 mm. It is interesting to find out that the original passive force demons algorithms outperform subsequently proposed variants based on the combination of accuracy, efficiency and ease of implementation.
Ripple FPN reduced algorithm based on temporal high-pass filter and hardware implementation
NASA Astrophysics Data System (ADS)
Li, Yiyang; Li, Shuo; Zhang, Zhipeng; Jin, Weiqi; Wu, Lei; Jin, Minglei
2016-11-01
Cooled infrared detector arrays always suffer from undesired Ripple Fixed-Pattern Noise (FPN) when observe the scene of sky. The Ripple Fixed-Pattern Noise seriously affect the imaging quality of thermal imager, especially for small target detection and tracking. It is hard to eliminate the FPN by the Calibration based techniques and the current scene-based nonuniformity algorithms. In this paper, we present a modified space low-pass and temporal high-pass nonuniformity correction algorithm using adaptive time domain threshold (THP&GM). The threshold is designed to significantly reduce ghosting artifacts. We test the algorithm on real infrared in comparison to several previously published methods. This algorithm not only can effectively correct common FPN such as Stripe, but also has obviously advantage compared with the current methods in terms of detail protection and convergence speed, especially for Ripple FPN correction. Furthermore, we display our architecture with a prototype built on a Xilinx Virtex-5 XC5VLX50T field-programmable gate array (FPGA). The hardware implementation of the algorithm based on FPGA has two advantages: (1) low resources consumption, and (2) small hardware delay (less than 20 lines). The hardware has been successfully applied in actual system.
NASA Astrophysics Data System (ADS)
Finney, Greg A.; Persons, Christopher M.; Henning, Stephan; Hazen, Jessie; Whitley, Daniel
2014-06-01
IERUS Technologies, Inc. and the University of Alabama in Huntsville have partnered to perform characterization and development of algorithms and hardware for adaptive optics. To date the algorithm work has focused on implementation of the stochastic parallel gradient descent (SPGD) algorithm. SPGD is a metric-based approach in which a scalar metric is optimized by taking random perturbative steps for many actuators simultaneously. This approach scales to systems with a large number of actuators while maintaining bandwidth, while conventional methods are negatively impacted by the very large matrix multiplications that are required. The metric approach enables the use of higher speed sensors with fewer (or even a single) sensing element(s), enabling a higher control bandwidth. Furthermore, the SPGD algorithm is model-free, and thus is not strongly impacted by the presence of nonlinearities which degrade the performance of conventional phase reconstruction methods. Finally, for high energy laser applications, SPGD can be performed using the primary laser beam without the need for an additional beacon laser. The conventional SPGD algorithm was modified to use an adaptive gain to improve convergence while maintaining low steady state error. Results from laboratory experiments using phase plates as atmosphere surrogates will be presented, demonstrating areas in which the adaptive gain yields better performance and areas which require further investigation.
Implementing Few-Body Algorithmic Regularization with Post-Newtonian Terms
NASA Astrophysics Data System (ADS)
Mikkola, Seppo; Merritt, David
2008-06-01
We discuss the implementation of a new regular algorithm for simulation of the gravitational few-body problem. The algorithm uses components from earlier methods, including the chain structure, the logarithmic Hamiltonian, and the time-transformed leapfrog. This algorithmic regularization code, AR-CHAIN, can be used for the normal N-body problem, as well as for problems with softened potentials and/or with velocity-dependent external perturbations, including post-Newtonian terms, which we include up to order PN2.5. Arbitrarily extreme mass ratios are allowed. Only linear coordinate transformations are used and thus the algorithm is somewhat simpler than many earlier regularized schemes. We present the results of performance tests which suggest that the new code is either comparable in performance or superior to the existing regularization schemes based on the Kustaanheimo-Stiefel (KS) transformation. This is true even for the two-body problem, independent of eccentricity. An important advantage of the new method is that, contrary to the older KS-CHAIN code, zero masses are allowed. We use our algorithm to integrate the orbits of the S stars around the Milky Way supermassive black hole for one million years, including PN2.5 terms and an intermediate-mass black hole. The three S stars with shortest periods are observed to escape from the system after a few hundred thousand years.
FPGA implementation of the hyperspectral Lossy Compression for Exomars (LCE) algorithm
NASA Astrophysics Data System (ADS)
García, Aday; Santos, L.; López, S.; Callicó, G. M.; López, J. F.; Sarmiento, R.
2014-10-01
The increase of data rates and data volumes in present remote sensing payload instruments, together with the restrictions imposed in the downlink connection requirements, represent at the same time a challenge and a must in the field of data and image compression. This is especially true for the case of hyperspectral images, in which both, reduction of spatial and spectral redundancy is mandatory. Recently the Consultative Committee for Space Data Systems (CCSDS) published the Lossless Multispectral and Hyperespectral Image Compression recommendation (CCSDS 123), a prediction-based technique resulted from the consensus of its members. Although this standard offers a good trade-off between coding performance and computational complexity, the appearance of future hyperspectral and ultraspectral sensors with vast amount of data imposes further efforts from the scientific community to ensure optimal transmission to ground stations based on greater compression rates. Furthermore, hardware implementations with specific features to deal with solar radiation problems play an important role in order to achieve real time applications. In this scenario, the Lossy Compression for Exomars (LCE) algorithm emerges as a good candidate to achieve these characteristics. Its good quality/compression ratio together with its low complexity facilitates the implementation in hardware platforms such as FPGAs or ASICs. In this work the authors present the implementation of the LCE algorithm into an antifuse-based FPGA and the optimizations carried out to obtain the RTL description code using CatapultC, a High Level Synthesis (HLS) Tool. Experimental results show an area occupancy of 75% in an RTAX2000 FPGA from Microsemi, with an operating frequency of 18 MHz. Additionally, the power budget obtained is presented giving an idea of the suitability of the proposed algorithm implementation for onboard compression applications.
Software for implementing trigger algorithms on the upgraded CMS Global Trigger System
NASA Astrophysics Data System (ADS)
Matsushita, Takashi; Arnold, Bernhard
2015-12-01
The Global Trigger is the final step of the CMS Level-1 Trigger and implements a trigger menu, a set of selection requirements applied to the final list of trigger objects. The conditions for trigger object selection, with possible topological requirements on multiobject triggers, are combined by simple combinatorial logic to form the algorithms. The LHC has resumed its operation in 2015, the collision-energy will be increased to 13 TeV with the luminosity expected to go up to 2x1034 cm-2s-1. The CMS Level-1 trigger system will be upgraded to improve its performance for selecting interesting physics events and to operate within the predefined data-acquisition rate in the challenging environment expected at LHC Run 2. The Global Trigger will be re-implemented on modern FPGAs on an Advanced Mezzanine Card in MicroTCA crate. The upgraded system will benefit from the ability to process complex algorithms with DSP slices and increased processing resources with optical links running at 10 Gbit/s, enabling more algorithms at a time than previously possible and allowing CMS to be more flexible in how it handles the trigger bandwidth. In order to handle the increased complexity of the trigger menu implemented on the upgraded Global Trigger, a set of new software has been developed. The software allows a physicist to define a menu with analysis-like triggers using intuitive user interface. The menu is then realised on FPGAs with further software processing, instantiating predefined firmware blocks. The design and implementation of the software for preparing a menu for the upgraded CMS Global Trigger system are presented.
Yates, James W T
2009-10-01
An implementation of the Expectation-Maximisation (EM) algorithm in ACSLXTREME (AEGIS Technologies) for the analyses of population pharmacokinetic-pharmacodynamic (PKPD) data is demonstrated. The parameter estimation results are compared with those from NONMEM (Globomax) using the first order conditional estimate method. The estimates are comparable and it is concluded that the EM algorithm is a useful technique in population pharmacokinetic-pharmacodynamic modelling. The implementation also demonstrates the ease with which parameter estimation algorithms for population data can be implemented in simulation software packages.
Crawford, Kevan C.; Sandquist, Gary M.
1990-07-01
The emphasis of this work is the development and implementation of an automatic control philosophy which uses the classical operational philosophies as a foundation. Three control algorithms were derived based on various simplifying assumptions. Two of the algorithms were tested in computer simulations. After realizing the insensitivity of the system to the simplifications, the most reduced form of the algorithms was implemented on the computer control system at the University of Utah (UNEL). Since the operational philosophies have a higher priority than automatic control, they determine when automatic control may be utilized. Unlike the operational philosophies, automatic control is not concerned with component failures. The object of this philosophy is the movement of absorber rods to produce a requested power. When the current power level is compared to the requested power level, an error may be detected which will require the movement of a control rod to correct the error. The automatic control philosophy adds another dimension to the classical operational philosophies. Using this philosophy, normal operator interactions with the computer would be limited only to run parameters such as power, period, and run time. This eliminates subjective judgements, objective judgements under pressure, and distractions to the operator and insures the reactor will be operated in a safe and controlled manner as well as providing reproducible operations.
Implementation of Complex Signal Processing Algorithms for Position-Sensitive Microcalorimeters
NASA Technical Reports Server (NTRS)
Smith, Stephen J.
2008-01-01
We have recently reported on a theoretical digital signal-processing algorithm for improved energy and position resolution in position-sensitive, transition-edge sensor (POST) X-ray detectors [Smith et al., Nucl, lnstr and Meth. A 556 (2006) 2371. PoST's consists of one or more transition-edge sensors (TES's) on a large continuous or pixellated X-ray absorber and are under development as an alternative to arrays of single pixel TES's. PoST's provide a means to increase the field-of-view for the fewest number of read-out channels. In this contribution we extend the theoretical correlated energy position optimal filter (CEPOF) algorithm (originally developed for 2-TES continuous absorber PoST's) to investigate the practical implementation on multi-pixel single TES PoST's or Hydras. We use numerically simulated data for a nine absorber device, which includes realistic detector noise, to demonstrate an iterative scheme that enables convergence on the correct photon absorption position and energy without any a priori assumptions. The position sensitivity of the CEPOF implemented on simulated data agrees very well with the theoretically predicted resolution. We discuss practical issues such as the impact of random arrival phase of the measured data on the performance of the CEPOF. The CEPOF algorithm demonstrates that full-width-at- half-maximum energy resolution of < 8 eV coupled with position-sensitivity down to a few 100 eV should be achievable for a fully optimized device.
An IDL/ENVI implementation of the FFT-based algorithm for automatic image registration
NASA Astrophysics Data System (ADS)
Xie, Hongjie; Hicks, Nigel; Randy Keller, G.; Huang, Haitao; Kreinovich, Vladik
2003-10-01
Georeferencing images is a laborious process so schemes for automating this process have been under investigation for some time. Among the most promising automatic registration algorithms are those based on the fast Fourier transform (FFT). The displacement between two given images can be determined by computing the ratio F 1 conj(F 2)/|F 1F 2| , and then applying the inverse Fourier transform. The result is an impulse-like function, which is approximately zero everywhere except at the displacement that is necessary to optimally register the images. Converting from rectangular coordinates to log-polar coordinates, shifts representing rotation and scaling can also determined to complete the georectification process. A FFT-based algorithm has been successfully implemented in Interactive Data Language (IDL) and added as two user functions to an image processing software package—ENvironment for Visualizing Images (ENVI) interface. ENVI handles all pre- and post-processing works such as input, output, display, filter, analysis, and file management. To test this implementation, several dozen tests were conducted on both simulated and "real world" images. The results of these tests show advantages and limitations of this algorithm. In particular, our tests show that the accuracy of the resulting registration is quite good compared to current manual methods.
A GPU implementation of a track-repeating algorithm for proton radiotherapy dose calculations.
Yepes, Pablo P; Mirkovic, Dragan; Taddei, Phillip J
2010-12-07
An essential component in proton radiotherapy is the algorithm to calculate the radiation dose to be delivered to the patient. The most common dose algorithms are fast but they are approximate analytical approaches. However their level of accuracy is not always satisfactory, especially for heterogeneous anatomical areas, like the thorax. Monte Carlo techniques provide superior accuracy; however, they often require large computation resources, which render them impractical for routine clinical use. Track-repeating algorithms, for example the fast dose calculator, have shown promise for achieving the accuracy of Monte Carlo simulations for proton radiotherapy dose calculations in a fraction of the computation time. We report on the implementation of the fast dose calculator for proton radiotherapy on a card equipped with graphics processor units (GPUs) rather than on a central processing unit architecture. This implementation reproduces the full Monte Carlo and CPU-based track-repeating dose calculations within 2%, while achieving a statistical uncertainty of 2% in less than 1 min utilizing one single GPU card, which should allow real-time accurate dose calculations.
An implementation of differential evolution algorithm for inversion of geoelectrical data
NASA Astrophysics Data System (ADS)
Balkaya, Çağlayan
2013-11-01
Differential evolution (DE), a population-based evolutionary algorithm (EA) has been implemented to invert self-potential (SP) and vertical electrical sounding (VES) data sets. The algorithm uses three operators including mutation, crossover and selection similar to genetic algorithm (GA). Mutation is the most important operator for the success of DE. Three commonly used mutation strategies including DE/best/1 (strategy 1), DE/rand/1 (strategy 2) and DE/rand-to-best/1 (strategy 3) were applied together with a binomial type crossover. Evolution cycle of DE was realized without boundary constraints. For the test studies performed with SP data, in addition to both noise-free and noisy synthetic data sets two field data sets observed over the sulfide ore body in the Malachite mine (Colorado) and over the ore bodies in the Neem-Ka Thana cooper belt (India) were considered. VES test studies were carried out using synthetically produced resistivity data representing a three-layered earth model and a field data set example from Gökçeada (Turkey), which displays a seawater infiltration problem. Mutation strategies mentioned above were also extensively tested on both synthetic and field data sets in consideration. Of these, strategy 1 was found to be the most effective strategy for the parameter estimation by providing less computational cost together with a good accuracy. The solutions obtained by DE for the synthetic cases of SP were quite consistent with particle swarm optimization (PSO) which is a more widely used population-based optimization algorithm than DE in geophysics. Estimated parameters of SP and VES data were also compared with those obtained from Metropolis-Hastings (M-H) sampling algorithm based on simulated annealing (SA) without cooling to clarify uncertainties in the solutions. Comparison to the M-H algorithm shows that DE performs a fast approximate posterior sampling for the case of low-dimensional inverse geophysical problems.
The mGA1.0: A common LISP implementation of a messy genetic algorithm
NASA Technical Reports Server (NTRS)
Goldberg, David E.; Kerzic, Travis
1990-01-01
Genetic algorithms (GAs) are finding increased application in difficult search, optimization, and machine learning problems in science and engineering. Increasing demands are being placed on algorithm performance, and the remaining challenges of genetic algorithm theory and practice are becoming increasingly unavoidable. Perhaps the most difficult of these challenges is the so-called linkage problem. Messy GAs were created to overcome the linkage problem of simple genetic algorithms by combining variable-length strings, gene expression, messy operators, and a nonhomogeneous phasing of evolutionary processing. Results on a number of difficult deceptive test functions are encouraging with the mGA always finding global optima in a polynomial number of function evaluations. Theoretical and empirical studies are continuing, and a first version of a messy GA is ready for testing by others. A Common LISP implementation called mGA1.0 is documented and related to the basic principles and operators developed by Goldberg et. al. (1989, 1990). Although the code was prepared with care, it is not a general-purpose code, only a research version. Important data structures and global variations are described. Thereafter brief function descriptions are given, and sample input data are presented together with sample program output. A source listing with comments is also included.
NASA Astrophysics Data System (ADS)
Montgomery, P. C.; Salzenstein, F.; Montaner, D.; Serio, B.; Pfeiffer, P.
2013-04-01
Coherence scanning interferometry (CSI) is an optical profilometry technique that uses the scanning of white light interference fringes over the depth of the surface of a sample to measure the surface roughness. Many different types of algorithms have been proposed to determine the fringe envelope, such as peak fringe intensity detection, demodulation, centroid detection, FFT, wavelets and signal correlation. In this paper we present a very compact and efficient algorithm based on the measurement of the signal modulation using a second-order nonlinear filter derived from Teager-Kaiser methods and known as the five-sample adaptive (FSA) algorithm. We describe its implementation in a measuring system for static surface roughness measurement. Two envelope peak detection techniques are demonstrated. The first one, using second order spline fitting results in an axial sensitivity of 25 nm and is better adapted to rough samples. The second one, using local phase correction, gives nanometric axial sensitivity and is more appropriate for smooth samples. The choice of technique is important to minimize artifacts. Surface measurement results are given on a silicon wafer and a metallic contact on poly-Si and the results are compared with those from a commercial interferometer and AFM, demonstrating the robustness of the FSA algorithm.
The Research and Implementation of MUSER CLEAN Algorithm Based on OpenCL
NASA Astrophysics Data System (ADS)
Feng, Y.; Chen, K.; Deng, H.; Wang, F.; Mei, Y.; Wei, S. L.; Dai, W.; Yang, Q. P.; Liu, Y. B.; Wu, J. P.
2017-03-01
It's urgent to carry out high-performance data processing with a single machine in the development of astronomical software. However, due to the different configuration of the machine, traditional programming techniques such as multi-threading, and CUDA (Compute Unified Device Architecture)+GPU (Graphic Processing Unit) have obvious limitations in portability and seamlessness between different operation systems. The OpenCL (Open Computing Language) used in the development of MUSER (MingantU SpEctral Radioheliograph) data processing system is introduced. And the Högbom CLEAN algorithm is re-implemented into parallel CLEAN algorithm by the Python language and PyOpenCL extended package. The experimental results show that the CLEAN algorithm based on OpenCL has approximately equally operating efficiency compared with the former CLEAN algorithm based on CUDA. More important, the data processing in merely CPU (Central Processing Unit) environment of this system can also achieve high performance, which has solved the problem of environmental dependence of CUDA+GPU. Overall, the research improves the adaptability of the system with emphasis on performance of MUSER image clean computing. In the meanwhile, the realization of OpenCL in MUSER proves its availability in scientific data processing. In view of the high-performance computing features of OpenCL in heterogeneous environment, it will probably become the preferred technology in the future high-performance astronomical software development.
Next Generation Aura-OMI SO2 Retrieval Algorithm: Introduction and Implementation Status
NASA Technical Reports Server (NTRS)
Li, Can; Joiner, Joanna; Krotkov, Nickolay A.; Bhartia, Pawan K.
2014-01-01
We introduce our next generation algorithm to retrieve SO2 using radiance measurements from the Aura Ozone Monitoring Instrument (OMI). We employ a principal component analysis technique to analyze OMI radiance spectral in 310.5-340 nm acquired over regions with no significant SO2. The resulting principal components (PCs) capture radiance variability caused by both physical processes (e.g., Rayleigh and Raman scattering, and ozone absorption) and measurement artifacts, enabling us to account for these various interferences in SO2 retrievals. By fitting these PCs along with SO2 Jacobians calculated with a radiative transfer model to OMI-measured radiance spectra, we directly estimate SO2 vertical column density in one step. As compared with the previous generation operational OMSO2 PBL (Planetary Boundary Layer) SO2 product, our new algorithm greatly reduces unphysical biases and decreases the noise by a factor of two, providing greater sensitivity to anthropogenic emissions. The new algorithm is fast, eliminates the need for instrument-specific radiance correction schemes, and can be easily adapted to other sensors. These attributes make it a promising technique for producing long-term, consistent SO2 records for air quality and climate research. We have operationally implemented this new algorithm on OMI SIPS for producing the new generation standard OMI SO2 products.
Supercomputer implementation of finite element algorithms for high speed compressible flows
NASA Technical Reports Server (NTRS)
Thornton, E. A.; Ramakrishnan, R.
1986-01-01
Prediction of compressible flow phenomena using the finite element method is of recent origin and considerable interest. Two shock capturing finite element formulations for high speed compressible flows are described. A Taylor-Galerkin formulation uses a Taylor series expansion in time coupled with a Galerkin weighted residual statement. The Taylor-Galerkin algorithms use explicit artificial dissipation, and the performance of three dissipation models are compared. A Petrov-Galerkin algorithm has as its basis the concepts of streamline upwinding. Vectorization strategies are developed to implement the finite element formulations on the NASA Langley VPS-32. The vectorization scheme results in finite element programs that use vectors of length of the order of the number of nodes or elements. The use of the vectorization procedure speeds up processing rates by over two orders of magnitude. The Taylor-Galerkin and Petrov-Galerkin algorithms are evaluated for 2D inviscid flows on criteria such as solution accuracy, shock resolution, computational speed and storage requirements. The convergence rates for both algorithms are enhanced by local time-stepping schemes. Extension of the vectorization procedure for predicting 2D viscous and 3D inviscid flows are demonstrated. Conclusions are drawn regarding the applicability of the finite element procedures for realistic problems that require hundreds of thousands of nodes.
Fast Detection Anti-Collision Algorithm for RFID System Implemented On-Chip
NASA Astrophysics Data System (ADS)
Sampe, Jahariah; Othman, Masuri
This study presents a proposed Fast Detection Anti-Collision Algorithm (FDACA) for Radio Frequency Identification (RFID) system. Our proposed FDACA is implemented on-chip using Application Specific Integrated Circuit (ASIC) technology and the algorithm is based on the deterministic anti-collision technique. The FDACA is novel in terms of a faster identification by reducing the number of iterations during the identification process. The primary FDACA also reads the identification (ID) bits at once regardless of its length. It also does not require the tags to remember the instructions from the reader during the communication process in which the tags are treated as address carrying devices only. As a result simple, small, low cost and memoryless tags can be produced. The proposed system is designed using Verilog HDL. The system is simulated using Modelsim XE II and synthesized using Xilinx Synthesis Technology (XST). The system is implemented in hardware using Field Programmable Grid Array (FPGA) board for real time verification. From the verification results it can be shown that the FDACA system enables to identify the tags without error until the operating frequency of 180 MHZ. Finally the FDACA system is implemented on chip using 0.18 μm Library, Synopsys Compiler and tools. From the resynthesis results it shows that the identification rate of the proposed FDACA system is 333 Mega tags per second with the power requirement of 3.451 mW.
Implementation of a cone-beam backprojection algorithm on the cell broadband engine processor
NASA Astrophysics Data System (ADS)
Bockenbach, Olivier; Knaup, Michael; Kachelrieß, Marc
2007-03-01
Tomographic image reconstruction is computationally very demanding. In all cases the backprojection represents the performance bottleneck due to the high operational count and due to the high demand put on the memory subsystem. In the past, solving this problem has lead to the implementation of specific architectures, connecting Application Specific Integrated Circuits (ASICs) or Field Programmable Gate Arrays (FPGAs) to memory through dedicated high speed busses. More recently, there have also been attempt to use Graphic Processing Units (GPUs) to perform the backprojection step. Originally aimed at the gaming market, IBM, Toshiba and Sony have introduced the Cell Broadband Engine (CBE) processor, often considered as a multicomputer on a chip. Clocked at 3 GHz, the Cell allows for a theoretical performance of 192 GFlops and a peak data transfer rate over the internal bus of 200 GB/s. This performance indeed makes the Cell a very attractive architecture for implementing tomographic image reconstruction algorithms. In this study, we investigate the relative performance of a perspective backprojection algorithm when implemented on a standard PC and on the Cell processor. We compare these results to the performance achievable with FPGAs based boards and high end GPUs. The cone-beam backprojection performance was assessed by backprojecting a full circle scan of 512 projections of 1024x1024 pixels into a volume of size 512x512x512 voxels. It took 3.2 minutes on the PC (single CPU) and is as fast as 13.6 seconds on the Cell.
Zhou, Ning; Huang, Zhenyu; Tuffner, Francis K.; Jin, Shuangshuang; Lin, Jenglung; Hauer, Matthew L.
2010-02-28
Small signal stability problems are one of the major threats to grid stability and reliability. Prony analysis has been successfully applied on ringdown data to monitor electromechanical modes of a power system using phasor measurement unit (PMU) data. To facilitate an on-line application of mode estimation, this paper develops a recursive algorithm for implementing Prony analysis and proposed an oscillation detection method to detect ringdown data in real time. By automatically detecting ringdown data, the proposed method helps guarantee that Prony analysis is applied properly and timely on the ringdown data. Thus, the mode estimation results can be performed reliably and timely. The proposed method is tested using Monte Carlo simulations based on a 17-machine model and is shown to be able to properly identify the oscillation data for on-line application of Prony analysis. In addition, the proposed method is applied to field measurement data from WECC to show the performance of the proposed algorithm.
The 2016-2100 total solar eclipse prediction by using Meeus Algorithm implemented on MATLAB
NASA Astrophysics Data System (ADS)
Melati, A.; Hodijah, S.
2016-11-01
The phenomenon of solar and lunar eclipses can be predicted where and when it will happen. The Total Solar Eclipse (TSE) phenomenon on March 09th, 2016 became revival astronomy science in Indonesia and provided public astronomy education. This research aims to predict the total solar eclipse phenomenon from 2016 until 2100. We Used Besselian calculations and Meeus algorithms implemented in MATLAB R2012b software. This methods combine with VSOP087 and ELP2000-82 algorithm. As an example of simulation, TSE prediction on April 20th, 2042 has 0.2 seconds distinction of duration compared with NASA prediction. For the whole data TSE from year of 2016 until 2100 we found 0.04-0.21 seconds differences compared with NASA prediction.
NASA Astrophysics Data System (ADS)
Bouganssa, Issam; Sbihi, Mohamed; Zaim, Mounia
2017-07-01
The 2D Discrete Wavelet Transform (DWT) is a computationally intensive task that is usually implemented on specific architectures in many imaging systems in real time. In this paper, a high throughput edge or contour detection algorithm is proposed based on the discrete wavelet transform. A technique for applying the filters on the three directions (Horizontal, Vertical and Diagonal) of the image is used to present the maximum of the existing contours. The proposed architectures were designed in VHDL and mapped to a Xilinx Sparten6 FPGA. The results of the synthesis show that the proposed architecture has a low area cost and can operate up to 100 MHz, which can perform 2D wavelet analysis for a sequence of images while maintaining the flexibility of the system to support an adaptive algorithm.
NASA Astrophysics Data System (ADS)
Pucello, N.; Rosati, M.; D'Agostino, G.; Pisacane, F.; Rosato, V.; Celino, M.
A genetic algorithm for the optimization of the ground-state structure of a metallic cluster has been developed and ported on a SIMD-MIMD parallel platform. The SIMD part of the parallel platform is represented by a Quadrics/APE100 consisting of 512 floating point units, while the MIMD part is formed by a cluster of workstations. The proposed algorithm is composed by a part where the genetic operators are applied to the elements of the population and a part which performs a further local relaxation and the fitness calculation via Molecular Dynamics. These parts have been implemented on the MIMD and on the SIMD part, respectively. Results have been compared to those generated by using Simulated Annealing.
Facchinetti, Andrea; Sparacino, Giovanni; Cobelli, Claudio
2013-09-01
Glucose readings provided by current continuous glucose monitoring (CGM) devices still suffer from accuracy and precision issues. In April 2013, we proposed a new conceptual architecture to deal with these problems and render CGM sensors algorithmically smarter, which consists of three modules for denoising, enhancement, and prediction placed in cascade to a commercial CGM sensor. The architecture was assessed on a data set consisting of 24 type 1 diabetes patients collected in four clinical centers of the AP@home Consortium (a European project of 7th Framework Programme funded by the European Committee). This article, as a companion to our prior publication, illustrates the technical details of the algorithms and of the implementation issues.
Morita, Kenji; Jitsev, Jenia; Morrison, Abigail
2016-09-15
Value-based action selection has been suggested to be realized in the corticostriatal local circuits through competition among neural populations. In this article, we review theoretical and experimental studies that have constructed and verified this notion, and provide new perspectives on how the local-circuit selection mechanisms implement reinforcement learning (RL) algorithms and computations beyond them. The striatal neurons are mostly inhibitory, and lateral inhibition among them has been classically proposed to realize "Winner-Take-All (WTA)" selection of the maximum-valued action (i.e., 'max' operation). Although this view has been challenged by the revealed weakness, sparseness, and asymmetry of lateral inhibition, which suggest more complex dynamics, WTA-like competition could still occur on short time scales. Unlike the striatal circuit, the cortical circuit contains recurrent excitation, which may enable retention or temporal integration of information and probabilistic "soft-max" selection. The striatal "max" circuit and the cortical "soft-max" circuit might co-implement an RL algorithm called Q-learning; the cortical circuit might also similarly serve for other algorithms such as SARSA. In these implementations, the cortical circuit presumably sustains activity representing the executed action, which negatively impacts dopamine neurons so that they can calculate reward-prediction-error. Regarding the suggested more complex dynamics of striatal, as well as cortical, circuits on long time scales, which could be viewed as a sequence of short WTA fragments, computational roles remain open: such a sequence might represent (1) sequential state-action-state transitions, constituting replay or simulation of the internal model, (2) a single state/action by the whole trajectory, or (3) probabilistic sampling of state/action.
Quality Screening Algorithms Implemented in the New CALIPSO Level 3 Aerosol Profile Product
NASA Astrophysics Data System (ADS)
Tackett, J. L.; Winker, D. M.; Getzewich, B. J.; Vaughan, M.
2012-12-01
Global observations of aerosol extinction profiles can improve the ability of climate models to properly account for aerosol radiative forcing in Earth's atmosphere. In response to this need, a new CALIPSO (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations) level 3 aerosol profile product has been released which for the first time provides monthly, globally gridded and quality-screened aerosol extinction profiles within the troposphere for the entire 6-year mission. Level 3 aerosol extinction profiles are aggregated from CALIOP (Cloud-Aerosol Lidar with Orthogonal Polarization) lidar extinction retrievals reported in the CALIPSO level 2 aerosol profile product onto an equal-angle grid after quality screening algorithms are applied to reduce occurrences of failed retrievals, misclassified aerosol, surface contamination, and spurious outliers. Implementation of these quality screening algorithms is a substantial value to aerosol modeling groups who desire high confidence datasets without having to independently develop quality screening metrics. Furthermore, quality screening is paramount to understand the scientific content of the resultant CALIPSO level 3 aerosol profile product since classification and retrieval errors in level 2 aerosol data may lead to misinterpretation of the distribution and optical properties of aerosol in the troposphere. This presentation summarizes the averaging and quality screening algorithms implemented in the CALIPSO level 3 aerosol profile product, provides rationale for their implementation, and discusses averaging and filtering differences unique to CALIPSO data compared to level 3 products aggregated from passive satellite measurements. Examples are given that illustrate the benefits of quality screening and the dangers of improper screening CALIPSO level 2 aerosol extinction data. Sensitivity study results are presented to highlight the impact of quality screening on final level 3 statistics. Since overlying cloud
Implementation of a Real-Time Stacking Algorithm in a Photogrammetric Digital Camera for Uavs
NASA Astrophysics Data System (ADS)
Audi, A.; Pierrot-Deseilligny, M.; Meynard, C.; Thom, C.
2017-08-01
In the recent years, unmanned aerial vehicles (UAVs) have become an interesting tool in aerial photography and photogrammetry activities. In this context, some applications (like cloudy sky surveys, narrow-spectral imagery and night-vision imagery) need a longexposure time where one of the main problems is the motion blur caused by the erratic camera movements during image acquisition. This paper describes an automatic real-time stacking algorithm which produces a high photogrammetric quality final composite image with an equivalent long-exposure time using several images acquired with short-exposure times. Our method is inspired by feature-based image registration technique. The algorithm is implemented on the light-weight IGN camera, which has an IMU sensor and a SoC/FPGA. To obtain the correct parameters for the resampling of images, the presented method accurately estimates the geometrical relation between the first and the Nth image, taking into account the internal parameters and the distortion of the camera. Features are detected in the first image by the FAST detector, than homologous points on other images are obtained by template matching aided by the IMU sensors. The SoC/FPGA in the camera is used to speed up time-consuming parts of the algorithm such as features detection and images resampling in order to achieve a real-time performance as we want to write only the resulting final image to save bandwidth on the storage device. The paper includes a detailed description of the implemented algorithm, resource usage summary, resulting processing time, resulting images, as well as block diagrams of the described architecture. The resulting stacked image obtained on real surveys doesn't seem visually impaired. Timing results demonstrate that our algorithm can be used in real-time since its processing time is less than the writing time of an image in the storage device. An interesting by-product of this algorithm is the 3D rotation estimated by a
AVR microcontroller simulator for software implemented hardware fault tolerance algorithms research
NASA Astrophysics Data System (ADS)
Piotrowski, Adam; Tarnowski, Szymon; Napieralski, Andrzej
2008-01-01
Reliability of new, advanced electronic systems becomes a serious problem especially in places like accelerators and synchrotrons, where sophisticated digital devices operate closely to radiation sources. One of the possible solutions to harden the microprocessor-based system is a strict programming approach known as the Software Implemented Hardware Fault Tolerance. Unfortunately, in real environments it is not possible to perform precise and accurate tests of the new algorithms due to hardware limitation. This paper highlights the AVR-family microcontroller simulator project equipped with an appropriate monitoring and the SEU injection systems.
Implementation and testing of a frozen density matrix-divide and conquer algorithm
Ermolaeva, M.D.; Vaart, A. van der; Merz, K.M. Jr.
1999-03-25
The authors have implemented and tested a frozen density matrix (FDM) approximation to the basic divide and conquer (DC) semiempirical algorithm. Molecular dynamics and Monte Carlo simulations were performed to estimate the advantages of the method. Results were compared to those obtained from the original DC method and the combined quantum mechanical/molecular mechanical (WM/MM) method. The authors found that the FDM approximation speeds DC calculations up significantly, while only introducing small errors. They also found that the FDM DC scheme performs better than the standard QM/MM approach in terms of defining the electronic and structural properties of the systems studied herein.
Implementation of a new iterative learning control algorithm on real data
NASA Astrophysics Data System (ADS)
Zamanian, Hamed; Koohi, Ardavan
2016-02-01
In this paper, a newly presented approach is proposed for closed-loop automatic tuning of a proportional integral derivative (PID) controller based on iterative learning control (ILC) algorithm. A modified ILC scheme iteratively changes the control signal by adjusting it. Once a satisfactory performance is achieved, a linear compensator is identified in the ILC behavior using casual relationship between the closed loop signals. This compensator is approximated by a PD controller which is used to tune the original PID controller. Results of implementing this approach presented on the experimental data of Damavand tokamak and are consistent with simulation outcome.
The Sustainable Technology Division has recently completed an implementation of the U.S. EPA's Waste Reduction (WAR) Algorithm that can be directly accessed from a Cape-Open compliant process modeling environment. The WAR Algorithm add-in can be used in AmsterChem's COFE (Cape-Op...
The Sustainable Technology Division has recently completed an implementation of the U.S. EPA's Waste Reduction (WAR) Algorithm that can be directly accessed from a Cape-Open compliant process modeling environment. The WAR Algorithm add-in can be used in AmsterChem's COFE (Cape-Op...
Implementation of a Point Algorithm for Real-Time Convex Optimization
NASA Technical Reports Server (NTRS)
Acikmese, Behcet; Motaghedi, Shui; Carson, John
2007-01-01
The primal-dual interior-point algorithm implemented in G-OPT is a relatively new and efficient way of solving convex optimization problems. Given a prescribed level of accuracy, the convergence to the optimal solution is guaranteed in a predetermined, finite number of iterations. G-OPT Version 1.0 is a flight software implementation written in C. Onboard application of the software enables autonomous, real-time guidance and control that explicitly incorporates mission constraints such as control authority (e.g. maximum thrust limits), hazard avoidance, and fuel limitations. This software can be used in planetary landing missions (Mars pinpoint landing and lunar landing), as well as in proximity operations around small celestial bodies (moons, asteroids, and comets). It also can be used in any spacecraft mission for thrust allocation in six-degrees-of-freedom control.
NASA Technical Reports Server (NTRS)
Doxley, Charles A.
2016-01-01
In the current world of applications that use reconfigurable technology implemented on field programmable gate arrays (FPGAs), there is a need for flexible architectures that can grow as the systems evolve. A project has limited resources and a fixed set of requirements that development efforts are tasked to meet. Designers must develop robust solutions that practically meet the current customer demands and also have the ability to grow for future performance. This paper describes the development of a high speed serial data streaming algorithm that allows for transmission of multiple data channels over a single serial link. The technique has the ability to change to meet new applications developed for future design considerations. This approach uses the Xilinx Serial RapidIO LOGICORE Solution to implement a flexible infrastructure to meet the current project requirements with the ability to adapt future system designs.
NASA Technical Reports Server (NTRS)
Yan, T. Y.; Yao, K.
1981-01-01
The theory and implementation of a multiplication-free linear mean-square error criterion equalizer for data transmission are considered. For many real-time signal processing situations, a large number of multiplications is objectionable. The linear estimation problem on a binary computer is considered where the estimation parameters are constrained to be powers of two and thus all multiplications are replaced by shifts. The optimal solution is obtained from an integer-programming-like problem except that the allowable discrete points are non-integers. The branch-and-bound algorithm is used to obtain the coefficients of the equalization TDL. Specific experimental performance results are given for an equalizer implemented with a 12 bit A/D device and a 8080 microprocessor.
Implementation of QR-decomposition based on CORDIC for unitary MUSIC algorithm
NASA Astrophysics Data System (ADS)
Lounici, Merwan; Luan, Xiaoming; Saadi, Wahab
2013-07-01
The DOA (Direction Of Arrival) estimation with subspace methods such as MUSIC (MUltiple SIgnal Classification) and ESPRIT (Estimation of Signal Parameters via Rotational Invariance Technique) is based on an accurate estimation of the eigenvalues and eigenvectors of covariance matrix. QR decomposition is implemented with the Coordinate Rotation DIgital Computer (CORDIC) algorithm. QRD requires only additions and shifts [6], so it is faster and more regular than other methods. In this article the hardware architecture of an EVD (Eigen Value Decomposition) processor based on TSA (triangular systolic array) for QR decomposition is proposed. Using Xilinx System Generator (XSG), the design is implemented and the estimated logic device resource values are presented for different matrix sizes.
NASA Astrophysics Data System (ADS)
Botwicz, Jakub; Buciak, Piotr; Sapiecha, Piotr
2006-03-01
Intrusion Prevention Systems (IPSs) have become widely recognized as a powerful tool and an important element of IT security safeguards. The essential feature of network IPSs is searching through network packets and matching multiple strings, that are fingerprints of known attacks. String matching is highly resource consuming and also the most significant bottleneck of IPSs. In this article an extension of the classical Karp-Rabin algorithm and its implementation architectures were examined. The result is a software, which generates a source code of a string matching module in hardware description language, that could be easily used to create an Intrusion Prevention System implemented in reconfigurable hardware. The prepared module matches the complete set of Snort IPS signatures achieving throughput of over 2 Gbps on an Altera Stratix I1 evaluation board. The most significant advantage of the proposed architecture is that the update of the patterns database does not require reconfiguration of the circuitry.
From algorithm to implementation: a case study on blind carrier synchronization
NASA Astrophysics Data System (ADS)
Schmidt, D.; Brack, T.; Wasenmüller, U.; Wehn, N.
2006-09-01
Increasing chip complexities demand a higher design productivity. IP cores, which implement commonly needed operations, can help to dramatically shorten development and verification times for new designs. They often allow for a efficient mapping of algorithmic tasks to a hardware architecture. In this paper we present a novel configurable building block for blind carrier synchronization that features combined frequency and phase offset estimation and an alternative modulation removal that improves communication performance compared to state-of-the-art designs. The used design flow exploits the benefits of IP cores for rapid development times while still offering the designer the full range of optimization possibilities for a specific design. It allowed us to do an almost complete design space exploration, assuring a near-optimal solution to the given problem. The implementation platform is a XILINX Virtex II Pro FPGA.
Multi-GPU implementation of a VMAT treatment plan optimization algorithm
Tian, Zhen E-mail: Xun.Jia@UTSouthwestern.edu Folkerts, Michael; Tan, Jun; Jia, Xun E-mail: Xun.Jia@UTSouthwestern.edu Jiang, Steve B. E-mail: Xun.Jia@UTSouthwestern.edu; Peng, Fei
2015-06-15
Purpose: Volumetric modulated arc therapy (VMAT) optimization is a computationally challenging problem due to its large data size, high degrees of freedom, and many hardware constraints. High-performance graphics processing units (GPUs) have been used to speed up the computations. However, GPU’s relatively small memory size cannot handle cases with a large dose-deposition coefficient (DDC) matrix in cases of, e.g., those with a large target size, multiple targets, multiple arcs, and/or small beamlet size. The main purpose of this paper is to report an implementation of a column-generation-based VMAT algorithm, previously developed in the authors’ group, on a multi-GPU platform to solve the memory limitation problem. While the column-generation-based VMAT algorithm has been previously developed, the GPU implementation details have not been reported. Hence, another purpose is to present detailed techniques employed for GPU implementation. The authors also would like to utilize this particular problem as an example problem to study the feasibility of using a multi-GPU platform to solve large-scale problems in medical physics. Methods: The column-generation approach generates VMAT apertures sequentially by solving a pricing problem (PP) and a master problem (MP) iteratively. In the authors’ method, the sparse DDC matrix is first stored on a CPU in coordinate list format (COO). On the GPU side, this matrix is split into four submatrices according to beam angles, which are stored on four GPUs in compressed sparse row format. Computation of beamlet price, the first step in PP, is accomplished using multi-GPUs. A fast inter-GPU data transfer scheme is accomplished using peer-to-peer access. The remaining steps of PP and MP problems are implemented on CPU or a single GPU due to their modest problem scale and computational loads. Barzilai and Borwein algorithm with a subspace step scheme is adopted here to solve the MP problem. A head and neck (H and N) cancer case is
Implementing Quantum Algorithms with Modular Gates in a Trapped Ion Chain
NASA Astrophysics Data System (ADS)
Figgatt, Caroline; Debnath, Shantanu; Linke, Norbert; Landsman, Kevin; Wright, Ken; Monroe, Chris
2016-05-01
We present experimental results on quantum algorithms performed using fully modular one- and two-qubit gates in a linear chain of 5 Yb + ions. This is accomplished through arbitrary qubit addressing and manipulation from stimulated Raman transitions driven by a beat note between counter-propagating beams from a pulsed laser. The Raman beam pairs consist of one global beam and a set of counter-propagating individual addressing beams, one for each ion. This provides arbitrary single-qubit rotations as well as arbitrary selection of ion pairs for a fully-connected system of two-qubit modular XX-entangling gates implemented using a pulse-segmentation scheme. We execute controlled-NOT gates with an average fidelity of 97.0% for all 10 possible pairs. Programming arbitrary sequences of gates allows us to construct any quantum algorithm, making this system a universal quantum computer. As an example, we present experimental results for the Bernstein-Vazirani algorithm using 4 control qubits and 1 ancilla, performed with concatenated gates that can be reconfigured to construct all 16 possible oracles, and obtain a process fidelity of 90.3%. This work is supported by the ARO with funding from the IARPA MQCO program and the AFOSR MURI on Quantum Measurement and Verification.
NASA Astrophysics Data System (ADS)
Sham Kumar, S.; C. V., Sreehari; Vivek, K.; T. V., Praveen; Moosad, K. P. B.; Rajesh, R.
2015-06-01
This paper discusses the detailed experimental investigations on the performance of interferometer based fiber optic hydrophones with different Phase Generated Carrier (PGC) demodulation algorithms for their interrogation. The study covers the effect on different parametric variations in the PGC implementations by comparison through Signal to Noise And Distortion (SINAD) and Total Harmonic Distortion (THD) analysis. This paper discusses experiments on most popular algorithms based on PGC like Arctangent, Differential Cross Multiplication (DCM) and Ameliorated PGC. A Distributed Feed-Back Fiber Lasers (DFB-FL) based fiber optic hydrophone, with Mach-Zehnder Interferometer having active phase modulator in reference arm and mechanism to cater polarization related intensity fading were used for the experiments. Experiments were carried out to study the effects of various parameters like the type and configuration of low pass filter, frequency of the modulation signal, frequency of acoustic signal etc. It is observed that all the three factors viz. the type of low pass filter, frequency of modulating and acoustic signal plays important role in retrieving the acoustic signal, based on the type of algorithms used and are discussed here.
Implementation of the FDK algorithm for cone-beam CT on the cell broadband engine architecture
NASA Astrophysics Data System (ADS)
Scherl, Holger; Koerner, Mario; Hofmann, Hannes; Eckert, Wieland; Kowarschik, Markus; Hornegger, Joachim
2007-03-01
In most of today's commercially available cone-beam CT scanners, the well known FDK method is used for solving the 3D reconstruction task. The computational complexity of this algorithm prohibits its use for many medical applications without hardware acceleration. The brand-new Cell Broadband Engine Architecture (CBEA) with its high level of parallelism is a cost-efficient processor for performing the FDK reconstruction according to the medical requirements. The programming scheme, however, is quite different to any standard personal computer hardware. In this paper, we present an innovative implementation of the most time-consuming parts of the FDK algorithm: filtering and back-projection. We also explain the required transformations to parallelize the algorithm for the CBEA. Our software framework allows to compute the filtering and back-projection in parallel, making it possible to do an on-the-fly-reconstruction. The achieved results demonstrate that a complete FDK reconstruction is computed with the CBEA in less than seven seconds for a standard clinical scenario. Given the fact that scan times are usually much higher, we conclude that reconstruction is finished right after the end of data acquisition. This enables us to present the reconstructed volume to the physician in real-time, immediately after the last projection image has been acquired by the scanning device.
Santi, Peter Angelo; Cutler, Theresa Elizabeth; Favalli, Andrea; Koehler, Katrina Elizabeth; Henzl, Vladimir; Henzlova, Daniela; Parker, Robert Francis; Croft, Stephen
2015-12-01
In order to improve the accuracy and capabilities of neutron multiplicity counting, additional quantifiable information is needed in order to address the assumptions that are present in the point model. Extracting and utilizing higher order moments (Quads and Pents) from the neutron pulse train represents the most direct way of extracting additional information from the measurement data to allow for an improved determination of the physical properties of the item of interest. The extraction of higher order moments from a neutron pulse train required the development of advanced dead time correction algorithms which could correct for dead time effects in all of the measurement moments in a self-consistent manner. In addition, advanced analysis algorithms have been developed to address specific assumptions that are made within the current analysis model, namely that all neutrons are created at a single point within the item of interest, and that all neutrons that are produced within an item are created with the same energy distribution. This report will discuss the current status of implementation and initial testing of the advanced dead time correction and analysis algorithms that have been developed in an attempt to utilize higher order moments to improve the capabilities of correlated neutron measurement techniques.
Implementation of a two-qubit Grover algorithm using superconducting qubits
NASA Astrophysics Data System (ADS)
Steffen, Matthias; Corcoles, Antonio; Chow, Jerry; Gambetta, Jay; Smolin, John; Ware, Matt; Strand, Joel; Plourde, Britton
2013-03-01
High fidelity two-qubit gates have previously been demonstrated with fixed frequency superconducting qubits and employing the cross-resonance effect generating the qubit-qubit interaction in which qubit 1 is driven at the frequency of qubit 2. The drawback of previous implementations of the cross-resonance gate is the fact that single qubit gates on qubit 2 emerge when the qubits are multi-level systems instead of strictly two-level systems. As a result, two-qubit gates must be tuned up by careful timing or by explicitly applying single-qubit correction pulses. This is a cumbersome procedure and can add overall errors. Instead, we show a refocusing scheme which preserves the two-qubit interaction but eliminates the single-qubit gates. The total gate length is only increased by the duration of two single qubit pi-pulses which is a low overhead. When tuning up this composite pulse we show an implementation of a two-qubit Grover's algorithm without applying any correction pulses. The average success probability of the algorithm is consistent with fidelity metrics obtained by independent randomized bench-marking experiments (both single and two-qubit). We acknowledge support from IARPA under contract W911NF-10-1-0324.
Pre-Hardware Optimization and Implementation Of Fast Optics Closed Control Loop Algorithms
NASA Technical Reports Server (NTRS)
Kizhner, Semion; Lyon, Richard G.; Herman, Jay R.; Abuhassan, Nader
2004-01-01
One of the main heritage tools used in scientific and engineering data spectrum analysis is the Fourier Integral Transform and its high performance digital equivalent - the Fast Fourier Transform (FFT). The FFT is particularly useful in two-dimensional (2-D) image processing (FFT2) within optical systems control. However, timing constraints of a fast optics closed control loop would require a supercomputer to run the software implementation of the FFT2 and its inverse, as well as other image processing representative algorithm, such as numerical image folding and fringe feature extraction. A laboratory supercomputer is not always available even for ground operations and is not feasible for a night project. However, the computationally intensive algorithms still warrant alternative implementation using reconfigurable computing technologies (RC) such as Digital Signal Processors (DSP) and Field Programmable Gate Arrays (FPGA), which provide low cost compact super-computing capabilities. We present a new RC hardware implementation and utilization architecture that significantly reduces the computational complexity of a few basic image-processing algorithm, such as FFT2, image folding and phase diversity for the NASA Solar Viewing Interferometer Prototype (SVIP) using a cluster of DSPs and FPGAs. The DSP cluster utilization architecture also assures avoidance of a single point of failure, while using commercially available hardware. This, combined with the control algorithms pre-hardware optimization, or the first time allows construction of image-based 800 Hertz (Hz) optics closed control loops on-board a spacecraft, based on the SVIP ground instrument. That spacecraft is the proposed Earth Atmosphere Solar Occultation Imager (EASI) to study greenhouse gases CO2, C2H, H2O, O3, O2, N2O from Lagrange-2 point in space. This paper provides an advanced insight into a new type of science capabilities for future space exploration missions based on on-board image processing
NASA Technical Reports Server (NTRS)
Kizhner, Semion; Flatley, Thomas P.; Hestnes, Phyllis; Jentoft-Nilsen, Marit; Petrick, David J.; Day, John H. (Technical Monitor)
2001-01-01
Spacecraft telemetry rates have steadily increased over the last decade presenting a problem for real-time processing by ground facilities. This paper proposes a solution to a related problem for the Geostationary Operational Environmental Spacecraft (GOES-8) image processing application. Although large super-computer facilities are the obvious heritage solution, they are very costly, making it imperative to seek a feasible alternative engineering solution at a fraction of the cost. The solution is based on a Personal Computer (PC) platform and synergy of optimized software algorithms and re-configurable computing hardware technologies, such as Field Programmable Gate Arrays (FPGA) and Digital Signal Processing (DSP). It has been shown in [1] and [2] that this configuration can provide superior inexpensive performance for a chosen application on the ground station or on-board a spacecraft. However, since this technology is still maturing, intensive pre-hardware steps are necessary to achieve the benefits of hardware implementation. This paper describes these steps for the GOES-8 application, a software project developed using Interactive Data Language (IDL) (Trademark of Research Systems, Inc.) on a Workstation/UNIX platform. The solution involves converting the application to a PC/Windows/RC platform, selected mainly by the availability of low cost, adaptable high-speed RC hardware. In order for the hybrid system to run, the IDL software was modified to account for platform differences. It was interesting to examine the gains and losses in performance on the new platform, as well as unexpected observations before implementing hardware. After substantial pre-hardware optimization steps, the necessity of hardware implementation for bottleneck code in the PC environment became evident and solvable beginning with the methodology described in [1], [2], and implementing a novel methodology for this specific application [6]. The PC-RC interface bandwidth problem for the
The Parallel Implementation of Algorithms for Finding the Reflection Symmetry of the Binary Images
NASA Astrophysics Data System (ADS)
Fedotova, S.; Seredin, O.; Kushnir, O.
2017-05-01
In this paper, we investigate the exact method of searching an axis of binary image symmetry, based on brute-force search among all potential symmetry axes. As a measure of symmetry, we use the set-theoretic Jaccard similarity applied to two subsets of pixels of the image which is divided by some axis. Brute-force search algorithm definitely finds the axis of approximate symmetry which could be considered as ground-truth, but it requires quite a lot of time to process each image. As a first step of our contribution we develop the parallel version of the brute-force algorithm. It allows us to process large image databases and obtain the desired axis of approximate symmetry for each shape in database. Experimental studies implemented on "Butterflies" and "Flavia" datasets have shown that the proposed algorithm takes several minutes per image to find a symmetry axis. However, in case of real-world applications we need computational efficiency which allows solving the task of symmetry axis search in real or quasi-real time. So, for the task of fast shape symmetry calculation on the common multicore PC we elaborated another parallel program, which based on the procedure suggested before in (Fedotova, 2016). That method takes as an initial axis the axis obtained by superfast comparison of two skeleton primitive sub-chains. This process takes about 0.5 sec on the common PC, it is considerably faster than any of the optimized brute-force methods including ones implemented in supercomputer. In our experiments for 70 percent of cases the found axis coincides with the ground-truth one absolutely, and for the rest of cases it is very close to the ground-truth.
A time-efficient algorithm for implementing the Catmull-Clark subdivision method
NASA Astrophysics Data System (ADS)
Ioannou, G.; Savva, A.; Stylianou, V.
2015-10-01
Splines are the most popular methods in Figure Modeling and CAGD (Computer Aided Geometric Design) in generating smooth surfaces from a number of control points. The control points define the shape of a figure and splines calculate the required number of points which when displayed on a computer screen the result is a smooth surface. However, spline methods are based on a rectangular topological structure of points, i.e., a two-dimensional table of vertices, and thus cannot generate complex figures, such as the human and animal bodies that their complex structure does not allow them to be defined by a regular rectangular grid. On the other hand surface subdivision methods, which are derived by splines, generate surfaces which are defined by an arbitrary topology of control points. This is the reason that during the last fifteen years subdivision methods have taken the lead over regular spline methods in all areas of modeling in both industry and research. The cost of executing computer software developed to read control points and calculate the surface is run-time, due to the fact that the surface-structure required for handling arbitrary topological grids is very complicate. There are many software programs that have been developed related to the implementation of subdivision surfaces however, not many algorithms are documented in the literature, to support developers for writing efficient code. This paper aims to assist programmers by presenting a time-efficient algorithm for implementing subdivision splines. The Catmull-Clark which is the most popular of the subdivision methods has been employed to illustrate the algorithm.
NASA Astrophysics Data System (ADS)
Paz, Abel; Plaza, Antonio
2010-08-01
Automatic target and anomaly detection are considered very important tasks for hyperspectral data exploitation. These techniques are now routinely applied in many application domains, including defence and intelligence, public safety, precision agriculture, geology, or forestry. Many of these applications require timely responses for swift decisions which depend upon high computing performance of algorithm analysis. However, with the recent explosion in the amount and dimensionality of hyperspectral imagery, this problem calls for the incorporation of parallel computing techniques. In the past, clusters of computers have offered an attractive solution for fast anomaly and target detection in hyperspectral data sets already transmitted to Earth. However, these systems are expensive and difficult to adapt to on-board data processing scenarios, in which low-weight and low-power integrated components are essential to reduce mission payload and obtain analysis results in (near) real-time, i.e., at the same time as the data is collected by the sensor. An exciting new development in the field of commodity computing is the emergence of commodity graphics processing units (GPUs), which can now bridge the gap towards on-board processing of remotely sensed hyperspectral data. In this paper, we describe several new GPU-based implementations of target and anomaly detection algorithms for hyperspectral data exploitation. The parallel algorithms are implemented on latest-generation Tesla C1060 GPU architectures, and quantitatively evaluated using hyperspectral data collected by NASA's AVIRIS system over the World Trade Center (WTC) in New York, five days after the terrorist attacks that collapsed the two main towers in the WTC complex.
Novel algorithm implementations in DARC: the Durham AO real-time controller
NASA Astrophysics Data System (ADS)
Basden, Alastair; Bitenc, Urban; Jenkins, David
2016-07-01
The Durham AO Real-time Controller has been used on-sky with the CANARY AO demonstrator instrument since 2010, and is also used to provide control for several AO test-benches, including DRAGON. Over this period, many new real-time algorithms have been developed, implemented and demonstrated, leading to performance improvements for CANARY. Additionally, the computational performance of this real-time system has continued to improve. Here, we provide details about recent updates and changes made to DARC, and the relevance of these updates, including new algorithms, to forthcoming AO systems. We present the computational performance of DARC when used on different hardware platforms, including hardware accelerators, and determine the relevance and potential for ELT scale systems. Recent updates to DARC have included algorithms to handle elongated laser guide star images, including correlation wavefront sensing, with options to automatically update references during AO loop operation. Additionally, sub-aperture masking options have been developed to increase signal to noise ratio when operating with non-symmetrical wavefront sensor images. The development of end-user tools has progressed with new options for configuration and control of the system. New wavefront sensor camera models and DM models have been integrated with the system, increasing the number of possible hardware configurations available, and a fully open-source AO system is now a reality, including drivers necessary for commercial cameras and DMs. The computational performance of DARC makes it suitable for ELT scale systems when implemented on suitable hardware. We present tests made on different hardware platforms, along with the strategies taken to optimise DARC for these systems.
NASA Astrophysics Data System (ADS)
Bernabe, Sergio; Igual, Francisco D.; Botella, Guillermo; Prieto-Matias, Manuel; Plaza, Antonio
2015-10-01
In the last decade, the issue of endmember variability has received considerable attention, particularly when each pixel is modeled as a linear combination of endmembers or pure materials. As a result, several models and algorithms have been developed for considering the effect of endmember variability in spectral unmixing and possibly include multiple endmembers in the spectral unmixing stage. One of the most popular approach for this purpose is the multiple endmember spectral mixture analysis (MESMA) algorithm. The procedure executed by MESMA can be summarized as follows: (i) First, a standard linear spectral unmixing (LSU) or fully constrained linear spectral unmixing (FCLSU) algorithm is run in an iterative fashion; (ii) Then, we use different endmember combinations, randomly selected from a spectral library, to decompose each mixed pixel; (iii) Finally, the model with the best fit, i.e., with the lowest root mean square error (RMSE) in the reconstruction of the original pixel, is adopted. However, this procedure can be computationally very expensive due to the fact that several endmember combinations need to be tested and several abundance estimation steps need to be conducted, a fact that compromises the use of MESMA in applications under real-time constraints. In this paper we develop (for the first time in the literature) an efficient implementation of MESMA on different platforms using OpenCL, an open standard for parallel programing on heterogeneous systems. Our experiments have been conducted using a simulated data set and the clMAGMA mathematical library. This kind of implementations with the same descriptive language on different architectures are very important in order to actually calibrate the possibility of using heterogeneous platforms for efficient hyperspectral imaging processing in real remote sensing missions.
Design and Implementation of the Automated Rendezvous Targeting Algorithms for Orion
NASA Technical Reports Server (NTRS)
DSouza, Christopher; Weeks, Michael
2010-01-01
The Orion vehicle will be designed to perform several rendezvous missions: rendezvous with the ISS in Low Earth Orbit (LEO), rendezvous with the EDS/Altair in LEO, a contingency rendezvous with the ascent stage of the Altair in Low Lunar Orbit (LLO) and a contingency rendezvous in LLO with the ascent and descent stage in the case of an aborted lunar landing. Therefore, it is not difficult to realize that each of these scenarios imposes different operational, timing, and performance constraints on the GNC system. To this end, a suite of on-board guidance and targeting algorithms have been designed to meet the requirement to perform the rendezvous independent of communications with the ground. This capability is particularly relevant for the lunar missions, some of which may occur on the far side of the moon. This paper will describe these algorithms which are designed to be structured and arranged in such a way so as to be flexible and able to safely perform a wide variety of rendezvous trajectories. The goal of the algorithms is not to merely fly one specific type of canned rendezvous profile. Conversely, it was designed from the start to be general enough such that any type of trajectory profile can be flown.(i.e. a coelliptic profile, a stable orbit rendezvous profile, and a expedited LLO rendezvous profile, etc) all using the same rendezvous suite of algorithms. Each of these profiles makes use of maneuver types which have been designed with dual goals of robustness and performance. They are designed to converge quickly under dispersed conditions and they are designed to perform many of the functions performed on the ground today. The targeting algorithms consist of a phasing maneuver (NC), an altitude adjust maneuver (NH), and plane change maneuver (NPC), a coelliptic maneuver (NSR), a Lambert targeted maneuver, and several multiple-burn targeted maneuvers which combine one of more of these algorithms. The derivation and implementation of each of these
Ersek, Mary; Polissar, Nayak; Du Pen, Anna; Jablonski, Anita; Herr, Keela; Neradilek, Moni B
2015-01-01
Background Unrelieved pain among nursing home (NH) residents is a well-documented problem. Attempts have been made to enhance pain management for older adults, including those in NHs. Several evidence-based clinical guidelines have been published to assist providers in assessing and managing acute and chronic pain in older adults. Despite the proliferation and dissemination of these practice guidelines, research has shown that intensive systems-level implementation strategies are necessary to change clinical practice and patient outcomes within a health-care setting. One promising approach is the embedding of guidelines into explicit protocols and algorithms to enhance decision making. Purpose The goal of the article is to describe several issues that arose in the design and conduct of a study that compared the effectiveness of pain management algorithms coupled with a comprehensive adoption program versus the effectiveness of education alone in improving evidence-based pain assessment and management practices, decreasing pain and depressive symptoms, and enhancing mobility among NH residents. Methods The study used a cluster-randomized controlled trial (RCT) design in which the individual NH was the unit of randomization. The Roger's Diffusion of Innovations theory provided the framework for the intervention. Outcome measures were surrogate-reported usual pain, self-reported usual and worst pain, and self-reported pain-related interference with activities, depression, and mobility. Results The final sample consisted of 485 NH residents from 27 NHs. The investigators were able to use a staggered enrollment strategy to recruit and retain facilities. The adaptive randomization procedures were successful in balancing intervention and control sites on key NH characteristics. Several strategies were successfully implemented to enhance the adoption of the algorithm. Limitations/Lessons The investigators encountered several methodological challenges that were inherent to
2014-01-01
Background The huge quantity of data produced in Biomedical research needs sophisticated algorithmic methodologies for its storage, analysis, and processing. High Performance Computing (HPC) appears as a magic bullet in this challenge. However, several hard to solve parallelization and load balancing problems arise in this context. Here we discuss the HPC-oriented implementation of a general purpose learning algorithm, originally conceived for DNA analysis and recently extended to treat uncertainty on data (U-BRAIN). The U-BRAIN algorithm is a learning algorithm that finds a Boolean formula in disjunctive normal form (DNF), of approximately minimum complexity, that is consistent with a set of data (instances) which may have missing bits. The conjunctive terms of the formula are computed in an iterative way by identifying, from the given data, a family of sets of conditions that must be satisfied by all the positive instances and violated by all the negative ones; such conditions allow the computation of a set of coefficients (relevances) for each attribute (literal), that form a probability distribution, allowing the selection of the term literals. The great versatility that characterizes it, makes U-BRAIN applicable in many of the fields in which there are data to be analyzed. However the memory and the execution time required by the running are of O(n3) and of O(n5) order, respectively, and so, the algorithm is unaffordable for huge data sets. Results We find mathematical and programming solutions able to lead us towards the implementation of the algorithm U-BRAIN on parallel computers. First we give a Dynamic Programming model of the U-BRAIN algorithm, then we minimize the representation of the relevances. When the data are of great size we are forced to use the mass memory, and depending on where the data are actually stored, the access times can be quite different. According to the evaluation of algorithmic efficiency based on the Disk Model, in order to
D'Angelo, Gianni; Rampone, Salvatore
2014-01-01
The huge quantity of data produced in Biomedical research needs sophisticated algorithmic methodologies for its storage, analysis, and processing. High Performance Computing (HPC) appears as a magic bullet in this challenge. However, several hard to solve parallelization and load balancing problems arise in this context. Here we discuss the HPC-oriented implementation of a general purpose learning algorithm, originally conceived for DNA analysis and recently extended to treat uncertainty on data (U-BRAIN). The U-BRAIN algorithm is a learning algorithm that finds a Boolean formula in disjunctive normal form (DNF), of approximately minimum complexity, that is consistent with a set of data (instances) which may have missing bits. The conjunctive terms of the formula are computed in an iterative way by identifying, from the given data, a family of sets of conditions that must be satisfied by all the positive instances and violated by all the negative ones; such conditions allow the computation of a set of coefficients (relevances) for each attribute (literal), that form a probability distribution, allowing the selection of the term literals. The great versatility that characterizes it, makes U-BRAIN applicable in many of the fields in which there are data to be analyzed. However the memory and the execution time required by the running are of O(n(3)) and of O(n(5)) order, respectively, and so, the algorithm is unaffordable for huge data sets. We find mathematical and programming solutions able to lead us towards the implementation of the algorithm U-BRAIN on parallel computers. First we give a Dynamic Programming model of the U-BRAIN algorithm, then we minimize the representation of the relevances. When the data are of great size we are forced to use the mass memory, and depending on where the data are actually stored, the access times can be quite different. According to the evaluation of algorithmic efficiency based on the Disk Model, in order to reduce the costs of
NASA Astrophysics Data System (ADS)
Barrett, James
The incorporation of small, privately owned generation operating in parallel with, and supplying power to, the distribution network is becoming more widespread. This method of operation does however have problems associated with it. In particular, a loss of the connection to the main utility supply which leaves a portion of the utility load connected to the embedded generator will result in a power island. This situation presents possible dangers to utility personnel and the public, complications for smooth system operation and probable plant damage should the two systems be reconnected out-of-synchronism. Loss of Grid (or Islanding), as this situation is known, is the subject of this thesis. The work begins by detailing the requirements for operation of generation embedded in the utility supply with particular attention drawn to the requirements for a loss of grid protection scheme. The mathematical basis for a new loss of grid protection algorithm is developed and the inclusion of the algorithm in an integrated generator protection scheme described. A detailed description is given on the implementation of the new algorithm in a microprocessor based relay hardware to allow practical tests on small embedded generation facilities, including an in-house multiple generator test facility. The results obtained from the practical tests are compared with those obtained from simulation studies carried out in previous work and the differences are discussed. The performance of the algorithm is enhanced from the theoretical algorithm developed using the simulation results with simple filtering together with pattern recognition techniques. This provides stability during severe load fluctuations under parallel operation and system fault conditions and improved performance under normal operating conditions and for loss of grid detection. In addition to operating for a loss of grid connection, the algorithm will respond to load fluctuations which occur within a power island
FPGA-based implementation for steganalysis: a JPEG-compatibility algorithm
NASA Astrophysics Data System (ADS)
Gutierrez-Fernandez, E.; Portela-García, M.; Lopez-Ongil, C.; Garcia-Valderas, M.
2013-05-01
Steganalysis is a process to detect hidden data in cover documents, like digital images, videos, audio files, etc. This is the inverse process of steganography, which is the used method to hide secret messages. The widely use of computers and network technologies make digital files very easy-to-use means for storing secret data or transmitting secret messages through the Internet. Depending on the cover medium used to embed the data, there are different steganalysis methods. In case of images, many of the steganalysis and steganographic methods are focused on JPEG image formats, since JPEG is one of the most common formats. One of the main important handicaps of steganalysis methods is the processing speed, since it is usually necessary to process huge amount of data or it can be necessary to process the on-going internet traffic in real-time. In this paper, a JPEG steganalysis system is implemented in an FPGA in order to speed-up the detection process with respect to software-based implementations and to increase the throughput. In particular, the implemented method is the JPEG-compatibility detection algorithm that is based on the fact that when a JPEG image is modified, the resulting image is incompatible with the JPEG compression process.
NASA Astrophysics Data System (ADS)
Tang, Yu-Hang; Karniadakis, George; Crunch Team
2014-03-01
We present a scalable dissipative particle dynamics simulation code, fully implemented on the Graphics Processing Units (GPUs) using a hybrid CUDA/MPI programming model, which achieves 10-30 times speedup on a single GPU over 16 CPU cores and almost linear weak scaling across a thousand nodes. A unified framework is developed within which the efficient generation of the neighbor list and maintaining particle data locality are addressed. Our algorithm generates strictly ordered neighbor lists in parallel, while the construction is deterministic and makes no use of atomic operations or sorting. Such neighbor list leads to optimal data loading efficiency when combined with a two-level particle reordering scheme. A faster in situ generation scheme for Gaussian random numbers is proposed using precomputed binary signatures. We designed custom transcendental functions that are fast and accurate for evaluating the pairwise interaction. Computer benchmarks demonstrate the speedup of our implementation over the CPU implementation as well as strong and weak scalability. A large-scale simulation of spontaneous vesicle formation consisting of 128 million particles was conducted to illustrate the practicality of our code in real-world applications. This work was supported by the new Department of Energy Collaboratory on Mathematics for Mesoscopic Modeling of Materials (CM4). Simulations were carried out at the Oak Ridge Leadership Computing Facility through the INCITE program under project BIP017.
Efficient algorithms and implementations of entropy-based moment closures for rarefied gases
NASA Astrophysics Data System (ADS)
Schaerer, Roman Pascal; Bansal, Pratyuksh; Torrilhon, Manuel
2017-07-01
We present efficient algorithms and implementations of the 35-moment system equipped with the maximum-entropy closure in the context of rarefied gases. While closures based on the principle of entropy maximization have been shown to yield very promising results for moderately rarefied gas flows, the computational cost of these closures is in general much higher than for closure theories with explicit closed-form expressions of the closing fluxes, such as Grad's classical closure. Following a similar approach as Garrett et al. (2015) [13], we investigate efficient implementations of the computationally expensive numerical quadrature method used for the moment evaluations of the maximum-entropy distribution by exploiting its inherent fine-grained parallelism with the parallelism offered by multi-core processors and graphics cards. We show that using a single graphics card as an accelerator allows speed-ups of two orders of magnitude when compared to a serial CPU implementation. To accelerate the time-to-solution for steady-state problems, we propose a new semi-implicit time discretization scheme. The resulting nonlinear system of equations is solved with a Newton type method in the Lagrange multipliers of the dual optimization problem in order to reduce the computational cost. Additionally, fully explicit time-stepping schemes of first and second order accuracy are presented. We investigate the accuracy and efficiency of the numerical schemes for several numerical test cases, including a steady-state shock-structure problem.
Implementation of the Algorithm for Congestion control in the Dynamic Circuit Network (DCN)
NASA Astrophysics Data System (ADS)
Nalamwar, H. S.; Ivanov, M. A.; Buddhawar, G. U.
2017-01-01
Transport Control Protocol (TCP) incast congestion happens when a number of senders work in parallel with the same server where the high bandwidth and low latency network problem occurs. For many data center network applications such as a search engine, heavy traffic is present on such a server. Incast congestion degrades the entire performance as packets are lost at a server side due to buffer overflow, and as a result, the response time becomes longer. In this work, we focus on TCP throughput, round-trip time (RTT), receive window and retransmission. Our method is based on the proactive adjust of the TCP receive window before the packet loss occurs. We aim to avoid the wastage of the bandwidth by adjusting its size as per the number of packets. To avoid the packet loss, the ICTCP algorithm has been implemented in the data center network (ToR).
Dragonfly: an implementation of the expand–maximize–compress algorithm for single-particle imaging1
Ayyer, Kartik; Lan, Ti-Yen; Elser, Veit; Loh, N. Duane
2016-01-01
Single-particle imaging (SPI) with X-ray free-electron lasers has the potential to change fundamentally how biomacromolecules are imaged. The structure would be derived from millions of diffraction patterns, each from a different copy of the macromolecule before it is torn apart by radiation damage. The challenges posed by the resultant data stream are staggering: millions of incomplete, noisy and un-oriented patterns have to be computationally assembled into a three-dimensional intensity map and then phase reconstructed. In this paper, the Dragonfly software package is described, based on a parallel implementation of the expand–maximize–compress reconstruction algorithm that is well suited for this task. Auxiliary modules to simulate SPI data streams are also included to assess the feasibility of proposed SPI experiments at the Linac Coherent Light Source, Stanford, California, USA. PMID:27504078
Dragonfly: an implementation of the expand-maximize-compress algorithm for single-particle imaging.
Ayyer, Kartik; Lan, Ti-Yen; Elser, Veit; Loh, N Duane
2016-08-01
Single-particle imaging (SPI) with X-ray free-electron lasers has the potential to change fundamentally how biomacromolecules are imaged. The structure would be derived from millions of diffraction patterns, each from a different copy of the macromolecule before it is torn apart by radiation damage. The challenges posed by the resultant data stream are staggering: millions of incomplete, noisy and un-oriented patterns have to be computationally assembled into a three-dimensional intensity map and then phase reconstructed. In this paper, the Dragonfly software package is described, based on a parallel implementation of the expand-maximize-compress reconstruction algorithm that is well suited for this task. Auxiliary modules to simulate SPI data streams are also included to assess the feasibility of proposed SPI experiments at the Linac Coherent Light Source, Stanford, California, USA.
Ma, Yong; Li, Juan; Lu, Bin
2016-02-01
In order to monitor the emotional state changes of audience on real-time and to adjust the music playlist, we proposed an algorithm framework of an electroencephalogram (EEG) driven personalized affective music recommendation system based on the portable dry electrode shown in this paper. We also further finished a preliminary implementation on the Android platform. We used a two-dimensional emotional model of arousal and valence as the reference, and mapped the EEG data and the corresponding seed songs to the emotional coordinate quadrant in order to establish the matching relationship. Then, Mel frequency cepstrum coefficients were applied to evaluate the similarity between the seed songs and the songs in music library. In the end, during the music playing state, we used the EEG data to identify the audience's emotional state, and played and adjusted the corresponding song playlist based on the established matching relationship.
An Efficient Implementation of the Sign LMS Algorithm Using Block Floating Point Format
NASA Astrophysics Data System (ADS)
Chakraborty, Mrityunjoy; Shaik, Rafiahamed; Lee, Moon Ho
2007-12-01
An efficient scheme is presented for implementing the sign LMS algorithm in block floating point format, which permits processing of data over a wide dynamic range at a processor complexity and cost as low as that of a fixed point processor. The proposed scheme adopts appropriate formats for representing the filter coefficients and the data. It also employs a scaled representation for the step-size that has a time-varying mantissa and also a time-varying exponent. Using these and an upper bound on the step-size mantissa, update relations for the filter weight mantissas and exponent are developed, taking care so that neither overflow occurs, nor are quantities which are already very small multiplied directly. Separate update relations are also worked out for the step size mantissa. The proposed scheme employs mostly fixed-point-based operations, and thus achieves considerable speedup over its floating-point-based counterpart.
Richter, S. S.; Beekmann, S. E.; Croco, J. L.; Diekema, D. J.; Koontz, F. P.; Pfaller, M. A.; Doern, G. V.
2002-01-01
An algorithm was implemented in the clinical microbiology laboratory to assess the clinical significance of organisms that are often considered contaminants (coagulase-negative staphylococci, aerobic and anaerobic diphtheroids, Micrococcus spp., Bacillus spp., and viridans group streptococci) when isolated from blood cultures. From 25 August 1999 through 30 April 2000, 12,374 blood cultures were submitted to the University of Iowa Clinical Microbiology Laboratory. Potential contaminants were recovered from 495 of 1,040 positive blood cultures. If one or more additional blood cultures were obtained within ±48 h and all were negative, the isolate was considered a contaminant. Antimicrobial susceptibility testing (AST) of these probable contaminants was not performed unless requested. If no additional blood cultures were submitted or there were additional positive blood cultures (within ±48 h), a pathology resident gathered patient clinical information and made a judgment regarding the isolate's significance. To evaluate the accuracy of these algorithm-based assignments, a nurse epidemiologist in approximately 60% of the cases performed a retrospective chart review. Agreement between the findings of the retrospective chart review and the automatic classification of the isolates with additional negative blood cultures as probable contaminants occurred among 85.8% of 225 isolates. In response to physician requests, AST had been performed on 15 of the 32 isolates with additional negative cultures considered significant by retrospective chart review. Agreement of pathology resident assignment with the retrospective chart review occurred among 74.6% of 71 isolates. The laboratory-based algorithm provided an acceptably accurate means for assessing the clinical significance of potential contaminants recovered from blood cultures. PMID:12089259
Duong, T A; Stubberud, A R
2000-06-01
In this paper, we present a mathematical foundation, including a convergence analysis, for cascading architecture neural network. Our analysis also shows that the convergence of the cascade architecture neural network is assured because it satisfies Liapunov criteria, in an added hidden unit domain rather than in the time domain. From this analysis, a mathematical foundation for the cascade correlation learning algorithm can be found. Furthermore, it becomes apparent that the cascade correlation scheme is a special case from mathematical analysis in which an efficient hardware learning algorithm called Cascade Error Projection(CEP) is proposed. The CEP provides efficient learning in hardware and it is faster to train, because part of the weights are deterministically obtained, and the learning of the remaining weights from the inputs to the hidden unit is performed as a single-layer perceptron learning with previously determined weights kept frozen. In addition, one can start out with zero weight values (rather than random finite weight values) when the learning of each layer is commenced. Further, unlike cascade correlation algorithm (where a pool of candidate hidden units is added), only a single hidden unit is added at a time. Therefore, the simplicity in hardware implementation is also achieved. Finally, 5- to 8-bit parity and chaotic time series prediction problems are investigated; the simulation results demonstrate that 4-bit or more weight quantization is sufficient for learning neural network using CEP. In addition, it is demonstrated that this technique is able to compensate for less bit weight resolution by incorporating additional hidden units. However, generation result may suffer somewhat with lower bit weight quantization.
NASA Astrophysics Data System (ADS)
Rajaram, Vignesh; Subramanian, Shankar C.
2016-07-01
An important aspect from the perspective of operational safety of heavy road vehicles is the detection and avoidance of collisions, particularly at high speeds. The development of a collision avoidance system is the overall focus of the research presented in this paper. The collision avoidance algorithm was developed using a sliding mode controller (SMC) and compared to one developed using linear full state feedback in terms of performance and controller effort. Important dynamic characteristics such as load transfer during braking, tyre-road interaction, dynamic brake force distribution and pneumatic brake system response were considered. The effect of aerodynamic drag on the controller performance was also studied. The developed control algorithms have been implemented on a Hardware-in-Loop experimental set-up equipped with the vehicle dynamic simulation software, IPG/TruckMaker®. The evaluation has been performed for realistic traffic scenarios with different loading and road conditions. The Hardware-in-Loop experimental results showed that the SMC and full state feedback controller were able to prevent the collision. However, when the discrepancies in the form of parametric variations were included, the SMC provided better results in terms of reduced stopping distance and lower controller effort compared to the full state feedback controller.
Design and implementation of a hybrid MPI-CUDA model for the Smith-Waterman algorithm.
Khaled, Heba; Faheem, Hossam El Deen Mostafa; El Gohary, Rania
2015-01-01
This paper provides a novel hybrid model for solving the multiple pair-wise sequence alignment problem combining message passing interface and CUDA, the parallel computing platform and programming model invented by NVIDIA. The proposed model targets homogeneous cluster nodes equipped with similar Graphical Processing Unit (GPU) cards. The model consists of the Master Node Dispatcher (MND) and the Worker GPU Nodes (WGN). The MND distributes the workload among the cluster working nodes and then aggregates the results. The WGN performs the multiple pair-wise sequence alignments using the Smith-Waterman algorithm. We also propose a modified implementation to the Smith-Waterman algorithm based on computing the alignment matrices row-wise. The experimental results demonstrate a considerable reduction in the running time by increasing the number of the working GPU nodes. The proposed model achieved a performance of about 12 Giga cell updates per second when we tested against the SWISS-PROT protein knowledge base running on four nodes.
Experimentally-implemented genetic algorithm (Exp-GA): toward fully optimal photovoltaics.
Zhong, Yan Kai; Fu, Sze Ming; Ju, Nyan Ping; Chen, Po Yu; Lin, Albert
2015-09-21
The geometry and dimension design is the most critical part for the success in nano-photonic devices. The choices of the geometrical parameters dramatically affect the device performance. Most of the time, simulation is conducted to locate the suitable geometry, but in many cases simulation can be ineffective. The most pronounced examples are large-area randomized patterns for solar cells, light emitting diode (LED), and thermophtovoltaics (TPV). The large random pattern is nearly impossible to calculate and optimize due to the extended CPU runtime and the memory limitation. Other scenarios that numerical simulations become ineffective include three-dimensional complex structures with anisotropic dielectric response. This leads to extended simulation time especially for the repeated runs during its geometry optimization. In this paper, we show that by incorporating genetic algorithm (GA) into real-world experiments, shortened trial-and-error time can be achieved. More importantly, this scheme can be used for many photonic design problems that are unsuitable for simulation-based optimizations. Moreover, the experimentally implemented genetic algorithm (Exp-GA) has the additional advantage that the resultant objective value is a real one rather than a theoretical one. This prevents the gaps between the modeling and the fabrication due to the process variation or inaccurate numerical models. Using TPV emitters as an example, 22% enhancement in the mean objective value is achieved.
NASA Astrophysics Data System (ADS)
Bernabé, Sergio; Martin, Gabriel; Botella, Guillermo; Prieto-Matias, Manuel; Plaza, Antonio
2016-04-01
In the last years, hyperspectral analysis have been applied in many remote sensing applications. In fact, hyperspectral unmixing has been a challenging task in hyperspectral data exploitation. This process consists of three stages: (i) estimation of the number of pure spectral signatures or endmembers, (ii) automatic identification of the estimated endmembers, and (iii) estimation of the fractional abundance of each endmember in each pixel of the scene. However, unmixing algorithms can be computationally very expensive, a fact that compromises their use in applications under real-time constraints. In recent years, several techniques have been proposed to solve the aforementioned problem but until now, most works have focused on the second and third stages. The execution cost of the first stage is usually lower than the other stages. Indeed, it can be optional if we known a priori this estimation. However, its acceleration on parallel architectures is still an interesting and open problem. In this paper we have addressed this issue focusing on the GENE algorithm, a promising geometry-based proposal introduced in.1 We have evaluated our parallel implementation in terms of both accuracy and computational performance through Monte Carlo simulations for real and synthetic data experiments. Performance results on a modern GPU shows satisfactory 16x speedup factors, which allow us to expect that this method could meet real-time requirements on a fully operational unmixing chain.
NASA Astrophysics Data System (ADS)
Landoni, M.; Riva, M.; Pepe, F.; Aliverti, M.; Cabral, A.; Calderone, G.; Cirami, R.; Cristiani, S.; Di Marcantonio, P.; Genoni, M.; Mégevand, D.; Moschetti, M.; Oggioni, L.; Pariani, G.
2016-08-01
In this paper we will review the ESPRESSO guiding algorithm for the Front End subsystem. ESPRESSO, the Echelle Spectrograph for Rocky Exoplanets and Stable Spectroscopic Observations, will be installed on ESO's Very Large Telescope (VLT). The Front End Unit (FEU) is the ESPRESSO subsystem which collects the light coming from the Coudè Trains of all the Four Telescope Units (UTs), provides Field and Pupil stabilization better than 0.05'' via piezoelectric tip tilt devices and inject the beams into the Spectrograph fibers. The field and pupil stabilization is obtained through a re-imaging system that collects the halo of the light out of the Injection Fiber and the image of the telescope pupil. In particular, we will focus on the software design of the system starting from class diagram to actual implementation. A review of the theoretical mathematical background required to understand the final design is also reported. We will show the performance of the algorithm on the actual Front End by adoption of telescope simulator exploring various scientific requirements.
Implementation of damage detection algorithms for the Alfred Zampa Memorial Suspension Bridge
NASA Astrophysics Data System (ADS)
Talebinejad, I.; Sedarat, H.; Emami-Naeini, A.; Krimotat, A.; Lynch, Jerome
2014-03-01
This study investigated a number of different damage detection algorithms for structural health monitoring of a typical suspension bridge. The Alfred Zampa Memorial Bridge, a part of the Interstate 80 in California, was selected for this study. The focus was to implement and validate simple damage detection algorithms for structural health monitoring of complex bridges. Accordingly, the numerical analysis involved development of a high fidelity finite element model of the bridge in order to simulate various structural damage scenarios. The finite element model of the bridge was validated based on the experimental modal properties. A number of damage scenarios were simulated by changing the stiffness of different bridge components including suspenders, main cable, bulkheads and deck. Several vibration-based damage detection methods namely the change in the stiffness, change in the flexibility, change in the uniform load surface and change in the uniform load surface curvature were employed to locate the simulated damages. The investigation here provides the relative merits and shortcomings of these methods when applied to long span suspension bridges. It also shows the applicability of these methods to locate the decay in the structure.
Implementation of the Kinetic Plasma Code with Locally Recursive non-Locally Asynchronous Algorithms
NASA Astrophysics Data System (ADS)
Perepelkina, A. Yu; Levchenko, V. D.; Goryachev, I. A.
2014-05-01
Numerical simulation is presently considered impractical for several relevant plasma kinetics problems due to limitations of computer hardware even with the use of supercomputers. To overcome the existing limitations it is suggested to develop algorithms which would effectively utilize the computer memory subsystem hierarchy to optimize the dependency graph traversal rules. The ideas for general cases of numerical simulation and implementation of such algorithms to particle-in-cell code is discussed in the paper. This approach enables the simulation of previously unaccessible for modeling problems and the execution of series of numerical experiments in reasonable time. The latter is demonstrated on a multiscale problem of the development of filamentation instability in laser interaction with overdense plasma. One variant of the simulation with parameters typical for simulations on supercomputers is performed with the use of one cluster node. The series of such experiments revealed the dependency of energy loss on incoming laser pulse amplitude to be nonmonotonic and reach over 4%, an interesting result for research of fast ignition concept.
Research and hardware implementation of image enhancement algorithm in OLED system
NASA Astrophysics Data System (ADS)
Xu, Meihua; Li, Ke; Fan, Yule
2009-11-01
During the process of generation and transmission, images will be influenced by the performance of imaging system, quantization noise or some other actors. The image will appear some phenomenons like clarity decrease and low contrast. For the sake of improving the picture quality in OLED system, time sequence control logic and hardware of the whole OLED system were implemented based on the detailed analysis of OLED panel electrical characteristics and various gray scale scanning principles. Sub-field scanning working mode is adopted in the design. Gray scale is 64 and vertical sweep frequency is 60Hz~100Hz.FPGA is the core control device in the whole system, the DVI decoded signal is processed. The design realizes the real-time video display on OLED. The utilization of parameters regulated homomorphic filtering technology is studied to improve the quality of color image. At first, the high and low frequency parts are departed with the help of illumination reflectance model.Then the digital image is processed with approximate high-pass filter, the simplified filtering algorithm made a compromise solution between the complex of hardware implementation and image quality. In the end, by the frequency domain inverse transformation, images got enhanced. The experiment results show that the particular information extrudes and the whole visual effect is improved after the process.
Real-time implementation of camera positioning algorithm based on FPGA & SOPC
NASA Astrophysics Data System (ADS)
Yang, Mingcao; Qiu, Yuehong
2014-09-01
In recent years, with the development of positioning algorithm and FPGA, to achieve the camera positioning based on real-time implementation, rapidity, accuracy of FPGA has become a possibility by way of in-depth study of embedded hardware and dual camera positioning system, this thesis set up an infrared optical positioning system based on FPGA and SOPC system, which enables real-time positioning to mark points in space. Thesis completion include: (1) uses a CMOS sensor to extract the pixel of three objects with total feet, implemented through FPGA hardware driver, visible-light LED, used here as the target point of the instrument. (2) prior to extraction of the feature point coordinates, the image needs to be filtered to avoid affecting the physical properties of the system to bring the platform, where the median filtering. (3) Coordinate signs point to FPGA hardware circuit extraction, a new iterative threshold selection method for segmentation of images. Binary image is then segmented image tags, which calculates the coordinates of the feature points of the needle through the center of gravity method. (4) direct linear transformation (DLT) and extreme constraints method is applied to three-dimensional reconstruction of the plane array CMOS system space coordinates. using SOPC system on a chip here, taking advantage of dual-core computing systems, which let match and coordinate operations separately, thus increase processing speed.
Accurate implementation of leaping in space: The spatial partitioned-leaping algorithm
NASA Astrophysics Data System (ADS)
Iyengar, Krishna A.; Harris, Leonard A.; Clancy, Paulette
2010-03-01
There is a great need for accurate and efficient computational approaches that can account for both the discrete and stochastic nature of chemical interactions as well as spatial inhomogeneities and diffusion. This is particularly true in biology and nanoscale materials science, where the common assumptions of deterministic dynamics and well-mixed reaction volumes often break down. In this article, we present a spatial version of the partitioned-leaping algorithm, a multiscale accelerated-stochastic simulation approach built upon the τ-leaping framework of Gillespie. We pay special attention to the details of the implementation, particularly as it pertains to the time step calculation procedure. We point out conceptual errors that have been made in this regard in prior implementations of spatial τ-leaping and illustrate the manifestation of these errors through practical examples. Finally, we discuss the fundamental difficulties associated with incorporating efficient exact-stochastic techniques, such as the next-subvolume method, into a spatial leaping framework and suggest possible solutions.
A Weighted Spatial-Spectral Kernel RX Algorithm and Efficient Implementation on GPUs
Zhao, Chunhui; Li, Jiawei; Meng, Meiling; Yao, Xifeng
2017-01-01
The kernel RX (KRX) detector proposed by Kwon and Nasrabadi exploits a kernel function to obtain a better detection performance. However, it still has two limits that can be improved. On the one hand, reasonable integration of spatial-spectral information can be used to further improve its detection accuracy. On the other hand, parallel computing can be used to reduce the processing time in available KRX detectors. Accordingly, this paper presents a novel weighted spatial-spectral kernel RX (WSSKRX) detector and its parallel implementation on graphics processing units (GPUs). The WSSKRX utilizes the spatial neighborhood resources to reconstruct the testing pixels by introducing a spectral factor and a spatial window, thereby effectively reducing the interference of background noise. Then, the kernel function is redesigned as a mapping trick in a KRX detector to implement the anomaly detection. In addition, a powerful architecture based on the GPU technique is designed to accelerate WSSKRX. To substantiate the performance of the proposed algorithm, both synthetic and real data are conducted for experiments. PMID:28241511
A Weighted Spatial-Spectral Kernel RX Algorithm and Efficient Implementation on GPUs.
Zhao, Chunhui; Li, Jiawei; Meng, Meiling; Yao, Xifeng
2017-02-23
The kernel RX (KRX) detector proposed by Kwon and Nasrabadi exploits a kernel function to obtain a better detection performance. However, it still has two limits that can be improved. On the one hand, reasonable integration of spatial-spectral information can be used to further improve its detection accuracy. On the other hand, parallel computing can be used to reduce the processing time in available KRX detectors. Accordingly, this paper presents a novel weighted spatial-spectral kernel RX (WSSKRX) detector and its parallel implementation on graphics processing units (GPUs). The WSSKRX utilizes the spatial neighborhood resources to reconstruct the testing pixels by introducing a spectral factor and a spatial window, thereby effectively reducing the interference of background noise. Then, the kernel function is redesigned as a mapping trick in a KRX detector to implement the anomaly detection. In addition, a powerful architecture based on the GPU technique is designed to accelerate WSSKRX. To substantiate the performance of the proposed algorithm, both synthetic and real data are conducted for experiments.
Dave, J. K.; Halldorsdottir, V. G.; Eisenbrey, J. R.; Merton, D. A.; Liu, J. B.; Machado, P.; Zhao, H.; Park, S.; Dianis, S.; Chalek, C. L.; Thomenius, K. E.; Brown, D. B.; Forsberg, F.
2013-01-01
Incident acoustic output (IAO) dependent subharmonic signal amplitudes from ultrasound contrast agents can be categorized into occurrence, growth or saturation stages. Subharmonic aided pressure estimation (SHAPE) is a technique that utilizes growth stage subharmonic signal amplitudes for hydrostatic pressure estimation. In this study, we developed an automated IAO optimization algorithm to identify the IAO level eliciting growth stage subharmonic signals and also studied the effect of pulse length on SHAPE. This approach may help eliminate the problems of acquiring and analyzing the data offline at all IAO levels as was done in previous studies and thus, pave the way for real-time clinical pressure monitoring applications. The IAO optimization algorithm was implemented on a Logiq 9 (GE Healthcare, Milwaukee, WI) scanner interfaced with a computer. The optimization algorithm stepped the ultrasound scanner from 0 to 100 % IAO. A logistic equation fitting function was applied with the criterion of minimum least squared error between the fitted subharmonic amplitudes and the measured subharmonic amplitudes as a function of the IAO levels and the optimum IAO level was chosen corresponding to the inflection point calculated from the fitted data. The efficacy of the optimum IAO level was investigated for in vivo SHAPE to monitor portal vein (PV) pressures in 5 canines and was compared with the performance of IAO levels, below and above the optimum IAO level, for 4, 8 and 16 transmit cycles. The canines received a continuous infusion of Sonazoid microbubbles (1.5 μl/kg/min; GE Healthcare, Oslo, Norway). PV pressures were obtained using a surgically introduced pressure catheter (Millar Instruments, Inc., Houston, TX) and were recorded before and after increasing PV pressures. The experiments showed that optimum IAO levels for SHAPE in the canines ranged from 6 to 40 %. The best correlation between changes in PV pressures and in subharmonic amplitudes (r = -0.76; p = 0
Xiao, Kai; Chen, Danny Z; Hu, X Sharon; Zhou, Bo
2012-12-01
The three-dimensional digital differential analyzer (3D-DDA) algorithm is a widely used ray traversal method, which is also at the core of many convolution∕superposition (C∕S) dose calculation approaches. However, porting existing C∕S dose calculation methods onto graphics processing unit (GPU) has brought challenges to retaining the efficiency of this algorithm. In particular, straightforward implementation of the original 3D-DDA algorithm inflicts a lot of branch divergence which conflicts with the GPU programming model and leads to suboptimal performance. In this paper, an efficient GPU implementation of the 3D-DDA algorithm is proposed, which effectively reduces such branch divergence and improves performance of the C∕S dose calculation programs running on GPU. The main idea of the proposed method is to convert a number of conditional statements in the original 3D-DDA algorithm into a set of simple operations (e.g., arithmetic, comparison, and logic) which are better supported by the GPU architecture. To verify and demonstrate the performance improvement, this ray traversal method was integrated into a GPU-based collapsed cone convolution∕superposition (CCCS) dose calculation program. The proposed method has been tested using a water phantom and various clinical cases on an NVIDIA GTX570 GPU. The CCCS dose calculation program based on the efficient 3D-DDA ray traversal implementation runs 1.42 ∼ 2.67× faster than the one based on the original 3D-DDA implementation, without losing any accuracy. The results show that the proposed method can effectively reduce branch divergence in the original 3D-DDA ray traversal algorithm and improve the performance of the CCCS program running on GPU. Considering the wide utilization of the 3D-DDA algorithm, various applications can benefit from this implementation method.
NASA Astrophysics Data System (ADS)
Hadia, Sarman K.; Thakker, R. A.; Bhatt, Kirit R.
2016-05-01
The study proposes an application of evolutionary algorithms, specifically an artificial bee colony (ABC), variant ABC and particle swarm optimisation (PSO), to extract the parameters of metal oxide semiconductor field effect transistor (MOSFET) model. These algorithms are applied for the MOSFET parameter extraction problem using a Pennsylvania surface potential model. MOSFET parameter extraction procedures involve reducing the error between measured and modelled data. This study shows that ABC algorithm optimises the parameter values based on intelligent activities of honey bee swarms. Some modifications have also been applied to the basic ABC algorithm. Particle swarm optimisation is a population-based stochastic optimisation method that is based on bird flocking activities. The performances of these algorithms are compared with respect to the quality of the solutions. The simulation results of this study show that the PSO algorithm performs better than the variant ABC and basic ABC algorithm for the parameter extraction of the MOSFET model; also the implementation of the ABC algorithm is shown to be simpler than that of the PSO algorithm.
Experimental realization of a four-photon seven-qubit graph state for one-way quantum computation.
Lee, Sang Min; Park, Hee Su; Cho, Jaeyoon; Kang, Yoonshik; Lee, Jae Yong; Kim, Heonoh; Lee, Dong-Hoon; Choi, Sang-Kyung
2012-03-26
We propose and demonstrate the scaling up of photonic graph states through path qubit fusion. Two path qubits from separate two-photon four-qubit states are fused to generate a two-dimensional seven-qubit graph state composed of polarization and path qubits. Genuine seven-qubit entanglement is verified by evaluating the witness operator. Six qubits from the graph state are used to demonstrate the Deutsch-Jozsa algorithm for general two-bit functions with a success probability greater than 90%.
NASA Astrophysics Data System (ADS)
Liu, Zhi; Zhou, Baotong; Zhang, Changnian
2017-03-01
Vehicle-mounted panoramic system is important safety assistant equipment for driving. However, traditional systems only render fixed top-down perspective view of limited view field, which may have potential safety hazard. In this paper, a texture mapping algorithm for 3D vehicle-mounted panoramic system is introduced, and an implementation of the algorithm utilizing OpenGL ES library based on Android smart platform is presented. Initial experiment results show that the proposed algorithm can render a good 3D panorama, and has the ability to change view point freely.
NASA Astrophysics Data System (ADS)
Vela, Adan Ernesto
2011-12-01
From 2010 to 2030, the number of instrument flight rules aircraft operations handled by Federal Aviation Administration en route traffic centers is predicted to increase from approximately 39 million flights to 64 million flights. The projected growth in air transportation demand is likely to result in traffic levels that exceed the abilities of the unaided air traffic controller in managing, separating, and providing services to aircraft. Consequently, the Federal Aviation Administration, and other air navigation service providers around the world, are making several efforts to improve the capacity and throughput of existing airspaces. Ultimately, the stated goal of the Federal Aviation Administration is to triple the available capacity of the National Airspace System by 2025. In an effort to satisfy air traffic demand through the increase of airspace capacity, air navigation service providers are considering the inclusion of advisory conflict-detection and resolution systems. In a human-in-the-loop framework, advisory conflict-detection and resolution decision-support tools identify potential conflicts and propose resolution commands for the air traffic controller to verify and issue to aircraft. A number of researchers and air navigation service providers hypothesize that the inclusion of combined conflict-detection and resolution tools into air traffic control systems will reduce or transform controller workload and enable the required increases in airspace capacity. In an effort to understand the potential workload implications of introducing advisory conflict-detection and resolution tools, this thesis provides a detailed study of the conflict event process and the implementation of conflict-detection and resolution algorithms. Specifically, the research presented here examines a metric of controller taskload: how many resolution commands an air traffic controller issues under the guidance of a conflict-detection and resolution decision-support tool. The goal
Design and implementation of modern control algorithms for unmanned aerial vehicles
NASA Astrophysics Data System (ADS)
Hafez, Ahmed Taimour
Recently, Unmanned Aerial Vehicles (UAVs) have attracted a great deal of attention in academic, civilian and military communities as prospective solutions to a wide variety of applications. The use of cooperative UAVs has received growing interest in the last decade and this provides an opportunity for new operational paradigms. As applications of UAVs continue to grow in complexity, the trend of using multiple cooperative UAVs to perform these applications rises in order to increase the overall effectiveness and robustness. There is a need for generating suitable control techniques that allow for the real-time implementation of control algorithms for different missions and tactics executed by a group of cooperative UAVs. In this thesis, we investigate possible control patterns and associated algorithms for controlling a group of autonomous UAVs in real-time to perform various tactics. This research proposes new control approaches to solve the dynamic encirclement, tactic switching and formation problems for a group of cooperative UAVs in simulation and real-time. Firstly, a combination of Feedback Linearization (FL) and decentralized Linear Model Predictive Control (LMPC) is used to solve the dynamic encirclement problem. Secondly, a combination of decentralized LMPC and fuzzy logic control is used to solve the problem of tactic switching for a group of cooperative UAVs. Finally, a decentralized Learning Based Model Predictive Control (LBMPC) is used to solve the problem of formation for a group of cooperative UAVs in simulation. We show through simulations and validate through experiments that the proposed control policies succeed to control a group of cooperative UAVs to achieve the desired requirements and control objectives for different tactics. These proposed control policies provide reliable and effective control techniques for multiple cooperative UAV systems.
NASA Astrophysics Data System (ADS)
Ham, Woonchul; Song, Chulgyu
2017-05-01
In this paper, we propose a new three-dimensional stereo image reconstruction algorithm for a photoacoustic medical imaging system. We also introduce and discuss a new theoretical algorithm by using the physical concept of Radon transform. The main key concept of proposed theoretical algorithm is to evaluate the existence possibility of the acoustic source within a searching region by using the geometric distance between each sensor element of acoustic detector and the corresponding searching region denoted by grid. We derive the mathematical equation for the magnitude of the existence possibility which can be used for implementing a new proposed algorithm. We handle and derive mathematical equations of proposed algorithm for the one-dimensional sensing array case as well as two dimensional sensing array case too. A mathematical k-wave simulation data are used for comparing the image quality of the proposed algorithm with that of general conventional algorithm in which the FFT should be necessarily used. From the k-wave Matlab simulation results, we can prove the effectiveness of the proposed reconstruction algorithm.
Development of tight-binding based GW algorithm and its computational implementation for graphene
Majidi, Muhammad Aziz; Naradipa, Muhammad Avicenna Phan, Wileam Yonatan; Syahroni, Ahmad; Rusydi, Andrivo
2016-04-19
Graphene has been a hot subject of research in the last decade as it holds a promise for various applications. One interesting issue is whether or not graphene should be classified into a strongly or weakly correlated system, as the optical properties may change upon several factors, such as the substrate, voltage bias, adatoms, etc. As the Coulomb repulsive interactions among electrons can generate the correlation effects that may modify the single-particle spectra (density of states) and the two-particle spectra (optical conductivity) of graphene, we aim to explore such interactions in this study. The understanding of such correlation effects is important because eventually they play an important role in inducing the effective attractive interactions between electrons and holes that bind them into excitons. We do this study theoretically by developing a GW method implemented on the basis of the tight-binding (TB) model Hamiltonian. Unlike the well-known GW method developed within density functional theory (DFT) framework, our TB-based GW implementation may serve as an alternative technique suitable for systems which Hamiltonian is to be constructed through a tight-binding based or similar models. This study includes theoretical formulation of the Green’s function G, the renormalized interaction function W from random phase approximation (RPA), and the corresponding self energy derived from Feynman diagrams, as well as the development of the algorithm to compute those quantities. As an evaluation of the method, we perform calculations of the density of states and the optical conductivity of graphene, and analyze the results.
NASA Astrophysics Data System (ADS)
Gutiérrez, Rebeca; Just, Dieter
2014-10-01
The Meteosat Third Generation (MTG) Programme is the next generation of European geostationary meteorological systems. The first MTG satellite, which is scheduled for launch at the end of 2018/early 2019, will host two imaging instruments: the Flexible Combined Imager (FCI) and the Lightning Imager. The FCI will continue the operation of the SEVIRI imager on the current Meteosat Second Generation satellites (MSG), but with an improved spatial, temporal and spectral resolution, not dissimilar to GOES-R (of NASA/NOAA). The transition from spinner to 3-axis stabilised platform, a 2-axis tapered scan pattern with overlaps between adjacent scan swaths, and the more stringent geometric, radiometric and timeliness requirements, make the rectification process for MTG FCI more challenging than for MSG SEVIRI. The effect of non-uniform sampling in the image rectification process was analysed in an earlier paper. The use of classical interpolation methods, such as truncated Shannon interpolation or cubic convolution interpolation, was shown to cause significant errors when applied to non-uniform samples. Moreover, cubic splines and Lagrange interpolation were selected as candidate resampling algorithms for the FCI rectification that can cope with irregularities in the sampling acquisition process. This paper extends the study for the two-dimensional case focusing on practical 2D interpolation methods and its feasibility for an operational implementation. Candidate kernels are described and assessed with respect to MTG requirements. The operational constraints of the Level 1 processor have been considered to develop an early image rectification prototype, including the impact of the potential curvature of the FCI scan swaths. The implementation follows a swath-based approach, uses parallel processing to speed up computation time and allows the selection of a number of resampling functions. Due to the tight time constraints of the FCI level 1 processing chain, focus is both on
Katouda, Michio; Naruse, Akira; Hirano, Yukihiko; Nakajima, Takahito
2016-11-15
A new parallel algorithm and its implementation for the RI-MP2 energy calculation utilizing peta-flop-class many-core supercomputers are presented. Some improvements from the previous algorithm (J. Chem. Theory Comput. 2013, 9, 5373) have been performed: (1) a dual-level hierarchical parallelization scheme that enables the use of more than 10,000 Message Passing Interface (MPI) processes and (2) a new data communication scheme that reduces network communication overhead. A multi-node and multi-GPU implementation of the present algorithm is presented for calculations on a central processing unit (CPU)/graphics processing unit (GPU) hybrid supercomputer. Benchmark results of the new algorithm and its implementation using the K computer (CPU clustering system) and TSUBAME 2.5 (CPU/GPU hybrid system) demonstrate high efficiency. The peak performance of 3.1 PFLOPS is attained using 80,199 nodes of the K computer. The peak performance of the multi-node and multi-GPU implementation is 514 TFLOPS using 1349 nodes and 4047 GPUs of TSUBAME 2.5. © 2016 Wiley Periodicals, Inc.
CAMPBELL, PHILIP L.
1999-08-01
This report presents an implementation of the Berlekamp-Massey linear feedback shift-register (LFSR) synthesis algorithm in the C programming language. Two pseudo-code versions of the code are given, the operation of LFSRs is explained, C-version of the pseudo-code versions is presented, and the output of the code, when run on two input samples, is shown.
NASA Astrophysics Data System (ADS)
Massanes, Francesc; Cadennes, Marie; Brankov, Jovan G.
2011-07-01
We describe and evaluate a fast implementation of a classical block-matching motion estimation algorithm for multiple graphical processing units (GPUs) using the compute unified device architecture computing engine. The implemented block-matching algorithm uses summed absolute difference error criterion and full grid search (FS) for finding optimal block displacement. In this evaluation, we compared the execution time of a GPU and CPU implementation for images of various sizes, using integer and noninteger search grids. The results show that use of a GPU card can shorten computation time by a factor of 200 times for integer and 1000 times for a noninteger search grid. The additional speedup for a noninteger search grid comes from the fact that GPU has built-in hardware for image interpolation. Further, when using multiple GPU cards, the presented evaluation shows the importance of the data splitting method across multiple cards, but an almost linear speedup with a number of cards is achievable. In addition, we compared the execution time of the proposed FS GPU implementation with two existing, highly optimized nonfull grid search CPU-based motion estimations methods, namely implementation of the Pyramidal Lucas Kanade Optical flow algorithm in OpenCV and simplified unsymmetrical multi-hexagon search in H.264/AVC standard. In these comparisons, FS GPU implementation still showed modest improvement even though the computational complexity of FS GPU implementation is substantially higher than non-FS CPU implementation. We also demonstrated that for an image sequence of 720 × 480 pixels in resolution commonly used in video surveillance, the proposed GPU implementation is sufficiently fast for real-time motion estimation at 30 frames-per-second using two NVIDIA C1060 Tesla GPU cards.
Massanes, Francesc; Cadennes, Marie; Brankov, Jovan G
2011-07-01
In this paper we describe and evaluate a fast implementation of a classical block matching motion estimation algorithm for multiple Graphical Processing Units (GPUs) using the Compute Unified Device Architecture (CUDA) computing engine. The implemented block matching algorithm (BMA) uses summed absolute difference (SAD) error criterion and full grid search (FS) for finding optimal block displacement. In this evaluation we compared the execution time of a GPU and CPU implementation for images of various sizes, using integer and non-integer search grids.The results show that use of a GPU card can shorten computation time by a factor of 200 times for integer and 1000 times for a non-integer search grid. The additional speedup for non-integer search grid comes from the fact that GPU has built-in hardware for image interpolation. Further, when using multiple GPU cards, the presented evaluation shows the importance of the data splitting method across multiple cards, but an almost linear speedup with a number of cards is achievable.In addition we compared execution time of the proposed FS GPU implementation with two existing, highly optimized non-full grid search CPU based motion estimations methods, namely implementation of the Pyramidal Lucas Kanade Optical flow algorithm in OpenCV and Simplified Unsymmetrical multi-Hexagon search in H.264/AVC standard. In these comparisons, FS GPU implementation still showed modest improvement even though the computational complexity of FS GPU implementation is substantially higher than non-FS CPU implementation.We also demonstrated that for an image sequence of 720×480 pixels in resolution, commonly used in video surveillance, the proposed GPU implementation is sufficiently fast for real-time motion estimation at 30 frames-per-second using two NVIDIA C1060 Tesla GPU cards.
Massanes, Francesc; Cadennes, Marie; Brankov, Jovan G.
2012-01-01
In this paper we describe and evaluate a fast implementation of a classical block matching motion estimation algorithm for multiple Graphical Processing Units (GPUs) using the Compute Unified Device Architecture (CUDA) computing engine. The implemented block matching algorithm (BMA) uses summed absolute difference (SAD) error criterion and full grid search (FS) for finding optimal block displacement. In this evaluation we compared the execution time of a GPU and CPU implementation for images of various sizes, using integer and non-integer search grids. The results show that use of a GPU card can shorten computation time by a factor of 200 times for integer and 1000 times for a non-integer search grid. The additional speedup for non-integer search grid comes from the fact that GPU has built-in hardware for image interpolation. Further, when using multiple GPU cards, the presented evaluation shows the importance of the data splitting method across multiple cards, but an almost linear speedup with a number of cards is achievable. In addition we compared execution time of the proposed FS GPU implementation with two existing, highly optimized non-full grid search CPU based motion estimations methods, namely implementation of the Pyramidal Lucas Kanade Optical flow algorithm in OpenCV and Simplified Unsymmetrical multi-Hexagon search in H.264/AVC standard. In these comparisons, FS GPU implementation still showed modest improvement even though the computational complexity of FS GPU implementation is substantially higher than non-FS CPU implementation. We also demonstrated that for an image sequence of 720×480 pixels in resolution, commonly used in video surveillance, the proposed GPU implementation is sufficiently fast for real-time motion estimation at 30 frames-per-second using two NVIDIA C1060 Tesla GPU cards. PMID:22347787
Position Accuracy Improvement by Implementing the DGNSS-CP Algorithm in Smartphones
Yoon, Donghwan; Kee, Changdon; Seo, Jiwon; Park, Byungwoon
2016-01-01
The position accuracy of Global Navigation Satellite System (GNSS) modules is one of the most significant factors in determining the feasibility of new location-based services for smartphones. Considering the structure of current smartphones, it is impossible to apply the ordinary range-domain Differential GNSS (DGNSS) method. Therefore, this paper describes and applies a DGNSS-correction projection method to a commercial smartphone. First, the local line-of-sight unit vector is calculated using the elevation and azimuth angle provided in the position-related output of Android’s LocationManager, and this is transformed to Earth-centered, Earth-fixed coordinates for use. To achieve position-domain correction for satellite systems other than GPS, such as GLONASS and BeiDou, the relevant line-of-sight unit vectors are used to construct an observation matrix suitable for multiple constellations. The results of static and dynamic tests show that the standalone GNSS accuracy is improved by about 30%–60%, thereby reducing the existing error of 3–4 m to just 1 m. The proposed algorithm enables the position error to be directly corrected via software, without the need to alter the hardware and infrastructure of the smartphone. This method of implementation and the subsequent improvement in performance are expected to be highly effective to portability and cost saving. PMID:27322284
Position Accuracy Improvement by Implementing the DGNSS-CP Algorithm in Smartphones.
Yoon, Donghwan; Kee, Changdon; Seo, Jiwon; Park, Byungwoon
2016-06-18
The position accuracy of Global Navigation Satellite System (GNSS) modules is one of the most significant factors in determining the feasibility of new location-based services for smartphones. Considering the structure of current smartphones, it is impossible to apply the ordinary range-domain Differential GNSS (DGNSS) method. Therefore, this paper describes and applies a DGNSS-correction projection method to a commercial smartphone. First, the local line-of-sight unit vector is calculated using the elevation and azimuth angle provided in the position-related output of Android's LocationManager, and this is transformed to Earth-centered, Earth-fixed coordinates for use. To achieve position-domain correction for satellite systems other than GPS, such as GLONASS and BeiDou, the relevant line-of-sight unit vectors are used to construct an observation matrix suitable for multiple constellations. The results of static and dynamic tests show that the standalone GNSS accuracy is improved by about 30%-60%, thereby reducing the existing error of 3-4 m to just 1 m. The proposed algorithm enables the position error to be directly corrected via software, without the need to alter the hardware and infrastructure of the smartphone. This method of implementation and the subsequent improvement in performance are expected to be highly effective to portability and cost saving.
NASA Astrophysics Data System (ADS)
Pietrzyk, Mariusz W.; Rannou, Didier; Brennan, Patrick C.
2012-02-01
This pilot study examines the effect of a novel decision support system in medical image interpretation. This system is based on combining image spatial frequency properties and eye-tracking data in order to recognize over and under calling errors. Thus, before it can be implemented as a detection aided schema, training is required during which SVMbased algorithm learns to recognize FP from all reported outcomes, and, FN from all unreported prolonged dwelled regions. Eight radiologists inspected 50 PA chest radiographs with the specific task of identifying lung nodules. Twentyfive cases contained CT proven subtle malignant lesions (5-20mm), but prevalence was not known by the subjects, who took part in two sequential reading sessions, the second, without and with support system feedback. MCMR ROC DBM and JAFROC analyses were conducted and demonstrated significantly higher scores following feedback with p values of 0.04, and 0.03 respectively, highlighting significant improvements in radiology performance once feedback was used. This positive effect on radiologists' performance might have important implications for future CAD-system development.
Liu, Ju-Chi; Chou, Hung-Chyun; Chen, Chien-Hsiu; Lin, Yi-Tseng; Kuo, Chung-Hsien
2016-01-01
A high efficient time-shift correlation algorithm was proposed to deal with the peak time uncertainty of P300 evoked potential for a P300-based brain-computer interface (BCI). The time-shift correlation series data were collected as the input nodes of an artificial neural network (ANN), and the classification of four LED visual stimuli was selected as the output node. Two operating modes, including fast-recognition mode (FM) and accuracy-recognition mode (AM), were realized. The proposed BCI system was implemented on an embedded system for commanding an adult-size humanoid robot to evaluate the performance from investigating the ground truth trajectories of the humanoid robot. When the humanoid robot walked in a spacious area, the FM was used to control the robot with a higher information transfer rate (ITR). When the robot walked in a crowded area, the AM was used for high accuracy of recognition to reduce the risk of collision. The experimental results showed that, in 100 trials, the accuracy rate of FM was 87.8% and the average ITR was 52.73 bits/min. In addition, the accuracy rate was improved to 92% for the AM, and the average ITR decreased to 31.27 bits/min. due to strict recognition constraints.
Almansouri, Hani; Johnson, Christi R; Clayton, Dwight A; Polsky, Yarom; Bouman, Charlie; Santos-Villalobos, Hector J
2017-01-01
All commercial nuclear power plants (NPPs) in the United States contain concrete structures. These structures provide important foundation, support, shielding, and containment functions. Identification and management of aging and the degradation of concrete structures is fundamental to the proposed long-term operation of NPPs. Concrete structures in NPPs are often inaccessible and contain large volumes of massively thick concrete. While acoustic imaging using the synthetic aperture focusing technique (SAFT) works adequately well for thin specimens of concrete such as concrete transportation structures, enhancements are needed for heavily reinforced, thick concrete. We argue that image reconstruction quality for acoustic imaging in thick concrete could be improved with Model-Based Iterative Reconstruction (MBIR) techniques. MBIR works by designing a probabilistic model for the measurements (forward model) and a probabilistic model for the object (prior model). Both models are used to formulate an objective function (cost function). The final step in MBIR is to optimize the cost function. Previously, we have demonstrated a first implementation of MBIR for an ultrasonic transducer array system. The original forward model has been upgraded to account for direct arrival signal. Updates to the forward model will be documented and the new algorithm will be assessed with synthetic and empirical samples.
Albert, Carlo; Vogel, Sören
2016-01-01
The General Unified Threshold model of Survival (GUTS) provides a consistent mathematical framework for survival analysis. However, the calibration of GUTS models is computationally challenging. We present a novel algorithm and its fast implementation in our R package, GUTS, that help to overcome these challenges. We show a step-by-step application example consisting of model calibration and uncertainty estimation as well as making probabilistic predictions and validating the model with new data. Using self-defined wrapper functions, we show how to produce informative text printouts and plots without effort, for the inexperienced as well as the advanced user. The complete ready-to-run script is available as supplemental material. We expect that our software facilitates novel re-analysis of existing survival data as well as asking new research questions in a wide range of sciences. In particular the ability to quickly quantify stressor thresholds in conjunction with dynamic compensating processes, and their uncertainty, is an improvement that complements current survival analysis methods. PMID:27340823
Overgaard, Rune V; Jonsson, Niclas; Tornøe, Christoffer W; Madsen, Henrik
2005-02-01
Pharmacokinetic/pharmacodynamic modelling is most often performed using non-linear mixed-effects models based on ordinary differential equations with uncorrelated intra-individual residuals. More sophisticated residual error models as e.g. stochastic differential equations (SDEs) with measurement noise can in many cases provide a better description of the variations, which could be useful in various aspects of modelling. This general approach enables a decomposition of the intra-individual residual variation epsilon into system noise w and measurement noise e. The present work describes implementation of SDEs in a non-linear mixed-effects model, where parameter estimation was performed by a novel approximation of the likelihood function. This approximation is constructed by combining the First-Order Conditional Estimation (FOCE) method used in non-linear mixed-effects modelling with the Extended Kalman Filter used in models with SDEs. Fundamental issues concerning the proposed model and estimation algorithm are addressed by simulation studies, concluding that system noise can successfully be separated from measurement noise and inter-individual variability.
NASA Astrophysics Data System (ADS)
Chun, Joohwan; Kailath, Thomas; Son, Jung-Young
2000-03-01
The systems such as infrared search and trackers (IRST's), forward looking infrared systems (FLIR's), sonars, and 2-D radars consist of two functional blocks; a detection unit and a tracker. The detection unit which has matched filters followed by a threshold device generates a set of multiple two-dimensional points or detects at every sampling time. For a radar or sonar, each generated detect has polar coordinates, the range and azimuth while an IRST or FLIR produces detects in cartesian coordinates. In practice, the detection unit always has a non-zero false alarm rate, and therefore, the set of detects usually contains clutter points as well as the target. In this paper, we shall present a new target tracking algorithm for clutter environment applicable to a wide range of tracking systems. More specifically, the two-dimensional tracking problem in clutter environment is solved in the discrete-time Bayes optimal (nonlinear, and non-Gaussian) estimation framework. The proposed method recursively finds the entire probability density functions of the target position and velocity. With our approach, the nonlinear estimation problem is converted into simpler linear convolution operations, which can efficiently be implemented with optical devices such as lenses, CCD's (charge coupled devices), SLM's (spatial light modulators) and films.
NASA Astrophysics Data System (ADS)
Almansouri, Hani; Johnson, Christi; Clayton, Dwight; Polsky, Yarom; Bouman, Charles; Santos-Villalobos, Hector
2017-02-01
All commercial nuclear power plants (NPPs) in the United States contain concrete structures. These structures provide important foundation, support, shielding, and containment functions. Identification and management of aging and the degradation of concrete structures is fundamental to the proposed long-term operation of NPPs. Concrete structures in NPPs are often inaccessible and contain large volumes of massively thick concrete. While acoustic imaging using the synthetic aperture focusing technique (SAFT) works adequately well for thin specimens of concrete such as concrete transportation structures, enhancements are needed for heavily reinforced, thick concrete. We argue that image reconstruction quality for acoustic imaging in thick concrete could be improved with Model-Based Iterative Reconstruction (MBIR) techniques. MBIR works by designing a probabilistic model for the measurements (forward model) and a probabilistic model for the object (prior model). Both models are used to formulate an objective function (cost function). The final step in MBIR is to optimize the cost function. Previously, we have demonstrated a first implementation of MBIR for an ultrasonic transducer array system. The original forward model has been upgraded to account for direct arrival signal. Updates to the forward model will be documented and the new algorithm will be assessed with synthetic and empirical samples.
NASA Astrophysics Data System (ADS)
Perez, L.; Autrique, L.; Gillet, M.
2008-11-01
The aim of this paper is to investigate the thermal diffusivity identification of a multilayered material dedicated to fire protection. In a military framework, fire protection needs to meet specific requirements, and operational protective systems must be constantly improved in order to keep up with the development of new weapons. In the specific domain of passive fire protections, intumescent coatings can be an effective solution on the battlefield. Intumescent materials have the ability to swell up when they are heated, building a thick multi-layered coating which provides efficient thermal insulation to the underlying material. Due to the heat aggressions (fire or explosion) leading to the intumescent phenomena, high temperatures are considered and prevent from linearization of the mathematical model describing the system state evolution. Previous sensitivity analysis has shown that the thermal diffusivity of the multilayered intumescent coating is a key parameter in order to validate the predictive numerical tool and therefore for thermal protection optimisation. A conjugate gradient method is implemented in order to minimise the quadratic cost function related to the error between predicted temperature and measured temperature. This regularisation algorithm is well adapted for a large number of unknown parameters.
Dealing with change in process choreographies: Design and implementation of propagation algorithms.
Fdhila, Walid; Indiono, Conrad; Rinderle-Ma, Stefanie; Reichert, Manfred
2015-04-01
Enabling process changes constitutes a major challenge for any process-aware information system. This not only holds for processes running within a single enterprise, but also for collaborative scenarios involving distributed and autonomous partners. In particular, if one partner adapts its private process, the change might affect the processes of the other partners as well. Accordingly, it might have to be propagated to concerned partners in a transitive way. A fundamental challenge in this context is to find ways of propagating the changes in a decentralized manner. Existing approaches are limited with respect to the change operations considered as well as their dependency on a particular process specification language. This paper presents a generic change propagation approach that is based on the Refined Process Structure Tree, i.e., the approach is independent of a specific process specification language. Further, it considers a comprehensive set of change patterns. For all these change patterns, it is shown that the provided change propagation algorithms preserve consistency and compatibility of the process choreography. Finally, a proof-of-concept prototype of a change propagation framework for process choreographies is presented. Overall, comprehensive change support in process choreographies will foster the implementation and operational support of agile collaborative process scenarios.
NASA Astrophysics Data System (ADS)
Cahyaningrum, Rosalia D.; Bustamam, Alhadi; Siswantining, Titin
2017-03-01
Technology of microarray became one of the imperative tools in life science to observe the gene expression levels, one of which is the expression of the genes of people with carcinoma. Carcinoma is a cancer that forms in the epithelial tissue. These data can be analyzed such as the identification expressions hereditary gene and also build classifications that can be used to improve diagnosis of carcinoma. Microarray data usually served in large dimension that most methods require large computing time to do the grouping. Therefore, this study uses spectral clustering method which allows to work with any object for reduces dimension. Spectral clustering method is a method based on spectral decomposition of the matrix which is represented in the form of a graph. After the data dimensions are reduced, then the data are partitioned. One of the famous partition method is Partitioning Around Medoids (PAM) which is minimize the objective function with exchanges all the non-medoid points into medoid point iteratively until converge. Objectivity of this research is to implement methods spectral clustering and partitioning algorithm PAM to obtain groups of 7457 genes with carcinoma based on the similarity value. The result in this study is two groups of genes with carcinoma.
NASA Astrophysics Data System (ADS)
Nagata, Yuichi
We propose an genetic algorithm (GA) that applies to the traveling salesman problem (TSP). The GA uses edge assembly crossover (EAX), which is known to be effective for solving the TSP. We first propose a fast implementation of a localized EAX where localized edge exchanges are used in the EAX procedure. We also propose a selection model with an effective combination of the localized EAX that can maintain population diversity at negligible computational costs. Edge entropy measure is used to evaluate population diversity. We demonstrate that the proposed GA is comparable to state-of-the-art heuristics for the TSP. Especially, the GA is superior to them on large instances more than 10,000 cities. For example, the GA found an optimal solution of brd14051 (14,051 cities instance) with a reasonable computational cost. The results are quite impressive because the GA does not use Lin-Kernighan local search (LKLS) even though almost all existing state-of-the-art heuristics for the TSP based on LKLS and its variants.
McLoughlin, Kevin
2016-01-11
This report describes the design and implementation of an algorithm for estimating relative microbial abundances, together with confidence limits, using data from metagenomic DNA sequencing. For the background behind this project and a detailed discussion of our modeling approach for metagenomic data, we refer the reader to our earlier technical report, dated March 4, 2014. Briefly, we described a fully Bayesian generative model for paired-end sequence read data, incorporating the effects of the relative abundances, the distribution of sequence fragment lengths, fragment position bias, sequencing errors and variations between the sampled genomes and the nearest reference genomes. A distinctive feature of our modeling approach is the use of a Chinese restaurant process (CRP) to describe the selection of genomes to be sampled, and thus the relative abundances. The CRP component is desirable for fitting abundances to reads that may map ambiguously to multiple targets, because it naturally leads to sparse solutions that select the best representative from each set of nearly equivalent genomes.
Liu, Ju-Chi; Chou, Hung-Chyun; Chen, Chien-Hsiu; Lin, Yi-Tseng
2016-01-01
A high efficient time-shift correlation algorithm was proposed to deal with the peak time uncertainty of P300 evoked potential for a P300-based brain-computer interface (BCI). The time-shift correlation series data were collected as the input nodes of an artificial neural network (ANN), and the classification of four LED visual stimuli was selected as the output node. Two operating modes, including fast-recognition mode (FM) and accuracy-recognition mode (AM), were realized. The proposed BCI system was implemented on an embedded system for commanding an adult-size humanoid robot to evaluate the performance from investigating the ground truth trajectories of the humanoid robot. When the humanoid robot walked in a spacious area, the FM was used to control the robot with a higher information transfer rate (ITR). When the robot walked in a crowded area, the AM was used for high accuracy of recognition to reduce the risk of collision. The experimental results showed that, in 100 trials, the accuracy rate of FM was 87.8% and the average ITR was 52.73 bits/min. In addition, the accuracy rate was improved to 92% for the AM, and the average ITR decreased to 31.27 bits/min. due to strict recognition constraints. PMID:27579033
Gong, Zongyi; Klanian, Kelly; Patel, Tushita; Sullivan, Olivia; Williams, Mark B.
2012-01-01
Purpose: We are developing a dual modality tomosynthesis breast scanner in which x-ray transmission tomosynthesis and gamma emission tomosynthesis are performed sequentially with the breast in a common configuration. In both modalities projection data are obtained over an angular range of less than 180° from one side of the mildly compressed breast resulting in incomplete and asymmetrical sampling. The objective of this work is to implement and evaluate a maximum likelihood expectation maximization (MLEM) reconstruction algorithm for gamma emission breast tomosynthesis (GEBT). Methods: A combination of Monte Carlo simulations and phantom experiments was used to test the MLEM algorithm for GEBT. The algorithm utilizes prior information obtained from the x-ray breast tomosynthesis scan to partially compensate for the incomplete angular sampling and to perform attenuation correction (AC) and resolution recovery (RR). System spatial resolution, image artifacts, lesion contrast, and signal to noise ratio (SNR) were measured as image quality figures of merit. To test the robustness of the reconstruction algorithm and to assess the relative impacts of correction techniques with changing angular range, simulations and experiments were both performed using acquisition angular ranges of 45°, 90° and 135°. For comparison, a single projection containing the same total number of counts as the full GEBT scan was also obtained to simulate planar breast scintigraphy. Results: The in-plane spatial resolution of the reconstructed GEBT images is independent of source position within the reconstructed volume and independent of acquisition angular range. For 45° acquisitions, spatial resolution in the depth dimension (the direction of breast compression) is degraded with increasing source depth (increasing distance from the collimator surface). Increasing the acquisition angular range from 45° to 135° both greatly reduces this depth dependence and improves the average depth
NASA Astrophysics Data System (ADS)
Cobos Arribas, Pedro; Monasterio Huelin Macia, Felix
2003-04-01
A FPGA based hardware implementation of the Santos-Victor optical flow algorithm, useful in robot guidance applications, is described in this paper. The system used to do contains an ALTERA FPGA (20K100), an interface with a digital camera, three VRAM memories to contain the data input and some output memories (a VRAM and a EDO) to contain the results. The system have been used previously to develop and test other vision algorithms, such as image compression, optical flow calculation with differential and correlation methods. The designed system let connect the digital camera, or the FPGA output (results of algorithms) to a PC, throw its Firewire or USB port. The problems take place in this occasion have motivated to adopt another hardware structure for certain vision algorithms with special requirements, that need a very hard code intensive processing.
NASA Astrophysics Data System (ADS)
Akil, Mohamed
2017-05-01
The real-time processing is getting more and more important in many image processing applications. Image segmentation is one of the most fundamental tasks image analysis. As a consequence, many different approaches for image segmentation have been proposed. The watershed transform is a well-known image segmentation tool. The watershed transform is a very data intensive task. To achieve acceleration and obtain real-time processing of watershed algorithms, parallel architectures and programming models for multicore computing have been developed. This paper focuses on the survey of the approaches for parallel implementation of sequential watershed algorithms on multicore general purpose CPUs: homogeneous multicore processor with shared memory. To achieve an efficient parallel implementation, it's necessary to explore different strategies (parallelization/distribution/distributed scheduling) combined with different acceleration and optimization techniques to enhance parallelism. In this paper, we give a comparison of various parallelization of sequential watershed algorithms on shared memory multicore architecture. We analyze the performance measurements of each parallel implementation and the impact of the different sources of overhead on the performance of the parallel implementations. In this comparison study, we also discuss the advantages and disadvantages of the parallel programming models. Thus, we compare the OpenMP (an application programming interface for multi-Processing) with Ptheads (POSIX Threads) to illustrate the impact of each parallel programming model on the performance of the parallel implementations.
Schneider, Nadine; Sayle, Roger A; Landrum, Gregory A
2015-10-26
Finding a canonical ordering of the atoms in a molecule is a prerequisite for generating a unique representation of the molecule. The canonicalization of a molecule is usually accomplished by applying some sort of graph relaxation algorithm, the most common of which is the Morgan algorithm. There are known issues with that algorithm that lead to noncanonical atom orderings as well as problems when it is applied to large molecules like proteins. Furthermore, each cheminformatics toolkit or software provides its own version of a canonical ordering, most based on unpublished algorithms, which also complicates the generation of a universal unique identifier for molecules. We present an alternative canonicalization approach that uses a standard stable-sorting algorithm instead of a Morgan-like index. Two new invariants that allow canonical ordering of molecules with dependent chirality as well as those with highly symmetrical cyclic graphs have been developed. The new approach proved to be robust and fast when tested on the 1.45 million compounds of the ChEMBL 20 data set in different scenarios like random renumbering of input atoms or SMILES round tripping. Our new algorithm is able to generate a canonical order of the atoms of protein molecules within a few milliseconds. The novel algorithm is implemented in the open-source cheminformatics toolkit RDKit. With this paper, we provide a reference Python implementation of the algorithm that could easily be integrated in any cheminformatics toolkit. This provides a first step toward a common standard for canonical atom ordering to generate a universal unique identifier for molecules other than InChI.
Sinha Gregory, Naina; Seley, Jane Jeffrie; Gerber, Linda M; Tang, Chin; Brillon, David
2016-12-01
More than one-third of hospitalized patients have hyperglycemia. Despite evidence that improving glycemic control leads to better outcomes, achieving recognized targets remains a challenge. The objective of this study was to evaluate the implementation of a computerized insulin order set and titration algorithm on rates of hypoglycemia and overall inpatient glycemic control. A prospective observational study evaluating the impact of a glycemic order set and titration algorithm in an academic medical center in non-critical care medical and surgical inpatients. The initial intervention was hospital-wide implementation of a comprehensive insulin order set. The secondary intervention was initiation of an insulin titration algorithm in two pilot medicine inpatient units. Point of care testing blood glucose reports were analyzed. These reports included rates of hypoglycemia (BG < 70 mg/dL) and hyperglycemia (BG >200 mg/dL in phase 1, BG > 180 mg/dL in phase 2). In the first phase of the study, implementation of the insulin order set was associated with decreased rates of hypoglycemia (1.92% vs 1.61%; p < 0.001) and increased rates of hyperglycemia (24.02% vs 27.27%; p < 0.001) from 2010 to 2011. In the second phase, addition of a titration algorithm was associated with decreased rates of hypoglycemia (2.57% vs 1.82%; p = 0.039) and increased rates of hyperglycemia (31.76% vs 41.33%; p < 0.001) from 2012 to 2013. A comprehensive computerized insulin order set and titration algorithm significantly decreased rates of hypoglycemia. This significant reduction in hypoglycemia was associated with increased rates of hyperglycemia. Hardwiring the algorithm into the electronic medical record may foster adoption.
Chlorophyll fluorescence: implementation in the full physics RemoTeC algorithm
NASA Astrophysics Data System (ADS)
Hahne, Philipp; Frankenberg, Christian; Hasekamp, Otto; Landgraf, Jochen; Butz, André
2014-05-01
Several operating and future satellite missions are dedicated to enhancing our understanding of the carbon cycle. They infer the atmospheric concentrations of carbon dioxide and methane from shortwave infrared absorption spectra of sunlight backscattered from Earth's atmosphere and surface. Exhibiting high spatial and temporal resolution, the inferred gas concentration databases provide valuable information for inverse modelling of source and sink processes at the Earth's surface. However, the inversion of sources and sinks requires highly accurate total column CO2 (XCO2) and CH4 (XCH4) measurements, which remains a challenge. Recently, Frankenberg et al., 2012, showed that - beside XCO2 and XCH4 - chlorophyll fluorescence can be retrieved from sounders such as GOSAT exploiting Fraunhofer lines in the vicinity of the O2 A-band. This has two implications: a) chlorophyll fluorescence itself being a proxy for photosynthetic activity yields new information on carbon cycle processes and b) the neglect of the fluorescence signal can induce errors in the retrieved greenhouse gas concentrations. Our RemoTeC full physics algorithm iteratively retrieves the target gas concentrations XCO2 and XCH4 along with atmospheric scattering properties and other auxiliary parameters. The radiative transfer model (RTM) LINTRAN provides RemoTeC with the single and multiple scattered intensity field and its analytically calculated derivatives. Here, we report on the implementation of a fluorescence light source at the lower boundary of our RTM. Processing three years of GOSAT data, we evaluate the performance of the refined retrieval method. To this end, we compare different retrieval configurations, using the s- and p-polarization detectors independently and combined, and validate to independent data sources.
Optical pattern recognition architecture implementing the mean-square error correlation algorithm
Molley, P.A.
1991-10-22
This patent describes an optical architecture implementing the mean-square error correlation algorithm, MSE = {Sigma}(I {minus} R){sup 2} for discriminating the presence of a reference image R in an input image scene I by computing the mean-square-error between a time-varying reference image signal s{sub 1}(t) and a time-varying input image signal s{sub 2}(t) includes a laser diode light source which is temporally modulated by a double-sideband suppressed-carrier source modulation signal I{sub 1}(t) having the form I{sub 1}(t) = A{sub 1}(1 = sq. root 2m{sub 1}s{sub 1}(t)cos (2{pi} f{sub 0}t)) and the modulated light output from the laser diode source is diffracted by an acousto-optic deflector. The resultant intensity of the +1 diffracted order from the acousto-optic device is given by I{sub 2}(t) = A{sub 2}(+2m{sub 2}{sup 2}s{sub 2}{sup 2}(t) {minus} 2 sq. root 2m{sub 2}(t) cos (2{pi}f{sub 0}t)). The time integration of the two signals I{sub 1}(t) and I{sub 2}(t) on the CCD deflector plane produces the result R{tau} of the mean-square error having the form: R({tau}) = A{sub 1}A{sub 2}{l brace}(T) +(2m{sub 2}{sup 2 {integral} s}{sub 2}{sup 2}(t {minus} {tau})dt) {minus} (2m{sub 1}m{sub 2} cos (2{tau}f{sub 0}{tau}) {integral} s{sub 1}(t)s{sub 2}(t {minus} {tau}) dt){r brace}.
Parallel Implementation of the Wideband DOA Algorithm on the IBM Cell BE Processor
2010-05-01
Abstract—The Multiple Signal Classification ( MUSIC ) algorithm is a powerful technique for determining the Direction of Arrival (DOA) of signals...Broadband Engine Processor (Cell BE). The process of adapting the serial based MUSIC algorithm to the Cell BE will be analyzed in terms of parallelism and...using Multiple Signal Classification MUSIC algorithm [4] • Computation of Focus matrix • Computation of number of sources • Separation of Signal
Implementation of FFT Algorithm using DSP TMS320F28335 for Shunt Active Power Filter
NASA Astrophysics Data System (ADS)
Patel, Pinkal Jashvantbhai; Patel, Rajesh M.; Patel, Vinod
2016-07-01
This work presents simulation, analysis and experimental verification of Fast Fourier Transform (FFT) algorithm for shunt active power filter based on three-level inverter. Different types of filters can be used for elimination of harmonics in the power system. In this work, FFT algorithm for reference current generation is discussed. FFT control algorithm is verified using PSIM simulation results with DLL block and C-code. Simulation results are compared with experimental results for FFT algorithm using DSP TMS320F28335 for shunt active power filter application.
Implementation of FFT Algorithm using DSP TMS320F28335 for Shunt Active Power Filter
NASA Astrophysics Data System (ADS)
Patel, Pinkal Jashvantbhai; Patel, Rajesh M.; Patel, Vinod
2017-06-01
This work presents simulation, analysis and experimental verification of Fast Fourier Transform (FFT) algorithm for shunt active power filter based on three-level inverter. Different types of filters can be used for elimination of harmonics in the power system. In this work, FFT algorithm for reference current generation is discussed. FFT control algorithm is verified using PSIM simulation results with DLL block and C-code. Simulation results are compared with experimental results for FFT algorithm using DSP TMS320F28335 for shunt active power filter application.
NASA Astrophysics Data System (ADS)
Moliner, L.; Correcher, C.; González, A. J.; Conde, P.; Hernández, L.; Orero, A.; Rodríguez-Álvarez, M. J.; Sánchez, F.; Soriano, A.; Vidal, L. F.; Benlloch, J. M.
2013-02-01
In this work we present an innovative algorithm for the reconstruction of PET images based on the List-Mode (LM) technique which improves their spatial resolution compared to results obtained with current MLEM algorithms. This study appears as a part of a large project with the aim of improving diagnosis in early Alzheimer disease stages by means of a newly developed hybrid PET-MR insert. At the present, Alzheimer is the most relevant neurodegenerative disease and the best way to apply an effective treatment is its early diagnosis. The PET device will consist of several monolithic LYSO crystals coupled to SiPM detectors. Monolithic crystals can reduce scanner costs with the advantage to enable implementation of very small virtual pixels in their geometry. This is especially useful for LM reconstruction algorithms, since they do not need a pre-calculated system matrix. We have developed an LM algorithm which has been initially tested with a large aperture (186 mm) breast PET system. Such an algorithm instead of using the common lines of response, incorporates a novel calculation of tubes of response. The new approach improves the volumetric spatial resolution about a factor 2 at the border of the field of view when compared with traditionally used MLEM algorithm. Moreover, it has also shown to decrease the image noise, thus increasing the image quality.
NASA Astrophysics Data System (ADS)
Alpatov, Boris; Babayan, Pavel; Ershov, Maksim; Strotov, Valery
2016-10-01
This paper describes the implementation of the orientation estimation algorithm in FPGA-based vision system. An approach to estimate an orientation of objects lacking axial symmetry is proposed. Suggested algorithm is intended to estimate orientation of a specific known 3D object based on object 3D model. The proposed orientation estimation algorithm consists of two stages: learning and estimation. Learning stage is devoted to the exploring of studied object. Using 3D model we can gather set of training images by capturing 3D model from viewpoints evenly distributed on a sphere. Sphere points distribution is made by the geosphere principle. Gathered training image set is used for calculating descriptors, which will be used in the estimation stage of the algorithm. The estimation stage is focusing on matching process between an observed image descriptor and the training image descriptors. The experimental research was performed using a set of images of Airbus A380. The proposed orientation estimation algorithm showed good accuracy in all case studies. The real-time performance of the algorithm in FPGA-based vision system was demonstrated.
NASA Astrophysics Data System (ADS)
Zhang, Yongcun; Shang, Shipeng; Liu, Shutian
2017-04-01
Asymptotic homogenization (AH) is a general method for predicting the effective coefficient of thermal expansion (CTE) of periodic composites. It has a rigorous mathematical foundation and can give an accurate solution if the macrostructure is large enough to comprise an infinite number of unit cells. In this paper, a novel implementation algorithm of asymptotic homogenization (NIAH) is developed to calculate the effective CTE of periodic composite materials. Compared with the previous implementation of AH, there are two obvious advantages. One is its implementation as simple as representative volume element (RVE). The new algorithm can be executed easily using commercial finite element analysis (FEA) software as a black box. The detailed process of the new implementation of AH has been provided. The other is that NIAH can simultaneously use more than one element type to discretize a unit cell, which can save much computational cost in predicting the CTE of a complex structure. Several examples are carried out to demonstrate the effectiveness of the new implementation. This work is expected to greatly promote the widespread use of AH in predicting the CTE of periodic composite materials.
NASA Astrophysics Data System (ADS)
Zhang, Yongcun; Shang, Shipeng; Liu, Shutian
2017-01-01
Asymptotic homogenization (AH) is a general method for predicting the effective coefficient of thermal expansion (CTE) of periodic composites. It has a rigorous mathematical foundation and can give an accurate solution if the macrostructure is large enough to comprise an infinite number of unit cells. In this paper, a novel implementation algorithm of asymptotic homogenization (NIAH) is developed to calculate the effective CTE of periodic composite materials. Compared with the previous implementation of AH, there are two obvious advantages. One is its implementation as simple as representative volume element (RVE). The new algorithm can be executed easily using commercial finite element analysis (FEA) software as a black box. The detailed process of the new implementation of AH has been provided. The other is that NIAH can simultaneously use more than one element type to discretize a unit cell, which can save much computational cost in predicting the CTE of a complex structure. Several examples are carried out to demonstrate the effectiveness of the new implementation. This work is expected to greatly promote the widespread use of AH in predicting the CTE of periodic composite materials.
Strout, Michelle
2015-08-15
Programming parallel machines is fraught with difficulties: the obfuscation of algorithms due to implementation details such as communication and synchronization, the need for transparency between language constructs and performance, the difficulty of performing program analysis to enable automatic parallelization techniques, and the existence of important "dusty deck" codes. The SAIMI project developed abstractions that enable the orthogonal specification of algorithms and implementation details within the context of existing DOE applications. The main idea is to enable the injection of small programming models such as expressions involving transcendental functions, polyhedral iteration spaces with sparse constraints, and task graphs into full programs through the use of pragmas. These smaller, more restricted programming models enable orthogonal specification of many implementation details such as how to map the computation on to parallel processors, how to schedule the computation, and how to allocation storage for the computation. At the same time, these small programming models enable the expression of the most computationally intense and communication heavy portions in many scientific simulations. The ability to orthogonally manipulate the implementation for such computations will significantly ease performance programming efforts and expose transformation possibilities and parameter to automated approaches such as autotuning. At Colorado State University, the SAIMI project was supported through DOE grant DE-SC3956 from April 2010 through August 2015. The SAIMI project has contributed a number of important results to programming abstractions that enable the orthogonal specification of implementation details in scientific codes. This final report summarizes the research that was funded by the SAIMI project.
Design and Implementation of High-Speed Input-Queued Switches Based on a Fair Scheduling Algorithm
NASA Astrophysics Data System (ADS)
Hu, Qingsheng; Zhao, Hua-An
To increase both the capacity and the processing speed for input-queued (IQ) switches, we proposed a fair scalable scheduling architecture (FSSA). By employing FSSA comprised of several cascaded sub-schedulers, a large-scale high performance switches or routers can be realized without the capacity limitation of monolithic device. In this paper, we present a fair scheduling algorithm named FSSA_DI based on an improved FSSA where a distributed iteration scheme is employed, the scheduler performance can be improved and the processing time can be reduced as well. Simulation results show that FSSA_DI achieves better performance on average delay and throughput under heavy loads compared to other existing algorithms. Moreover, a practical 64 × 64 FSSA using FSSA_DI algorithm is implemented by four Xilinx Vertex-4 FPGAs. Measurement results show that the data rates of our solution can be up to 800Mbps and the tradeoff between performance and hardware complexity has been solved peacefully.
Luk, F.T. )
1990-01-01
Various papers on advanced signal-processing, algorithms, architectures, and implementations are presented. Individual topics addressed include: wavelets and related time-scale transforms; real-time SAR change-detection using neural networks; nonlinear signal processing using radial basis functions; nonlinear classification and adaptive structures; efficient beam-based adaptive processing for planar arrays; translation, rotation, and scaling invariant object and texture classification using polyspectra; direction finding using a modified minimum-eigenvector technique; wavelets, tomography, and line-segment image representations. Also discussed are: solving unstructured grid problems on massively parallel computers; systolic array for Kalman filtering with algorithm-based fault tolerance; radar superrange resolution and Bragg cell interferometry; improved jammer localization using multiple focusing; parallel algorithms for automatic target recognition using laser radar imagery; accurate characterization of error propagation in a highly parallel architecture.
Li, Bin; Tian, Lianfang; Ou, Shanxing
2010-01-01
In order to efficiently and effectively reconstruct 3D medical images and clearly display the detailed information of inner structures and the inner hidden interfaces between different media, an Improved Volume Rendering Optical Model (IVROM) for medical translucent volume rendering and its implementation using the preintegrated Shear-Warp Volume Rendering algorithm are proposed in this paper, which can be readily applied on a commodity PC. Based on the classical absorption and emission model, effects of volumetric shadows and direct and indirect scattering are also considered in the proposed model IVROM. Moreover, the implementation of the Improved Translucent Volume Rendering Method (ITVRM) integrating the IVROM model, Shear-Warp and preintegrated volume rendering algorithm is described, in which the aliasing and staircase effects resulting from under-sampling in Shear-Warp, are avoided by the preintegrated volume rendering technique. This study demonstrates the superiority of the proposed method.
Li, Bin; Tian, Lianfang; Ou, Shanxing
2010-01-01
In order to efficiently and effectively reconstruct 3D medical images and clearly display the detailed information of inner structures and the inner hidden interfaces between different media, an Improved Volume Rendering Optical Model (IVROM) for medical translucent volume rendering and its implementation using the preintegrated Shear-Warp Volume Rendering algorithm are proposed in this paper, which can be readily applied on a commodity PC. Based on the classical absorption and emission model, effects of volumetric shadows and direct and indirect scattering are also considered in the proposed model IVROM. Moreover, the implementation of the Improved Translucent Volume Rendering Method (ITVRM) integrating the IVROM model, Shear-Warp and preintegrated volume rendering algorithm is described, in which the aliasing and staircase effects resulting from under-sampling in Shear-Warp, are avoided by the preintegrated volume rendering technique. This study demonstrates the superiority of the proposed method. PMID:20592761
On implementation of EM-type algorithms in the stochastic models for a matrix computing on GPU
Gorshenin, Andrey K.
2015-03-10
The paper discusses the main ideas of an implementation of EM-type algorithms for computing on the graphics processors and the application for the probabilistic models based on the Cox processes. An example of the GPU’s adapted MATLAB source code for the finite normal mixtures with the expectation-maximization matrix formulas is given. The testing of computational efficiency for GPU vs CPU is illustrated for the different sample sizes.
Mateos-Pérez, José María; Soto-Montenegro, María Luisa; Peña-Zalbidea, Santiago; Desco, Manuel; Vaquero, Juan José
2016-02-01
We present a novel segmentation algorithm for dynamic PET studies that groups pixels according to the similarity of their time-activity curves. Sixteen mice bearing a human tumor cell line xenograft (CH-157MN) were imaged with three different (68)Ga-DOTA-peptides (DOTANOC, DOTATATE, DOTATOC) using a small animal PET-CT scanner. Regional activities (input function and tumor) were obtained after manual delineation of regions of interest over the image. The algorithm was implemented under the jClustering framework and used to extract the same regional activities as in the manual approach. The volume of distribution in the tumor was computed using the Logan linear method. A Kruskal-Wallis test was used to investigate significant differences between the manually and automatically obtained volumes of distribution. The algorithm successfully segmented all the studies. No significant differences were found for the same tracer across different segmentation methods. Manual delineation revealed significant differences between DOTANOC and the other two tracers (DOTANOC - DOTATATE, p=0.020; DOTANOC - DOTATOC, p=0.033). Similar differences were found using the leader-follower algorithm. An open implementation of a novel segmentation method for dynamic PET studies is presented and validated in rodent studies. It successfully replicated the manual results obtained in small-animal studies, thus making it a reliable substitute for this task and, potentially, for other dynamic segmentation procedures. Copyright © 2016 Elsevier Ltd. All rights reserved.
Optical pattern recognition architecture implementing the mean-square error correlation algorithm
Molley, Perry A.
1991-01-01
An optical architecture implementing the mean-square error correlation algorithm, MSE=.SIGMA.[I-R].sup.2 for discriminating the presence of a reference image R in an input image scene I by computing the mean-square-error between a time-varying reference image signal s.sub.1 (t) and a time-varying input image signal s.sub.2 (t) includes a laser diode light source which is temporally modulated by a double-sideband suppressed-carrier source modulation signal I.sub.1 (t) having the form I.sub.1 (t)=A.sub.1 [1+.sqroot.2m.sub.1 s.sub.1 (t)cos (2.pi.f.sub.o t)] and the modulated light output from the laser diode source is diffracted by an acousto-optic deflector. The resultant intensity of the +1 diffracted order from the acousto-optic device is given by: I.sub.2 (t)=A.sub.2 [+2m.sub.2.sup.2 s.sub.2.sup.2 (t)-2.sqroot.2m.sub.2 (t) cos (2.pi.f.sub.o t] The time integration of the two signals I.sub.1 (t) and I.sub.2 (t) on the CCD deflector plane produces the result R(.tau.) of the mean-square error having the form: R(.tau.)=A.sub.1 A.sub.2 {[T]+[2m.sub.2.sup.2.multidot..intg.s.sub.2.sup.2 (t-.tau.)dt]-[2m.sub.1 m.sub.2 cos (2.tau.f.sub.o .tau.).multidot..intg.s.sub.1 (t)s.sub.2 (t-.tau.)dt]} where: s.sub.1 (t) is the signal input to the diode modulation source: s.sub.2 (t) is the signal input to the AOD modulation source; A.sub.1 is the light intensity; A.sub.2 is the diffraction efficiency; m.sub.1 and m.sub.2 are constants that determine the signal-to-bias ratio; f.sub.o is the frequency offset between the oscillator at f.sub.c and the modulation at f.sub.c +f.sub.o ; and a.sub.o and a.sub.1 are constant chosen to bias the diode source and the acousto-optic deflector into their respective linear operating regions so that the diode source exhibits a linear intensity characteristic and the AOD exhibits a linear amplitude characteristic.
NASA Astrophysics Data System (ADS)
Liu, Wei; Chen, Shu-Ming; Zhang, Jian; Wu, Chun-Wang; Wu, Wei; Chen, Ping-Xing
2015-03-01
It is widely believed that Shor’s factoring algorithm provides a driving force to boost the quantum computing research. However, a serious obstacle to its binary implementation is the large number of quantum gates. Non-binary quantum computing is an efficient way to reduce the required number of elemental gates. Here, we propose optimization schemes for Shor’s algorithm implementation and take a ternary version for factorizing 21 as an example. The optimized factorization is achieved by a two-qutrit quantum circuit, which consists of only two single qutrit gates and one ternary controlled-NOT gate. This two-qutrit quantum circuit is then encoded into the nine lower vibrational states of an ion trapped in a weakly anharmonic potential. Optimal control theory (OCT) is employed to derive the manipulation electric field for transferring the encoded states. The ternary Shor’s algorithm can be implemented in one single step. Numerical simulation results show that the accuracy of the state transformations is about 0.9919. Project supported by the National Natural Science Foundation of China (Grant No. 61205108) and the High Performance Computing (HPC) Foundation of National University of Defense Technology, China.
Terra, Ricardo Mingarini; Waisberg, Daniel Reis; de Almeida, José Luiz Jesus; Devido, Marcela Santana; Pêgo-Fernandes, Paulo Manuel; Jatene, Fabio Biscegli
2012-01-01
OBJECTIVE: We aimed to evaluate whether the inclusion of videothoracoscopy in a pleural empyema treatment algorithm would change the clinical outcome of such patients. METHODS: This study performed quality-improvement research. We conducted a retrospective review of patients who underwent pleural decortication for pleural empyema at our institution from 2002 to 2008. With the old algorithm (January 2002 to September 2005), open decortication was the procedure of choice, and videothoracoscopy was only performed in certain sporadic mid-stage cases. With the new algorithm (October 2005 to December 2008), videothoracoscopy became the first-line treatment option, whereas open decortication was only performed in patients with a thick pleural peel (>2 cm) observed by chest scan. The patients were divided into an old algorithm (n = 93) and new algorithm (n = 113) group and compared. The main outcome variables assessed included treatment failure (pleural space reintervention or death up to 60 days after medical discharge) and the occurrence of complications. RESULTS: Videothoracoscopy and open decortication were performed in 13 and 80 patients from the old algorithm group and in 81 and 32 patients from the new algorithm group, respectively (p<0.01). The patients in the new algorithm group were older (41±1 vs. 46.3±16.7 years, p = 0.014) and had higher Charlson Comorbidity Index scores [0(0-3) vs. 2(0-4), p = 0.032]. The occurrence of treatment failure was similar in both groups (19.35% vs. 24.77%, p = 0.35), although the complication rate was lower in the new algorithm group (48.3% vs. 33.6%, p = 0.04). CONCLUSIONS: The wider use of videothoracoscopy in pleural empyema treatment was associated with fewer complications and unaltered rates of mortality and reoperation even though more severely ill patients were subjected to videothoracoscopic surgery. PMID:22760892
Study on algorithm and real-time implementation of infrared image processing based on FPGA
NASA Astrophysics Data System (ADS)
Pang, Yulin; Ding, Ruijun; Liu, Shanshan; Chen, Zhe
2010-10-01
With the fast development of Infrared Focal Plane Arrays (IRFPA) detectors, high quality real-time image processing becomes more important in infrared imaging system. Facing the demand of better visual effect and good performance, we find FPGA is an ideal choice of hardware to realize image processing algorithm that fully taking advantage of its high speed, high reliability and processing a great amount of data in parallel. In this paper, a new idea of dynamic linear extension algorithm is introduced, which has the function of automatically finding the proper extension range. This image enhancement algorithm is designed in Verilog HDL and realized on FPGA. It works on higher speed than serial processing device like CPU and DSP. Experiment shows that this hardware unit of dynamic linear extension algorithm enhances the visual effect of infrared image effectively.
Multi-Core Parallel Implementation of Data Filtering Algorithm for Multi-Beam Bathymetry Data
NASA Astrophysics Data System (ADS)
Liu, Tianyang; Xu, Weiming; Yin, Xiaodong; Zhao, Xiliang
In order to improve the multi-beam bathymetry data processing speed, we propose a parallel filtering algorithm based on multi thread technology. The algorithm consists of two parts. The first is the parallel data re-order step, in which the surveying area is divided into a regular grid, and the discrete bathymetry data is arranged into each grid by parallel method. The second part is the parallel filtering step, which involves dividing the grid into blocks and parallel executing filtering process in each block. In the experiment, the speedup of the proposed algorithm reaches to about 3.67 with an 8 core computer. The result shows the method can improve computing efficiency significantly comparing to the traditional algorithm.
Vicroy, D.D.
1984-01-01
A simplified flight management descent algorithm was developed and programmed on a small programmable calculator. It was designed to aid the pilot in planning and executing a fuel conservative descent to arrive at a metering fix at a time designated by the air traffic control system. The algorithm may also be used for planning fuel conservative descents when time is not a consideration. The descent path was calculated for a constant Mach/airspeed schedule from linear approximations of airplane performance with considerations given for gross weight, wind, and nonstandard temperature effects. An explanation and examples of how the algorithm is used, as well as a detailed flow chart and listing of the algorithm are contained.
Fu, Haohao; Shao, Xueguang; Chipot, Christophe; Cai, Wensheng
2016-08-09
Proper use of the adaptive biasing force (ABF) algorithm in free-energy calculations needs certain prerequisites to be met, namely, that the Jacobian for the metric transformation and its first derivative be available and the coarse variables be independent and fully decoupled from any holonomic constraint or geometric restraint, thereby limiting singularly the field of application of the approach. The extended ABF (eABF) algorithm circumvents these intrinsic limitations by applying the time-dependent bias onto a fictitious particle coupled to the coarse variable of interest by means of a stiff spring. However, with the current implementation of eABF in the popular molecular dynamics engine NAMD, a trajectory-based post-treatment is necessary to derive the underlying free-energy change. Usually, such a posthoc analysis leads to a decrease in the reliability of the free-energy estimates due to the inevitable loss of information, as well as to a drop in efficiency, which stems from substantial read-write accesses to file systems. We have developed a user-friendly, on-the-fly code for performing eABF simulations within NAMD. In the present contribution, this code is probed in eight illustrative examples. The performance of the algorithm is compared with traditional ABF, on the one hand, and the original eABF implementation combined with a posthoc analysis, on the other hand. Our results indicate that the on-the-fly eABF algorithm (i) supplies the correct free-energy landscape in those critical cases where the coarse variables at play are coupled to either each other or to geometric restraints or holonomic constraints, (ii) greatly improves the reliability of the free-energy change, compared to the outcome of a posthoc analysis, and (iii) represents a negligible additional computational effort compared to regular ABF. Moreover, in the proposed implementation, guidelines for choosing two parameters of the eABF algorithm, namely the stiffness of the spring and the mass
NASA Astrophysics Data System (ADS)
Huang, Quanzhen; Luo, Jun; Li, Hengyu; Wang, Xiaohua
2013-08-01
With the wide application of large-scale flexible structures in spacecraft, vibration control problems in these structures have become important design issues. The filtered-X least mean square (FXLMS) algorithm is the most popular one in current active vibration control using adaptive filtering. It assumes that the source of interference can be measured and the interference source is considered as the reference signal input to the controller. However, in the actual control system, this assumption is not accurate, because it does not consider the impact of the reference signal on the output feedback signal. In this paper, an adaptive vibration active control algorithm based on an infinite impulse response (IIR) filter structure (FULMS, filtered-U least mean square) is proposed. The algorithm is based on an FXLMS algorithm framework, which replaces the finite impulse response (FIR) filter with an IIR filter. This paper focuses on the structural design of the controller, the process of the FULMS filtering control method, the design of the experimental model object, and the experimental platform construction for the entire control system. The comparison of the FXLMS algorithm with FULMS is theoretically analyzed and experimentally validated. The results show that the FULMS algorithm converges faster and controls better. The design of the FULMS controller is feasible and effective and has greater value in practical applications of aerospace engineering.
Implementation of spectral clustering on microarray data of carcinoma using k-means algorithm
NASA Astrophysics Data System (ADS)
Frisca, Bustamam, Alhadi; Siswantining, Titin
2017-03-01
Clustering is one of data analysis methods that aims to classify data which have similar characteristics in the same group. Spectral clustering is one of the most popular modern clustering algorithms. As an effective clustering technique, spectral clustering method emerged from the concepts of spectral graph theory. Spectral clustering method needs partitioning algorithm. There are some partitioning methods including PAM, SOM, Fuzzy c-means, and k-means. Based on the research that has been done by Capital and Choudhury in 2013, when using Euclidian distance k-means algorithm provide better accuracy than PAM algorithm. So in this paper we use k-means as our partition algorithm. The major advantage of spectral clustering is in reducing data dimension, especially in this case to reduce the dimension of large microarray dataset. Microarray data is a small-sized chip made of a glass plate containing thousands and even tens of thousands kinds of genes in the DNA fragments derived from doubling cDNA. Application of microarray data is widely used to detect cancer, for the example is carcinoma, in which cancer cells express the abnormalities in his genes. The purpose of this research is to classify the data that have high similarity in the same group and the data that have low similarity in the others. In this research, Carcinoma microarray data using 7457 genes. The result of partitioning using k-means algorithm is two clusters.
Implementation on Landsat Data of a Simple Cloud Mask Algorithm Developed for MODIS Land Bands
NASA Technical Reports Server (NTRS)
Oreopoulos, Lazaros; Wilson, Michael J.; Varnai, Tamas
2010-01-01
This letter assesses the performance on Landsat-7 images of a modified version of a cloud masking algorithm originally developed for clear-sky compositing of Moderate Resolution Imaging Spectroradiometer (MODIS) images at northern mid-latitudes. While data from recent Landsat missions include measurements at thermal wavelengths, and such measurements are also planned for the next mission, thermal tests are not included in the suggested algorithm in its present form to maintain greater versatility and ease of use. To evaluate the masking algorithm we take advantage of the availability of manual (visual) cloud masks developed at USGS for the collection of Landsat scenes used here. As part of our evaluation we also include the Automated Cloud Cover Assesment (ACCA) algorithm that includes thermal tests and is used operationally by the Landsat-7 mission to provide scene cloud fractions, but no cloud masks. We show that the suggested algorithm can perform about as well as ACCA both in terms of scene cloud fraction and pixel-level cloud identification. Specifically, we find that the algorithm gives an error of 1.3% for the scene cloud fraction of 156 scenes, and a root mean square error of 7.2%, while it agrees with the manual mask for 93% of the pixels, figures very similar to those from ACCA (1.2%, 7.1%, 93.7%).
NASA Astrophysics Data System (ADS)
Plaza, Antonio; Chang, Chein-I.; Plaza, Javier; Valencia, David
2006-05-01
The incorporation of hyperspectral sensors aboard airborne/satellite platforms is currently producing a nearly continual stream of multidimensional image data, and this high data volume has soon introduced new processing challenges. The price paid for the wealth spatial and spectral information available from hyperspectral sensors is the enormous amounts of data that they generate. Several applications exist, however, where having the desired information calculated quickly enough for practical use is highly desirable. High computing performance of algorithm analysis is particularly important in homeland defense and security applications, in which swift decisions often involve detection of (sub-pixel) military targets (including hostile weaponry, camouflage, concealment, and decoys) or chemical/biological agents. In order to speed-up computational performance of hyperspectral imaging algorithms, this paper develops several fast parallel data processing techniques. Techniques include four classes of algorithms: (1) unsupervised classification, (2) spectral unmixing, and (3) automatic target recognition, and (4) onboard data compression. A massively parallel Beowulf cluster (Thunderhead) at NASA's Goddard Space Flight Center in Maryland is used to measure parallel performance of the proposed algorithms. In order to explore the viability of developing onboard, real-time hyperspectral data compression algorithms, a Xilinx Virtex-II field programmable gate array (FPGA) is also used in experiments. Our quantitative and comparative assessment of parallel techniques and strategies may help image analysts in selection of parallel hyperspectral algorithms for specific applications.
NASA Astrophysics Data System (ADS)
Chen, Jun; Shibata, Tadashi
2007-04-01
Pulse-coupled neural networks (PCNNs) are biologically inspired algorithms that have been shown to be highly effective for image feature generation. However, conventional PCNNs are software-oriented algorithms that are too complicated to implement as very-large-scale integration (VLSI) hardware. To employ PCNNs in image-feature-generation VLSIs, a hardware-implementation-friendly PCNN is proposed here. By introducing the concepts of exponentially decaying output and a one-branch dendritic tree, the new PCNN eliminates the large number of convolution operators and floating-point multipliers in conventional PCNNs without compromising its performance at image feature generation. As an analog VLSI implementation of the new PCNN, an image-feature-generation circuit is proposed. By employing floating-gate metal-oxide-semiconductor (MOS) technology, the circuit achieves a full voltage-mode implementation of the PCNN in a compact structure. Inheriting the merits of the PCNN, the circuit is capable of generating rotation-independent and translation-independent features for input patterns, which has been verified by SPICE simulation.
Yang, Yuan; Quan, Nannan; Bu, Jingjing; Li, Xueping; Yu, Ningmei
2016-09-26
High order modulation and demodulation technology can solve the frequency requirement between the wireless energy transmission and data communication. In order to achieve reliable wireless data communication based on high order modulation technology for visual prosthesis, this work proposed a Reed-Solomon (RS) error correcting code (ECC) circuit on the basis of differential amplitude and phase shift keying (DAPSK) soft demodulation. Firstly, recognizing the weakness of the traditional DAPSK soft demodulation algorithm based on division that is complex for hardware implementation, an improved phase soft demodulation algorithm for visual prosthesis to reduce the hardware complexity is put forward. Based on this new algorithm, an improved RS soft decoding method is hence proposed. In this new decoding method, the combination of Chase algorithm and hard decoding algorithms is used to achieve soft decoding. In order to meet the requirements of implantable visual prosthesis, the method to calculate reliability of symbol-level based on multiplication of bit reliability is derived, which reduces the testing vectors number of Chase algorithm. The proposed algorithms are verified by MATLAB simulation and FPGA experimental results. During MATLAB simulation, the biological channel attenuation property model is added into the ECC circuit. The data rate is 8 Mbps in the MATLAB simulation and FPGA experiments. MATLAB simulation results show that the improved phase soft demodulation algorithm proposed in this paper saves hardware resources without losing bit error rate (BER) performance. Compared with the traditional demodulation circuit, the coding gain of the ECC circuit has been improved by about 3 dB under the same BER of [Formula: see text]. The FPGA experimental results show that under the condition of data demodulation error with wireless coils 3 cm away, the system can correct it. The greater the distance, the higher the BER. Then we use a bit error rate analyzer to
Chabini, I.; Florian, M.
1994-12-31
In this paper we present a new class of sequential and parallel algorithms for transportation problems with linear and convex costs. First, we consider a capacitated transportation problem with an entropy type objective function. We show that this problem has some interesting properties, namely that its optimal solution verifies both the non negativity and capacity constraints. Then, we give a new solution method for this problem. The algorithm consists of a sequence of {open_quotes}balancing{close_quotes} iterations on the conservation of flow constraints which may be viewed as a generalization of the well known RAS algorithm for matrix balancing. Then we prove the convergence of this method and extend it to strictly convex and linear cost transportation problems. For differentiable convex costs we develop an adaptation where each projection is an entropy type capacitated transportation problem. For linear costs, we prove a triple equivalence between the entropy projection method, the proximal minimization approach (with our entropy type function) and an entropy barrier method. We give a convergence rate analysis for strongly convex costs and linear objectif functions. We show efficient implementations on both serial and parallel environments. Computational results indicate that this methods yields very encouraging results. We solve large problems with several million variables on a network of transputers and Sun workstations. For the linear case, the serial implementation is compared to some network simplex codes like RELAX and RNET. Computational experiments indicate that this algorithm can outperform both RELAX and RNET. The parallel implementations are analysed using especially a new measure of performance developed by the authors. The results demonstrate that this measure can give more information than the classical measure of speedup. Some unexpected behaviors are reported.
Chabini, I.; Florian, M.
1994-12-31
In this paper we present a new class of sequential and parallel algorithms for transportation problems with linear and convex costs. First, we consider a capacitated transportation problem with an entropy type objectif function. We show that this problem has some interesting properties, namely that its optimal solution verifies both the non negativity and capacity constraints. Then, we give a new solution method for this problem. The algorithm consists of a sequence of {open_quotes}balancing{close_quotes} iterations on the conservation of flow constraints which may be viewed as a generalization of the well known RAS algorithm for matrix balancing. Then we prove the convergence of this method and extend it to strictly convex and linear cost transportation problems. For differentiable convex costs we develop an adaptation where each projection is an entropy type capacitated transportation problem. For linear costs, we prove a triple equivalence between the entropy projection method, the proximal minimization approach (with our entropy type function) and an entropy barrier method. We give a convergence rate analysis for strongly convex costs and linear objectif functions. We show efficient implementations on both serial and parallel environments. Computational results indicate that this methods yields very encouraging results. We solve large problems with several million variables on a network of transputers and Sun workstations. For the linear case, the serial implementation is compared to some network simplex codes like RELAX and RNET. Computational experiments indicate that this algorithm can outperform both RELAX and RNET. The parallel implementations are analysed using especially a new measure of performance developed by the authors. The results demonstrate that this measure can give more information than the classical measure of speedup. Some unexpected behaviors are reported.
Egan, A; Laub, W
2014-06-15
Purpose: Several shortcomings of the current implementation of the analytic anisotropic algorithm (AAA) may lead to dose calculation errors in highly modulated treatments delivered to highly heterogeneous geometries. Here we introduce a set of dosimetric error predictors that can be applied to a clinical treatment plan and patient geometry in order to identify high risk plans. Once a problematic plan is identified, the treatment can be recalculated with more accurate algorithm in order to better assess its viability. Methods: Here we focus on three distinct sources dosimetric error in the AAA algorithm. First, due to a combination of discrepancies in smallfield beam modeling as well as volume averaging effects, dose calculated through small MLC apertures can be underestimated, while that behind small MLC blocks can overestimated. Second, due the rectilinear scaling of the Monte Carlo generated pencil beam kernel, energy is not properly transported through heterogeneities near, but not impeding, the central axis of the beamlet. And third, AAA overestimates dose in regions very low density (< 0.2 g/cm{sup 3}). We have developed an algorithm to detect the location and magnitude of each scenario within the patient geometry, namely the field-size index (FSI), the heterogeneous scatter index (HSI), and the lowdensity index (LDI) respectively. Results: Error indices successfully identify deviations between AAA and Monte Carlo dose distributions in simple phantom geometries. Algorithms are currently implemented in the MATLAB computing environment and are able to run on a typical RapidArc head and neck geometry in less than an hour. Conclusion: Because these error indices successfully identify each type of error in contrived cases, with sufficient benchmarking, this method can be developed into a clinical tool that may be able to help estimate AAA dose calculation errors and when it might be advisable to use Monte Carlo calculations.
NASA Astrophysics Data System (ADS)
Rueda, Antonio J.; Noguera, José M.; Luque, Adrián
2016-02-01
In recent years GPU computing has gained wide acceptance as a simple low-cost solution for speeding up computationally expensive processing in many scientific and engineering applications. However, in most cases accelerating a traditional CPU implementation for a GPU is a non-trivial task that requires a thorough refactorization of the code and specific optimizations that depend on the architecture of the device. OpenACC is a promising technology that aims at reducing the effort required to accelerate C/C++/Fortran code on an attached multicore device. Virtually with this technology the CPU code only has to be augmented with a few compiler directives to identify the areas to be accelerated and the way in which data has to be moved between the CPU and GPU. Its potential benefits are multiple: better code readability, less development time, lower risk of errors and less dependency on the underlying architecture and future evolution of the GPU technology. Our aim with this work is to evaluate the pros and cons of using OpenACC against native GPU implementations in computationally expensive hydrological applications, using the classic D8 algorithm of O'Callaghan and Mark for river network extraction as case-study. We implemented the flow accumulation step of this algorithm in CPU, using OpenACC and two different CUDA versions, comparing the length and complexity of the code and its performance with different datasets. We advance that although OpenACC can not match the performance of a CUDA optimized implementation (×3.5 slower in average), it provides a significant performance improvement against a CPU implementation (×2-6) with by far a simpler code and less implementation effort.
NASA Astrophysics Data System (ADS)
Popov, A.; Zolotarev, V.; Bychkov, S.
2016-11-01
This paper examines the results of experimental studies of a previously submitted combined algorithm designed to increase the reliability of information systems. The data that illustrates the organization and conduct of the studies is provided. Within the framework of a comparison of As a part of the study conducted, the comparison of the experimental data of simulation modeling and the data of the functioning of the real information system was made. The hypothesis of the homogeneity of the logical structure of the information systems was formulated, thus enabling to reconfigure the algorithm presented, - more specifically, to transform it into the model for the analysis and prediction of arbitrary information systems. The results presented can be used for further research in this direction. The data of the opportunity to predict the functioning of the information systems can be used for strategic and economic planning. The algorithm can be used as a means for providing information security.
NASA Astrophysics Data System (ADS)
von Rudorff, Guido Falk; Wehmeyer, Christoph; Sebastiani, Daniel
2014-06-01
We adapt a swarm-intelligence-based optimization method (the artificial bee colony algorithm, ABC) to enhance its parallel scaling properties and to improve the escaping behavior from deep local minima. Specifically, we apply the approach to the geometry optimization of Lennard-Jones clusters. We illustrate the performance and the scaling properties of the parallelization scheme for several system sizes (5-20 particles). Our main findings are specific recommendations for ranges of the parameters of the ABC algorithm which yield maximal performance for Lennard-Jones clusters and Morse clusters. The suggested parameter ranges for these different interaction potentials turn out to be very similar; thus, we believe that our reported values are fairly general for the ABC algorithm applied to chemical optimization problems.
Design and implementation of a vision-based hovering and feature tracking algorithm for a quadrotor
NASA Astrophysics Data System (ADS)
Lee, Y. H.; Chahl, J. S.
2016-10-01
This paper demonstrates an approach to the vision-based control of the unmanned quadrotors for hover and object tracking. The algorithms used the Speed Up Robust Features (SURF) algorithm to detect objects. The pose of the object in the image was then calculated in order to pass the pose information to the flight controller. Finally, the flight controller steered the quadrotor to approach the object based on the calculated pose data. The above processes was run using standard onboard resources found in the 3DR Solo quadrotor in an embedded computing environment. The obtained results showed that the algorithm behaved well during its missions, tracking and hovering, although there were significant latencies due to low CPU performance of the onboard image processing system.
NASA Astrophysics Data System (ADS)
Viau, C. R.
2012-06-01
The use of the intensity change and line-of-sight (LOS) change concepts have previously been documented in the open-literature as techniques used by non-imaging infrared (IR) seekers to reject expendable IR countermeasures (IRCM). The purpose of this project was to implement IR counter-countermeasure (IRCCM) algorithms based on target intensity and kinematic behavior for a generic imaging IR (IIR) seeker model with the underlying goal of obtaining a better understanding of how expendable IRCM can be used to defeat the latest generation of seekers. The report describes the Intensity Ratio Change (IRC) and LOS Rate Change (LRC) discrimination techniques. The algorithms and the seeker model are implemented in a physics-based simulation product called Tactical Engagement Simulation Software (TESS™). TESS is developed in the MATLAB®/Simulink® environment and is a suite of RF/IR missile software simulators used to evaluate and analyze the effectiveness of countermeasures against various classes of guided threats. The investigation evaluates the algorithm and tests their robustness by presenting the results of batch simulation runs of surface-to-air (SAM) and air-to-air (AAM) IIR missiles engaging a non-maneuvering target platform equipped with expendable IRCM as self-protection. The report discusses how varying critical parameters such track memory time, ratio thresholds and hold time can influence the outcome of an engagement.
NASA Astrophysics Data System (ADS)
Orjuela-Vargas, S. A.; Triana-Martinez, J.; Yañez, J. P.; Philips, W.
2014-03-01
Video analysis in real time requires fast and efficient algorithms to extract relevant information from a considerable number, commonly 25, of frames per second. Furthermore, robust algorithms for outdoor visual scenes may retrieve correspondent features along the day where a challenge is to deal with lighting changes. Currently, Local Binary Pattern (LBP) techniques are widely used for extracting features due to their robustness to illumination changes and the low requirements for implementation. We propose to compute an automatic threshold based on the distribution of the intensity residuals resulting from the pairwise comparisons when using LBP techniques. The intensity residuals distribution can be modelled by a Generalized Gaussian Distribution (GGD). In this paper we compute the adaptive threshold using the parameters of the GGD. We present a CUDA implementation of our proposed algorithm. We use the LBPSYM technique. Our approach is tested on videos of four different urban scenes with mobilities captured during day and night. The extracted features can be used in a further step to determine patterns, identify objects or detect background. However, further research must be conducted for blurring correction since the scenes at night are commonly blurred due to artificial lighting.
Implementation of Future Climate Satellite Cloud Algorithms: Case of the GCOM-C/SGLI
NASA Astrophysics Data System (ADS)
Dim, J. R.; Murakami, H.; Nakajima, T. Y.; Takamura, T.
2012-12-01
The Global Change Observation Mission-Climate/Second Generation GLobal Imager (GCOM-C/SGLI) is a future Earth observation satellite to be launched in 2015. Its major objective is the monitoring of long-term climate changes. A major factor of these changes is the cloud impact. A new cloud algorithm adapted to the spectral characteristics of the GCOM-C/SGLI and the products derived are currently tested. The tests consist of evaluating the performance of the cloud optical thickness (COT) and the cloud particle effective radius (CLER) against simulation data, and equivalent products derived from a compatible satellite, the Terra/MODerate resolution Image Spectrometer (Terra/MODIS). In addition to these tests, the sensitivity of the products derived from this algorithm, to external and internal cloud related parameters, is analyzed. The base-map of the initial data input for this algorithm is made of geometrically corrected radiances of the Advanced Earth Observation Satellite II/GLobal Imager (ADEOS-II/GLI) and the GCOM-C/SGLI simulated radiances. The results of these performance tests, based on timely matching products, show that the GCOM-C/SGLI algorithm performs relatively well for averagely overcast scenes, with an agreement rate of ±20% with the satellite simulation products and the Terra/MODIS COT and CLER. A negative bias is however frequently observed, with the GCOM-C/SGLI retrieved parameters showing higher values at high COT levels. The algorithm also seems less reactive to thin and small particles' clouds mainly in land areas, compared to Terra/MODIS data and the satellite simulation products. Sensitivity to varying ground albedo, cloud phase, cloud structure and cloud location are analyzed to understand the influence of these parameters on the results obtained. Possible consequences of these influences on long-term climate variations and the bases for the improvement of the present algorithm in various cloud types' conditions are discussed.
Assessment of radial aspheres by the Arc-step algorithm as implemented by the Keratron keratoscope.
Tripoli, N K; Cohen, K L; Holmgren, D E; Coggins, J M
1995-11-01
To assess the accuracy with which the Keratron (Optikon 2000, Rome, Italy) measured rotationally symmetric, radially aspheric test surfaces according to an arc-step profile reconstruction algorithm and to discriminate between error caused by the algorithm and error from other sources. Height, local power, and axial power calculated from radius of curvature centered on the instrument's axis were reported by the Keratron for four surfaces that had radial profiles similar to normal corneas. The Keratron profile reconstruction algorithm was simulated by using ray tracing. Keratron measurements were compared with the surfaces' formulas and the ray-traced simulations. The heights reported by the Keratron were within 0.25 microns from the four surfaces at less than 3 mm from the keratoscope axis and generally within 1 micron of the height calculated from the surfaces' formulas. The Keratron's axial powers were within +/- 0.1 diopter of the simulation of the axial solution between 1 and 4 mm of the axis but were greater central to 1 mm and peripheral to 4 mm. The Keratron's local powers were within -0.25 diopters at less than 4 mm from the axis and peripherally were between +1.75 diopters and -0.75 diopter of power calculated from the surface's instantaneous radii of curvature. Height error because of the arc-step algorithm was less than -0.2 micron. The Keratron's arc-step profile reconstruction algorithm contributed to its ability to measure height more accurately than keratoscopes that use spherically biased algorithms and provided measurement of local power.
Transport implementation of the Bernstein-Vazirani algorithm with ion qubits
NASA Astrophysics Data System (ADS)
Fallek, S. D.; Herold, C. D.; McMahon, B. J.; Maller, K. M.; Brown, K. R.; Amini, J. M.
2016-08-01
Using trapped ion quantum bits in a scalable microfabricated surface trap, we perform the Bernstein-Vazirani algorithm. Our architecture takes advantage of the ion transport capabilities of such a trap. The algorithm is demonstrated using two- and three-ion chains. For three ions, an improvement is achieved compared to a classical system using the same number of oracle queries. For two ions and one query, we correctly determine an unknown bit string with probability 97.6(8)%. For three ions, we succeed with probability 80.9(3)%.
Gregoretti, Francesco; Belcastro, Vincenzo; di Bernardo, Diego; Oliva, Gennaro
2010-04-21
The reverse engineering of gene regulatory networks using gene expression profile data has become crucial to gain novel biological knowledge. Large amounts of data that need to be analyzed are currently being produced due to advances in microarray technologies. Using current reverse engineering algorithms to analyze large data sets can be very computational-intensive. These emerging computational requirements can be met using parallel computing techniques. It has been shown that the Network Identification by multiple Regression (NIR) algorithm performs better than the other ready-to-use reverse engineering software. However it cannot be used with large networks with thousands of nodes--as is the case in biological networks--due to the high time and space complexity. In this work we overcome this limitation by designing and developing a parallel version of the NIR algorithm. The new implementation of the algorithm reaches a very good accuracy even for large gene networks, improving our understanding of the gene regulatory networks that is crucial for a wide range of biomedical applications.
NASA Astrophysics Data System (ADS)
Ardaneswari, Gianinna; Bustamam, Alhadi; Siswantining, Titin
2017-03-01
Tumor is an abnormal growth of cells that serves no purpose. Carcinoma is a tumor that grows from the top of the cell membrane. In the field of molecular biology, the development of microarray technology is used in data store of disease genetic expression. For each of microarray gene, an amount of information is stored for each trait or condition. In gene expression data clustering can be done with a bicluster algorithm, that's clustering method not only the objects to be clustered, but also the properties or condition of the object. This research proposed a two-phase method for finding a bicluster. In the first phase, a parallel k-means algorithm is applied to the gene expression data. Then, in the second phase, Cheng and Church biclustering algorithm as one of biclustering method is performed to find biclusters. In this study, we discuss the implementation of two-phase method using biclustering of Cheng and Church and parallel k-means algorithm in Carcinoma tumor gene expression data. From the experimental results, we found five biclusters are formed by Carcinoma gene expression data.
2005-12-01
a user for a patrol mission. To increase the vehicle’s abilities, other behaviours such as obstacle avoidance, path planning or leader / follower augment...15 5.4 Leader / Follower Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 5.5 Waypoint Following...Navigation Behaviour - Provide goal directedness in concert with an obstacle avoid- ance algorithm. 3. Leader / Follower - Allow a follower vehicle to
NASA Technical Reports Server (NTRS)
Springer, P.
1993-01-01
This paper discusses the method in which the Cascade-Correlation algorithm was parallelized in such a way that it could be run using the Time Warp Operating System (TWOS). TWOS is a special purpose operating system designed to run parellel discrete event simulations with maximum efficiency on parallel or distributed computers.
NASA Technical Reports Server (NTRS)
Springer, P.
1993-01-01
This paper discusses the method in which the Cascade-Correlation algorithm was parallelized in such a way that it could be run using the Time Warp Operating System (TWOS). TWOS is a special purpose operating system designed to run parellel discrete event simulations with maximum efficiency on parallel or distributed computers.
Implementation of fractional-order electromagnetic potential through a genetic algorithm
NASA Astrophysics Data System (ADS)
Jesus, Isabel S.; Machado, J. A. Tenreiro
2009-05-01
Several phenomena present in electrical systems motivated the development of comprehensive models based on the theory of fractional calculus (FC). Bearing these ideas in mind, in this work are applied the FC concepts to define, and to evaluate, the electrical potential of fractional order, based in a genetic algorithm optimization scheme. The feasibility and the convergence of the proposed method are evaluated.
NASA Technical Reports Server (NTRS)
Soeder, J. F.
1983-01-01
As turbofan engines become more complex, the development of controls necessitate the use of multivariable control techniques. A control developed for the F100-PW-100(3) turbofan engine by using linear quadratic regulator theory and other modern multivariable control synthesis techniques is described. The assembly language implementation of this control on an SEL 810B minicomputer is described. This implementation was then evaluated by using a real-time hybrid simulation of the engine. The control software was modified to run with a real engine. These modifications, in the form of sensor and actuator failure checks and control executive sequencing, are discussed. Finally recommendations for control software implementations are presented.
NASA Technical Reports Server (NTRS)
Choudhary, Alok Nidhi; Leung, Mun K.; Huang, Thomas S.; Patel, Janak H.
1989-01-01
Several techniques to perform static and dynamic load balancing techniques for vision systems are presented. These techniques are novel in the sense that they capture the computational requirements of a task by examining the data when it is produced. Furthermore, they can be applied to many vision systems because many algorithms in different systems are either the same, or have similar computational characteristics. These techniques are evaluated by applying them on a parallel implementation of the algorithms in a motion estimation system on a hypercube multiprocessor system. The motion estimation system consists of the following steps: (1) extraction of features; (2) stereo match of images in one time instant; (3) time match of images from different time instants; (4) stereo match to compute final unambiguous points; and (5) computation of motion parameters. It is shown that the performance gains when these data decomposition and load balancing techniques are used are significant and the overhead of using these techniques is minimal.
Facchinetti, Andrea; Sparacino, Giovanni; Cobelli, Claudio
2013-01-01
Glucose readings provided by current continuous glucose monitoring (CGM) devices still suffer from accuracy and precision issues. In April 2013, we proposed a new conceptual architecture to deal with these problems and render CGM sensors algorithmically smarter, which consists of three modules for denoising, enhancement, and prediction placed in cascade to a commercial CGM sensor. The architecture was assessed on a data set consisting of 24 type 1 diabetes patients collected in four clinical centers of the AP@home Consortium (a European project of 7th Framework Programme funded by the European Committee). This article, as a companion to our prior publication, illustrates the technical details of the algorithms and of the implementation issues. PMID:24124959
Badawi, Ahmed M; Rushdi, Muhammad A
2006-01-01
This paper proposes a novel algorithm for speckle reduction in medical ultrasound imaging while preserving the edges with the added advantages of adaptive noise filtering and speed. We propose a nonlinear image diffusion algorithm that incorporates two local parameters of image quality, namely, scatterer density and texture-based contrast in addition to gradient, to weight the nonlinear diffusion process. The scatterer density is proposed to replace the existing traditional measures of quality of the ultrasound diffusion process such as MSE, RMSE, SNR, and PSNR. This novel diffusion filter was then implemented using back propagation neural network for fast parallel processing of volumetric images. The experimental results show that weighting the image diffusion with these parameters produces better noise reduction and produces a better edge detection quality with reasonable computational cost. The proposed filter can be used as a preprocessing phase before applying any ultrasound segmentation or active contour model processes.
How to implement a quantum algorithm on a large number of qubits by controlling one central qubit
NASA Astrophysics Data System (ADS)
Zagoskin, Alexander; Ashhab, Sahel; Johansson, J. R.; Nori, Franco
2010-03-01
It is desirable to minimize the number of control parameters needed to perform a quantum algorithm. We show that, under certain conditions, an entire quantum algorithm can be efficiently implemented by controlling a single central qubit in a quantum computer. We also show that the different system parameters do not need to be designed accurately during fabrication. They can be determined through the response of the central qubit to external driving. Our proposal is well suited for hybrid architectures that combine microscopic and macroscopic qubits. More details can be found in: A.M. Zagoskin, S. Ashhab, J.R. Johansson, F. Nori, Quantum two-level systems in Josephson junctions as naturally formed qubits, Phys. Rev. Lett. 97, 077001 (2006); and S. Ashhab, J.R. Johansson, F. Nori, Rabi oscillations in a qubit coupled to a quantum two-level system, New J. Phys. 8, 103 (2006).
Passive microwave remote sensing of rainfall with SSM/I: Algorithm development and implementation
NASA Technical Reports Server (NTRS)
Ferriday, James G.; Avery, Susan K.
1994-01-01
A physically based algorithm sensitive to emission and scattering is used to estimate rainfall using the Special Sensor Microwave/Imager (SSM/I). The algorithm is derived from radiative transfer calculations through an atmospheric cloud model specifying vertical distributions of ice and liquid hydrometeors as a function of rain rate. The algorithm is structured in two parts: SSM/I brightness temperatures are screened to detect rainfall and are then used in rain-rate calculation. The screening process distinguishes between nonraining background conditions and emission and scattering associated with hydrometeors. Thermometric temperature and polarization thresholds determined from the radiative transfer calculations are used to detect rain, whereas the rain-rate calculation is based on a linear function fit to a linear combination of channels. Separate calculations for ocean and land account for different background conditions. The rain-rate calculation is constructed to respond to both emission and scattering, reduce extraneous atmospheric and surface effects, and to correct for beam filling. The resulting SSM/I rain-rate estimates are compared to three precipitation radars as well as to a dynamically simulated rainfall event. Global estimates from the SSM/I algorithm are also compared to continental and shipboard measurements over a 4-month period. The algorithm is found to accurately describe both localized instantaneous rainfall events and global monthly patterns over both land and ovean. Over land the 4-month mean difference between SSM/I and the Global Precipitation Climatology Center continental rain gauge database is less than 10%. Over the ocean, the mean difference between SSM/I and the Legates and Willmott global shipboard rain gauge climatology is less than 20%.
Bjerrome Ahlin, Henrik; Kölby, Lars; Elander, Anna; Selvaggi, Gennaro
2014-12-01
was 2.5 for the two-step concentric circular approach, and 1.25 for the single step, algorithm-based approach; particularly, when the concentric circular technique was chosen for the single step, algorithm-based approach, only two of the patients required revision surgery to improve the cosmetic outcome. This study shows that the number of complications and the total number of surgeries performed to satisfy patients were lower after Monstrey's algorithm for mastectomies was implemented as routine practice at the Sahlgrenska University Hospital.
Non-quantum implementation of quantum computation algorithm using a spatial coding technique
NASA Astrophysics Data System (ADS)
Tate, N.; Ogura, Y.; Tanida, J.
2005-07-01
Non-quantum implementation of quantum information processing is studied. A spatial coding technique, which is one effective digital optical computing technique, is utilized to implement quantum teleportation efficiently. In the coding, quantum information is represented by the intensity and the phase of elemental cells. Correct operation is confirmed within the proposed scheme, which indicates the effectiveness of the proposed approach and a motive for further investigation.
NASA Astrophysics Data System (ADS)
Tommy, Lukas; Hardjianto, Mardi; Agani, Nazori
2017-04-01
Connect Four is a two-player game which the players take turns dropping discs into a grid to connect 4 of one’s own discs next to each other vertically, horizontally, or diagonally. At Connect Four, Computer requires artificial intelligence (AI) in order to play properly like human. There are many AI algorithms that can be implemented to Connect Four, but the suitable algorithms are unknown. The suitable algorithm means optimal in choosing move and its execution time is not slow at search depth which is deep enough. In this research, analysis and comparison between standard alpha beta (AB) Pruning and MTD(f) will be carried out at the prototype of Connect Four in terms of optimality (win percentage) and speed (execution time and the number of leaf nodes). Experiments are carried out by running computer versus computer mode with 12 different conditions, i.e. varied search depth (5 through 10) and who moves first. The percentage achieved by MTD(f) based on experiments is win 45,83%, lose 37,5% and draw 16,67%. In the experiments with search depth 8, MTD(f) execution time is 35, 19% faster and evaluate 56,27% fewer leaf nodes than AB Pruning. The results of this research are MTD(f) is as optimal as AB Pruning at Connect Four prototype, but MTD(f) on average is faster and evaluates fewer leaf nodes than AB Pruning. The execution time of MTD(f) is not slow and much faster than AB Pruning at search depth which is deep enough.
NASA Astrophysics Data System (ADS)
Valencia, David; Plaza, Antonio; Vega-Rodríguez, Miguel A.; Pérez, Rosa M.
2005-11-01
Hyperspectral imagery is a class of image data which is used in many scientific areas, most notably, medical imaging and remote sensing. It is characterized by a wealth of spatial and spectral information. Over the last years, many algorithms have been developed with the purpose of finding "spectral endmembers," which are assumed to be pure signatures in remotely sensed hyperspectral data sets. Such pure signatures can then be used to estimate the abundance or concentration of materials in mixed pixels, thus allowing sub-pixel analysis which is crucial in many remote sensing applications due to current sensor optics and configuration. One of the most popular endmember extraction algorithms has been the pixel purity index (PPI), available from Kodak's Research Systems ENVI software package. This algorithm is very time consuming, a fact that has generally prevented its exploitation in valid response times in a wide range of applications, including environmental monitoring, military applications or hazard and threat assessment/tracking (including wildland fire detection, oil spill mapping and chemical and biological standoff detection). Field programmable gate arrays (FPGAs) are hardware components with millions of gates. Their reprogrammability and high computational power makes them particularly attractive in remote sensing applications which require a response in near real-time. In this paper, we present an FPGA design for implementation of PPI algorithm which takes advantage of a recently developed fast PPI (FPPI) algorithm that relies on software-based optimization. The proposed FPGA design represents our first step toward the development of a new reconfigurable system for fast, onboard analysis of remotely sensed hyperspectral imagery.
NASA Technical Reports Server (NTRS)
Schallhorn, Paul; Majumdar, Alok
2012-01-01
This paper describes a finite volume based numerical algorithm that allows multi-dimensional computation of fluid flow within a system level network flow analysis. There are several thermo-fluid engineering problems where higher fidelity solutions are needed that are not within the capacity of system level codes. The proposed algorithm will allow NASA's Generalized Fluid System Simulation Program (GFSSP) to perform multi-dimensional flow calculation within the framework of GFSSP s typical system level flow network consisting of fluid nodes and branches. The paper presents several classical two-dimensional fluid dynamics problems that have been solved by GFSSP's multi-dimensional flow solver. The numerical solutions are compared with the analytical and benchmark solution of Poiseulle, Couette and flow in a driven cavity.
Implemented Wavelet Packet Tree based Denoising Algorithm in Bus Signals of a Wearable Sensorarray
NASA Astrophysics Data System (ADS)
Schimmack, M.; Nguyen, S.; Mercorelli, P.
2015-11-01
This paper introduces a thermosensing embedded system with a sensor bus that uses wavelets for the purposes of noise location and denoising. From the principle of the filter bank the measured signal is separated in two bands, low and high frequency. The proposed algorithm identifies the defined noise in these two bands. With the Wavelet Packet Transform as a method of Discrete Wavelet Transform, it is able to decompose and reconstruct bus input signals of a sensor network. Using a seminorm, the noise of a sequence can be detected and located, so that the wavelet basis can be rearranged. This particularly allows for elimination of any incoherent parts that make up unavoidable measuring noise of bus signals. The proposed method was built based on wavelet algorithms from the WaveLab 850 library of the Stanford University (USA). This work gives an insight to the workings of Wavelet Transformation.
Schalkoff, R.J.; Shaaban, K.M.; Carver, A.E.
1996-12-31
The ARIES {number_sign}1 (Autonomous Robotic Inspection Experimental System) vision system is used to acquire drum surface images under controlled conditions and subsequently perform autonomous visual inspection leading to a classification as `acceptable` or `suspect`. Specific topics described include vision system design methodology, algorithmic structure,hardware processing structure, and image acquisition hardware. Most of these capabilities were demonstrated at the ARIES Phase II Demo held on Nov. 30, 1995. Finally, Phase III efforts are briefly addressed.
NASA Astrophysics Data System (ADS)
Tamma, Vincenzo
2016-12-01
We report a detailed analysis of the optical realization of the analogue algorithm described in the first paper of this series (Tamma in Quantum Inf Process 11128:1190, 2015) for the simultaneous factorization of an exponential number of integers. Such an analogue procedure, which scales exponentially in the context of first-order interference, opens up the horizon to polynomial scaling by exploiting multi-particle quantum interference.
Implementation of a Landing Footprint Algorithm for the HTV-2 and Trajectory Simulations
NASA Technical Reports Server (NTRS)
Clark, Casie M.
2012-01-01
This presentation details work performed during the Fall 2011 term in the Research Controls and Dynamics Branch at NASA Dryden Flight Research Center. Included is a study on a possible landing footprint algorithm, with direct application to the HTV-2. Also discussed is work in support of the MIPCC effort, which includes optimal trajectory solutions for the F-15A Streak Eagle aircraft and theoretical performance of an F-15A with a MIPCC propulsion system.
Cantor network, control algorithm, two-dimensional compact structure and its optical implementation
NASA Astrophysics Data System (ADS)
Wang, Ning; Liu, Liren; Yin, Yaozu
1995-12-01
A compact integrating module technique for packaging a optical multistage Cantor network with a polarization multiplex technique is suggested. The modules have a unique configuration, which is the solid-state combination of a polarization rotator, double birefringent slabs, and a 2 \\times 2 switch array. The design and the fabrication of an eight-channel optical nonblocking Cantor network are demonstrated, and a fast-setup control algorithm is developed. The network systems are easy to assemble and insensitive to environment disturbance.
NASA Technical Reports Server (NTRS)
Chen, Shu-Po
1999-01-01
This paper presents software for solving the non-conforming fluid structure interfaces in aeroelastic simulation. It reviews the algorithm of interpolation and integration, highlights the flexibility and the user-friendly feature that allows the user to select the existing structure and fluid package, like NASTRAN and CLF3D, to perform the simulation. The presented software is validated by computing the High Speed Civil Transport model.
Real-time implementation of a traction control algorithm on a scaled roller rig
NASA Astrophysics Data System (ADS)
Bosso, N.; Zampieri, N.
2013-04-01
Traction control is a very important aspect in railway vehicle dynamics. Its optimisation allows improvement of the performance of a locomotive by working close to the limit of adhesion. On the other hand, in case the adhesion limit is surpassed, the wheels are subjected to heavy wear and there is also a big risk that vibrations in the traction occur. Similar considerations can be made in the case of braking. The development and optimisation of a traction/braking control algorithm is a complex activity, because it is usually performed on a real vehicle on the track, where many uncertainties are present due to environmental conditions and vehicle characteristics. This work shows the use of a scaled roller rig to develop and optimise a traction control algorithm on a single wheelset. Measurements performed on the wheelset are used to estimate the optimal adhesion forces by means of a wheel/rail contact algorithm executed in real time. This allows application of the optimal adhesion force.
NASA Technical Reports Server (NTRS)
Patten, William Neff
1989-01-01
There is an evident need to discover a means of establishing reliable, implementable controls for systems that are plagued by nonlinear and, or uncertain, model dynamics. The development of a generic controller design tool for tough-to-control systems is reported. The method utilizes a moving grid, time infinite element based solution of the necessary conditions that describe an optimal controller for a system. The technique produces a discrete feedback controller. Real time laboratory experiments are now being conducted to demonstrate the viability of the method. The algorithm that results is being implemented in a microprocessor environment. Critical computational tasks are accomplished using a low cost, on-board, multiprocessor (INMOS T800 Transputers) and parallel processing. Progress to date validates the methodology presented. Applications of the technique to the control of highly flexible robotic appendages are suggested.
NASA Technical Reports Server (NTRS)
Patten, William Neff
1989-01-01
There is an evident need to discover a means of establishing reliable, implementable controls for systems that are plagued by nonlinear and, or uncertain, model dynamics. The development of a generic controller design tool for tough-to-control systems is reported. The method utilizes a moving grid, time infinite element based solution of the necessary conditions that describe an optimal controller for a system. The technique produces a discrete feedback controller. Real time laboratory experiments are now being conducted to demonstrate the viability of the method. The algorithm that results is being implemented in a microprocessor environment. Critical computational tasks are accomplished using a low cost, on-board, multiprocessor (INMOS T800 Transputers) and parallel processing. Progress to date validates the methodology presented. Applications of the technique to the control of highly flexible robotic appendages are suggested.
NASA Astrophysics Data System (ADS)
Zabolotny, W. M.; Byszuk, A.
2016-03-01
The CMS experiment Level-1 trigger system is undergoing an upgrade. In the barrel-endcap transition region, it is necessary to merge data from 3 types of muon detectors—RPC, DT and CSC. The Overlap Muon Track Finder (OMTF) uses the novel approach to concentrate and process those data in a uniform manner to identify muons and their transversal momentum. The paper presents the algorithm and FPGA firmware implementation of the OMTF and its data transmission system in CMS. It is foreseen that the OMTF will be subject to significant changes resulting from optimization which will be done with the aid of physics simulations. Therefore, a special, high-level, parameterized HDL implementation is necessary.
NASA Astrophysics Data System (ADS)
Ham, Woonchul; Song, Chulgyu; Lee, Kangsan; Badarch, Luubaatar
2015-05-01
In this paper, we introduce the mobile embedded system implemented for capturing stereo image based on two CMOS camera module. We use WinCE as an operating system and capture the stereo image by using device driver for CMOS camera interface and Direct Draw API functions. We aslo comments on the GPU hardware and CUDA programming for implementation of 3D exaggeraion algorithm for ROI by adjusting and synthesizing the disparity value of ROI (region of interest) in real time. We comment on the pattern of aperture for deblurring of CMOS camera module based on the Kirchhoff diffraction formula and clarify the reason why we can get more sharp and clear image by blocking some portion of aperture or geometric sampling. Synthesized stereo image is real time monitored on the shutter glass type three-dimensional LCD monitor and disparity values of each segment are analyzed to prove the validness of emphasizing effect of ROI.
NASA Technical Reports Server (NTRS)
Kaufman, H.; Alag, G.
1975-01-01
Simple mechanical linkages have not solved the many control problems associated with high performance aircraft maneuvering throughout a wide flight envelope. One procedure for retaining uniform handling qualities over such an envelope is to implement a digital adaptive controller. Towards such an implementation an explicit adaptive controller which makes direct use of on-line parameter identification, has been developed and applied to both linearized and nonlinear equations of motion for a typical fighter aircraft. This controller is composed of an on-line weighted least squares parameter identifier, a Kalman state filter, and a model following control law designed using single stage performance indices. Simulation experiments with realistic measurement noise indicate that the proposed adaptive system has the potential for on-board implementation.
Taylor, Paul D; Attwood, Teresa K; Flower, Darren R
2006-12-06
We describe a novel and potentially important tool for candidate subunit vaccine selection through in silico reverse-vaccinology. A set of Bayesian networks able to make individual predictions for specific subcellular locations is implemented in three pipelines with different architectures: a parallel implementation with a confidence level-based decision engine and two serial implementations with a hierarchical decision structure, one initially rooted by prediction between membrane types and another rooted by soluble versus membrane prediction. The parallel pipeline outperformed the serial pipeline, but took twice as long to execute. The soluble-rooted serial pipeline outperformed the membrane-rooted predictor. Assessment using genomic test sets was more equivocal, as many more predictions are made by the parallel pipeline, yet the serial pipeline identifies 22 more of the 74 proteins of known location.
A Synchronization Algorithm and Implementation for High-Speed Block Codes Applications. Part 4
NASA Technical Reports Server (NTRS)
Lin, Shu; Zhang, Yu; Nakamura, Eric B.; Uehara, Gregory T.
1998-01-01
Block codes have trellis structures and decoders amenable to high speed CMOS VLSI implementation. For a given CMOS technology, these structures enable operating speeds higher than those achievable using convolutional codes for only modest reductions in coding gain. As a result, block codes have tremendous potential for satellite trunk and other future high-speed communication applications. This paper describes a new approach for implementation of the synchronization function for block codes. The approach utilizes the output of the Viterbi decoder and therefore employs the strength of the decoder. Its operation requires no knowledge of the signal-to-noise ratio of the received signal, has a simple implementation, adds no overhead to the transmitted data, and has been shown to be effective in simulation for received SNR greater than 2 dB.
NASA Astrophysics Data System (ADS)
Giacometto, F. J.; Vilardy, J. M.; Torres, C. O.; Mattos, L.
2011-01-01
Currently addressing problems related to security in access control, as a consequence, have been developed applications that work under unique characteristics in individuals, such as biometric features. In the world becomes important working with biometric images such as the liveliness of the iris which are for both the pattern of retinal images as your blood vessels. This paper presents an implementation of an algorithm for creating templates for biometric authentication with ocular features for FPGA, in which the object of study is that the texture pattern of iris is unique to each individual. The authentication will be based in processes such as edge extraction methods, segmentation principle of John Daugman and Libor Masek's, and standardization to obtain necessary templates for the search of matches in a database and then get the expected results of authentication.
Implementation of a near real-time burned area detection algorithm calibrated for VIIRS imagery
Brenna Schwert; Carl Albury; Jess Clark; Abigail Schaaf; Shawn Urbanski; Bryce Nordgren
2016-01-01
There is a need to implement methods for rapid burned area detection using a suitable replacement for Moderate Resolution Imaging Spectroradiometer (MODIS) imagery to meet future mapping and monitoring needs (Roy and Boschetti 2009, Tucker and Yager 2011). The Visible Infrared Imaging Radiometer Suite (VIIRS) sensor onboard the Suomi-National Polar-orbiting Partnership...
NASA Astrophysics Data System (ADS)
Miao, J. C.; Zhu, P.; Shi, G. L.; Chen, G. L.
2008-01-01
A sub-cycling integration algorithm (or named multi-time-steps integration algorithm), which has been successfully applied to FEM dynamical analysis, was firstly presented by Belytschko et al. (Comput Methods Appl Mech Eng 17/18:259-275, 1979). However, the problem of how to apply this type of algorithm to flexible multi-body dynamics (FMD) problems still lacks investigation up to now. Similar to the region-partitioning method used in FEM, this paper presents a central-difference-based sub-cycling integral method by decomposing the variables of an FMD equation into several groups and adopting different integral step sizes to each group of the variables. Based on the condensed form of an FMD equation, a group of common update formulae and a sub-step update formula, which constitute the sub-cycling together, are established in the paper. Furthermore, an implementation flowchart of the sub-cycling is presented. Stability of the sub-cycling will be analyzed and numerical examples will be performed to verify availability and precision of the sub-cycling in part II of the paper.
Toole, Ryan; Fok, Mable P
2015-06-15
A photonic system exemplifying the neurobiological learning algorithm, spike timing dependent plasticity (STDP), is experimentally demonstrated using the cooperative effects of cross gain modulation and nonlinear polarization rotation within an SOA. Furthermore, an STDP-based photonic approach towards the measurement of the angle of arrival (AOA) of a microwave signal is developed, and a three-dimensional AOA localization scheme is explored. Measurement accuracies on the order of tens of centimeters, rivaling that of complex positioning systems that utilize a large distribution of measuring units, are achieved for larger distances and with a simpler setup using just three STDP-based AOA units.
Parrallel Implementation of Fast Randomized Algorithms for Low Rank Matrix Decomposition
Lucas, Andrew J.; Stalizer, Mark; Feo, John T.
2014-03-01
We analyze the parallel performance of randomized interpolative decomposition by de- composing low rank complex-valued Gaussian random matrices larger than 100 GB. We chose a Cray XMT supercomputer as it provides an almost ideal PRAM model permitting quick investigation of parallel algorithms without obfuscation from hardware idiosyncrasies. We obtain that on non-square matrices performance scales almost linearly with runtime about 100 times faster on 128 processors. We also verify that numerically discovered error bounds still hold on matrices two orders of magnitude larger than those previously tested.
Implementation Issues for Algorithmic VLSI (Very Large Scale Integration) Processor Arrays.
1984-10-01
analysis of the various algorithms are described in Appendiccs 5.A, 5.B and 5.C. A note on notation: Following Ottmann ei aL [40], the variable n is used...redundant operations OK. Ottmann log i I log 1 up to n wasted processors. X-tree topology. Atallah log n I 1 redundant operations OK. up to n wasted...for Computing Machinery 14(2):203-241, April, 1967. 40] Thomas A. Ottmann , Arnold L. Rosenberg and Larry J. Stockmeyer. A dictionary machine (for VLSI
NASA Technical Reports Server (NTRS)
Li, Wei; Saleeb, Atef F.
1995-01-01
This two-part report is concerned with the development of a general framework for the implicit time-stepping integrators for the flow and evolution equations in generalized viscoplastic models. The primary goal is to present a complete theoretical formulation, and to address in detail the algorithmic and numerical analysis aspects involved in its finite element implementation, as well as to critically assess the numerical performance of the developed schemes in a comprehensive set of test cases. On the theoretical side, the general framework is developed on the basis of the unconditionally-stable, backward-Euler difference scheme as a starting point. Its mathematical structure is of sufficient generality to allow a unified treatment of different classes of viscoplastic models with internal variables. In particular, two specific models of this type, which are representative of the present start-of-art in metal viscoplasticity, are considered in applications reported here; i.e., fully associative (GVIPS) and non-associative (NAV) models. The matrix forms developed for both these models are directly applicable for both initially isotropic and anisotropic materials, in general (three-dimensional) situations as well as subspace applications (i.e., plane stress/strain, axisymmetric, generalized plane stress in shells). On the computational side, issues related to efficiency and robustness are emphasized in developing the (local) interative algorithm. In particular, closed-form expressions for residual vectors and (consistent) material tangent stiffness arrays are given explicitly for both GVIPS and NAV models, with their maximum sizes 'optimized' to depend only on the number of independent stress components (but independent of the number of viscoplastic internal state parameters). Significant robustness of the local iterative solution is provided by complementing the basic Newton-Raphson scheme with a line-search strategy for convergence. In the present second part of
NASA Astrophysics Data System (ADS)
Li, Wei; Saleeb, Atef F.
1995-05-01
This two-part report is concerned with the development of a general framework for the implicit time-stepping integrators for the flow and evolution equations in generalized viscoplastic models. The primary goal is to present a complete theoretical formulation, and to address in detail the algorithmic and numerical analysis aspects involved in its finite element implementation, as well as to critically assess the numerical performance of the developed schemes in a comprehensive set of test cases. On the theoretical side, the general framework is developed on the basis of the unconditionally-stable, backward-Euler difference scheme as a starting point. Its mathematical structure is of sufficient generality to allow a unified treatment of different classes of viscoplastic models with internal variables. In particular, two specific models of this type, which are representative of the present start-of-art in metal viscoplasticity, are considered in applications reported here; i.e., fully associative (GVIPS) and non-associative (NAV) models. The matrix forms developed for both these models are directly applicable for both initially isotropic and anisotropic materials, in general (three-dimensional) situations as well as subspace applications (i.e., plane stress/strain, axisymmetric, generalized plane stress in shells). On the computational side, issues related to efficiency and robustness are emphasized in developing the (local) interative algorithm. In particular, closed-form expressions for residual vectors and (consistent) material tangent stiffness arrays are given explicitly for both GVIPS and NAV models, with their maximum sizes 'optimized' to depend only on the number of independent stress components (but independent of the number of viscoplastic internal state parameters). Significant robustness of the local iterative solution is provided by complementing the basic Newton-Raphson scheme with a line-search strategy for convergence. In the present second part of
Analysis and Implementation of Graph Clustering for Digital News Using Star Clustering Algorithm
NASA Astrophysics Data System (ADS)
Ahdi, A. B.; SW, K. R.; Herdiani, A.
2017-01-01
Since Web 2.0 notion emerged and is used extensively by many services in the Internet, we see an unprecedented proliferation of digital news. Those digital news is very rich in term of content and link to other news/sources but lack of category information. This make the user could not easily identify or grouping all the news that they read into set of groups. Naturally, digital news are linked data because every digital new has relation/connection with other digital news/resources. The most appropriate model for linked data is graph model. Graph model is suitable for this purpose due its flexibility in describing relation and its easy-to-understand visualization. To handle the grouping issue, we use graph clustering approach. There are many graph clustering algorithm available, such as MST Clustering, Chameleon, Makarov Clustering and Star Clustering. From all of these options, we choose Star Clustering because this algorithm is more easy-to-understand, more accurate, efficient and guarantee the quality of clusters results. In this research, we investigate the accuracy of the cluster results by comparing it with expert judgement. We got quite high accuracy level, which is 80.98% and for the cluster quality, we got promising result which is 62.87%.
Lavrentieva, A; Kontou, P; Soulountsi, V; Kioumis, J; Chrysou, O; Bitzani, M
2015-09-30
The purpose of this study was to examine the hypothesis that an algorithm based on serial measurements of procalcitonin (PCT) allows reduction in the duration of antibiotic therapy compared with empirical rules, and does not result in more adverse outcomes in burn patients with infectious complications. All burn patients requiring antibiotic therapy based on confirmed or highly suspected bacterial infections were eligible. Patients were assigned to either a procalcitonin-guided (study group) or a standard (control group) antibiotic regimen. The following variables were analyzed and compared in both groups: duration of antibiotic treatment, mortality rate, percentage of patients with relapse or superinfection, maximum SOFA score (days 1-28), length of ICU and hospital stay. A total of 46 Burn ICU patients receiving antibiotic therapy were enrolled in this study. In 24 patients antibiotic therapy was guided by daily procalcitonin and clinical assessment. PCT guidance resulted in a smaller antibiotic exposure (10.1±4 vs. 15.3±8 days, p=0.034) without negative effects on clinical outcome characteristics such as mortality rate, percentage of patients with relapse or superinfection, maximum SOFA score, length of ICU and hospital stay. The findings thus show that use of a procalcitonin-guided algorithm for antibiotic therapy in the burn intensive care unit may contribute to the reduction of antibiotic exposure without compromising clinical outcome parameters.
Simulation of Propellant Loading System Senior Design Implement in Computer Algorithm
NASA Technical Reports Server (NTRS)
Bandyopadhyay, Alak
2010-01-01
Propellant loading from the Storage Tank to the External Tank is one of the very important and time consuming pre-launch ground operations for the launch vehicle. The propellant loading system is a complex integrated system involving many physical components such as the storage tank filled with cryogenic fluid at a very low temperature, the long pipe line connecting the storage tank with the external tank, the external tank along with the flare stack, and vent systems for releasing the excess fuel. Some of the very important parameters useful for design purpose are the prediction of pre-chill time, loading time, amount of fuel lost, the maximum pressure rise etc. The physics involved for mathematical modeling is quite complex due to the fact the process is unsteady, there is phase change as some of the fuel changes from liquid to gas state, then conjugate heat transfer in the pipe walls as well as between solid-to-fluid region. The simulation is very tedious and time consuming too. So overall, this is a complex system and the objective of the work is student's involvement and work in the parametric study and optimization of numerical modeling towards the design of such system. The students have to first become familiar and understand the physical process, the related mathematics and the numerical algorithm. The work involves exploring (i) improved algorithm to make the transient simulation computationally effective (reduced CPU time) and (ii) Parametric study to evaluate design parameters by changing the operational conditions
Lavrentieva, A.; Kontou, P.; Soulountsi, V.; Kioumis, J.; Chrysou, O.; Bitzani, M.
2015-01-01
Summary The purpose of this study was to examine the hypothesis that an algorithm based on serial measurements of procalcitonin (PCT) allows reduction in the duration of antibiotic therapy compared with empirical rules, and does not result in more adverse outcomes in burn patients with infectious complications. All burn patients requiring antibiotic therapy based on confirmed or highly suspected bacterial infections were eligible. Patients were assigned to either a procalcitonin-guided (study group) or a standard (control group) antibiotic regimen. The following variables were analyzed and compared in both groups: duration of antibiotic treatment, mortality rate, percentage of patients with relapse or superinfection, maximum SOFA score (days 1-28), length of ICU and hospital stay. A total of 46 Burn ICU patients receiving antibiotic therapy were enrolled in this study. In 24 patients antibiotic therapy was guided by daily procalcitonin and clinical assessment. PCT guidance resulted in a smaller antibiotic exposure (10.1±4 vs. 15.3±8 days, p=0.034) without negative effects on clinical outcome characteristics such as mortality rate, percentage of patients with relapse or superinfection, maximum SOFA score, length of ICU and hospital stay. The findings thus show that use of a procalcitonin-guided algorithm for antibiotic therapy in the burn intensive care unit may contribute to the reduction of antibiotic exposure without compromising clinical outcome parameters. PMID:27279801
NBA-Palm: prediction of palmitoylation site implemented in Naïve Bayes algorithm
Xue, Yu; Chen, Hu; Jin, Changjiang; Sun, Zhirong; Yao, Xuebiao
2006-01-01
Background Protein palmitoylation, an essential and reversible post-translational modification (PTM), has been implicated in cellular dynamics and plasticity. Although numerous experimental studies have been performed to explore the molecular mechanisms underlying palmitoylation processes, the intrinsic feature of substrate specificity has remained elusive. Thus, computational approaches for palmitoylation prediction are much desirable for further experimental design. Results In this work, we present NBA-Palm, a novel computational method based on Naïve Bayes algorithm for prediction of palmitoylation site. The training data is curated from scientific literature (PubMed) and includes 245 palmitoylated sites from 105 distinct proteins after redundancy elimination. The proper window length for a potential palmitoylated peptide is optimized as six. To evaluate the prediction performance of NBA-Palm, 3-fold cross-validation, 8-fold cross-validation and Jack-Knife validation have been carried out. Prediction accuracies reach 85.79% for 3-fold cross-validation, 86.72% for 8-fold cross-validation and 86.74% for Jack-Knife validation. Two more algorithms, RBF network and support vector machine (SVM), also have been employed and compared with NBA-Palm. Conclusion Taken together, our analyses demonstrate that NBA-Palm is a useful computational program that provides insights for further experimentation. The accuracy of NBA-Palm is comparable with our previously described tool CSS-Palm. The NBA-Palm is freely accessible from: . PMID:17044919
NBA-Palm: prediction of palmitoylation site implemented in Naïve Bayes algorithm.
Xue, Yu; Chen, Hu; Jin, Changjiang; Sun, Zhirong; Yao, Xuebiao
2006-10-17
Protein palmitoylation, an essential and reversible post-translational modification (PTM), has been implicated in cellular dynamics and plasticity. Although numerous experimental studies have been performed to explore the molecular mechanisms underlying palmitoylation processes, the intrinsic feature of substrate specificity has remained elusive. Thus, computational approaches for palmitoylation prediction are much desirable for further experimental design. In this work, we present NBA-Palm, a novel computational method based on Naïve Bayes algorithm for prediction of palmitoylation site. The training data is curated from scientific literature (PubMed) and includes 245 palmitoylated sites from 105 distinct proteins after redundancy elimination. The proper window length for a potential palmitoylated peptide is optimized as six. To evaluate the prediction performance of NBA-Palm, 3-fold cross-validation, 8-fold cross-validation and Jack-Knife validation have been carried out. Prediction accuracies reach 85.79% for 3-fold cross-validation, 86.72% for 8-fold cross-validation and 86.74% for Jack-Knife validation. Two more algorithms, RBF network and support vector machine (SVM), also have been employed and compared with NBA-Palm. Taken together, our analyses demonstrate that NBA-Palm is a useful computational program that provides insights for further experimentation. The accuracy of NBA-Palm is comparable with our previously described tool CSS-Palm. The NBA-Palm is freely accessible from: http://www.bioinfo.tsinghua.edu.cn/NBA-Palm.
2014-05-01
and Implementation of GPS Correlator Structures in MATLAB and Simulink with Focus on SDR Applications: Implementation of a Standard GPS Correlator...Correlator Structures in MATLAB and Simulink with Focus on SDR Applications: Implementation of a Standard GPS Correlator Architecture (Baseline...Development and Implementation of GPS Correlator Structures in MATLAB and Simulink with Focus on SDR Applications Implementation of a
NASA Astrophysics Data System (ADS)
Merk, D.; Zinner, T.
2013-02-01
In this paper a new detection scheme for Convective Initation (CI) under day and night conditions is presented. The new algorithm combines the strengths of two existing methods for detecting Convective Initation with geostationary satellite data and uses the channels of the Spinning Enhanced Visible and Infrared Imager (SEVIRI) onboard Meteosat Second Generation (MSG). For the new algorithm five infrared criteria from the Satellite Convection Analysis and Tracking algorithm (SATCAST) and one High Resolution Visible channel (HRV) criteria from Cb-TRAM were adapted. This set of criteria aims for identifying the typical development of quickly developing convective cells in an early stage. The different criteria include timetrends of the 10.8 IR channel and IR channel differences as well as their timetrends. To provide the trend fields an optical flow based method is used, the Pyramidal Matching algorithm which is part of Cb-TRAM. The new detection scheme is implemented in Cb-TRAM and is verified for seven days which comprise different weather situations in Central Europe. Contrasted with the original early stage detection scheme of Cb-TRAM skill scores are provided. From the comparison against detections of later thunderstorm stages, which are also provided by Cb-TRAM, a decrease in false prior warnings (false alarm ratio) from 91 to 81% is presented, an increase of the critical success index from 7.4 to 12.7%, and a decrease of the BIAS from 320 to 146% for normal scan mode. Similar trends are found for rapid scan mode. Most obvious is the decline of false alarms found for synoptic conditions with upper cold air masses triggering convection.
NASA Astrophysics Data System (ADS)
Merk, D.; Zinner, T.
2013-08-01
In this paper a new detection scheme for convective initiation (CI) under day and night conditions is presented. The new algorithm combines the strengths of two existing methods for detecting CI with geostationary satellite data. It uses the channels of the Spinning Enhanced Visible and Infrared Imager (SEVIRI) onboard Meteosat Second Generation (MSG). For the new algorithm five infrared (IR) criteria from the Satellite Convection Analysis and Tracking algorithm (SATCAST) and one high-resolution visible channel (HRV) criteria from Cb-TRAM were adapted. This set of criteria aims to identify the typical development of quickly developing convective cells in an early stage. The different criteria include time trends of the 10.8 IR channel, and IR channel differences, as well as their time trends. To provide the trend fields an optical-flow-based method is used: the pyramidal matching algorithm, which is part of Cb-TRAM. The new detection scheme is implemented in Cb-TRAM, and is verified for seven days which comprise different weather situations in central Europe. Contrasted with the original early-stage detection scheme of Cb-TRAM, skill scores are provided. From the comparison against detections of later thunderstorm stages, which are also provided by Cb-TRAM, a decrease in false prior warnings (false alarm ratio) from 91 to 81% is presented, an increase of the critical success index from 7.4 to 12.7%, and a decrease of the BIAS from 320 to 146% for normal scan mode. Similar trends are found for rapid scan mode. Most obvious is the decline of false alarms found for the synoptic class "cold air" masses.
NASA Astrophysics Data System (ADS)
Lohmann, Timo
Electric sector models are powerful tools that guide policy makers and stakeholders. Long-term power generation expansion planning models are a prominent example and determine a capacity expansion for an existing power system over a long planning horizon. With the changes in the power industry away from monopolies and regulation, the focus of these models has shifted to competing electric companies maximizing their profit in a deregulated electricity market. In recent years, consumers have started to participate in demand response programs, actively influencing electricity load and price in the power system. We introduce a model that features investment and retirement decisions over a long planning horizon of more than 20 years, as well as an hourly representation of day-ahead electricity markets in which sellers of electricity face buyers. This combination makes our model both unique and challenging to solve. Decomposition algorithms, and especially Benders decomposition, can exploit the model structure. We present a novel method that can be seen as an alternative to generalized Benders decomposition and relies on dynamic linear overestimation. We prove its finite convergence and present computational results, demonstrating its superiority over traditional approaches. In certain special cases of our model, all necessary solution values in the decomposition algorithms can be directly calculated and solving mathematical programming problems becomes entirely obsolete. This leads to highly efficient algorithms that drastically outperform their programming problem-based counterparts. Furthermore, we discuss the implementation of all tailored algorithms and the challenges from a modeling software developer's standpoint, providing an insider's look into the modeling language GAMS. Finally, we apply our model to the Texas power system and design two electricity policies motivated by the U.S. Environment Protection Agency's recently proposed CO2 emissions targets for the
Implementation of Monte Carlo Tree Search (MCTS) Algorithm in COMBATXXI using JDAFS
2014-07-31
Office 255 Sedgewick Rd Ft Leavenworth KS TRAC- MRO DISTRIBUTION STATEMENT: Approved for public release; distribution is unlimited. The implementation of...completed in FY13. The TRADOC Analysis Center - Methods and Research Office (TRAC- MRO ) sponsored this iteration in an attempt to test the feasibility...work completed in FY13.1 The TRADOC Analysis Center - Methods and Research Office (TRAC- MRO ) sponsored this iteration in an attempt to test the
Obeng, A Owusu; Kaszemacher, T; Abul-Husn, N S; Gottesman, O; Vega, A; Waite, E; Myers, K; Cho, J; Bottinger, E P; Ellis, S B; Scott, S A
2016-11-01
Implementation of pharmacogenetic-guided warfarin dosing has been hindered by inconsistent results from reported clinical trials and a lack of available algorithms that include alleles prevalent in non-white populations. However, current evidence indicates that algorithm-guided dosing is more accurate than empirical dosing. To facilitate multiethnic algorithm-guided warfarin dosing using preemptive genetic testing, we developed a strategy that accounts for the complexity of race and leverages electronic health records for algorithm variables and deploying point-of-care dose recommendations. © 2016 American Society for Clinical Pharmacology and Therapeutics.
Algorithmic complexity for psychology: a user-friendly implementation of the coding theorem method.
Gauvrit, Nicolas; Singmann, Henrik; Soler-Toscano, Fernando; Zenil, Hector
2016-03-01
Kolmogorov-Chaitin complexity has long been believed to be impossible to approximate when it comes to short sequences (e.g. of length 5-50). However, with the newly developed coding theorem method the complexity of strings of length 2-11 can now be numerically estimated. We present the theoretical basis of algorithmic complexity for short strings (ACSS) and describe an R-package providing functions based on ACSS that will cover psychologists' needs and improve upon previous methods in three ways: (1) ACSS is now available not only for binary strings, but for strings based on up to 9 different symbols, (2) ACSS no longer requires time-consuming computing, and (3) a new approach based on ACSS gives access to an estimation of the complexity of strings of any length. Finally, three illustrative examples show how these tools can be applied to psychology.
FPGA-based real-time phase measuring profilometry algorithm design and implementation
NASA Astrophysics Data System (ADS)
Zhan, Guomin; Tang, Hongwei; Zhong, Kai; Li, Zhongwei; Shi, Yusheng
2016-11-01
Phase measuring profilometry (PMP) has been widely used in many fields, like Computer Aided Verification (CAV), Flexible Manufacturing System (FMS) et al. High frame-rate (HFR) real-time vision-based feedback control will be a common demands in near future. However, the instruction time delay in the computer caused by numerous repetitive operations greatly limit the efficiency of data processing. FPGA has the advantages of pipeline architecture and parallel execution, and it fit for handling PMP algorithm. In this paper, we design a fully pipelined hardware architecture for PMP. The functions of hardware architecture includes rectification, phase calculation, phase shifting, and stereo matching. The experiment verified the performance of this method, and the factors that may influence the computation accuracy was analyzed.
Alteration of Box-Jenkins methodology by implementing genetic algorithm method
NASA Astrophysics Data System (ADS)
Ismail, Zuhaimy; Maarof, Mohd Zulariffin Md; Fadzli, Mohammad
2015-02-01
A time series is a set of values sequentially observed through time. The Box-Jenkins methodology is a systematic method of identifying, fitting, checking and using integrated autoregressive moving average time series model for forecasting. Box-Jenkins method is an appropriate for a medium to a long length (at least 50) time series data observation. When modeling a medium to a long length (at least 50), the difficulty arose in choosing the accurate order of model identification level and to discover the right parameter estimation. This presents the development of Genetic Algorithm heuristic method in solving the identification and estimation models problems in Box-Jenkins. Data on International Tourist arrivals to Malaysia were used to illustrate the effectiveness of this proposed method. The forecast results that generated from this proposed model outperformed single traditional Box-Jenkins model.
NASA Technical Reports Server (NTRS)
Mandy, Christophe P.; Sakamoto, Hiraku; Saenz-Otero, Alvar; Miller, David W.
2007-01-01
The MIT's Space Systems Laboratory developed the Synchronized Position Hold Engage and Reorient Experimental Satellites (SPHERES) as a risk-tolerant spaceborne facility to develop and mature control, estimation, and autonomy algorithms for distributed satellite systems for applications such as satellite formation flight. Tests performed study interferometric mission-type formation flight maneuvers in deep space. These tests consist of having the satellites trace a coordinated trajectory under tight control that would allow simulated apertures to constructively interfere observed light and measure the resulting increase in angular resolution. This paper focuses on formation initialization (establishment of a formation using limited field of view relative sensors), formation coordination (synchronization of the different satellite s motion) and fuel-balancing among the different satellites.
Pre-Hardware Optimization of Spacecraft Image Processing Algorithms and Hardware Implementation
NASA Technical Reports Server (NTRS)
Kizhner, Semion; Petrick, David J.; Flatley, Thomas P.; Hestnes, Phyllis; Jentoft-Nilsen, Marit; Day, John H. (Technical Monitor)
2002-01-01
Spacecraft telemetry rates and telemetry product complexity have steadily increased over the last decade presenting a problem for real-time processing by ground facilities. This paper proposes a solution to a related problem for the Geostationary Operational Environmental Spacecraft (GOES-8) image data processing and color picture generation application. Although large super-computer facilities are the obvious heritage solution, they are very costly, making it imperative to seek a feasible alternative engineering solution at a fraction of the cost. The proposed solution is based on a Personal Computer (PC) platform and synergy of optimized software algorithms, and reconfigurable computing hardware (RC) technologies, such as Field Programmable Gate Arrays (FPGA) and Digital Signal Processors (DSP). It has been shown that this approach can provide superior inexpensive performance for a chosen application on the ground station or on-board a spacecraft.
Numerical implementation of the convexification algorithm for an optical diffusion tomograph
NASA Astrophysics Data System (ADS)
Shan, Hua; Klibanov, Michael V.; Liu, Hanli; Pantong, Natee; Su, Jianzhong
2008-04-01
A globally convergent (the so-called convexification) algorithm was previously developed for coefficient inverse problems (CIPs) with the time/frequency-dependent data. In this publication the convexification is extended to the case of a CIP for an elliptic equation with the data generated by the source running along a straight line. The data are incomplete, since they are given only at a part of the boundary. Applications to both electrical impedance and optical tomographies are feasible, which include, in particular, imaging of land mines and underground bunkers, as well as diffuse optical imaging of targets on battlefields through smogs and flames. However, our numerical setup is intended for medical applications to small animals. Numerical experiments in the 2D case are presented.
VLSI implementation of a new LMS-based algorithm for noise removal in ECG signal
NASA Astrophysics Data System (ADS)
Satheeskumaran, S.; Sabrigiriraj, M.
2016-06-01
Least mean square (LMS)-based adaptive filters are widely deployed for removing artefacts in electrocardiogram (ECG) due to less number of computations. But they posses high mean square error (MSE) under noisy environment. The transform domain variable step-size LMS algorithm reduces the MSE at the cost of computational complexity. In this paper, a variable step-size delayed LMS adaptive filter is used to remove the artefacts from the ECG signal for improved feature extraction. The dedicated digital Signal processors provide fast processing, but they are not flexible. By using field programmable gate arrays, the pipelined architectures can be used to enhance the system performance. The pipelined architecture can enhance the operation efficiency of the adaptive filter and save the power consumption. This technique provides high signal-to-noise ratio and low MSE with reduced computational complexity; hence, it is a useful method for monitoring patients with heart-related problem.
FPGA-based genetic algorithm implementation for AC chopper fed induction motor
NASA Astrophysics Data System (ADS)
Mahendran, S.; Gnanambal, I.; Maheswari, A.
2016-12-01
Genetic algorithm (GA)-based harmonic elimination technique is proposed for designing AC chopper. GA is used to calculate optimal firing angles to eliminate lower order harmonics in output voltage. Total harmonic distortion of output voltage is taken for the fitness function used in the GA. Thus, the ratings of the load are not mandatory to be known for calculating the switching angles using proposed technique. For the performance assessment of GA, Newton-Raphson (NR) method is applied in this present work. Simulation results show that the proposed technique is better in terms of less computational complexity and quick convergence. Simulation results were verified by field programmable gate array controller-based prototype. Simulation study and experimental investigations show that the proposed GA method is superior to the conventional methods.
NASA Astrophysics Data System (ADS)
Sun, Hong; Wu, Qian-zhong
2013-09-01
In order to improve the precision of optical-electric tracking device, proposing a kind of improved optical-electric tracking device based on MEMS, in allusion to the tracking error of gyroscope senor and the random drift, According to the principles of time series analysis of random sequence, establish AR model of gyro random error based on Kalman filter algorithm, then the output signals of gyro are multiple filtered with Kalman filter. And use ARM as micro controller servo motor is controlled by fuzzy PID full closed loop control algorithm, and add advanced correction and feed-forward links to improve response lag of angle input, Free-forward can make output perfectly follow input. The function of lead compensation link is to shorten the response of input signals, so as to reduce errors. Use the wireless video monitor module and remote monitoring software (Visual Basic 6.0) to monitor servo motor state in real time, the video monitor module gathers video signals, and the wireless video module will sent these signals to upper computer, so that show the motor running state in the window of Visual Basic 6.0. At the same time, take a detailed analysis to the main error source. Through the quantitative analysis of the errors from bandwidth and gyro sensor, it makes the proportion of each error in the whole error more intuitive, consequently, decrease the error of the system. Through the simulation and experiment results shows the system has good following characteristic, and it is very valuable for engineering application.
A new algorithm for determining 3D biplane imaging geometry: theory and implementation
NASA Astrophysics Data System (ADS)
Singh, Vikas; Xu, Jinhui; Hoffmann, Kenneth R.; Xu, Guang; Chen, Zhenming; Gopal, Anant
2005-04-01
Biplane imaging is a primary method for visual and quantitative assessment of the vasculature. A key problem called Imaging Geometry Determination problem (IGD for short) in this method is to determine the rotation-matrix R and the translation-vector t which relate the two coordinate systems. In this paper, we propose a new approach, called IG-Sieving, to calculate R and t using corresponding points in the two images. Our technique first generates an initial estimate of R and t from the gantry angles of the imaging system, and then optimizes them by solving an optimal-cell-search problem in a 6-D parametric space (three variables defining R plus the three variables of t). To efficiently find the optimal imaging geometry (IG) in 6-D, our approach divides the high dimensional search domain into a set of lower-dimensional regions, thereby reducing the optimal-cell-search problem to a set of optimization problems in 3D sub-spaces. For each such sub-space, our approach first applies efficient computational geometry techniques to identify "possibly-feasible"" IG"s, and then uses a criterion we call fall-in-number to sieve out good IGs. We show that in a bounded number of optimization steps, a (possibly infinite) set of near-optimal IGs can be determined. Simulation results indicate that our method can reconstruct 3D points with average 3D center-of-mass errors of about 0.8cm for input image-data errors as high as 0.1cm. More importantly, our algorithm provides a novel insight into the geometric structure of the solution-space, which could be exploited to significantly improve the accuracy of other biplane algorithms.
Implementation and optimization of ultrasound signal processing algorithms on mobile GPU
NASA Astrophysics Data System (ADS)
Kong, Woo Kyu; Lee, Wooyoul; Kim, Kyu Cheol; Yoo, Yangmo; Song, Tai-Kyong
2014-03-01
A general-purpose graphics processing unit (GPGPU) has been used for improving computing power in medical ultrasound imaging systems. Recently, a mobile GPU becomes powerful to deal with 3D games and videos at high frame rates on Full HD or HD resolution displays. This paper proposes the method to implement ultrasound signal processing on a mobile GPU available in the high-end smartphone (Galaxy S4, Samsung Electronics, Seoul, Korea) with programmable shaders on the OpenGL ES 2.0 platform. To maximize the performance of the mobile GPU, the optimization of shader design and load sharing between vertex and fragment shader was performed. The beamformed data were captured from a tissue mimicking phantom (Model 539 Multipurpose Phantom, ATS Laboratories, Inc., Bridgeport, CT, USA) by using a commercial ultrasound imaging system equipped with a research package (Ultrasonix Touch, Ultrasonix, Richmond, BC, Canada). The real-time performance is evaluated by frame rates while varying the range of signal processing blocks. The implementation method of ultrasound signal processing on OpenGL ES 2.0 was verified by analyzing PSNR with MATLAB gold standard that has the same signal path. CNR was also analyzed to verify the method. From the evaluations, the proposed mobile GPU-based processing method has no significant difference with the processing using MATLAB (i.e., PSNR<52.51 dB). The comparable results of CNR were obtained from both processing methods (i.e., 11.31). From the mobile GPU implementation, the frame rates of 57.6 Hz were achieved. The total execution time was 17.4 ms that was faster than the acquisition time (i.e., 34.4 ms). These results indicate that the mobile GPU-based processing method can support real-time ultrasound B-mode processing on the smartphone.
Implementation of ILLIAC 4 algorithms for multispectral image interpretation. [earth resources data
NASA Technical Reports Server (NTRS)
Ray, R. M.; Thomas, J. D.; Donovan, W. E.; Swain, P. H.
1974-01-01
Research has focused on the design and partial implementation of a comprehensive ILLIAC software system for computer-assisted interpretation of multispectral earth resources data such as that now collected by the Earth Resources Technology Satellite. Research suggests generally that the ILLIAC 4 should be as much as two orders of magnitude more cost effective than serial processing computers for digital interpretation of ERTS imagery via multivariate statistical classification techniques. The potential of the ARPA Network as a mechanism for interfacing geographically-dispersed users to an ILLIAC 4 image processing facility is discussed.
NASA Technical Reports Server (NTRS)
Maine, R. E.; Iliff, K. W.
1980-01-01
A new formulation is proposed for the problem of parameter estimation of dynamic systems with both process and measurement noise. The formulation gives estimates that are maximum likelihood asymptotically in time. The means used to overcome the difficulties encountered by previous formulations are discussed. It is then shown how the proposed formulation can be efficiently implemented in a computer program. A computer program using the proposed formulation is available in a form suitable for routine application. Examples with simulated and real data are given to illustrate that the program works well.
NASA Astrophysics Data System (ADS)
Cieszyńska, Agata; Śliwińska-Wilczewska, Sylwia
2017-04-01
mixtures of conditions were applied in the laboratory experiments. Results from these experiments were the foundation to create picocyanobacteria life cycle algorithm - pico-bioalgorithm. The form of algorithm bases on the Ecological Regional Ocean Model formulas for functional phytoplankton groups. According to this, in pico-bioalgorithm the dependence on temperature and salinity of water body and the occurrence of nutrients are provided along with the coefficients determining mortality of picoplankton cells and coefficients of respiration and growth rates. In order to prescribe the limiting properties, modified Michaelis-Menten formula with squared arguments as a limiting function was used. Picoplanktonic organisms are very specific and can live in environments, which may be initially defined as impossible for such organisms to survive. The issue of picoplanktonic species inhabiting the Baltic Sea needs to be explored in details. Present study and proposed algorithm comprise an important step in this scientific exploration. This work has been funded by the National Centre of Science project (contract number: 2012/07/N/ST10/03485) entitled: "Improved understanding of phytoplankton blooms in the Baltic Sea based on numerical models and existing data sets". The Author (AC) received funding from National Centre of Sciences in doctoral scholarship program (contract number: 2016/20/T/ST10/00214);
NASA Astrophysics Data System (ADS)
Hatzaki, Maria; Flocas, Elena A.; Simmonds, Ian; Kouroutzoglou, John; Keay, Kevin; Rudeva, Irina
2013-04-01
Migratory cyclones and anticyclones mainly account for the short-term weather variations in extra-tropical regions. By contrast to cyclones that have drawn major scientific attention due to their direct link to active weather and precipitation, climatological studies on anticyclones are limited, even though they also are associated with extreme weather phenomena and play an important role in global and regional climate. This is especially true for the Mediterranean, a region particularly vulnerable to climate change, and the little research which has been done is essentially confined to the manual analysis of synoptic charts. For the construction of a comprehensive climatology of migratory anticyclonic systems in the Mediterranean using an objective methodology, the Melbourne University automatic tracking algorithm is applied, based to the ERA-Interim reanalysis mean sea level pressure database. The algorithm's reliability in accurately capturing the weather patterns and synoptic climatology of the transient activity has been widely proven. This algorithm has been extensively applied for cyclone studies worldwide and it has been also successfully applied for the Mediterranean, though its use for anticyclone tracking is limited to the Southern Hemisphere. In this study the performance of the tracking algorithm under different data resolutions and different choices of parameter settings in the scheme is examined. Our focus is on the appropriate modification of the algorithm in order to efficiently capture the individual characteristics of the anticyclonic tracks in the Mediterranean, a closed basin with complex topography. We show that the number of the detected anticyclonic centers and the resulting tracks largely depend upon the data resolution and the search radius. We also find that different scale anticyclones and secondary centers that lie within larger anticyclone structures can be adequately represented; this is important, since the extensions of major
Describes elements for the set of activities to ensure that control strategies are put into effect and that air quality goals and standards are fulfilled, permitting programs, and additional resources related to implementation under the Clean Air Act.
NASA Astrophysics Data System (ADS)
Balick, Lee K.; Hirsch, Karen L.; McLachlan, Peter M.; Borel, Christoph C.; Clodius, William B.; Villeneuve, Pierre V.
2000-11-01
The Multispectral Thermal Imager (MTI) is a satellite system developed by the DoE. It has 10 spectral bands in the reflectance domain and 5 in the thermal IR. It is pointable and, at nadir, provides 5m IFOV in four visible and short near IR bands and 20m IFOV at longer wavelengths. Several of the bands in the reflectance domain were designed to enable quantitative compensation for aerosol effects and water vapor (daytime). These include 3 bands in and adjacent to the 940nm water vapor feature, a band at 1380nm for cirrus cloud detection and a SWIR band with small atmospheric effects. The concepts and development of these techniques have been described in detail at previous SPIE conferences and in journals. This paper describes the adaptation of these algorithms to the MTI automated processing pipeline (standardized level 2 products) for retrieval of aerosol optical depth (and subsequent compensation of reflectance bands for calibration to reflectance) and the atmospheric water vapor content (thermal IR compensation). Input data sources and flow are described. Validation results are presented. Pre-launch validation was performed using images from the NASA AVIRIS hyperspectral imaging sensor flown in the stratosphere on NASA ER-2 aircraft compared to ground based sun photometer and radiosonde measurements from different sources. These data sets span a range of environmental conditions.
Pérez-Marín, D; Garrido-Varo, A; Guerrero, J E
2005-01-01
Seven thousand four hundred and twenty-three compound feed samples were used to develop near-infrared (NIR) calibrations for predicting the percentage of each ingredient used in the manufacture of a given compound feedingstuff. Spectra were collected at 2 nm increments using a FOSS NIRSystems 5000 monochromator. The reference data used for each ingredient percentage were those declared in the formula for each feedingstuff. Two chemometric tools for developing NIRS prediction models were compared: the so-called GLOBAL MPLS (modified partial least squares), traditionally used in developing NIRS applications, and the more recently developed calibration strategy known as LOCAL. The LOCAL procedure is designed to select, from a large database, samples with spectra resembling the sample being analyzed. Selected samples are used as calibration sets to develop specific MPLS equations for predicting each unknown sample. For all predicted ingredients, LOCAL calibrations resulted in a significant improvement in both standard error of prediction (SEP) and bias values compared with GLOBAL calibrations. Determination coefficient values (r(2)) also improved using the LOCAL strategy, exceeding 0.90 for most ingredients. Use of the LOCAL algorithm for calibration thus proved valuable in minimizing the errors in NIRS calibration equations for predicting a parameter as complex as the percentage of each ingredient in compound feedingstuffs.
NASA Astrophysics Data System (ADS)
Baba, Toshitaka; Takahashi, Narumi; Kaneda, Yoshiyuki; Ando, Kazuto; Matsuoka, Daisuke; Kato, Toshihiro
2015-12-01
Because of improvements in offshore tsunami observation technology, dispersion phenomena during tsunami propagation have often been observed in recent tsunamis, for example the 2004 Indian Ocean and 2011 Tohoku tsunamis. The dispersive propagation of tsunamis can be simulated by use of the Boussinesq model, but the model demands many computational resources. However, rapid progress has been made in parallel computing technology. In this study, we investigated a parallelized approach for dispersive tsunami wave modeling. Our new parallel software solves the nonlinear Boussinesq dispersive equations in spherical coordinates. A variable nested algorithm was used to increase spatial resolution in the target region. The software can also be used to predict tsunami inundation on land. We used the dispersive tsunami model to simulate the 2011 Tohoku earthquake on the Supercomputer K. Good agreement was apparent between the dispersive wave model results and the tsunami waveforms observed offshore. The finest bathymetric grid interval was 2/9 arcsec (approx. 5 m) along longitude and latitude lines. Use of this grid simulated tsunami soliton fission near the Sendai coast. Incorporating the three-dimensional shape of buildings and structures led to improved modeling of tsunami inundation.
Implementation of the Canny Edge Detection algorithm for a stereo vision system
Wang, J.R.; Davis, T.A.; Lee, G.K.
1996-12-31
There exists many applications in which three-dimensional information is necessary. For example, in manufacturing systems, parts inspection may require the extraction of three-dimensional information from two-dimensional images, through the use of a stereo vision system. In medical applications, one may wish to reconstruct a three-dimensional image of a human organ from two or more transducer images. An important component of three-dimensional reconstruction is edge detection, whereby an image boundary is separated from background, for further processing. In this paper, a modification of the Canny Edge Detection approach is suggested to extract an image from a cluttered background. The resulting cleaned image can then be sent to the image matching, interpolation and inverse perspective transformation blocks to reconstruct the 3-D scene. A brief discussion of the stereo vision system that has been developed at the Mars Mission Research Center (MMRC) is also presented. Results of a version of Canny Edge Detection algorithm show promise as an accurate edge extractor which may be used in the edge-pixel based binocular stereo vision system.
Barley, Mark H; Turner, Nicolas J; Goodacre, Royston
2017-06-19
In directed evolution (DE) the assessment of candidate enzymes and their modification is essential. In this study we have investigated genetic algorithms (GAs) in this context and conducted a systematic study of the behavior of GAs on 20 fitness landscapes (FLs) of varying complexity. This has allowed the tuning of the GAs to be explored. On the basis of this study, recommendations for the best GA settings to use for a GA-directed high-throughput experimental program (in which populations and the number of generations is necessarily low) are reported. The FLs were based upon simple linear models and were characterized by the behavior of the GA on the landscape as demonstrated by stall plots and the footprints and adhesion of candidate solutions, which highlighted local optima (LOs). In order to maximize progress of the GA and to reduce the chances of becoming stuck in a LO it was best to use: 1) a large number of generations, 2) high populations, 3) removal of duplicate sequences (clones), 4) double mutation, and 5) high selection pressure (the two best individuals go to the next generation), and 6) to consider using a designed sequence as the starting point of the GA run. We believe that these recommendations might be appropriate starting points for studies employing GAs within DE experiments. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
NASA Astrophysics Data System (ADS)
Karabiber, Fethullah; Grassi, Giuseppe; Vecchio, Pietro; Arik, Sabri; Yalcin, M. Erhan
2011-01-01
Based on the cellular neural network (CNN) paradigm, the bio-inspired (bi-i) cellular vision system is a computing platform consisting of state-of-the-art sensing, cellular sensing-processing and digital signal processing. This paper presents the implementation of a novel CNN-based segmentation algorithm onto the bi-i system. The experimental results, carried out for different benchmark video sequences, highlight the feasibility of the approach, which provides a frame rate of about 26 frame/sec. Comparisons with existing CNN-based methods show that, even though these methods are from two to six times faster than the proposed one, the conceived approach is more accurate and, consequently, represents a satisfying trade-off between real-time requirements and accuracy.
NASA Astrophysics Data System (ADS)
Arimbi, Mentari Dian; Bustamam, Alhadi; Lestari, Dian
2017-03-01
Data clustering can be executed through partition or hierarchical method for many types of data including DNA sequences. Both clustering methods can be combined by processing partition algorithm in the first level and hierarchical in the second level, called hybrid clustering. In the partition phase some popular methods such as PAM, K-means, or Fuzzy c-means methods could be applied. In this study we selected partitioning around medoids (PAM) in our partition stage. Furthermore, following the partition algorithm, in hierarchical stage we applied divisive analysis algorithm (DIANA) in order to have more specific clusters and sub clusters structures. The number of main clusters is determined using Davies Bouldin Index (DBI) value. We choose the optimal number of clusters if the results minimize the DBI value. In this work, we conduct the clustering on 1252 HPV DNA sequences data from GenBank. The characteristic extraction is initially performed, followed by normalizing and genetic distance calculation using Euclidean distance. In our implementation, we used the hybrid PAM and DIANA using the R open source programming tool. In our results, we obtained 3 main clusters with average DBI value is 0.979, using PAM in the first stage. After executing DIANA in the second stage, we obtained 4 sub clusters for Cluster-1, 9 sub clusters for Cluster-2 and 2 sub clusters in Cluster-3, with the BDI value 0.972, 0.771, and 0.768 for each main cluster respectively. Since the second stage produce lower DBI value compare to the DBI value in the first stage, we conclude that this hybrid approach can improve the accuracy of our clustering results.
Komorkiewicz, Mateusz; Kryjak, Tomasz; Gorgon, Marek
2014-01-01
This article presents an efficient hardware implementation of the Horn-Schunck algorithm that can be used in an embedded optical flow sensor. An architecture is proposed, that realises the iterative Horn-Schunck algorithm in a pipelined manner. This modification allows to achieve data throughput of 175 MPixels/s and makes processing of Full HD video stream (1, 920 × 1, 080 @ 60 fps) possible. The structure of the optical flow module as well as pre- and post-filtering blocks and a flow reliability computation unit is described in details. Three versions of optical flow modules, with different numerical precision, working frequency and obtained results accuracy are proposed. The errors caused by switching from floating- to fixed-point computations are also evaluated. The described architecture was tested on popular sequences from an optical flow dataset of the Middlebury University. It achieves state-of-the-art results among hardware implementations of single scale methods. The designed fixed-point architecture achieves performance of 418 GOPS with power efficiency of 34 GOPS/W. The proposed floating-point module achieves 103 GFLOPS, with power efficiency of 24 GFLOPS/W. Moreover, a 100 times speedup compared to a modern CPU with SIMD support is reported. A complete, working vision system realized on Xilinx VC707 evaluation board is also presented. It is able to compute optical flow for Full HD video stream received from an HDMI camera in real-time. The obtained results prove that FPGA devices are an ideal platform for embedded vision systems. PMID:24526303
NASA Astrophysics Data System (ADS)
Cua, G. B.; Fischer, M.; Heimers, S.; Clinton, J. F.; Diehl, T.; Kaestli, P.; Becker, J.; Saul, J.
2011-12-01
The Virtual Seismologist (VS) earthquake early warning (EEW) methodology is a Bayesian approach to EEW, wherein the most probable source estimate at any given time is a combination of contributions from a likelihood function that evolves in response to incoming data from the on-going earthquake, and selected prior information, which can include factors such as network topology, the Gutenberg-Richter relationship or previously observed seismicity. The VS algorithm, implemented by the Swiss Seismological Service (SED) at ETH Zurich, is a fundamental component of the California Integrated Seismic Network (CISN) ShakeAlert system, and has been thoroughly tested in real-time the Southern California Seismic Network since July 2008, and at the Northern California Seismic Network since February 2009. SeisComP3 (SC3) is a fully featured automated real-time earthquake monitoring software developed by GeoForschungZentrum Potsdam in collaboration with commercial partner, gempa GmbH. It is becoming a community standard software for earthquake detection and waveform processing for regional and global networks across the globe, including at the SED. As part of efforts in the development of real-time seismology tools supported by the Network of European Research Infrastructures for Earthquake Risk Assessment and Mitigation (NERA), the VS EEW algorithm has been implemented within the SeisComP3 framework. We discuss the software design and development, as well as testing and performance evaluation on real-time and archived waveform data from the SED. The "VS in SC3" effort facilitates the seamless integration of real-time EEW within standard network processing at SED, as well as the distribution, installation, and real-time testing of the VS codes at various European networks, in particular, real-time test sites at Naples, Istanbul, Patras, and Iceland planned as part of FP7 project REAKT "Strategies and Tools for Real-Time Earthquake Risk Mitigation".
Implementation of the SU(2) Hamiltonian symmetry for the DMRG algorithm
NASA Astrophysics Data System (ADS)
Alvarez, Gonzalo
2012-10-01
In the Density Matrix Renormalization Group (DMRG) algorithm (White, 1992, 1993) [1,2], Hamiltonian symmetries play an important rôle. Using symmetries, the matrix representation of the Hamiltonian can be blocked. Diagonalizing each matrix block is more efficient than diagonalizing the original matrix. This paper explains how the the DMRG++ code (Alvarez, 2009) [3] has been extended to handle the non-local SU(2) symmetry in a model independent way. Improvements in CPU times compared to runs with only local symmetries are discussed for the one-orbital Hubbard model, and for a two-orbital Hubbard model for iron-based superconductors. The computational bottleneck of the algorithm and the use of shared memory parallelization are also addressed. Program summary Program title: DMRG++ Catalog identifier: AEDJ_v2_0 Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AEDJ_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Special license. See http://cpc.cs.qub.ac.uk/licence/AEDJ_v2_0.html No. of lines in distributed program, including test data, etc.: 211560 No. of bytes in distributed program, including test data, etc.: 10572185 Distribution format: tar.gz Programming language: C++. Computer: PC. Operating system: Multiplatform, tested on Linux. Has the code been vectorized or parallelized?: Yes. 1 to 8 processors with MPI, 2 to 4 cores with pthreads. RAM: 1GB (256MB is enough to run the included test) Classification: 23. Catalog identifier of previous version: AEDJ_v1_0 Journal reference of previous version: Comput. Phys. Comm. 180(2009)1572 External routines: BLAS and LAPACK Nature of problem: Strongly correlated electrons systems, display a broad range of important phenomena, and their study is a major area of research in condensed matter physics. In this context, model Hamiltonians are used to simulate the relevant interactions of a given compound, and the relevant degrees of freedom. These studies
2015-01-01
False negative docking outcomes for highly symmetric molecules are a barrier to the accurate evaluation of docking programs, scoring functions, and protocols. This work describes an implementation of a symmetry-corrected root-mean-square deviation (RMSD) method into the program DOCK based on the Hungarian algorithm for solving the minimum assignment problem, which dynamically assigns atom correspondence in molecules with symmetry. The algorithm adds only a trivial amount of computation time to the RMSD calculations and is shown to increase the reported overall docking success rate by approximately 5% when tested over 1043 receptor–ligand systems. For some families of protein systems the results are even more dramatic, with success rate increases up to 16.7%. Several additional applications of the method are also presented including as a pairwise similarity metric to compare molecules during de novo design, as a scoring function to rank-order virtual screening results, and for the analysis of trajectories from molecular dynamics simulation. The new method, including source code, is available to registered users of DOCK6 (http://dock.compbio.ucsf.edu). PMID:24410429
McDonnell, Mark D; Mohan, Ashutosh; Stricker, Christian
2013-01-01
The release of neurotransmitter vesicles after arrival of a pre-synaptic action potential (AP) at cortical synapses is known to be a stochastic process, as is the availability of vesicles for release. These processes are known to also depend on the recent history of AP arrivals, and this can be described in terms of time-varying probabilities of vesicle release. Mathematical models of such synaptic dynamics frequently are based only on the mean number of vesicles released by each pre-synaptic AP, since if it is assumed there are sufficiently many vesicle sites, then variance is small. However, it has been shown recently that variance across sites can be significant for neuron and network dynamics, and this suggests the potential importance of studying short-term plasticity using simulations that do generate trial-to-trial variability. Therefore, in this paper we study several well-known conceptual models for stochastic availability and release. We state explicitly the random variables that these models describe and propose efficient algorithms for accurately implementing stochastic simulations of these random variables in software or hardware. Our results are complemented by mathematical analysis and statement of pseudo-code algorithms.
McDonnell, Mark D.; Mohan, Ashutosh; Stricker, Christian
2013-01-01
The release of neurotransmitter vesicles after arrival of a pre-synaptic action potential (AP) at cortical synapses is known to be a stochastic process, as is the availability of vesicles for release. These processes are known to also depend on the recent history of AP arrivals, and this can be described in terms of time-varying probabilities of vesicle release. Mathematical models of such synaptic dynamics frequently are based only on the mean number of vesicles released by each pre-synaptic AP, since if it is assumed there are sufficiently many vesicle sites, then variance is small. However, it has been shown recently that variance across sites can be significant for neuron and network dynamics, and this suggests the potential importance of studying short-term plasticity using simulations that do generate trial-to-trial variability. Therefore, in this paper we study several well-known conceptual models for stochastic availability and release. We state explicitly the random variables that these models describe and propose efficient algorithms for accurately implementing stochastic simulations of these random variables in software or hardware. Our results are complemented by mathematical analysis and statement of pseudo-code algorithms. PMID:23675343
NASA Astrophysics Data System (ADS)
Baker, B. I.; Friberg, P. A.
2014-12-01
Modern seismic networks typically deploy three component (3C) sensors, but still fail to utilize all of the information available in the seismograms when performing automated phase picking for real-time event location. In most cases a variation on a short term over long term average threshold detector is used for picking and then an association program is used to assign phase types to the picks. However, the 3C waveforms from an earthquake contain an abundance of information related to the P and S phases in both their polarization and energy partitioning. An approach that has been overlooked and has demonstrated encouraging results is the Component Energy Comparison Method (CECM) by Nagano et al. as published in Geophysics 1989. CECM is well suited to being used in real-time because the calculation is not computationally intensive. Furthermore, the CECM method has fewer tuning variables (3) than traditional pickers in Earthworm such as the Rex Allen algorithm (N=18) or even the Anthony Lomax Filter Picker module (N=5). In addition to computing the CECM detector we study the detector sensitivity by rotating the signal into principle components as well as estimating the P phase onset from a curvature function describing the CECM as opposed to the CECM itself. We present our results implementing this algorithm in a real-time module for Earthworm and show the improved phase picks as compared to the traditional single component pickers using Earthworm.
Allen, William J; Rizzo, Robert C
2014-02-24
False negative docking outcomes for highly symmetric molecules are a barrier to the accurate evaluation of docking programs, scoring functions, and protocols. This work describes an implementation of a symmetry-corrected root-mean-square deviation (RMSD) method into the program DOCK based on the Hungarian algorithm for solving the minimum assignment problem, which dynamically assigns atom correspondence in molecules with symmetry. The algorithm adds only a trivial amount of computation time to the RMSD calculations and is shown to increase the reported overall docking success rate by approximately 5% when tested over 1043 receptor-ligand systems. For some families of protein systems the results are even more dramatic, with success rate increases up to 16.7%. Several additional applications of the method are also presented including as a pairwise similarity metric to compare molecules during de novo design, as a scoring function to rank-order virtual screening results, and for the analysis of trajectories from molecular dynamics simulation. The new method, including source code, is available to registered users of DOCK6 ( http://dock.compbio.ucsf.edu ).
NASA Astrophysics Data System (ADS)
Raghunath, N.; Faber, T. L.; Suryanarayanan, S.; Votaw, J. R.
2009-02-01
Image quality is significantly degraded even by small amounts of patient motion in very high-resolution PET scanners. When patient motion is known, deconvolution methods can be used to correct the reconstructed image and reduce motion blur. This paper describes the implementation and optimization of an iterative deconvolution method that uses an ordered subset approach to make it practical and clinically viable. We performed ten separate FDG PET scans using the Hoffman brain phantom and simultaneously measured its motion using the Polaris Vicra tracking system (Northern Digital Inc., Ontario, Canada). The feasibility and effectiveness of the technique was studied by performing scans with different motion and deconvolution parameters. Deconvolution resulted in visually better images and significant improvement as quantified by the Universal Quality Index (UQI) and contrast measures. Finally, the technique was applied to human studies to demonstrate marked improvement. Thus, the deconvolution technique presented here appears promising as a valid alternative to existing motion correction methods for PET. It has the potential for deblurring an image from any modality if the causative motion is known and its effect can be represented in a system matrix.
NASA Astrophysics Data System (ADS)
Jeon, Seok-Hee; Gil, Sang-Keun
2016-12-01
We propose an optical design of cipher block chaining (CBC) encryption mode using digital holography, which is implemented by the two-step quadrature phase-shifting digital holographic encryption technique using orthogonal polarization. A block of plain text is encrypted with the encryption key by applying the two-step phase-shifting digital holographic method; then, it is changed into cipher text blocks which are digital holograms. Optically, these digital holograms with the encrypted information are Fourier transform holograms and are recorded onto charge-coupled devices with 256 quantization gray levels. This means that the proposed optical CBC encryption is a scheme that has an analog-type of pseudorandom pattern information in the cipher text, while the conventional electronic CBC encryption is a kind of bitwise block message encryption processed by digital bits. Also, the proposed method enables the cryptosystem to have higher security strength and faster processing than the conventional electronic method because of the large two-dimensional (2-D) array key space and parallel processing. The results of computer simulations verify that the proposed optical CBC encryption design is very effective in CBC mode due to fast and secure optical encryption of 2-D data and shows the feasibility for the CBC encryption mode.
An implementation of the TFQMR - algorithm on a distributed memory machine
Buecker, M.
1994-12-31
A fundamental task of numerical computing is to solve linear systems A{rvec x} = {rvec b}, which can be labeled as (1). Such systems are essential parts of many scientific problems, e.g. finite element methods contain linear systems to approximate partial differential equations. Linear systems can be solved by direct as well as iterative methods. Using a direct method means to factorize the coefficient matrix. A typical and well known direct method is the Gaussian elimination. Direct methods work well as long as the systems remain small. Unfortunately, there is need for the solution of large linear systems. When systems get large direct methods result in enormous computing time because of their complexity. As an example take (1) with a N x N matrix A. Then the solution of (1) can be computed by Gaussian elimination requiring O(N{sup 3}) operations. Additionally, an implementation of a direct method has to take care of the high storage requirement. In contrast to direct methods, iterative methods use successive approximations to obtain more accurate solutions to a linear system at each step. The iteration process generates a sequence of iterates x{sub n} converging to the solution x of (1). The iteration process ends if either x{sub n} fulfills a chosen convergence criterion or breakdowns, i.e. division by zero, occur. Possible breakdowns can be avoided by look-ahead techniques.
Automated infrasound signal detection algorithms implemented in MatSeis - Infra Tool.
Hart, Darren
2004-07-01
MatSeis's infrasound analysis tool, Infra Tool, uses frequency slowness processing to deconstruct the array data into three outputs per processing step: correlation, azimuth and slowness. Until now, an experienced analyst trained to recognize a pattern observed in outputs from signal processing manually accomplished infrasound signal detection. Our goal was to automate the process of infrasound signal detection. The critical aspect of infrasound signal detection is to identify consecutive processing steps where the azimuth is constant (flat) while the time-lag correlation of the windowed waveform is above background value. These two statements describe the arrival of a correlated set of wavefronts at an array. The Hough Transform and Inverse Slope methods are used to determine the representative slope for a specified number of azimuth data points. The representative slope is then used in conjunction with associated correlation value and azimuth data variance to determine if and when an infrasound signal was detected. A format for an infrasound signal detection output file is also proposed. The detection output file will list the processed array element names, followed by detection characteristics for each method. Each detection is supplied with a listing of frequency slowness processing characteristics: human time (YYYY/MM/DD HH:MM:SS.SSS), epochal time, correlation, fstat, azimuth (deg) and trace velocity (km/s). As an example, a ground truth event was processed using the four-element DLIAR infrasound array located in New Mexico. The event is known as the Watusi chemical explosion, which occurred on 2002/09/28 at 21:25:17 with an explosive yield of 38,000 lb TNT equivalent. Knowing the source and array location, the array-to-event distance was computed to be approximately 890 km. This test determined the station-to-event azimuth (281.8 and 282.1 degrees) to within 1.6 and 1.4 degrees for the Inverse Slope and Hough Transform detection algorithms, respectively, and
Cickovski, Trevor; Flor, Tiffany; Irving-Sachs, Galen; Novikov, Philip; Parda, James; Narasimhan, Giri
2015-01-01
In order to make multiple copies of a target sequence in the laboratory, the technique of Polymerase Chain Reaction (PCR) requires the design of "primers", which are short fragments of nucleotides complementary to the flanking regions of the target sequence. If the same primer is to amplify multiple closely related target sequences, then it is necessary to make the primers "degenerate", which would allow it to hybridize to target sequences with a limited amount of variability that may have been caused by mutations. However, the PCR technique can only allow a limited amount of degeneracy, and therefore the design of degenerate primers requires the identification of reasonably well-conserved regions in the input sequences. We take an existing algorithm for designing degenerate primers that is based on clustering and parallelize it in a web-accessible software package GPUDePiCt, using a shared memory model and the computing power of Graphics Processing Units (GPUs). We test our implementation on large sets of aligned sequences from the human genome and show a multi-fold speedup for clustering using our hybrid GPU/CPU implementation over a pure CPU approach for these sequences, which consist of more than 7,500 nucleotides. We also demonstrate that this speedup is consistent over larger numbers and longer lengths of aligned sequences.
Implementation of a parallel algorithm for thermo-chemical nonequilibrium flow simulations
NASA Astrophysics Data System (ADS)
Wong, C. C.; Blottner, F. G.; Payne, J. L.; Soetrisno, M.
1995-01-01
Massively parallel (MP) computing is considered to be the future direction of high performance computing. When engineers apply this new MP computing technology to solve large-scale problems, one major interest is what is the maximum problem size that a MP computer can handle. To determine the maximum size, it is important to address the code scalability issue. Scalability implies whether the code can provide an increase in performance proportional to an increase in problem size. If the size of the problem increases, by utilizing more computer nodes, the ideal elapsed time to simulate a problem should not increase much. Hence one important task in the development of the MP computing technology is to ensure scalability. A scalable code is an efficient code. In order to obtain good scaled performance, it is necessary to first have the code optimized for a single node performance before proceeding to a large-scale simulation with a large number of computer nodes. This paper will discuss the implementation of a massively parallel computing strategy and the process of optimization to improve the scaled performance. Specifically, we will look at domain decomposition, resource management in the code, communication overhead, and problem mapping. By incorporating these improvements and adopting an efficient MP computing strategy, an efficiency of about 85% and 96%, respectively, has been achieved using 64 nodes on MP computers for both perfect gas and chemically reactive gas problems. A comparison of the performance between MP computers and a vectorized computer, such as Cray-YMP, will also be presented.
Cui, Ganglong; Yang, Weitao
2011-05-28
The significance of conical intersections in photophysics, photochemistry, and photodissociation of polyatomic molecules in gas phase has been demonstrated by numerous experimental and theoretical studies. Optimization of conical intersections of small- and medium-size molecules in gas phase has currently become a routine optimization process, as it has been implemented in many electronic structure packages. However, optimization of conical intersections of small- and medium-size molecules in solution or macromolecules remains inefficient, even poorly defined, due to large number of degrees of freedom and costly evaluations of gradient difference and nonadiabatic coupling vectors. In this work, based on the sequential quantum mechanics and molecular mechanics (QM/MM) and QM/MM-minimum free energy path methods, we have designed two conical intersection optimization methods for small- and medium-size molecules in solution or macromolecules. The first one is sequential QM conical intersection optimization and MM minimization for potential energy surfaces; the second one is sequential QM conical intersection optimization and MM sampling for potential of mean force surfaces, i.e., free energy surfaces. In such methods, the region where electronic structures change remarkably is placed into the QM subsystem, while the rest of the system is placed into the MM subsystem; thus, dimensionalities of gradient difference and nonadiabatic coupling vectors are decreased due to the relatively small QM subsystem. Furthermore, in comparison with the concurrent optimization scheme, sequential QM conical intersection optimization and MM minimization or sampling reduce the number of evaluations of gradient difference and nonadiabatic coupling vectors because these vectors need to be calculated only when the QM subsystem moves, independent of the MM minimization or sampling. Taken together, costly evaluations of gradient difference and nonadiabatic coupling vectors in solution or
Gan, Yixiang; Kamlah, Marc
2008-07-01
In this investigation, a thermo-mechanical model of pebble beds is adopted and developed based on experiments by Dr. Reimann at Forschungszentrum Karlsruhe (FZK). The framework of the present material model is composed of a non-linear elastic law, the Drucker-Prager-Cap theory, and a modified creep law. Furthermore, the volumetric inelastic strain dependent thermal conductivity of beryllium pebble beds is taken into account and full thermo-mechanical coupling is considered. Investigation showed that the Drucker-Prager-Cap model implemented in ABAQUS can not fulfill the requirements of both the prediction of large creep strains and the hardening behaviour caused by creep, which are of importance with respect to the application of pebble beds in fusion blankets. Therefore, UMAT (user defined material's mechanical behaviour) and UMATHT (user defined material's thermal behaviour) routines are used to re-implement the present thermo-mechanical model in ABAQUS. An elastic predictor radial return mapping algorithm is used to solve the non-associated plasticity iteratively, and a proper tangent stiffness matrix is obtained for cost-efficiency in the calculation. An explicit creep mechanism is adopted for the prediction of time-dependent behaviour in order to represent large creep strains in high temperature. Finally, the thermo-mechanical interactions are implemented in a UMATHT routine for the coupled analysis. The oedometric compression tests and creep tests of pebble beds at different temperatures are simulated with the help of the present UMAT and UMATHT routines, and the comparison between the simulation and the experiments is made. (authors)
NASA Astrophysics Data System (ADS)
Hohil, Myron E.; Desai, Sachi; Morcos, Amir
2006-05-01
(PAWSS) Limited Objective Experiment (LOE) conducted by Joint Project Manager for Nuclear Biological Contamination Avoidance (JPM NBC CA) and a matrixed team from Edgewood Chemical and Biological Center (ECBC) at ranges exceeding 3km. The details of the fieldtest experiment and real-time implementation/integration of the standalone acoustic sensor system are discussed herein.
NASA Astrophysics Data System (ADS)
Hohil, Myron E.; Desai, Sachi; Morcos, Amir
2006-09-01
(PAWSS) Limited Objective Experiment (LOE) conducted by Joint Project Manager for Nuclear Biological Contamination Avoidance (JPM NBC CA) and a matrixed team from Edgewood Chemical and Biological Center (ECBC) at ranges exceeding 3km. The details of the field-test experiment and real-time implementation/integration of the stand-alone acoustic sensor system are discussed herein.
NASA Astrophysics Data System (ADS)
Nguyen-Truong, Hieu T.; Le, Hung M.
2015-06-01
We present in this study a new and robust algorithm for feed-forward neural network (NN) fitting. This method is developed for the application in potential energy surface (PES) construction, in which simultaneous energy-gradient fitting is implemented using the well-established Levenberg-Marquardt (LM) algorithm. Three fitting examples are demonstrated, which include the vibrational PES of H2O, reactive PESs of O3 and ClOOCl. In the three testing cases, our new LM implementation has been shown to work very efficiently. Not only increasing fitting accuracy, it also offers two other advantages: less training iterations are utilized and less data points are required for fitting.
NASA Astrophysics Data System (ADS)
Ullah, Muhammed Zafar
Neural Network and Fuzzy Logic are the two key technologies that have recently received growing attention in solving real world, nonlinear, time variant problems. Because of their learning and/or reasoning capabilities, these techniques do not need a mathematical model of the system, which may be difficult, if not impossible, to obtain for complex systems. One of the major problems in portable or electric vehicle world is secondary cell charging, which shows non-linear characteristics. Portable-electronic equipment, such as notebook computers, cordless and cellular telephones and cordless-electric lawn tools use batteries in increasing numbers. These consumers demand fast charging times, increased battery lifetime and fuel gauge capabilities. All of these demands require that the state-of-charge within a battery be known. Charging secondary cells Fast is a problem, which is difficult to solve using conventional techniques. Charge control is important in fast charging, preventing overcharging and improving battery life. This research work provides a quick and reliable approach to charger design using Neural-Fuzzy technology, which learns the exact battery charging characteristics. Neural-Fuzzy technology is an intelligent combination of neural net with fuzzy logic that learns system behavior by using system input-output data rather than mathematical modeling. The primary objective of this research is to improve the secondary cell charging algorithm and to have faster charging time based on neural network and fuzzy logic technique. Also a new architecture of a controller will be developed for implementing the charging algorithm for the secondary battery.
Sebastiani, Giada
2009-05-14
Chronic hepatitis B and C together with alcoholic and non-alcoholic fatty liver diseases represent the major causes of progressive liver disease that can eventually evolve into cirrhosis and its end-stage complications, including decompensation, bleeding and liver cancer. Formation and accumulation of fibrosis in the liver is the common pathway that leads to an evolutive liver disease. Precise definition of liver fibrosis stage is essential for management of the patient in clinical practice since the presence of bridging fibrosis represents a strong indication for antiviral therapy for chronic viral hepatitis, while cirrhosis requires a specific follow-up including screening for esophageal varices and hepatocellular carcinoma. Liver biopsy has always represented the standard of reference for assessment of hepatic fibrosis but it has some limitations being invasive, costly and prone to sampling errors. Recently, blood markers and instrumental methods have been proposed for the non-invasive assessment of liver fibrosis. However, there are still some doubts as to their implementation in clinical practice and a real consensus on how and when to use them is not still available. This is due to an unsatisfactory accuracy for some of them, and to an incomplete validation for others. Some studies suggest that performance of non-invasive methods for liver fibrosis assessment may increase when they are combined. Combination algorithms of non-invasive methods for assessing liver fibrosis may represent a rational and reliable approach to implement non-invasive assessment of liver fibrosis in clinical practice and to reduce rather than abolish liver biopsies.
Singan, Vasanth R; Simpson, Jeremy C
2016-01-01
Quantitative co-localization studies strengthen the analysis of fluorescence microscopy-based assays and are essential for illustrating and understanding many cellular processes and interactions. In our earlier study, we presented a rank-based intensity weighting scheme for the quantification of co-localization between structures in fluorescence microscopy images. This method, which uses a combined pixel co-occurrence and intensity correlation approach, is superior to conventional algorithms and provides a more accurate quantification of co-localization. In this brief report we provide the source code and implementation of the rank-weighted co-localization (RWC) algorithm in three (two open source and one proprietary) image analysis platforms. The RWC algorithm has been implemented as a plugin for ImageJ, a module for CellProfiler and an Acapella script for Columbus image analysis software tools. We have provided with a web resource from which users can download plugins and modules implementing the RWC algorithm in various commonly used image analysis platforms. The implementations have been designed for easy incorporation into existing tools in a 'ready-for-use' format. The resources can be accessed through the following web link: http://simpsonlab.pbworks.com/w/page/48541482/Bioinformatic_Tools.