Sample records for computational algorithm based

  1. Network Community Detection based on the Physarum-inspired Computational Framework.

    PubMed

    Gao, Chao; Liang, Mingxin; Li, Xianghua; Zhang, Zili; Wang, Zhen; Zhou, Zhili

    2016-12-13

    Community detection is a crucial and essential problem in the structure analytics of complex networks, which can help us understand and predict the characteristics and functions of complex networks. Many methods, ranging from the optimization-based algorithms to the heuristic-based algorithms, have been proposed for solving such a problem. Due to the inherent complexity of identifying network structure, how to design an effective algorithm with a higher accuracy and a lower computational cost still remains an open problem. Inspired by the computational capability and positive feedback mechanism in the wake of foraging process of Physarum, which is a large amoeba-like cell consisting of a dendritic network of tube-like pseudopodia, a general Physarum-based computational framework for community detection is proposed in this paper. Based on the proposed framework, the inter-community edges can be identified from the intra-community edges in a network and the positive feedback of solving process in an algorithm can be further enhanced, which are used to improve the efficiency of original optimization-based and heuristic-based community detection algorithms, respectively. Some typical algorithms (e.g., genetic algorithm, ant colony optimization algorithm, and Markov clustering algorithm) and real-world datasets have been used to estimate the efficiency of our proposed computational framework. Experiments show that the algorithms optimized by Physarum-inspired computational framework perform better than the original ones, in terms of accuracy and computational cost. Moreover, a computational complexity analysis verifies the scalability of our framework.

  2. Efficient 3D geometric and Zernike moments computation from unstructured surface meshes.

    PubMed

    Pozo, José María; Villa-Uriol, Maria-Cruz; Frangi, Alejandro F

    2011-03-01

    This paper introduces and evaluates a fast exact algorithm and a series of faster approximate algorithms for the computation of 3D geometric moments from an unstructured surface mesh of triangles. Being based on the object surface reduces the computational complexity of these algorithms with respect to volumetric grid-based algorithms. In contrast, it can only be applied for the computation of geometric moments of homogeneous objects. This advantage and restriction is shared with other proposed algorithms based on the object boundary. The proposed exact algorithm reduces the computational complexity for computing geometric moments up to order N with respect to previously proposed exact algorithms, from N(9) to N(6). The approximate series algorithm appears as a power series on the rate between triangle size and object size, which can be truncated at any desired degree. The higher the number and quality of the triangles, the better the approximation. This approximate algorithm reduces the computational complexity to N(3). In addition, the paper introduces a fast algorithm for the computation of 3D Zernike moments from the computed geometric moments, with a computational complexity N(4), while the previously proposed algorithm is of order N(6). The error introduced by the proposed approximate algorithms is evaluated in different shapes and the cost-benefit ratio in terms of error, and computational time is analyzed for different moment orders.

  3. QPSO-Based Adaptive DNA Computing Algorithm

    PubMed Central

    Karakose, Mehmet; Cigdem, Ugur

    2013-01-01

    DNA (deoxyribonucleic acid) computing that is a new computation model based on DNA molecules for information storage has been increasingly used for optimization and data analysis in recent years. However, DNA computing algorithm has some limitations in terms of convergence speed, adaptability, and effectiveness. In this paper, a new approach for improvement of DNA computing is proposed. This new approach aims to perform DNA computing algorithm with adaptive parameters towards the desired goal using quantum-behaved particle swarm optimization (QPSO). Some contributions provided by the proposed QPSO based on adaptive DNA computing algorithm are as follows: (1) parameters of population size, crossover rate, maximum number of operations, enzyme and virus mutation rate, and fitness function of DNA computing algorithm are simultaneously tuned for adaptive process, (2) adaptive algorithm is performed using QPSO algorithm for goal-driven progress, faster operation, and flexibility in data, and (3) numerical realization of DNA computing algorithm with proposed approach is implemented in system identification. Two experiments with different systems were carried out to evaluate the performance of the proposed approach with comparative results. Experimental results obtained with Matlab and FPGA demonstrate ability to provide effective optimization, considerable convergence speed, and high accuracy according to DNA computing algorithm. PMID:23935409

  4. A review of classification algorithms for EEG-based brain-computer interfaces.

    PubMed

    Lotte, F; Congedo, M; Lécuyer, A; Lamarche, F; Arnaldi, B

    2007-06-01

    In this paper we review classification algorithms used to design brain-computer interface (BCI) systems based on electroencephalography (EEG). We briefly present the commonly employed algorithms and describe their critical properties. Based on the literature, we compare them in terms of performance and provide guidelines to choose the suitable classification algorithm(s) for a specific BCI.

  5. A High Performance Cloud-Based Protein-Ligand Docking Prediction Algorithm

    PubMed Central

    Chen, Jui-Le; Yang, Chu-Sing

    2013-01-01

    The potential of predicting druggability for a particular disease by integrating biological and computer science technologies has witnessed success in recent years. Although the computer science technologies can be used to reduce the costs of the pharmaceutical research, the computation time of the structure-based protein-ligand docking prediction is still unsatisfied until now. Hence, in this paper, a novel docking prediction algorithm, named fast cloud-based protein-ligand docking prediction algorithm (FCPLDPA), is presented to accelerate the docking prediction algorithm. The proposed algorithm works by leveraging two high-performance operators: (1) the novel migration (information exchange) operator is designed specially for cloud-based environments to reduce the computation time; (2) the efficient operator is aimed at filtering out the worst search directions. Our simulation results illustrate that the proposed method outperforms the other docking algorithms compared in this paper in terms of both the computation time and the quality of the end result. PMID:23762864

  6. A cross-disciplinary introduction to quantum annealing-based algorithms

    NASA Astrophysics Data System (ADS)

    Venegas-Andraca, Salvador E.; Cruz-Santos, William; McGeoch, Catherine; Lanzagorta, Marco

    2018-04-01

    A central goal in quantum computing is the development of quantum hardware and quantum algorithms in order to analyse challenging scientific and engineering problems. Research in quantum computation involves contributions from both physics and computer science; hence this article presents a concise introduction to basic concepts from both fields that are used in annealing-based quantum computation, an alternative to the more familiar quantum gate model. We introduce some concepts from computer science required to define difficult computational problems and to realise the potential relevance of quantum algorithms to find novel solutions to those problems. We introduce the structure of quantum annealing-based algorithms as well as two examples of this kind of algorithms for solving instances of the max-SAT and Minimum Multicut problems. An overview of the quantum annealing systems manufactured by D-Wave Systems is also presented.

  7. Reversible Data Hiding Based on DNA Computing

    PubMed Central

    Xie, Yingjie

    2017-01-01

    Biocomputing, especially DNA, computing has got great development. It is widely used in information security. In this paper, a novel algorithm of reversible data hiding based on DNA computing is proposed. Inspired by the algorithm of histogram modification, which is a classical algorithm for reversible data hiding, we combine it with DNA computing to realize this algorithm based on biological technology. Compared with previous results, our experimental results have significantly improved the ER (Embedding Rate). Furthermore, some PSNR (peak signal-to-noise ratios) of test images are also improved. Experimental results show that it is suitable for protecting the copyright of cover image in DNA-based information security. PMID:28280504

  8. Parallel Directionally Split Solver Based on Reformulation of Pipelined Thomas Algorithm

    NASA Technical Reports Server (NTRS)

    Povitsky, A.

    1998-01-01

    In this research an efficient parallel algorithm for 3-D directionally split problems is developed. The proposed algorithm is based on a reformulated version of the pipelined Thomas algorithm that starts the backward step computations immediately after the completion of the forward step computations for the first portion of lines This algorithm has data available for other computational tasks while processors are idle from the Thomas algorithm. The proposed 3-D directionally split solver is based on the static scheduling of processors where local and non-local, data-dependent and data-independent computations are scheduled while processors are idle. A theoretical model of parallelization efficiency is used to define optimal parameters of the algorithm, to show an asymptotic parallelization penalty and to obtain an optimal cover of a global domain with subdomains. It is shown by computational experiments and by the theoretical model that the proposed algorithm reduces the parallelization penalty about two times over the basic algorithm for the range of the number of processors (subdomains) considered and the number of grid nodes per subdomain.

  9. Edge Pushing is Equivalent to Vertex Elimination for Computing Hessians

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Mu; Pothen, Alex; Hovland, Paul

    We prove the equivalence of two different Hessian evaluation algorithms in AD. The first is the Edge Pushing algorithm of Gower and Mello, which may be viewed as a second order Reverse mode algorithm for computing the Hessian. In earlier work, we have derived the Edge Pushing algorithm by exploiting a Reverse mode invariant based on the concept of live variables in compiler theory. The second algorithm is based on eliminating vertices in a computational graph of the gradient, in which intermediate variables are successively eliminated from the graph, and the weights of the edges are updated suitably. We provemore » that if the vertices are eliminated in a reverse topological order while preserving symmetry in the computational graph of the gradient, then the Vertex Elimination algorithm and the Edge Pushing algorithm perform identical computations. In this sense, the two algorithms are equivalent. This insight that unifies two seemingly disparate approaches to Hessian computations could lead to improved algorithms and implementations for computing Hessians. Read More: http://epubs.siam.org/doi/10.1137/1.9781611974690.ch11« less

  10. Algorithm-Based Fault Tolerance Integrated with Replication

    NASA Technical Reports Server (NTRS)

    Some, Raphael; Rennels, David

    2008-01-01

    In a proposed approach to programming and utilization of commercial off-the-shelf computing equipment, a combination of algorithm-based fault tolerance (ABFT) and replication would be utilized to obtain high degrees of fault tolerance without incurring excessive costs. The basic idea of the proposed approach is to integrate ABFT with replication such that the algorithmic portions of computations would be protected by ABFT, and the logical portions by replication. ABFT is an extremely efficient, inexpensive, high-coverage technique for detecting and mitigating faults in computer systems used for algorithmic computations, but does not protect against errors in logical operations surrounding algorithms.

  11. A highly efficient multi-core algorithm for clustering extremely large datasets

    PubMed Central

    2010-01-01

    Background In recent years, the demand for computational power in computational biology has increased due to rapidly growing data sets from microarray and other high-throughput technologies. This demand is likely to increase. Standard algorithms for analyzing data, such as cluster algorithms, need to be parallelized for fast processing. Unfortunately, most approaches for parallelizing algorithms largely rely on network communication protocols connecting and requiring multiple computers. One answer to this problem is to utilize the intrinsic capabilities in current multi-core hardware to distribute the tasks among the different cores of one computer. Results We introduce a multi-core parallelization of the k-means and k-modes cluster algorithms based on the design principles of transactional memory for clustering gene expression microarray type data and categorial SNP data. Our new shared memory parallel algorithms show to be highly efficient. We demonstrate their computational power and show their utility in cluster stability and sensitivity analysis employing repeated runs with slightly changed parameters. Computation speed of our Java based algorithm was increased by a factor of 10 for large data sets while preserving computational accuracy compared to single-core implementations and a recently published network based parallelization. Conclusions Most desktop computers and even notebooks provide at least dual-core processors. Our multi-core algorithms show that using modern algorithmic concepts, parallelization makes it possible to perform even such laborious tasks as cluster sensitivity and cluster number estimation on the laboratory computer. PMID:20370922

  12. Microscope self-calibration based on micro laser line imaging and soft computing algorithms

    NASA Astrophysics Data System (ADS)

    Apolinar Muñoz Rodríguez, J.

    2018-06-01

    A technique to perform microscope self-calibration via micro laser line and soft computing algorithms is presented. In this technique, the microscope vision parameters are computed by means of soft computing algorithms based on laser line projection. To implement the self-calibration, a microscope vision system is constructed by means of a CCD camera and a 38 μm laser line. From this arrangement, the microscope vision parameters are represented via Bezier approximation networks, which are accomplished through the laser line position. In this procedure, a genetic algorithm determines the microscope vision parameters by means of laser line imaging. Also, the approximation networks compute the three-dimensional vision by means of the laser line position. Additionally, the soft computing algorithms re-calibrate the vision parameters when the microscope vision system is modified during the vision task. The proposed self-calibration improves accuracy of the traditional microscope calibration, which is accomplished via external references to the microscope system. The capability of the self-calibration based on soft computing algorithms is determined by means of the calibration accuracy and the micro-scale measurement error. This contribution is corroborated by an evaluation based on the accuracy of the traditional microscope calibration.

  13. Molecular Isotopic Distribution Analysis (MIDAs) with Adjustable Mass Accuracy

    NASA Astrophysics Data System (ADS)

    Alves, Gelio; Ogurtsov, Aleksey Y.; Yu, Yi-Kuo

    2014-01-01

    In this paper, we present Molecular Isotopic Distribution Analysis (MIDAs), a new software tool designed to compute molecular isotopic distributions with adjustable accuracies. MIDAs offers two algorithms, one polynomial-based and one Fourier-transform-based, both of which compute molecular isotopic distributions accurately and efficiently. The polynomial-based algorithm contains few novel aspects, whereas the Fourier-transform-based algorithm consists mainly of improvements to other existing Fourier-transform-based algorithms. We have benchmarked the performance of the two algorithms implemented in MIDAs with that of eight software packages (BRAIN, Emass, Mercury, Mercury5, NeutronCluster, Qmass, JFC, IC) using a consensus set of benchmark molecules. Under the proposed evaluation criteria, MIDAs's algorithms, JFC, and Emass compute with comparable accuracy the coarse-grained (low-resolution) isotopic distributions and are more accurate than the other software packages. For fine-grained isotopic distributions, we compared IC, MIDAs's polynomial algorithm, and MIDAs's Fourier transform algorithm. Among the three, IC and MIDAs's polynomial algorithm compute isotopic distributions that better resemble their corresponding exact fine-grained (high-resolution) isotopic distributions. MIDAs can be accessed freely through a user-friendly web-interface at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/midas/index.html.

  14. Molecular Isotopic Distribution Analysis (MIDAs) with adjustable mass accuracy.

    PubMed

    Alves, Gelio; Ogurtsov, Aleksey Y; Yu, Yi-Kuo

    2014-01-01

    In this paper, we present Molecular Isotopic Distribution Analysis (MIDAs), a new software tool designed to compute molecular isotopic distributions with adjustable accuracies. MIDAs offers two algorithms, one polynomial-based and one Fourier-transform-based, both of which compute molecular isotopic distributions accurately and efficiently. The polynomial-based algorithm contains few novel aspects, whereas the Fourier-transform-based algorithm consists mainly of improvements to other existing Fourier-transform-based algorithms. We have benchmarked the performance of the two algorithms implemented in MIDAs with that of eight software packages (BRAIN, Emass, Mercury, Mercury5, NeutronCluster, Qmass, JFC, IC) using a consensus set of benchmark molecules. Under the proposed evaluation criteria, MIDAs's algorithms, JFC, and Emass compute with comparable accuracy the coarse-grained (low-resolution) isotopic distributions and are more accurate than the other software packages. For fine-grained isotopic distributions, we compared IC, MIDAs's polynomial algorithm, and MIDAs's Fourier transform algorithm. Among the three, IC and MIDAs's polynomial algorithm compute isotopic distributions that better resemble their corresponding exact fine-grained (high-resolution) isotopic distributions. MIDAs can be accessed freely through a user-friendly web-interface at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/midas/index.html.

  15. Three-dimensional photoacoustic tomography based on graphics-processing-unit-accelerated finite element method.

    PubMed

    Peng, Kuan; He, Ling; Zhu, Ziqiang; Tang, Jingtian; Xiao, Jiaying

    2013-12-01

    Compared with commonly used analytical reconstruction methods, the frequency-domain finite element method (FEM) based approach has proven to be an accurate and flexible algorithm for photoacoustic tomography. However, the FEM-based algorithm is computationally demanding, especially for three-dimensional cases. To enhance the algorithm's efficiency, in this work a parallel computational strategy is implemented in the framework of the FEM-based reconstruction algorithm using a graphic-processing-unit parallel frame named the "compute unified device architecture." A series of simulation experiments is carried out to test the accuracy and accelerating effect of the improved method. The results obtained indicate that the parallel calculation does not change the accuracy of the reconstruction algorithm, while its computational cost is significantly reduced by a factor of 38.9 with a GTX 580 graphics card using the improved method.

  16. On Parallel Push-Relabel based Algorithms for Bipartite Maximum Matching

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Langguth, Johannes; Azad, Md Ariful; Halappanavar, Mahantesh

    2014-07-01

    We study multithreaded push-relabel based algorithms for computing maximum cardinality matching in bipartite graphs. Matching is a fundamental combinatorial (graph) problem with applications in a wide variety of problems in science and engineering. We are motivated by its use in the context of sparse linear solvers for computing maximum transversal of a matrix. We implement and test our algorithms on several multi-socket multicore systems and compare their performance to state-of-the-art augmenting path-based serial and parallel algorithms using a testset comprised of a wide range of real-world instances. Building on several heuristics for enhancing performance, we demonstrate good scaling for themore » parallel push-relabel algorithm. We show that it is comparable to the best augmenting path-based algorithms for bipartite matching. To the best of our knowledge, this is the first extensive study of multithreaded push-relabel based algorithms. In addition to a direct impact on the applications using matching, the proposed algorithmic techniques can be extended to preflow-push based algorithms for computing maximum flow in graphs.« less

  17. Study on the algorithm of computational ghost imaging based on discrete fourier transform measurement matrix

    NASA Astrophysics Data System (ADS)

    Zhang, Leihong; Liang, Dong; Li, Bei; Kang, Yi; Pan, Zilan; Zhang, Dawei; Gao, Xiumin; Ma, Xiuhua

    2016-07-01

    On the basis of analyzing the cosine light field with determined analytic expression and the pseudo-inverse method, the object is illuminated by a presetting light field with a determined discrete Fourier transform measurement matrix, and the object image is reconstructed by the pseudo-inverse method. The analytic expression of the algorithm of computational ghost imaging based on discrete Fourier transform measurement matrix is deduced theoretically, and compared with the algorithm of compressive computational ghost imaging based on random measurement matrix. The reconstruction process and the reconstruction error are analyzed. On this basis, the simulation is done to verify the theoretical analysis. When the sampling measurement number is similar to the number of object pixel, the rank of discrete Fourier transform matrix is the same as the one of the random measurement matrix, the PSNR of the reconstruction image of FGI algorithm and PGI algorithm are similar, the reconstruction error of the traditional CGI algorithm is lower than that of reconstruction image based on FGI algorithm and PGI algorithm. As the decreasing of the number of sampling measurement, the PSNR of reconstruction image based on FGI algorithm decreases slowly, and the PSNR of reconstruction image based on PGI algorithm and CGI algorithm decreases sharply. The reconstruction time of FGI algorithm is lower than that of other algorithms and is not affected by the number of sampling measurement. The FGI algorithm can effectively filter out the random white noise through a low-pass filter and realize the reconstruction denoising which has a higher denoising capability than that of the CGI algorithm. The FGI algorithm can improve the reconstruction accuracy and the reconstruction speed of computational ghost imaging.

  18. Real-time polarization imaging algorithm for camera-based polarization navigation sensors.

    PubMed

    Lu, Hao; Zhao, Kaichun; You, Zheng; Huang, Kaoli

    2017-04-10

    Biologically inspired polarization navigation is a promising approach due to its autonomous nature, high precision, and robustness. Many researchers have built point source-based and camera-based polarization navigation prototypes in recent years. Camera-based prototypes can benefit from their high spatial resolution but incur a heavy computation load. The pattern recognition algorithm in most polarization imaging algorithms involves several nonlinear calculations that impose a significant computation burden. In this paper, the polarization imaging and pattern recognition algorithms are optimized through reduction to several linear calculations by exploiting the orthogonality of the Stokes parameters without affecting precision according to the features of the solar meridian and the patterns of the polarized skylight. The algorithm contains a pattern recognition algorithm with a Hough transform as well as orientation measurement algorithms. The algorithm was loaded and run on a digital signal processing system to test its computational complexity. The test showed that the running time decreased to several tens of milliseconds from several thousand milliseconds. Through simulations and experiments, it was found that the algorithm can measure orientation without reducing precision. It can hence satisfy the practical demands of low computational load and high precision for use in embedded systems.

  19. Postprocessing of Voxel-Based Topologies for Additive Manufacturing Using the Computational Geometry Algorithms Library (CGAL)

    DTIC Science & Technology

    2015-06-01

    10-2014 to 00-11-2014 4. TITLE AND SUBTITLE Postprocessing of Voxel-Based Topologies for Additive Manufacturing Using the Computational Geometry...ABSTRACT Postprocessing of 3-dimensional (3-D) topologies that are defined as a set of voxels using the Computational Geometry Algorithms Library (CGAL... computational geometry algorithms, several of which are suited to the task. The work flow described in this report involves first defining a set of

  20. CAD system for footwear design based on whole real 3D data of last surface

    NASA Astrophysics Data System (ADS)

    Song, Wanzhong; Su, Xianyu

    2000-10-01

    Two major parts of application of CAD in footwear design are studied: the development of last surface; computer-aided design of planar shoe-template. A new quasi-experiential development algorithm of last surface based on triangulation approximation is presented. This development algorithm consumes less time and does not need any interactive operation for precisely development compared with other development algorithm of last surface. Based on this algorithm, a software, SHOEMAKERTM, which contains computer aided automatic measurement, automatic development of last surface and computer aide design of shoe-template has been developed.

  1. On computation of Gröbner bases for linear difference systems

    NASA Astrophysics Data System (ADS)

    Gerdt, Vladimir P.

    2006-04-01

    In this paper, we present an algorithm for computing Gröbner bases of linear ideals in a difference polynomial ring over a ground difference field. The input difference polynomials generating the ideal are also assumed to be linear. The algorithm is an adaptation to difference ideals of our polynomial algorithm based on Janet-like reductions.

  2. Multirate-based fast parallel algorithms for 2-D DHT-based real-valued discrete Gabor transform.

    PubMed

    Tao, Liang; Kwan, Hon Keung

    2012-07-01

    Novel algorithms for the multirate and fast parallel implementation of the 2-D discrete Hartley transform (DHT)-based real-valued discrete Gabor transform (RDGT) and its inverse transform are presented in this paper. A 2-D multirate-based analysis convolver bank is designed for the 2-D RDGT, and a 2-D multirate-based synthesis convolver bank is designed for the 2-D inverse RDGT. The parallel channels in each of the two convolver banks have a unified structure and can apply the 2-D fast DHT algorithm to speed up their computations. The computational complexity of each parallel channel is low and is independent of the Gabor oversampling rate. All the 2-D RDGT coefficients of an image are computed in parallel during the analysis process and can be reconstructed in parallel during the synthesis process. The computational complexity and time of the proposed parallel algorithms are analyzed and compared with those of the existing fastest algorithms for 2-D discrete Gabor transforms. The results indicate that the proposed algorithms are the fastest, which make them attractive for real-time image processing.

  3. Computing Gröbner Bases within Linear Algebra

    NASA Astrophysics Data System (ADS)

    Suzuki, Akira

    In this paper, we present an alternative algorithm to compute Gröbner bases, which is based on computations on sparse linear algebra. Both of S-polynomial computations and monomial reductions are computed in linear algebra simultaneously in this algorithm. So it can be implemented to any computational system which can handle linear algebra. For a given ideal in a polynomial ring, it calculates a Gröbner basis along with the corresponding term order appropriately.

  4. Developing Subdomain Allocation Algorithms Based on Spatial and Communicational Constraints to Accelerate Dust Storm Simulation

    PubMed Central

    Gui, Zhipeng; Yu, Manzhu; Yang, Chaowei; Jiang, Yunfeng; Chen, Songqing; Xia, Jizhe; Huang, Qunying; Liu, Kai; Li, Zhenlong; Hassan, Mohammed Anowarul; Jin, Baoxuan

    2016-01-01

    Dust storm has serious disastrous impacts on environment, human health, and assets. The developments and applications of dust storm models have contributed significantly to better understand and predict the distribution, intensity and structure of dust storms. However, dust storm simulation is a data and computing intensive process. To improve the computing performance, high performance computing has been widely adopted by dividing the entire study area into multiple subdomains and allocating each subdomain on different computing nodes in a parallel fashion. Inappropriate allocation may introduce imbalanced task loads and unnecessary communications among computing nodes. Therefore, allocation is a key factor that may impact the efficiency of parallel process. An allocation algorithm is expected to consider the computing cost and communication cost for each computing node to minimize total execution time and reduce overall communication cost for the entire simulation. This research introduces three algorithms to optimize the allocation by considering the spatial and communicational constraints: 1) an Integer Linear Programming (ILP) based algorithm from combinational optimization perspective; 2) a K-Means and Kernighan-Lin combined heuristic algorithm (K&K) integrating geometric and coordinate-free methods by merging local and global partitioning; 3) an automatic seeded region growing based geometric and local partitioning algorithm (ASRG). The performance and effectiveness of the three algorithms are compared based on different factors. Further, we adopt the K&K algorithm as the demonstrated algorithm for the experiment of dust model simulation with the non-hydrostatic mesoscale model (NMM-dust) and compared the performance with the MPI default sequential allocation. The results demonstrate that K&K method significantly improves the simulation performance with better subdomain allocation. This method can also be adopted for other relevant atmospheric and numerical modeling. PMID:27044039

  5. GPU-based parallel algorithm for blind image restoration using midfrequency-based methods

    NASA Astrophysics Data System (ADS)

    Xie, Lang; Luo, Yi-han; Bao, Qi-liang

    2013-08-01

    GPU-based general-purpose computing is a new branch of modern parallel computing, so the study of parallel algorithms specially designed for GPU hardware architecture is of great significance. In order to solve the problem of high computational complexity and poor real-time performance in blind image restoration, the midfrequency-based algorithm for blind image restoration was analyzed and improved in this paper. Furthermore, a midfrequency-based filtering method is also used to restore the image hardly with any recursion or iteration. Combining the algorithm with data intensiveness, data parallel computing and GPU execution model of single instruction and multiple threads, a new parallel midfrequency-based algorithm for blind image restoration is proposed in this paper, which is suitable for stream computing of GPU. In this algorithm, the GPU is utilized to accelerate the estimation of class-G point spread functions and midfrequency-based filtering. Aiming at better management of the GPU threads, the threads in a grid are scheduled according to the decomposition of the filtering data in frequency domain after the optimization of data access and the communication between the host and the device. The kernel parallelism structure is determined by the decomposition of the filtering data to ensure the transmission rate to get around the memory bandwidth limitation. The results show that, with the new algorithm, the operational speed is significantly increased and the real-time performance of image restoration is effectively improved, especially for high-resolution images.

  6. Implementation of Multispectral Image Classification on a Remote Adaptive Computer

    NASA Technical Reports Server (NTRS)

    Figueiredo, Marco A.; Gloster, Clay S.; Stephens, Mark; Graves, Corey A.; Nakkar, Mouna

    1999-01-01

    As the demand for higher performance computers for the processing of remote sensing science algorithms increases, the need to investigate new computing paradigms its justified. Field Programmable Gate Arrays enable the implementation of algorithms at the hardware gate level, leading to orders of m a,gnitude performance increase over microprocessor based systems. The automatic classification of spaceborne multispectral images is an example of a computation intensive application, that, can benefit from implementation on an FPGA - based custom computing machine (adaptive or reconfigurable computer). A probabilistic neural network is used here to classify pixels of of a multispectral LANDSAT-2 image. The implementation described utilizes Java client/server application programs to access the adaptive computer from a remote site. Results verify that a remote hardware version of the algorithm (implemented on an adaptive computer) is significantly faster than a local software version of the same algorithm implemented on a typical general - purpose computer).

  7. Development and application of unified algorithms for problems in computational science

    NASA Technical Reports Server (NTRS)

    Shankar, Vijaya; Chakravarthy, Sukumar

    1987-01-01

    A framework is presented for developing computationally unified numerical algorithms for solving nonlinear equations that arise in modeling various problems in mathematical physics. The concept of computational unification is an attempt to encompass efficient solution procedures for computing various nonlinear phenomena that may occur in a given problem. For example, in Computational Fluid Dynamics (CFD), a unified algorithm will be one that allows for solutions to subsonic (elliptic), transonic (mixed elliptic-hyperbolic), and supersonic (hyperbolic) flows for both steady and unsteady problems. The objectives are: development of superior unified algorithms emphasizing accuracy and efficiency aspects; development of codes based on selected algorithms leading to validation; application of mature codes to realistic problems; and extension/application of CFD-based algorithms to problems in other areas of mathematical physics. The ultimate objective is to achieve integration of multidisciplinary technologies to enhance synergism in the design process through computational simulation. Specific unified algorithms for a hierarchy of gas dynamics equations and their applications to two other areas: electromagnetic scattering, and laser-materials interaction accounting for melting.

  8. A Parallel Nonrigid Registration Algorithm Based on B-Spline for Medical Images.

    PubMed

    Du, Xiaogang; Dang, Jianwu; Wang, Yangping; Wang, Song; Lei, Tao

    2016-01-01

    The nonrigid registration algorithm based on B-spline Free-Form Deformation (FFD) plays a key role and is widely applied in medical image processing due to the good flexibility and robustness. However, it requires a tremendous amount of computing time to obtain more accurate registration results especially for a large amount of medical image data. To address the issue, a parallel nonrigid registration algorithm based on B-spline is proposed in this paper. First, the Logarithm Squared Difference (LSD) is considered as the similarity metric in the B-spline registration algorithm to improve registration precision. After that, we create a parallel computing strategy and lookup tables (LUTs) to reduce the complexity of the B-spline registration algorithm. As a result, the computing time of three time-consuming steps including B-splines interpolation, LSD computation, and the analytic gradient computation of LSD, is efficiently reduced, for the B-spline registration algorithm employs the Nonlinear Conjugate Gradient (NCG) optimization method. Experimental results of registration quality and execution efficiency on the large amount of medical images show that our algorithm achieves a better registration accuracy in terms of the differences between the best deformation fields and ground truth and a speedup of 17 times over the single-threaded CPU implementation due to the powerful parallel computing ability of Graphics Processing Unit (GPU).

  9. A class of parallel algorithms for computation of the manipulator inertia matrix

    NASA Technical Reports Server (NTRS)

    Fijany, Amir; Bejczy, Antal K.

    1989-01-01

    Parallel and parallel/pipeline algorithms for computation of the manipulator inertia matrix are presented. An algorithm based on composite rigid-body spatial inertia method, which provides better features for parallelization, is used for the computation of the inertia matrix. Two parallel algorithms are developed which achieve the time lower bound in computation. Also described is the mapping of these algorithms with topological variation on a two-dimensional processor array, with nearest-neighbor connection, and with cardinality variation on a linear processor array. An efficient parallel/pipeline algorithm for the linear array was also developed, but at significantly higher efficiency.

  10. Algorithmic complexity of quantum capacity

    NASA Astrophysics Data System (ADS)

    Oskouei, Samad Khabbazi; Mancini, Stefano

    2018-04-01

    We analyze the notion of quantum capacity from the perspective of algorithmic (descriptive) complexity. To this end, we resort to the concept of semi-computability in order to describe quantum states and quantum channel maps. We introduce algorithmic entropies (like algorithmic quantum coherent information) and derive relevant properties for them. Then we show that quantum capacity based on semi-computable concept equals the entropy rate of algorithmic coherent information, which in turn equals the standard quantum capacity. Thanks to this, we finally prove that the quantum capacity, for a given semi-computable channel, is limit computable.

  11. Range image registration based on hash map and moth-flame optimization

    NASA Astrophysics Data System (ADS)

    Zou, Li; Ge, Baozhen; Chen, Lei

    2018-03-01

    Over the past decade, evolutionary algorithms (EAs) have been introduced to solve range image registration problems because of their robustness and high precision. However, EA-based range image registration algorithms are time-consuming. To reduce the computational time, an EA-based range image registration algorithm using hash map and moth-flame optimization is proposed. In this registration algorithm, a hash map is used to avoid over-exploitation in registration process. Additionally, we present a search equation that is better at exploration and a restart mechanism to avoid being trapped in local minima. We compare the proposed registration algorithm with the registration algorithms using moth-flame optimization and several state-of-the-art EA-based registration algorithms. The experimental results show that the proposed algorithm has a lower computational cost than other algorithms and achieves similar registration precision.

  12. Hybrid Architectures for Evolutionary Computing Algorithms

    DTIC Science & Technology

    2008-01-01

    other EC algorithms to FPGA Core Burns P1026/MAPLD 200532 Genetic Algorithm Hardware References S. Scott, A. Samal , and S. Seth, “HGA: A Hardware Based...on Parallel and Distributed Processing (IPPS/SPDP 󈨦), pp. 316-320, Proceedings. IEEE Computer Society 1998. [12] Scott, S. D. , Samal , A., and...Algorithm Hardware References S. Scott, A. Samal , and S. Seth, “HGA: A Hardware Based Genetic Algorithm”, Proceedings of the 1995 ACM Third

  13. High performance transcription factor-DNA docking with GPU computing

    PubMed Central

    2012-01-01

    Background Protein-DNA docking is a very challenging problem in structural bioinformatics and has important implications in a number of applications, such as structure-based prediction of transcription factor binding sites and rational drug design. Protein-DNA docking is very computational demanding due to the high cost of energy calculation and the statistical nature of conformational sampling algorithms. More importantly, experiments show that the docking quality depends on the coverage of the conformational sampling space. It is therefore desirable to accelerate the computation of the docking algorithm, not only to reduce computing time, but also to improve docking quality. Methods In an attempt to accelerate the sampling process and to improve the docking performance, we developed a graphics processing unit (GPU)-based protein-DNA docking algorithm. The algorithm employs a potential-based energy function to describe the binding affinity of a protein-DNA pair, and integrates Monte-Carlo simulation and a simulated annealing method to search through the conformational space. Algorithmic techniques were developed to improve the computation efficiency and scalability on GPU-based high performance computing systems. Results The effectiveness of our approach is tested on a non-redundant set of 75 TF-DNA complexes and a newly developed TF-DNA docking benchmark. We demonstrated that the GPU-based docking algorithm can significantly accelerate the simulation process and thereby improving the chance of finding near-native TF-DNA complex structures. This study also suggests that further improvement in protein-DNA docking research would require efforts from two integral aspects: improvement in computation efficiency and energy function design. Conclusions We present a high performance computing approach for improving the prediction accuracy of protein-DNA docking. The GPU-based docking algorithm accelerates the search of the conformational space and thus increases the chance of finding more near-native structures. To the best of our knowledge, this is the first ad hoc effort of applying GPU or GPU clusters to the protein-DNA docking problem. PMID:22759575

  14. Image restoration for three-dimensional fluorescence microscopy using an orthonormal basis for efficient representation of depth-variant point-spread functions

    PubMed Central

    Patwary, Nurmohammed; Preza, Chrysanthe

    2015-01-01

    A depth-variant (DV) image restoration algorithm for wide field fluorescence microscopy, using an orthonormal basis decomposition of DV point-spread functions (PSFs), is investigated in this study. The efficient PSF representation is based on a previously developed principal component analysis (PCA), which is computationally intensive. We present an approach developed to reduce the number of DV PSFs required for the PCA computation, thereby making the PCA-based approach computationally tractable for thick samples. Restoration results from both synthetic and experimental images show consistency and that the proposed algorithm addresses efficiently depth-induced aberration using a small number of principal components. Comparison of the PCA-based algorithm with a previously-developed strata-based DV restoration algorithm demonstrates that the proposed method improves performance by 50% in terms of accuracy and simultaneously reduces the processing time by 64% using comparable computational resources. PMID:26504634

  15. A Parallel Nonrigid Registration Algorithm Based on B-Spline for Medical Images

    PubMed Central

    Wang, Yangping; Wang, Song

    2016-01-01

    The nonrigid registration algorithm based on B-spline Free-Form Deformation (FFD) plays a key role and is widely applied in medical image processing due to the good flexibility and robustness. However, it requires a tremendous amount of computing time to obtain more accurate registration results especially for a large amount of medical image data. To address the issue, a parallel nonrigid registration algorithm based on B-spline is proposed in this paper. First, the Logarithm Squared Difference (LSD) is considered as the similarity metric in the B-spline registration algorithm to improve registration precision. After that, we create a parallel computing strategy and lookup tables (LUTs) to reduce the complexity of the B-spline registration algorithm. As a result, the computing time of three time-consuming steps including B-splines interpolation, LSD computation, and the analytic gradient computation of LSD, is efficiently reduced, for the B-spline registration algorithm employs the Nonlinear Conjugate Gradient (NCG) optimization method. Experimental results of registration quality and execution efficiency on the large amount of medical images show that our algorithm achieves a better registration accuracy in terms of the differences between the best deformation fields and ground truth and a speedup of 17 times over the single-threaded CPU implementation due to the powerful parallel computing ability of Graphics Processing Unit (GPU). PMID:28053653

  16. Cloud Computing and Its Applications in GIS

    NASA Astrophysics Data System (ADS)

    Kang, Cao

    2011-12-01

    Cloud computing is a novel computing paradigm that offers highly scalable and highly available distributed computing services. The objectives of this research are to: 1. analyze and understand cloud computing and its potential for GIS; 2. discover the feasibilities of migrating truly spatial GIS algorithms to distributed computing infrastructures; 3. explore a solution to host and serve large volumes of raster GIS data efficiently and speedily. These objectives thus form the basis for three professional articles. The first article is entitled "Cloud Computing and Its Applications in GIS". This paper introduces the concept, structure, and features of cloud computing. Features of cloud computing such as scalability, parallelization, and high availability make it a very capable computing paradigm. Unlike High Performance Computing (HPC), cloud computing uses inexpensive commodity computers. The uniform administration systems in cloud computing make it easier to use than GRID computing. Potential advantages of cloud-based GIS systems such as lower barrier to entry are consequently presented. Three cloud-based GIS system architectures are proposed: public cloud- based GIS systems, private cloud-based GIS systems and hybrid cloud-based GIS systems. Public cloud-based GIS systems provide the lowest entry barriers for users among these three architectures, but their advantages are offset by data security and privacy related issues. Private cloud-based GIS systems provide the best data protection, though they have the highest entry barriers. Hybrid cloud-based GIS systems provide a compromise between these extremes. The second article is entitled "A cloud computing algorithm for the calculation of Euclidian distance for raster GIS". Euclidean distance is a truly spatial GIS algorithm. Classical algorithms such as the pushbroom and growth ring techniques require computational propagation through the entire raster image, which makes it incompatible with the distributed nature of cloud computing. This paper presents a parallel Euclidean distance algorithm that works seamlessly with the distributed nature of cloud computing infrastructures. The mechanism of this algorithm is to subdivide a raster image into sub-images and wrap them with a one pixel deep edge layer of individually computed distance information. Each sub-image is then processed by a separate node, after which the resulting sub-images are reassembled into the final output. It is shown that while any rectangular sub-image shape can be used, those approximating squares are computationally optimal. This study also serves as a demonstration of this subdivide and layer-wrap strategy, which would enable the migration of many truly spatial GIS algorithms to cloud computing infrastructures. However, this research also indicates that certain spatial GIS algorithms such as cost distance cannot be migrated by adopting this mechanism, which presents significant challenges for the development of cloud-based GIS systems. The third article is entitled "A Distributed Storage Schema for Cloud Computing based Raster GIS Systems". This paper proposes a NoSQL Database Management System (NDDBMS) based raster GIS data storage schema. NDDBMS has good scalability and is able to use distributed commodity computers, which make it superior to Relational Database Management Systems (RDBMS) in a cloud computing environment. In order to provide optimized data service performance, the proposed storage schema analyzes the nature of commonly used raster GIS data sets. It discriminates two categories of commonly used data sets, and then designs corresponding data storage models for both categories. As a result, the proposed storage schema is capable of hosting and serving enormous volumes of raster GIS data speedily and efficiently on cloud computing infrastructures. In addition, the scheme also takes advantage of the data compression characteristics of Quadtrees, thus promoting efficient data storage. Through this assessment of cloud computing technology, the exploration of the challenges and solutions to the migration of GIS algorithms to cloud computing infrastructures, and the examination of strategies for serving large amounts of GIS data in a cloud computing infrastructure, this dissertation lends support to the feasibility of building a cloud-based GIS system. However, there are still challenges that need to be addressed before a full-scale functional cloud-based GIS system can be successfully implemented. (Abstract shortened by UMI.)

  17. Computational path planner for product assembly in complex environments

    NASA Astrophysics Data System (ADS)

    Shang, Wei; Liu, Jianhua; Ning, Ruxin; Liu, Mi

    2013-03-01

    Assembly path planning is a crucial problem in assembly related design and manufacturing processes. Sampling based motion planning algorithms are used for computational assembly path planning. However, the performance of such algorithms may degrade much in environments with complex product structure, narrow passages or other challenging scenarios. A computational path planner for automatic assembly path planning in complex 3D environments is presented. The global planning process is divided into three phases based on the environment and specific algorithms are proposed and utilized in each phase to solve the challenging issues. A novel ray test based stochastic collision detection method is proposed to evaluate the intersection between two polyhedral objects. This method avoids fake collisions in conventional methods and degrades the geometric constraint when a part has to be removed with surface contact with other parts. A refined history based rapidly-exploring random tree (RRT) algorithm which bias the growth of the tree based on its planning history is proposed and employed in the planning phase where the path is simple but the space is highly constrained. A novel adaptive RRT algorithm is developed for the path planning problem with challenging scenarios and uncertain environment. With extending values assigned on each tree node and extending schemes applied, the tree can adapts its growth to explore complex environments more efficiently. Experiments on the key algorithms are carried out and comparisons are made between the conventional path planning algorithms and the presented ones. The comparing results show that based on the proposed algorithms, the path planner can compute assembly path in challenging complex environments more efficiently and with higher success. This research provides the references to the study of computational assembly path planning under complex environments.

  18. Design of a Performance-Responsive Drill and Practice Algorithm for Computer-Based Training.

    ERIC Educational Resources Information Center

    Vazquez-Abad, Jesus; LaFleur, Marc

    1990-01-01

    Reviews criticisms of the use of drill and practice programs in educational computing and describes potentials for its use in instruction. Topics discussed include guidelines for developing computer-based drill and practice; scripted training courseware; item format design; item bank design; and a performance-responsive algorithm for item…

  19. Energy-Aware Computation Offloading of IoT Sensors in Cloudlet-Based Mobile Edge Computing.

    PubMed

    Ma, Xiao; Lin, Chuang; Zhang, Han; Liu, Jianwei

    2018-06-15

    Mobile edge computing is proposed as a promising computing paradigm to relieve the excessive burden of data centers and mobile networks, which is induced by the rapid growth of Internet of Things (IoT). This work introduces the cloud-assisted multi-cloudlet framework to provision scalable services in cloudlet-based mobile edge computing. Due to the constrained computation resources of cloudlets and limited communication resources of wireless access points (APs), IoT sensors with identical computation offloading decisions interact with each other. To optimize the processing delay and energy consumption of computation tasks, theoretic analysis of the computation offloading decision problem of IoT sensors is presented in this paper. In more detail, the computation offloading decision problem of IoT sensors is formulated as a computation offloading game and the condition of Nash equilibrium is derived by introducing the tool of a potential game. By exploiting the finite improvement property of the game, the Computation Offloading Decision (COD) algorithm is designed to provide decentralized computation offloading strategies for IoT sensors. Simulation results demonstrate that the COD algorithm can significantly reduce the system cost compared with the random-selection algorithm and the cloud-first algorithm. Furthermore, the COD algorithm can scale well with increasing IoT sensors.

  20. QRS detection based ECG quality assessment.

    PubMed

    Hayn, Dieter; Jammerbund, Bernhard; Schreier, Günter

    2012-09-01

    Although immediate feedback concerning ECG signal quality during recording is useful, up to now not much literature describing quality measures is available. We have implemented and evaluated four ECG quality measures. Empty lead criterion (A), spike detection criterion (B) and lead crossing point criterion (C) were calculated from basic signal properties. Measure D quantified the robustness of QRS detection when applied to the signal. An advanced Matlab-based algorithm combining all four measures and a simplified algorithm for Android platforms, excluding measure D, were developed. Both algorithms were evaluated by taking part in the Computing in Cardiology Challenge 2011. Each measure's accuracy and computing time was evaluated separately. During the challenge, the advanced algorithm correctly classified 93.3% of the ECGs in the training-set and 91.6 % in the test-set. Scores for the simplified algorithm were 0.834 in event 2 and 0.873 in event 3. Computing time for measure D was almost five times higher than for other measures. Required accuracy levels depend on the application and are related to computing time. While our simplified algorithm may be accurate for real-time feedback during ECG self-recordings, QRS detection based measures can further increase the performance if sufficient computing power is available.

  1. Computing border bases using mutant strategies

    NASA Astrophysics Data System (ADS)

    Ullah, E.; Abbas Khan, S.

    2014-01-01

    Border bases, a generalization of Gröbner bases, have actively been addressed during recent years due to their applicability to industrial problems. In cryptography and coding theory a useful application of border based is to solve zero-dimensional systems of polynomial equations over finite fields, which motivates us for developing optimizations of the algorithms that compute border bases. In 2006, Kehrein and Kreuzer formulated the Border Basis Algorithm (BBA), an algorithm which allows the computation of border bases that relate to a degree compatible term ordering. In 2007, J. Ding et al. introduced mutant strategies bases on finding special lower degree polynomials in the ideal. The mutant strategies aim to distinguish special lower degree polynomials (mutants) from the other polynomials and give them priority in the process of generating new polynomials in the ideal. In this paper we develop hybrid algorithms that use the ideas of J. Ding et al. involving the concept of mutants to optimize the Border Basis Algorithm for solving systems of polynomial equations over finite fields. In particular, we recall a version of the Border Basis Algorithm which is actually called the Improved Border Basis Algorithm and propose two hybrid algorithms, called MBBA and IMBBA. The new mutants variants provide us space efficiency as well as time efficiency. The efficiency of these newly developed hybrid algorithms is discussed using standard cryptographic examples.

  2. Intelligent fuzzy approach for fast fractal image compression

    NASA Astrophysics Data System (ADS)

    Nodehi, Ali; Sulong, Ghazali; Al-Rodhaan, Mznah; Al-Dhelaan, Abdullah; Rehman, Amjad; Saba, Tanzila

    2014-12-01

    Fractal image compression (FIC) is recognized as a NP-hard problem, and it suffers from a high number of mean square error (MSE) computations. In this paper, a two-phase algorithm was proposed to reduce the MSE computation of FIC. In the first phase, based on edge property, range and domains are arranged. In the second one, imperialist competitive algorithm (ICA) is used according to the classified blocks. For maintaining the quality of the retrieved image and accelerating algorithm operation, we divided the solutions into two groups: developed countries and undeveloped countries. Simulations were carried out to evaluate the performance of the developed approach. Promising results thus achieved exhibit performance better than genetic algorithm (GA)-based and Full-search algorithms in terms of decreasing the number of MSE computations. The number of MSE computations was reduced by the proposed algorithm for 463 times faster compared to the Full-search algorithm, although the retrieved image quality did not have a considerable change.

  3. Embedded assessment algorithms within home-based cognitive computer game exercises for elders.

    PubMed

    Jimison, Holly; Pavel, Misha

    2006-01-01

    With the recent consumer interest in computer-based activities designed to improve cognitive performance, there is a growing need for scientific assessment algorithms to validate the potential contributions of cognitive exercises. In this paper, we present a novel methodology for incorporating dynamic cognitive assessment algorithms within computer games designed to enhance cognitive performance. We describe how this approach works for variety of computer applications and describe cognitive monitoring results for one of the computer game exercises. The real-time cognitive assessments also provide a control signal for adapting the difficulty of the game exercises and providing tailored help for elders of varying abilities.

  4. Parallelizing flow-accumulation calculations on graphics processing units—From iterative DEM preprocessing algorithm to recursive multiple-flow-direction algorithm

    NASA Astrophysics Data System (ADS)

    Qin, Cheng-Zhi; Zhan, Lijun

    2012-06-01

    As one of the important tasks in digital terrain analysis, the calculation of flow accumulations from gridded digital elevation models (DEMs) usually involves two steps in a real application: (1) using an iterative DEM preprocessing algorithm to remove the depressions and flat areas commonly contained in real DEMs, and (2) using a recursive flow-direction algorithm to calculate the flow accumulation for every cell in the DEM. Because both algorithms are computationally intensive, quick calculation of the flow accumulations from a DEM (especially for a large area) presents a practical challenge to personal computer (PC) users. In recent years, rapid increases in hardware capacity of the graphics processing units (GPUs) provided in modern PCs have made it possible to meet this challenge in a PC environment. Parallel computing on GPUs using a compute-unified-device-architecture (CUDA) programming model has been explored to speed up the execution of the single-flow-direction algorithm (SFD). However, the parallel implementation on a GPU of the multiple-flow-direction (MFD) algorithm, which generally performs better than the SFD algorithm, has not been reported. Moreover, GPU-based parallelization of the DEM preprocessing step in the flow-accumulation calculations has not been addressed. This paper proposes a parallel approach to calculate flow accumulations (including both iterative DEM preprocessing and a recursive MFD algorithm) on a CUDA-compatible GPU. For the parallelization of an MFD algorithm (MFD-md), two different parallelization strategies using a GPU are explored. The first parallelization strategy, which has been used in the existing parallel SFD algorithm on GPU, has the problem of computing redundancy. Therefore, we designed a parallelization strategy based on graph theory. The application results show that the proposed parallel approach to calculate flow accumulations on a GPU performs much faster than either sequential algorithms or other parallel GPU-based algorithms based on existing parallelization strategies.

  5. Parallel Computational Protein Design.

    PubMed

    Zhou, Yichao; Donald, Bruce R; Zeng, Jianyang

    2017-01-01

    Computational structure-based protein design (CSPD) is an important problem in computational biology, which aims to design or improve a prescribed protein function based on a protein structure template. It provides a practical tool for real-world protein engineering applications. A popular CSPD method that guarantees to find the global minimum energy solution (GMEC) is to combine both dead-end elimination (DEE) and A* tree search algorithms. However, in this framework, the A* search algorithm can run in exponential time in the worst case, which may become the computation bottleneck of large-scale computational protein design process. To address this issue, we extend and add a new module to the OSPREY program that was previously developed in the Donald lab (Gainza et al., Methods Enzymol 523:87, 2013) to implement a GPU-based massively parallel A* algorithm for improving protein design pipeline. By exploiting the modern GPU computational framework and optimizing the computation of the heuristic function for A* search, our new program, called gOSPREY, can provide up to four orders of magnitude speedups in large protein design cases with a small memory overhead comparing to the traditional A* search algorithm implementation, while still guaranteeing the optimality. In addition, gOSPREY can be configured to run in a bounded-memory mode to tackle the problems in which the conformation space is too large and the global optimal solution cannot be computed previously. Furthermore, the GPU-based A* algorithm implemented in the gOSPREY program can be combined with the state-of-the-art rotamer pruning algorithms such as iMinDEE (Gainza et al., PLoS Comput Biol 8:e1002335, 2012) and DEEPer (Hallen et al., Proteins 81:18-39, 2013) to also consider continuous backbone and side-chain flexibility.

  6. Benchmarking neuromorphic vision: lessons learnt from computer vision

    PubMed Central

    Tan, Cheston; Lallee, Stephane; Orchard, Garrick

    2015-01-01

    Neuromorphic Vision sensors have improved greatly since the first silicon retina was presented almost three decades ago. They have recently matured to the point where they are commercially available and can be operated by laymen. However, despite improved availability of sensors, there remains a lack of good datasets, while algorithms for processing spike-based visual data are still in their infancy. On the other hand, frame-based computer vision algorithms are far more mature, thanks in part to widely accepted datasets which allow direct comparison between algorithms and encourage competition. We are presented with a unique opportunity to shape the development of Neuromorphic Vision benchmarks and challenges by leveraging what has been learnt from the use of datasets in frame-based computer vision. Taking advantage of this opportunity, in this paper we review the role that benchmarks and challenges have played in the advancement of frame-based computer vision, and suggest guidelines for the creation of Neuromorphic Vision benchmarks and challenges. We also discuss the unique challenges faced when benchmarking Neuromorphic Vision algorithms, particularly when attempting to provide direct comparison with frame-based computer vision. PMID:26528120

  7. Visual saliency-based fast intracoding algorithm for high efficiency video coding

    NASA Astrophysics Data System (ADS)

    Zhou, Xin; Shi, Guangming; Zhou, Wei; Duan, Zhemin

    2017-01-01

    Intraprediction has been significantly improved in high efficiency video coding over H.264/AVC with quad-tree-based coding unit (CU) structure from size 64×64 to 8×8 and more prediction modes. However, these techniques cause a dramatic increase in computational complexity. An intracoding algorithm is proposed that consists of perceptual fast CU size decision algorithm and fast intraprediction mode decision algorithm. First, based on the visual saliency detection, an adaptive and fast CU size decision method is proposed to alleviate intraencoding complexity. Furthermore, a fast intraprediction mode decision algorithm with step halving rough mode decision method and early modes pruning algorithm is presented to selectively check the potential modes and effectively reduce the complexity of computation. Experimental results show that our proposed fast method reduces the computational complexity of the current HM to about 57% in encoding time with only 0.37% increases in BD rate. Meanwhile, the proposed fast algorithm has reasonable peak signal-to-noise ratio losses and nearly the same subjective perceptual quality.

  8. Minimum time acceleration of aircraft turbofan engines by using an algorithm based on nonlinear programming

    NASA Technical Reports Server (NTRS)

    Teren, F.

    1977-01-01

    Minimum time accelerations of aircraft turbofan engines are presented. The calculation of these accelerations was made by using a piecewise linear engine model, and an algorithm based on nonlinear programming. Use of this model and algorithm allows such trajectories to be readily calculated on a digital computer with a minimal expenditure of computer time.

  9. Generalization of the Lord-Wingersky Algorithm to Computing the Distribution of Summed Test Scores Based on Real-Number Item Scores

    ERIC Educational Resources Information Center

    Kim, Seonghoon

    2013-01-01

    With known item response theory (IRT) item parameters, Lord and Wingersky provided a recursive algorithm for computing the conditional frequency distribution of number-correct test scores, given proficiency. This article presents a generalized algorithm for computing the conditional distribution of summed test scores involving real-number item…

  10. A Benders based rolling horizon algorithm for a dynamic facility location problem

    DOE PAGES

    Marufuzzaman,, Mohammad; Gedik, Ridvan; Roni, Mohammad S.

    2016-06-28

    This study presents a well-known capacitated dynamic facility location problem (DFLP) that satisfies the customer demand at a minimum cost by determining the time period for opening, closing, or retaining an existing facility in a given location. To solve this challenging NP-hard problem, this paper develops a unique hybrid solution algorithm that combines a rolling horizon algorithm with an accelerated Benders decomposition algorithm. Extensive computational experiments are performed on benchmark test instances to evaluate the hybrid algorithm’s efficiency and robustness in solving the DFLP problem. Computational results indicate that the hybrid Benders based rolling horizon algorithm consistently offers high qualitymore » feasible solutions in a much shorter computational time period than the standalone rolling horizon and accelerated Benders decomposition algorithms in the experimental range.« less

  11. Volumetric visualization algorithm development for an FPGA-based custom computing machine

    NASA Astrophysics Data System (ADS)

    Sallinen, Sami J.; Alakuijala, Jyrki; Helminen, Hannu; Laitinen, Joakim

    1998-05-01

    Rendering volumetric medical images is a burdensome computational task for contemporary computers due to the large size of the data sets. Custom designed reconfigurable hardware could considerably speed up volume visualization if an algorithm suitable for the platform is used. We present an algorithm and speedup techniques for visualizing volumetric medical CT and MR images with a custom-computing machine based on a Field Programmable Gate Array (FPGA). We also present simulated performance results of the proposed algorithm calculated with a software implementation running on a desktop PC. Our algorithm is capable of generating perspective projection renderings of single and multiple isosurfaces with transparency, simulated X-ray images, and Maximum Intensity Projections (MIP). Although more speedup techniques exist for parallel projection than for perspective projection, we have constrained ourselves to perspective viewing, because of its importance in the field of radiotherapy. The algorithm we have developed is based on ray casting, and the rendering is sped up by three different methods: shading speedup by gradient precalculation, a new generalized version of Ray-Acceleration by Distance Coding (RADC), and background ray elimination by speculative ray selection.

  12. Arbitrated Quantum Signature with Hamiltonian Algorithm Based on Blind Quantum Computation

    NASA Astrophysics Data System (ADS)

    Shi, Ronghua; Ding, Wanting; Shi, Jinjing

    2018-03-01

    A novel arbitrated quantum signature (AQS) scheme is proposed motivated by the Hamiltonian algorithm (HA) and blind quantum computation (BQC). The generation and verification of signature algorithm is designed based on HA, which enables the scheme to rely less on computational complexity. It is unnecessary to recover original messages when verifying signatures since the blind quantum computation is applied, which can improve the simplicity and operability of our scheme. It is proved that the scheme can be deployed securely, and the extended AQS has some extensive applications in E-payment system, E-government, E-business, etc.

  13. Arbitrated Quantum Signature with Hamiltonian Algorithm Based on Blind Quantum Computation

    NASA Astrophysics Data System (ADS)

    Shi, Ronghua; Ding, Wanting; Shi, Jinjing

    2018-07-01

    A novel arbitrated quantum signature (AQS) scheme is proposed motivated by the Hamiltonian algorithm (HA) and blind quantum computation (BQC). The generation and verification of signature algorithm is designed based on HA, which enables the scheme to rely less on computational complexity. It is unnecessary to recover original messages when verifying signatures since the blind quantum computation is applied, which can improve the simplicity and operability of our scheme. It is proved that the scheme can be deployed securely, and the extended AQS has some extensive applications in E-payment system, E-government, E-business, etc.

  14. Computational Workbench for Multibody Dynamics

    NASA Technical Reports Server (NTRS)

    Edmonds, Karina

    2007-01-01

    PyCraft is a computer program that provides an interactive, workbenchlike computing environment for developing and testing algorithms for multibody dynamics. Examples of multibody dynamic systems amenable to analysis with the help of PyCraft include land vehicles, spacecraft, robots, and molecular models. PyCraft is based on the Spatial-Operator- Algebra (SOA) formulation for multibody dynamics. The SOA operators enable construction of simple and compact representations of complex multibody dynamical equations. Within the Py-Craft computational workbench, users can, essentially, use the high-level SOA operator notation to represent the variety of dynamical quantities and algorithms and to perform computations interactively. PyCraft provides a Python-language interface to underlying C++ code. Working with SOA concepts, a user can create and manipulate Python-level operator classes in order to implement and evaluate new dynamical quantities and algorithms. During use of PyCraft, virtually all SOA-based algorithms are available for computational experiments.

  15. Parallel algorithms for computation of the manipulator inertia matrix

    NASA Technical Reports Server (NTRS)

    Amin-Javaheri, Masoud; Orin, David E.

    1989-01-01

    The development of an O(log2N) parallel algorithm for the manipulator inertia matrix is presented. It is based on the most efficient serial algorithm which uses the composite rigid body method. Recursive doubling is used to reformulate the linear recurrence equations which are required to compute the diagonal elements of the matrix. It results in O(log2N) levels of computation. Computation of the off-diagonal elements involves N linear recurrences of varying-size and a new method, which avoids redundant computation of position and orientation transforms for the manipulator, is developed. The O(log2N) algorithm is presented in both equation and graphic forms which clearly show the parallelism inherent in the algorithm.

  16. Hardware architecture design of image restoration based on time-frequency domain computation

    NASA Astrophysics Data System (ADS)

    Wen, Bo; Zhang, Jing; Jiao, Zipeng

    2013-10-01

    The image restoration algorithms based on time-frequency domain computation is high maturity and applied widely in engineering. To solve the high-speed implementation of these algorithms, the TFDC hardware architecture is proposed. Firstly, the main module is designed, by analyzing the common processing and numerical calculation. Then, to improve the commonality, the iteration control module is planed for iterative algorithms. In addition, to reduce the computational cost and memory requirements, the necessary optimizations are suggested for the time-consuming module, which include two-dimensional FFT/IFFT and the plural calculation. Eventually, the TFDC hardware architecture is adopted for hardware design of real-time image restoration system. The result proves that, the TFDC hardware architecture and its optimizations can be applied to image restoration algorithms based on TFDC, with good algorithm commonality, hardware realizability and high efficiency.

  17. The remote sensing image segmentation mean shift algorithm parallel processing based on MapReduce

    NASA Astrophysics Data System (ADS)

    Chen, Xi; Zhou, Liqing

    2015-12-01

    With the development of satellite remote sensing technology and the remote sensing image data, traditional remote sensing image segmentation technology cannot meet the massive remote sensing image processing and storage requirements. This article put cloud computing and parallel computing technology in remote sensing image segmentation process, and build a cheap and efficient computer cluster system that uses parallel processing to achieve MeanShift algorithm of remote sensing image segmentation based on the MapReduce model, not only to ensure the quality of remote sensing image segmentation, improved split speed, and better meet the real-time requirements. The remote sensing image segmentation MeanShift algorithm parallel processing algorithm based on MapReduce shows certain significance and a realization of value.

  18. General Quantum Meet-in-the-Middle Search Algorithm Based on Target Solution of Fixed Weight

    NASA Astrophysics Data System (ADS)

    Fu, Xiang-Qun; Bao, Wan-Su; Wang, Xiang; Shi, Jian-Hong

    2016-10-01

    Similar to the classical meet-in-the-middle algorithm, the storage and computation complexity are the key factors that decide the efficiency of the quantum meet-in-the-middle algorithm. Aiming at the target vector of fixed weight, based on the quantum meet-in-the-middle algorithm, the algorithm for searching all n-product vectors with the same weight is presented, whose complexity is better than the exhaustive search algorithm. And the algorithm can reduce the storage complexity of the quantum meet-in-the-middle search algorithm. Then based on the algorithm and the knapsack vector of the Chor-Rivest public-key crypto of fixed weight d, we present a general quantum meet-in-the-middle search algorithm based on the target solution of fixed weight, whose computational complexity is \\sumj = 0d {(O(\\sqrt {Cn - k + 1d - j }) + O(C_kj log C_k^j))} with Σd i =0 Ck i memory cost. And the optimal value of k is given. Compared to the quantum meet-in-the-middle search algorithm for knapsack problem and the quantum algorithm for searching a target solution of fixed weight, the computational complexity of the algorithm is lower. And its storage complexity is smaller than the quantum meet-in-the-middle-algorithm. Supported by the National Basic Research Program of China under Grant No. 2013CB338002 and the National Natural Science Foundation of China under Grant No. 61502526

  19. An efficient parallel termination detection algorithm

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Baker, A. H.; Crivelli, S.; Jessup, E. R.

    2004-05-27

    Information local to any one processor is insufficient to monitor the overall progress of most distributed computations. Typically, a second distributed computation for detecting termination of the main computation is necessary. In order to be a useful computational tool, the termination detection routine must operate concurrently with the main computation, adding minimal overhead, and it must promptly and correctly detect termination when it occurs. In this paper, we present a new algorithm for detecting the termination of a parallel computation on distributed-memory MIMD computers that satisfies all of those criteria. A variety of termination detection algorithms have been devised. Ofmore » these, the algorithm presented by Sinha, Kale, and Ramkumar (henceforth, the SKR algorithm) is unique in its ability to adapt to the load conditions of the system on which it runs, thereby minimizing the impact of termination detection on performance. Because their algorithm also detects termination quickly, we consider it to be the most efficient practical algorithm presently available. The termination detection algorithm presented here was developed for use in the PMESC programming library for distributed-memory MIMD computers. Like the SKR algorithm, our algorithm adapts to system loads and imposes little overhead. Also like the SKR algorithm, ours is tree-based, and it does not depend on any assumptions about the physical interconnection topology of the processors or the specifics of the distributed computation. In addition, our algorithm is easier to implement and requires only half as many tree traverses as does the SKR algorithm. This paper is organized as follows. In section 2, we define our computational model. In section 3, we review the SKR algorithm. We introduce our new algorithm in section 4, and prove its correctness in section 5. We discuss its efficiency and present experimental results in section 6.« less

  20. Trellises and Trellis-Based Decoding Algorithms for Linear Block Codes. Part 3

    NASA Technical Reports Server (NTRS)

    Lin, Shu

    1998-01-01

    Decoding algorithms based on the trellis representation of a code (block or convolutional) drastically reduce decoding complexity. The best known and most commonly used trellis-based decoding algorithm is the Viterbi algorithm. It is a maximum likelihood decoding algorithm. Convolutional codes with the Viterbi decoding have been widely used for error control in digital communications over the last two decades. This chapter is concerned with the application of the Viterbi decoding algorithm to linear block codes. First, the Viterbi algorithm is presented. Then, optimum sectionalization of a trellis to minimize the computational complexity of a Viterbi decoder is discussed and an algorithm is presented. Some design issues for IC (integrated circuit) implementation of a Viterbi decoder are considered and discussed. Finally, a new decoding algorithm based on the principle of compare-select-add is presented. This new algorithm can be applied to both block and convolutional codes and is more efficient than the conventional Viterbi algorithm based on the add-compare-select principle. This algorithm is particularly efficient for rate 1/n antipodal convolutional codes and their high-rate punctured codes. It reduces computational complexity by one-third compared with the Viterbi algorithm.

  1. GPU-accelerated computing for Lagrangian coherent structures of multi-body gravitational regimes

    NASA Astrophysics Data System (ADS)

    Lin, Mingpei; Xu, Ming; Fu, Xiaoyu

    2017-04-01

    Based on a well-established theoretical foundation, Lagrangian Coherent Structures (LCSs) have elicited widespread research on the intrinsic structures of dynamical systems in many fields, including the field of astrodynamics. Although the application of LCSs in dynamical problems seems straightforward theoretically, its associated computational cost is prohibitive. We propose a block decomposition algorithm developed on Compute Unified Device Architecture (CUDA) platform for the computation of the LCSs of multi-body gravitational regimes. In order to take advantage of GPU's outstanding computing properties, such as Shared Memory, Constant Memory, and Zero-Copy, the algorithm utilizes a block decomposition strategy to facilitate computation of finite-time Lyapunov exponent (FTLE) fields of arbitrary size and timespan. Simulation results demonstrate that this GPU-based algorithm can satisfy double-precision accuracy requirements and greatly decrease the time needed to calculate final results, increasing speed by approximately 13 times. Additionally, this algorithm can be generalized to various large-scale computing problems, such as particle filters, constellation design, and Monte-Carlo simulation.

  2. Simulation of biochemical reactions with time-dependent rates by the rejection-based algorithm

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Thanh, Vo Hong, E-mail: vo@cosbi.eu; Priami, Corrado, E-mail: priami@cosbi.eu; Department of Mathematics, University of Trento, Trento

    We address the problem of simulating biochemical reaction networks with time-dependent rates and propose a new algorithm based on our rejection-based stochastic simulation algorithm (RSSA) [Thanh et al., J. Chem. Phys. 141(13), 134116 (2014)]. The computation for selecting next reaction firings by our time-dependent RSSA (tRSSA) is computationally efficient. Furthermore, the generated trajectory is exact by exploiting the rejection-based mechanism. We benchmark tRSSA on different biological systems with varying forms of reaction rates to demonstrate its applicability and efficiency. We reveal that for nontrivial cases, the selection of reaction firings in existing algorithms introduces approximations because the integration of reactionmore » rates is very computationally demanding and simplifying assumptions are introduced. The selection of the next reaction firing by our approach is easier while preserving the exactness.« less

  3. Dynamic displacement measurement of large-scale structures based on the Lucas-Kanade template tracking algorithm

    NASA Astrophysics Data System (ADS)

    Guo, Jie; Zhu, Chang`an

    2016-01-01

    The development of optics and computer technologies enables the application of the vision-based technique that uses digital cameras to the displacement measurement of large-scale structures. Compared with traditional contact measurements, vision-based technique allows for remote measurement, has a non-intrusive characteristic, and does not necessitate mass introduction. In this study, a high-speed camera system is developed to complete the displacement measurement in real time. The system consists of a high-speed camera and a notebook computer. The high-speed camera can capture images at a speed of hundreds of frames per second. To process the captured images in computer, the Lucas-Kanade template tracking algorithm in the field of computer vision is introduced. Additionally, a modified inverse compositional algorithm is proposed to reduce the computing time of the original algorithm and improve the efficiency further. The modified algorithm can rapidly accomplish one displacement extraction within 1 ms without having to install any pre-designed target panel onto the structures in advance. The accuracy and the efficiency of the system in the remote measurement of dynamic displacement are demonstrated in the experiments on motion platform and sound barrier on suspension viaduct. Experimental results show that the proposed algorithm can extract accurate displacement signal and accomplish the vibration measurement of large-scale structures.

  4. Research on Image Encryption Based on DNA Sequence and Chaos Theory

    NASA Astrophysics Data System (ADS)

    Tian Zhang, Tian; Yan, Shan Jun; Gu, Cheng Yan; Ren, Ran; Liao, Kai Xin

    2018-04-01

    Nowadays encryption is a common technique to protect image data from unauthorized access. In recent years, many scientists have proposed various encryption algorithms based on DNA sequence to provide a new idea for the design of image encryption algorithm. Therefore, a new method of image encryption based on DNA computing technology is proposed in this paper, whose original image is encrypted by DNA coding and 1-D logistic chaotic mapping. First, the algorithm uses two modules as the encryption key. The first module uses the real DNA sequence, and the second module is made by one-dimensional logistic chaos mapping. Secondly, the algorithm uses DNA complementary rules to encode original image, and uses the key and DNA computing technology to compute each pixel value of the original image, so as to realize the encryption of the whole image. Simulation results show that the algorithm has good encryption effect and security.

  5. Parallel grid generation algorithm for distributed memory computers

    NASA Technical Reports Server (NTRS)

    Moitra, Stuti; Moitra, Anutosh

    1994-01-01

    A parallel grid-generation algorithm and its implementation on the Intel iPSC/860 computer are described. The grid-generation scheme is based on an algebraic formulation of homotopic relations. Methods for utilizing the inherent parallelism of the grid-generation scheme are described, and implementation of multiple levELs of parallelism on multiple instruction multiple data machines are indicated. The algorithm is capable of providing near orthogonality and spacing control at solid boundaries while requiring minimal interprocessor communications. Results obtained on the Intel hypercube for a blended wing-body configuration are used to demonstrate the effectiveness of the algorithm. Fortran implementations bAsed on the native programming model of the iPSC/860 computer and the Express system of software tools are reported. Computational gains in execution time speed-up ratios are given.

  6. A Scheduling Algorithm for Cloud Computing System Based on the Driver of Dynamic Essential Path.

    PubMed

    Xie, Zhiqiang; Shao, Xia; Xin, Yu

    2016-01-01

    To solve the problem of task scheduling in the cloud computing system, this paper proposes a scheduling algorithm for cloud computing based on the driver of dynamic essential path (DDEP). This algorithm applies a predecessor-task layer priority strategy to solve the problem of constraint relations among task nodes. The strategy assigns different priority values to every task node based on the scheduling order of task node as affected by the constraint relations among task nodes, and the task node list is generated by the different priority value. To address the scheduling order problem in which task nodes have the same priority value, the dynamic essential long path strategy is proposed. This strategy computes the dynamic essential path of the pre-scheduling task nodes based on the actual computation cost and communication cost of task node in the scheduling process. The task node that has the longest dynamic essential path is scheduled first as the completion time of task graph is indirectly influenced by the finishing time of task nodes in the longest dynamic essential path. Finally, we demonstrate the proposed algorithm via simulation experiments using Matlab tools. The experimental results indicate that the proposed algorithm can effectively reduce the task Makespan in most cases and meet a high quality performance objective.

  7. A Scheduling Algorithm for Cloud Computing System Based on the Driver of Dynamic Essential Path

    PubMed Central

    Xie, Zhiqiang; Shao, Xia; Xin, Yu

    2016-01-01

    To solve the problem of task scheduling in the cloud computing system, this paper proposes a scheduling algorithm for cloud computing based on the driver of dynamic essential path (DDEP). This algorithm applies a predecessor-task layer priority strategy to solve the problem of constraint relations among task nodes. The strategy assigns different priority values to every task node based on the scheduling order of task node as affected by the constraint relations among task nodes, and the task node list is generated by the different priority value. To address the scheduling order problem in which task nodes have the same priority value, the dynamic essential long path strategy is proposed. This strategy computes the dynamic essential path of the pre-scheduling task nodes based on the actual computation cost and communication cost of task node in the scheduling process. The task node that has the longest dynamic essential path is scheduled first as the completion time of task graph is indirectly influenced by the finishing time of task nodes in the longest dynamic essential path. Finally, we demonstrate the proposed algorithm via simulation experiments using Matlab tools. The experimental results indicate that the proposed algorithm can effectively reduce the task Makespan in most cases and meet a high quality performance objective. PMID:27490901

  8. A Parallel Point Matching Algorithm for Landmark Based Image Registration Using Multicore Platform

    PubMed Central

    Yang, Lin; Gong, Leiguang; Zhang, Hong; Nosher, John L.; Foran, David J.

    2013-01-01

    Point matching is crucial for many computer vision applications. Establishing the correspondence between a large number of data points is a computationally intensive process. Some point matching related applications, such as medical image registration, require real time or near real time performance if applied to critical clinical applications like image assisted surgery. In this paper, we report a new multicore platform based parallel algorithm for fast point matching in the context of landmark based medical image registration. We introduced a non-regular data partition algorithm which utilizes the K-means clustering algorithm to group the landmarks based on the number of available processing cores, which optimize the memory usage and data transfer. We have tested our method using the IBM Cell Broadband Engine (Cell/B.E.) platform. The results demonstrated a significant speed up over its sequential implementation. The proposed data partition and parallelization algorithm, though tested only on one multicore platform, is generic by its design. Therefore the parallel algorithm can be extended to other computing platforms, as well as other point matching related applications. PMID:24308014

  9. A Decentralized Eigenvalue Computation Method for Spectrum Sensing Based on Average Consensus

    NASA Astrophysics Data System (ADS)

    Mohammadi, Jafar; Limmer, Steffen; Stańczak, Sławomir

    2016-07-01

    This paper considers eigenvalue estimation for the decentralized inference problem for spectrum sensing. We propose a decentralized eigenvalue computation algorithm based on the power method, which is referred to as generalized power method GPM; it is capable of estimating the eigenvalues of a given covariance matrix under certain conditions. Furthermore, we have developed a decentralized implementation of GPM by splitting the iterative operations into local and global computation tasks. The global tasks require data exchange to be performed among the nodes. For this task, we apply an average consensus algorithm to efficiently perform the global computations. As a special case, we consider a structured graph that is a tree with clusters of nodes at its leaves. For an accelerated distributed implementation, we propose to use computation over multiple access channel (CoMAC) as a building block of the algorithm. Numerical simulations are provided to illustrate the performance of the two algorithms.

  10. A novel combined SLAM based on RBPF-SLAM and EIF-SLAM for mobile system sensing in a large scale environment.

    PubMed

    He, Bo; Zhang, Shujing; Yan, Tianhong; Zhang, Tao; Liang, Yan; Zhang, Hongjin

    2011-01-01

    Mobile autonomous systems are very important for marine scientific investigation and military applications. Many algorithms have been studied to deal with the computational efficiency problem required for large scale simultaneous localization and mapping (SLAM) and its related accuracy and consistency. Among these methods, submap-based SLAM is a more effective one. By combining the strength of two popular mapping algorithms, the Rao-Blackwellised particle filter (RBPF) and extended information filter (EIF), this paper presents a combined SLAM-an efficient submap-based solution to the SLAM problem in a large scale environment. RBPF-SLAM is used to produce local maps, which are periodically fused into an EIF-SLAM algorithm. RBPF-SLAM can avoid linearization of the robot model during operating and provide a robust data association, while EIF-SLAM can improve the whole computational speed, and avoid the tendency of RBPF-SLAM to be over-confident. In order to further improve the computational speed in a real time environment, a binary-tree-based decision-making strategy is introduced. Simulation experiments show that the proposed combined SLAM algorithm significantly outperforms currently existing algorithms in terms of accuracy and consistency, as well as the computing efficiency. Finally, the combined SLAM algorithm is experimentally validated in a real environment by using the Victoria Park dataset.

  11. Exact and heuristic algorithms for Space Information Flow.

    PubMed

    Uwitonze, Alfred; Huang, Jiaqing; Ye, Yuanqing; Cheng, Wenqing; Li, Zongpeng

    2018-01-01

    Space Information Flow (SIF) is a new promising research area that studies network coding in geometric space, such as Euclidean space. The design of algorithms that compute the optimal SIF solutions remains one of the key open problems in SIF. This work proposes the first exact SIF algorithm and a heuristic SIF algorithm that compute min-cost multicast network coding for N (N ≥ 3) given terminal nodes in 2-D Euclidean space. Furthermore, we find that the Butterfly network in Euclidean space is the second example besides the Pentagram network where SIF is strictly better than Euclidean Steiner minimal tree. The exact algorithm design is based on two key techniques: Delaunay triangulation and linear programming. Delaunay triangulation technique helps to find practically good candidate relay nodes, after which a min-cost multicast linear programming model is solved over the terminal nodes and the candidate relay nodes, to compute the optimal multicast network topology, including the optimal relay nodes selected by linear programming from all the candidate relay nodes and the flow rates on the connection links. The heuristic algorithm design is also based on Delaunay triangulation and linear programming techniques. The exact algorithm can achieve the optimal SIF solution with an exponential computational complexity, while the heuristic algorithm can achieve the sub-optimal SIF solution with a polynomial computational complexity. We prove the correctness of the exact SIF algorithm. The simulation results show the effectiveness of the heuristic SIF algorithm.

  12. A unifying framework for rigid multibody dynamics and serial and parallel computational issues

    NASA Technical Reports Server (NTRS)

    Fijany, Amir; Jain, Abhinandan

    1989-01-01

    A unifying framework for various formulations of the dynamics of open-chain rigid multibody systems is discussed. Their suitability for serial and parallel processing is assessed. The framework is based on the derivation of intrinsic, i.e., coordinate-free, equations of the algorithms which provides a suitable abstraction and permits a distinction to be made between the computational redundancy in the intrinsic and extrinsic equations. A set of spatial notation is used which allows the derivation of the various algorithms in a common setting and thus clarifies the relationships among them. The three classes of algorithms viz., O(n), O(n exp 2) and O(n exp 3) or the solution of the dynamics problem are investigated. Researchers begin with the derivation of O(n exp 3) algorithms based on the explicit computation of the mass matrix and it provides insight into the underlying basis of the O(n) algorithms. From a computational perspective, the optimal choice of a coordinate frame for the projection of the intrinsic equations is discussed and the serial computational complexity of the different algorithms is evaluated. The three classes of algorithms are also analyzed for suitability for parallel processing. It is shown that the problem belongs to the class of N C and the time and processor bounds are of O(log2/2(n)) and O(n exp 4), respectively. However, the algorithm that achieves the above bounds is not stable. Researchers show that the fastest stable parallel algorithm achieves a computational complexity of O(n) with O(n exp 4), respectively. However, the algorithm that achieves the above bounds is not stable. Researchers show that the fastest stable parallel algorithm achieves a computational complexity of O(n) with O(n exp 2) processors, and results from the parallelization of the O(n exp 3) serial algorithm.

  13. Three list scheduling temporal partitioning algorithm of time space characteristic analysis and compare for dynamic reconfigurable computing

    NASA Astrophysics Data System (ADS)

    Chen, Naijin

    2013-03-01

    Level Based Partitioning (LBP) algorithm, Cluster Based Partitioning (CBP) algorithm and Enhance Static List (ESL) temporal partitioning algorithm based on adjacent matrix and adjacent table are designed and implemented in this paper. Also partitioning time and memory occupation based on three algorithms are compared. Experiment results show LBP partitioning algorithm possesses the least partitioning time and better parallel character, as far as memory occupation and partitioning time are concerned, algorithms based on adjacent table have less partitioning time and less space memory occupation.

  14. Cloud computing task scheduling strategy based on improved differential evolution algorithm

    NASA Astrophysics Data System (ADS)

    Ge, Junwei; He, Qian; Fang, Yiqiu

    2017-04-01

    In order to optimize the cloud computing task scheduling scheme, an improved differential evolution algorithm for cloud computing task scheduling is proposed. Firstly, the cloud computing task scheduling model, according to the model of the fitness function, and then used improved optimization calculation of the fitness function of the evolutionary algorithm, according to the evolution of generation of dynamic selection strategy through dynamic mutation strategy to ensure the global and local search ability. The performance test experiment was carried out in the CloudSim simulation platform, the experimental results show that the improved differential evolution algorithm can reduce the cloud computing task execution time and user cost saving, good implementation of the optimal scheduling of cloud computing tasks.

  15. Computer-aided US diagnosis of breast lesions by using cell-based contour grouping.

    PubMed

    Cheng, Jie-Zhi; Chou, Yi-Hong; Huang, Chiun-Sheng; Chang, Yeun-Chung; Tiu, Chui-Mei; Chen, Kuei-Wu; Chen, Chung-Ming

    2010-06-01

    To develop a computer-aided diagnostic algorithm with automatic boundary delineation for differential diagnosis of benign and malignant breast lesions at ultrasonography (US) and investigate the effect of boundary quality on the performance of a computer-aided diagnostic algorithm. This was an institutional review board-approved retrospective study with waiver of informed consent. A cell-based contour grouping (CBCG) segmentation algorithm was used to delineate the lesion boundaries automatically. Seven morphologic features were extracted. The classifier was a logistic regression function. Five hundred twenty breast US scans were obtained from 520 subjects (age range, 15-89 years), including 275 benign (mean size, 15 mm; range, 5-35 mm) and 245 malignant (mean size, 18 mm; range, 8-29 mm) lesions. The newly developed computer-aided diagnostic algorithm was evaluated on the basis of boundary quality and differentiation performance. The segmentation algorithms and features in two conventional computer-aided diagnostic algorithms were used for comparative study. The CBCG-generated boundaries were shown to be comparable with the manually delineated boundaries. The area under the receiver operating characteristic curve (AUC) and differentiation accuracy were 0.968 +/- 0.010 and 93.1% +/- 0.7, respectively, for all 520 breast lesions. At the 5% significance level, the newly developed algorithm was shown to be superior to the use of the boundaries and features of the two conventional computer-aided diagnostic algorithms in terms of AUC (0.974 +/- 0.007 versus 0.890 +/- 0.008 and 0.788 +/- 0.024, respectively). The newly developed computer-aided diagnostic algorithm that used a CBCG segmentation method to measure boundaries achieved a high differentiation performance. Copyright RSNA, 2010

  16. Recommendations for dose calculations of lung cancer treatment plans treated with stereotactic ablative body radiotherapy (SABR)

    NASA Astrophysics Data System (ADS)

    Devpura, S.; Siddiqui, M. S.; Chen, D.; Liu, D.; Li, H.; Kumar, S.; Gordon, J.; Ajlouni, M.; Movsas, B.; Chetty, I. J.

    2014-03-01

    The purpose of this study was to systematically evaluate dose distributions computed with 5 different dose algorithms for patients with lung cancers treated using stereotactic ablative body radiotherapy (SABR). Treatment plans for 133 lung cancer patients, initially computed with a 1D-pencil beam (equivalent-path-length, EPL-1D) algorithm, were recalculated with 4 other algorithms commissioned for treatment planning, including 3-D pencil-beam (EPL-3D), anisotropic analytical algorithm (AAA), collapsed cone convolution superposition (CCC), and Monte Carlo (MC). The plan prescription dose was 48 Gy in 4 fractions normalized to the 95% isodose line. Tumors were classified according to location: peripheral tumors surrounded by lung (lung-island, N=39), peripheral tumors attached to the rib-cage or chest wall (lung-wall, N=44), and centrally-located tumors (lung-central, N=50). Relative to the EPL-1D algorithm, PTV D95 and mean dose values computed with the other 4 algorithms were lowest for "lung-island" tumors with smallest field sizes (3-5 cm). On the other hand, the smallest differences were noted for lung-central tumors treated with largest field widths (7-10 cm). Amongst all locations, dose distribution differences were most strongly correlated with tumor size for lung-island tumors. For most cases, convolution/superposition and MC algorithms were in good agreement. Mean lung dose (MLD) values computed with the EPL-1D algorithm were highly correlated with that of the other algorithms (correlation coefficient =0.99). The MLD values were found to be ~10% lower for small lung-island tumors with the model-based (conv/superposition and MC) vs. the correction-based (pencil-beam) algorithms with the model-based algorithms predicting greater low dose spread within the lungs. This study suggests that pencil beam algorithms should be avoided for lung SABR planning. For the most challenging cases, small tumors surrounded entirely by lung tissue (lung-island type), a Monte-Carlo-based algorithm may be warranted.

  17. Optimization and large scale computation of an entropy-based moment closure

    NASA Astrophysics Data System (ADS)

    Kristopher Garrett, C.; Hauck, Cory; Hill, Judith

    2015-12-01

    We present computational advances and results in the implementation of an entropy-based moment closure, MN, in the context of linear kinetic equations, with an emphasis on heterogeneous and large-scale computing platforms. Entropy-based closures are known in several cases to yield more accurate results than closures based on standard spectral approximations, such as PN, but the computational cost is generally much higher and often prohibitive. Several optimizations are introduced to improve the performance of entropy-based algorithms over previous implementations. These optimizations include the use of GPU acceleration and the exploitation of the mathematical properties of spherical harmonics, which are used as test functions in the moment formulation. To test the emerging high-performance computing paradigm of communication bound simulations, we present timing results at the largest computational scales currently available. These results show, in particular, load balancing issues in scaling the MN algorithm that do not appear for the PN algorithm. We also observe that in weak scaling tests, the ratio in time to solution of MN to PN decreases.

  18. Optimization and large scale computation of an entropy-based moment closure

    DOE PAGES

    Hauck, Cory D.; Hill, Judith C.; Garrett, C. Kristopher

    2015-09-10

    We present computational advances and results in the implementation of an entropy-based moment closure, M N, in the context of linear kinetic equations, with an emphasis on heterogeneous and large-scale computing platforms. Entropy-based closures are known in several cases to yield more accurate results than closures based on standard spectral approximations, such as P N, but the computational cost is generally much higher and often prohibitive. Several optimizations are introduced to improve the performance of entropy-based algorithms over previous implementations. These optimizations include the use of GPU acceleration and the exploitation of the mathematical properties of spherical harmonics, which aremore » used as test functions in the moment formulation. To test the emerging high-performance computing paradigm of communication bound simulations, we present timing results at the largest computational scales currently available. Lastly, these results show, in particular, load balancing issues in scaling the M N algorithm that do not appear for the P N algorithm. We also observe that in weak scaling tests, the ratio in time to solution of M N to P N decreases.« less

  19. Computer-Based Algorithmic Determination of Muscle Movement Onset Using M-Mode Ultrasonography

    DTIC Science & Technology

    2017-05-01

    contraction images were analyzed visually and with three different classes of algorithms: pixel standard deviation (SD), high-pass filter and Teager Kaiser...Linear relationships and agreements between computed and visual muscle onset were calculated. The top algorithms were high-pass filtered with a 30 Hz...suggest that computer automated determination using high-pass filtering is a potential objective alternative to visual determination in human

  20. Interactive collision detection for deformable models using streaming AABBs.

    PubMed

    Zhang, Xinyu; Kim, Young J

    2007-01-01

    We present an interactive and accurate collision detection algorithm for deformable, polygonal objects based on the streaming computational model. Our algorithm can detect all possible pairwise primitive-level intersections between two severely deforming models at highly interactive rates. In our streaming computational model, we consider a set of axis aligned bounding boxes (AABBs) that bound each of the given deformable objects as an input stream and perform massively-parallel pairwise, overlapping tests onto the incoming streams. As a result, we are able to prevent performance stalls in the streaming pipeline that can be caused by expensive indexing mechanism required by bounding volume hierarchy-based streaming algorithms. At runtime, as the underlying models deform over time, we employ a novel, streaming algorithm to update the geometric changes in the AABB streams. Moreover, in order to get only the computed result (i.e., collision results between AABBs) without reading back the entire output streams, we propose a streaming en/decoding strategy that can be performed in a hierarchical fashion. After determining overlapped AABBs, we perform a primitive-level (e.g., triangle) intersection checking on a serial computational model such as CPUs. We implemented the entire pipeline of our algorithm using off-the-shelf graphics processors (GPUs), such as nVIDIA GeForce 7800 GTX, for streaming computations, and Intel Dual Core 3.4G processors for serial computations. We benchmarked our algorithm with different models of varying complexities, ranging from 15K up to 50K triangles, under various deformation motions, and the timings were obtained as 30 approximately 100 FPS depending on the complexity of models and their relative configurations. Finally, we made comparisons with a well-known GPU-based collision detection algorithm, CULLIDE [4] and observed about three times performance improvement over the earlier approach. We also made comparisons with a SW-based AABB culling algorithm [2] and observed about two times improvement.

  1. Novel techniques for data decomposition and load balancing for parallel processing of vision systems: Implementation and evaluation using a motion estimation system

    NASA Technical Reports Server (NTRS)

    Choudhary, Alok Nidhi; Leung, Mun K.; Huang, Thomas S.; Patel, Janak H.

    1989-01-01

    Computer vision systems employ a sequence of vision algorithms in which the output of an algorithm is the input of the next algorithm in the sequence. Algorithms that constitute such systems exhibit vastly different computational characteristics, and therefore, require different data decomposition techniques and efficient load balancing techniques for parallel implementation. However, since the input data for a task is produced as the output data of the previous task, this information can be exploited to perform knowledge based data decomposition and load balancing. Presented here are algorithms for a motion estimation system. The motion estimation is based on the point correspondence between the involved images which are a sequence of stereo image pairs. Researchers propose algorithms to obtain point correspondences by matching feature points among stereo image pairs at any two consecutive time instants. Furthermore, the proposed algorithms employ non-iterative procedures, which results in saving considerable amounts of computation time. The system consists of the following steps: (1) extraction of features; (2) stereo match of images in one time instant; (3) time match of images from consecutive time instants; (4) stereo match to compute final unambiguous points; and (5) computation of motion parameters.

  2. Computing the Envelope for Stepwise Constant Resource Allocations

    NASA Technical Reports Server (NTRS)

    Muscettola, Nicola; Clancy, Daniel (Technical Monitor)

    2001-01-01

    Estimating tight resource level is a fundamental problem in the construction of flexible plans with resource utilization. In this paper we describe an efficient algorithm that builds a resource envelope, the tightest possible such bound. The algorithm is based on transforming the temporal network of resource consuming and producing events into a flow network with noises equal to the events and edges equal to the necessary predecessor links between events. The incremental solution of a staged maximum flow problem on the network is then used to compute the time of occurrence and the height of each step of the resource envelope profile. The staged algorithm has the same computational complexity of solving a maximum flow problem on the entire flow network. This makes this method computationally feasible for use in the inner loop of search-based scheduling algorithms.

  3. Real-time optical flow estimation on a GPU for a skied-steered mobile robot

    NASA Astrophysics Data System (ADS)

    Kniaz, V. V.

    2016-04-01

    Accurate egomotion estimation is required for mobile robot navigation. Often the egomotion is estimated using optical flow algorithms. For an accurate estimation of optical flow most of modern algorithms require high memory resources and processor speed. However simple single-board computers that control the motion of the robot usually do not provide such resources. On the other hand, most of modern single-board computers are equipped with an embedded GPU that could be used in parallel with a CPU to improve the performance of the optical flow estimation algorithm. This paper presents a new Z-flow algorithm for efficient computation of an optical flow using an embedded GPU. The algorithm is based on the phase correlation optical flow estimation and provide a real-time performance on a low cost embedded GPU. The layered optical flow model is used. Layer segmentation is performed using graph-cut algorithm with a time derivative based energy function. Such approach makes the algorithm both fast and robust in low light and low texture conditions. The algorithm implementation for a Raspberry Pi Model B computer is discussed. For evaluation of the algorithm the computer was mounted on a Hercules mobile skied-steered robot equipped with a monocular camera. The evaluation was performed using a hardware-in-the-loop simulation and experiments with Hercules mobile robot. Also the algorithm was evaluated using KITTY Optical Flow 2015 dataset. The resulting endpoint error of the optical flow calculated with the developed algorithm was low enough for navigation of the robot along the desired trajectory.

  4. Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment.

    PubMed

    Lee, Wei-Po; Hsiao, Yu-Ting; Hwang, Wei-Che

    2014-01-16

    To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high quality solutions can be obtained within relatively short time. This integrated approach is a promising way for inferring large networks.

  5. Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment

    PubMed Central

    2014-01-01

    Background To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. Results This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Conclusions Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high quality solutions can be obtained within relatively short time. This integrated approach is a promising way for inferring large networks. PMID:24428926

  6. Interfacing External Quantum Devices to a Universal Quantum Computer

    PubMed Central

    Lagana, Antonio A.; Lohe, Max A.; von Smekal, Lorenz

    2011-01-01

    We present a scheme to use external quantum devices using the universal quantum computer previously constructed. We thereby show how the universal quantum computer can utilize networked quantum information resources to carry out local computations. Such information may come from specialized quantum devices or even from remote universal quantum computers. We show how to accomplish this by devising universal quantum computer programs that implement well known oracle based quantum algorithms, namely the Deutsch, Deutsch-Jozsa, and the Grover algorithms using external black-box quantum oracle devices. In the process, we demonstrate a method to map existing quantum algorithms onto the universal quantum computer. PMID:22216276

  7. Interfacing external quantum devices to a universal quantum computer.

    PubMed

    Lagana, Antonio A; Lohe, Max A; von Smekal, Lorenz

    2011-01-01

    We present a scheme to use external quantum devices using the universal quantum computer previously constructed. We thereby show how the universal quantum computer can utilize networked quantum information resources to carry out local computations. Such information may come from specialized quantum devices or even from remote universal quantum computers. We show how to accomplish this by devising universal quantum computer programs that implement well known oracle based quantum algorithms, namely the Deutsch, Deutsch-Jozsa, and the Grover algorithms using external black-box quantum oracle devices. In the process, we demonstrate a method to map existing quantum algorithms onto the universal quantum computer. © 2011 Lagana et al.

  8. Quantum adiabatic computation and adiabatic conditions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wei Zhaohui; Ying Mingsheng

    2007-08-15

    Recently, quantum adiabatic computation has attracted more and more attention in the literature. It is a novel quantum computation model based on adiabatic approximation, and the analysis of a quantum adiabatic algorithm depends highly on the adiabatic conditions. However, it has been pointed out that the traditional adiabatic conditions are problematic. Thus, results obtained previously should be checked and sufficient adiabatic conditions applicable to adiabatic computation should be proposed. Based on a result of Tong et al. [Phys. Rev. Lett. 98, 150402 (2007)], we propose a modified adiabatic criterion which is more applicable to the analysis of adiabatic algorithms. Asmore » an example, we prove the validity of the local adiabatic search algorithm by employing our criterion.« less

  9. A comparison of graph- and kernel-based -omics data integration algorithms for classifying complex traits.

    PubMed

    Yan, Kang K; Zhao, Hongyu; Pang, Herbert

    2017-12-06

    High-throughput sequencing data are widely collected and analyzed in the study of complex diseases in quest of improving human health. Well-studied algorithms mostly deal with single data source, and cannot fully utilize the potential of these multi-omics data sources. In order to provide a holistic understanding of human health and diseases, it is necessary to integrate multiple data sources. Several algorithms have been proposed so far, however, a comprehensive comparison of data integration algorithms for classification of binary traits is currently lacking. In this paper, we focus on two common classes of integration algorithms, graph-based that depict relationships with subjects denoted by nodes and relationships denoted by edges, and kernel-based that can generate a classifier in feature space. Our paper provides a comprehensive comparison of their performance in terms of various measurements of classification accuracy and computation time. Seven different integration algorithms, including graph-based semi-supervised learning, graph sharpening integration, composite association network, Bayesian network, semi-definite programming-support vector machine (SDP-SVM), relevance vector machine (RVM) and Ada-boost relevance vector machine are compared and evaluated with hypertension and two cancer data sets in our study. In general, kernel-based algorithms create more complex models and require longer computation time, but they tend to perform better than graph-based algorithms. The performance of graph-based algorithms has the advantage of being faster computationally. The empirical results demonstrate that composite association network, relevance vector machine, and Ada-boost RVM are the better performers. We provide recommendations on how to choose an appropriate algorithm for integrating data from multiple sources.

  10. A generalized Condat's algorithm of 1D total variation regularization

    NASA Astrophysics Data System (ADS)

    Makovetskii, Artyom; Voronin, Sergei; Kober, Vitaly

    2017-09-01

    A common way for solving the denosing problem is to utilize the total variation (TV) regularization. Many efficient numerical algorithms have been developed for solving the TV regularization problem. Condat described a fast direct algorithm to compute the processed 1D signal. Also there exists a direct algorithm with a linear time for 1D TV denoising referred to as the taut string algorithm. The Condat's algorithm is based on a dual problem to the 1D TV regularization. In this paper, we propose a variant of the Condat's algorithm based on the direct 1D TV regularization problem. The usage of the Condat's algorithm with the taut string approach leads to a clear geometric description of the extremal function. Computer simulation results are provided to illustrate the performance of the proposed algorithm for restoration of degraded signals.

  11. Current algorithmic solutions for peptide-based proteomics data generation and identification.

    PubMed

    Hoopmann, Michael R; Moritz, Robert L

    2013-02-01

    Peptide-based proteomic data sets are ever increasing in size and complexity. These data sets provide computational challenges when attempting to quickly analyze spectra and obtain correct protein identifications. Database search and de novo algorithms must consider high-resolution MS/MS spectra and alternative fragmentation methods. Protein inference is a tricky problem when analyzing large data sets of degenerate peptide identifications. Combining multiple algorithms for improved peptide identification puts significant strain on computational systems when investigating large data sets. This review highlights some of the recent developments in peptide and protein identification algorithms for analyzing shotgun mass spectrometry data when encountering the aforementioned hurdles. Also explored are the roles that analytical pipelines, public spectral libraries, and cloud computing play in the evolution of peptide-based proteomics. Copyright © 2012 Elsevier Ltd. All rights reserved.

  12. BCM: toolkit for Bayesian analysis of Computational Models using samplers.

    PubMed

    Thijssen, Bram; Dijkstra, Tjeerd M H; Heskes, Tom; Wessels, Lodewyk F A

    2016-10-21

    Computational models in biology are characterized by a large degree of uncertainty. This uncertainty can be analyzed with Bayesian statistics, however, the sampling algorithms that are frequently used for calculating Bayesian statistical estimates are computationally demanding, and each algorithm has unique advantages and disadvantages. It is typically unclear, before starting an analysis, which algorithm will perform well on a given computational model. We present BCM, a toolkit for the Bayesian analysis of Computational Models using samplers. It provides efficient, multithreaded implementations of eleven algorithms for sampling from posterior probability distributions and for calculating marginal likelihoods. BCM includes tools to simplify the process of model specification and scripts for visualizing the results. The flexible architecture allows it to be used on diverse types of biological computational models. In an example inference task using a model of the cell cycle based on ordinary differential equations, BCM is significantly more efficient than existing software packages, allowing more challenging inference problems to be solved. BCM represents an efficient one-stop-shop for computational modelers wishing to use sampler-based Bayesian statistics.

  13. Tools for Analyzing Computing Resource Management Strategies and Algorithms for SDR Clouds

    NASA Astrophysics Data System (ADS)

    Marojevic, Vuk; Gomez-Miguelez, Ismael; Gelonch, Antoni

    2012-09-01

    Software defined radio (SDR) clouds centralize the computing resources of base stations. The computing resource pool is shared between radio operators and dynamically loads and unloads digital signal processing chains for providing wireless communications services on demand. Each new user session request particularly requires the allocation of computing resources for executing the corresponding SDR transceivers. The huge amount of computing resources of SDR cloud data centers and the numerous session requests at certain hours of a day require an efficient computing resource management. We propose a hierarchical approach, where the data center is divided in clusters that are managed in a distributed way. This paper presents a set of computing resource management tools for analyzing computing resource management strategies and algorithms for SDR clouds. We use the tools for evaluating a different strategies and algorithms. The results show that more sophisticated algorithms can achieve higher resource occupations and that a tradeoff exists between cluster size and algorithm complexity.

  14. Design and Implementation of Hybrid CORDIC Algorithm Based on Phase Rotation Estimation for NCO

    PubMed Central

    Zhang, Chaozhu; Han, Jinan; Li, Ke

    2014-01-01

    The numerical controlled oscillator has wide application in radar, digital receiver, and software radio system. Firstly, this paper introduces the traditional CORDIC algorithm. Then in order to improve computing speed and save resources, this paper proposes a kind of hybrid CORDIC algorithm based on phase rotation estimation applied in numerical controlled oscillator (NCO). Through estimating the direction of part phase rotation, the algorithm reduces part phase rotation and add-subtract unit, so that it decreases delay. Furthermore, the paper simulates and implements the numerical controlled oscillator by Quartus II software and Modelsim software. Finally, simulation results indicate that the improvement over traditional CORDIC algorithm is achieved in terms of ease of computation, resource utilization, and computing speed/delay while maintaining the precision. It is suitable for high speed and precision digital modulation and demodulation. PMID:25110750

  15. Flood inundation extent mapping based on block compressed tracing

    NASA Astrophysics Data System (ADS)

    Shen, Dingtao; Rui, Yikang; Wang, Jiechen; Zhang, Yu; Cheng, Liang

    2015-07-01

    Flood inundation extent, depth, and duration are important factors affecting flood hazard evaluation. At present, flood inundation analysis is based mainly on a seeded region-growing algorithm, which is an inefficient process because it requires excessive recursive computations and it is incapable of processing massive datasets. To address this problem, we propose a block compressed tracing algorithm for mapping the flood inundation extent, which reads the DEM data in blocks before transferring them to raster compression storage. This allows a smaller computer memory to process a larger amount of data, which solves the problem of the regular seeded region-growing algorithm. In addition, the use of a raster boundary tracing technique allows the algorithm to avoid the time-consuming computations required by the seeded region-growing. Finally, we conduct a comparative evaluation in the Chin-sha River basin, results show that the proposed method solves the problem of flood inundation extent mapping based on massive DEM datasets with higher computational efficiency than the original method, which makes it suitable for practical applications.

  16. The challenge of computer mathematics.

    PubMed

    Barendregt, Henk; Wiedijk, Freek

    2005-10-15

    Progress in the foundations of mathematics has made it possible to formulate all thinkable mathematical concepts, algorithms and proofs in one language and in an impeccable way. This is not in spite of, but partially based on the famous results of Gödel and Turing. In this way statements are about mathematical objects and algorithms, proofs show the correctness of statements and computations, and computations are dealing with objects and proofs. Interactive computer systems for a full integration of defining, computing and proving are based on this. The human defines concepts, constructs algorithms and provides proofs, while the machine checks that the definitions are well formed and the proofs and computations are correct. Results formalized so far demonstrate the feasibility of this 'computer mathematics'. Also there are very good applications. The challenge is to make the systems more mathematician-friendly, by building libraries and tools. The eventual goal is to help humans to learn, develop, communicate, referee and apply mathematics.

  17. Algorithms for Disconnected Diagrams in Lattice QCD

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gambhir, Arjun Singh; Stathopoulos, Andreas; Orginos, Konstantinos

    2016-11-01

    Computing disconnected diagrams in Lattice QCD (operator insertion in a quark loop) entails the computationally demanding problem of taking the trace of the all to all quark propagator. We first outline the basic algorithm used to compute a quark loop as well as improvements to this method. Then, we motivate and introduce an algorithm based on the synergy between hierarchical probing and singular value deflation. We present results for the chiral condensate using a 2+1-flavor clover ensemble and compare estimates of the nucleon charges with the basic algorithm.

  18. The fractional Fourier transform and applications

    NASA Technical Reports Server (NTRS)

    Bailey, David H.; Swarztrauber, Paul N.

    1991-01-01

    This paper describes the 'fractional Fourier transform', which admits computation by an algorithm that has complexity proportional to the fast Fourier transform algorithm. Whereas the discrete Fourier transform (DFT) is based on integral roots of unity e exp -2(pi)i/n, the fractional Fourier transform is based on fractional roots of unity e exp -2(pi)i(alpha), where alpha is arbitrary. The fractional Fourier transform and the corresponding fast algorithm are useful for such applications as computing DFTs of sequences with prime lengths, computing DFTs of sparse sequences, analyzing sequences with noninteger periodicities, performing high-resolution trigonometric interpolation, detecting lines in noisy images, and detecting signals with linearly drifting frequencies. In many cases, the resulting algorithms are faster by arbitrarily large factors than conventional techniques.

  19. Zero-block mode decision algorithm for H.264/AVC.

    PubMed

    Lee, Yu-Ming; Lin, Yinyi

    2009-03-01

    In the previous paper , we proposed a zero-block intermode decision algorithm for H.264 video coding based upon the number of zero-blocks of 4 x 4 DCT coefficients between the current macroblock and the co-located macroblock. The proposed algorithm can achieve significant improvement in computation, but the computation performance is limited for high bit-rate coding. To improve computation efficiency, in this paper, we suggest an enhanced zero-block decision algorithm, which uses an early zero-block detection method to compute the number of zero-blocks instead of direct DCT and quantization (DCT/Q) calculation and incorporates two adequate decision methods into semi-stationary and nonstationary regions of a video sequence. In addition, the zero-block decision algorithm is also applied to the intramode prediction in the P frame. The enhanced zero-block decision algorithm brings out a reduction of average 27% of total encoding time compared to the zero-block decision algorithm.

  20. Using Computer-Based "Experiments" in the Analysis of Chemical Reaction Equilibria

    ERIC Educational Resources Information Center

    Li, Zhao; Corti, David S.

    2018-01-01

    The application of the Reaction Monte Carlo (RxMC) algorithm to standard textbook problems in chemical reaction equilibria is discussed. The RxMC method is a molecular simulation algorithm for studying the equilibrium properties of reactive systems, and therefore provides the opportunity to develop computer-based "experiments" for the…

  1. A New Augmentation Based Algorithm for Extracting Maximal Chordal Subgraphs.

    PubMed

    Bhowmick, Sanjukta; Chen, Tzu-Yi; Halappanavar, Mahantesh

    2015-02-01

    A graph is chordal if every cycle of length greater than three contains an edge between non-adjacent vertices. Chordal graphs are of interest both theoretically, since they admit polynomial time solutions to a range of NP-hard graph problems, and practically, since they arise in many applications including sparse linear algebra, computer vision, and computational biology. A maximal chordal subgraph is a chordal subgraph that is not a proper subgraph of any other chordal subgraph. Existing algorithms for computing maximal chordal subgraphs depend on dynamically ordering the vertices, which is an inherently sequential process and therefore limits the algorithms' parallelizability. In this paper we explore techniques to develop a scalable parallel algorithm for extracting a maximal chordal subgraph. We demonstrate that an earlier attempt at developing a parallel algorithm may induce a non-optimal vertex ordering and is therefore not guaranteed to terminate with a maximal chordal subgraph. We then give a new algorithm that first computes and then repeatedly augments a spanning chordal subgraph. After proving that the algorithm terminates with a maximal chordal subgraph, we then demonstrate that this algorithm is more amenable to parallelization and that the parallel version also terminates with a maximal chordal subgraph. That said, the complexity of the new algorithm is higher than that of the previous parallel algorithm, although the earlier algorithm computes a chordal subgraph which is not guaranteed to be maximal. We experimented with our augmentation-based algorithm on both synthetic and real-world graphs. We provide scalability results and also explore the effect of different choices for the initial spanning chordal subgraph on both the running time and on the number of edges in the maximal chordal subgraph.

  2. A physics-based algorithm for real-time simulation of electrosurgery procedures in minimally invasive surgery.

    PubMed

    Lu, Zhonghua; Arikatla, Venkata S; Han, Zhongqing; Allen, Brian F; De, Suvranu

    2014-12-01

    High-frequency electricity is used in the majority of surgical interventions. However, modern computer-based training and simulation systems rely on physically unrealistic models that fail to capture the interplay of the electrical, mechanical and thermal properties of biological tissue. We present a real-time and physically realistic simulation of electrosurgery by modelling the electrical, thermal and mechanical properties as three iteratively solved finite element models. To provide subfinite-element graphical rendering of vaporized tissue, a dual-mesh dynamic triangulation algorithm based on isotherms is proposed. The block compressed row storage (BCRS) structure is shown to be critical in allowing computationally efficient changes in the tissue topology due to vaporization. We have demonstrated our physics-based electrosurgery cutting algorithm through various examples. Our matrix manipulation algorithms designed for topology changes have shown low computational cost. Our simulator offers substantially greater physical fidelity compared to previous simulators that use simple geometry-based heat characterization. Copyright © 2013 John Wiley & Sons, Ltd.

  3. Natural Inspired Intelligent Visual Computing and Its Application to Viticulture.

    PubMed

    Ang, Li Minn; Seng, Kah Phooi; Ge, Feng Lu

    2017-05-23

    This paper presents an investigation of natural inspired intelligent computing and its corresponding application towards visual information processing systems for viticulture. The paper has three contributions: (1) a review of visual information processing applications for viticulture; (2) the development of natural inspired computing algorithms based on artificial immune system (AIS) techniques for grape berry detection; and (3) the application of the developed algorithms towards real-world grape berry images captured in natural conditions from vineyards in Australia. The AIS algorithms in (2) were developed based on a nature-inspired clonal selection algorithm (CSA) which is able to detect the arcs in the berry images with precision, based on a fitness model. The arcs detected are then extended to perform the multiple arcs and ring detectors information processing for the berry detection application. The performance of the developed algorithms were compared with traditional image processing algorithms like the circular Hough transform (CHT) and other well-known circle detection methods. The proposed AIS approach gave a Fscore of 0.71 compared with Fscores of 0.28 and 0.30 for the CHT and a parameter-free circle detection technique (RPCD) respectively.

  4. An imperialist competitive algorithm for virtual machine placement in cloud computing

    NASA Astrophysics Data System (ADS)

    Jamali, Shahram; Malektaji, Sepideh; Analoui, Morteza

    2017-05-01

    Cloud computing, the recently emerged revolution in IT industry, is empowered by virtualisation technology. In this paradigm, the user's applications run over some virtual machines (VMs). The process of selecting proper physical machines to host these virtual machines is called virtual machine placement. It plays an important role on resource utilisation and power efficiency of cloud computing environment. In this paper, we propose an imperialist competitive-based algorithm for the virtual machine placement problem called ICA-VMPLC. The base optimisation algorithm is chosen to be ICA because of its ease in neighbourhood movement, good convergence rate and suitable terminology. The proposed algorithm investigates search space in a unique manner to efficiently obtain optimal placement solution that simultaneously minimises power consumption and total resource wastage. Its final solution performance is compared with several existing methods such as grouping genetic and ant colony-based algorithms as well as bin packing heuristic. The simulation results show that the proposed method is superior to other tested algorithms in terms of power consumption, resource wastage, CPU usage efficiency and memory usage efficiency.

  5. A nonvoxel-based dose convolution/superposition algorithm optimized for scalable GPU architectures.

    PubMed

    Neylon, J; Sheng, K; Yu, V; Chen, Q; Low, D A; Kupelian, P; Santhanam, A

    2014-10-01

    Real-time adaptive planning and treatment has been infeasible due in part to its high computational complexity. There have been many recent efforts to utilize graphics processing units (GPUs) to accelerate the computational performance and dose accuracy in radiation therapy. Data structure and memory access patterns are the key GPU factors that determine the computational performance and accuracy. In this paper, the authors present a nonvoxel-based (NVB) approach to maximize computational and memory access efficiency and throughput on the GPU. The proposed algorithm employs a ray-tracing mechanism to restructure the 3D data sets computed from the CT anatomy into a nonvoxel-based framework. In a process that takes only a few milliseconds of computing time, the algorithm restructured the data sets by ray-tracing through precalculated CT volumes to realign the coordinate system along the convolution direction, as defined by zenithal and azimuthal angles. During the ray-tracing step, the data were resampled according to radial sampling and parallel ray-spacing parameters making the algorithm independent of the original CT resolution. The nonvoxel-based algorithm presented in this paper also demonstrated a trade-off in computational performance and dose accuracy for different coordinate system configurations. In order to find the best balance between the computed speedup and the accuracy, the authors employed an exhaustive parameter search on all sampling parameters that defined the coordinate system configuration: zenithal, azimuthal, and radial sampling of the convolution algorithm, as well as the parallel ray spacing during ray tracing. The angular sampling parameters were varied between 4 and 48 discrete angles, while both radial sampling and parallel ray spacing were varied from 0.5 to 10 mm. The gamma distribution analysis method (γ) was used to compare the dose distributions using 2% and 2 mm dose difference and distance-to-agreement criteria, respectively. Accuracy was investigated using three distinct phantoms with varied geometries and heterogeneities and on a series of 14 segmented lung CT data sets. Performance gains were calculated using three 256 mm cube homogenous water phantoms, with isotropic voxel dimensions of 1, 2, and 4 mm. The nonvoxel-based GPU algorithm was independent of the data size and provided significant computational gains over the CPU algorithm for large CT data sizes. The parameter search analysis also showed that the ray combination of 8 zenithal and 8 azimuthal angles along with 1 mm radial sampling and 2 mm parallel ray spacing maintained dose accuracy with greater than 99% of voxels passing the γ test. Combining the acceleration obtained from GPU parallelization with the sampling optimization, the authors achieved a total performance improvement factor of >175 000 when compared to our voxel-based ground truth CPU benchmark and a factor of 20 compared with a voxel-based GPU dose convolution method. The nonvoxel-based convolution method yielded substantial performance improvements over a generic GPU implementation, while maintaining accuracy as compared to a CPU computed ground truth dose distribution. Such an algorithm can be a key contribution toward developing tools for adaptive radiation therapy systems.

  6. A nonvoxel-based dose convolution/superposition algorithm optimized for scalable GPU architectures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Neylon, J., E-mail: jneylon@mednet.ucla.edu; Sheng, K.; Yu, V.

    Purpose: Real-time adaptive planning and treatment has been infeasible due in part to its high computational complexity. There have been many recent efforts to utilize graphics processing units (GPUs) to accelerate the computational performance and dose accuracy in radiation therapy. Data structure and memory access patterns are the key GPU factors that determine the computational performance and accuracy. In this paper, the authors present a nonvoxel-based (NVB) approach to maximize computational and memory access efficiency and throughput on the GPU. Methods: The proposed algorithm employs a ray-tracing mechanism to restructure the 3D data sets computed from the CT anatomy intomore » a nonvoxel-based framework. In a process that takes only a few milliseconds of computing time, the algorithm restructured the data sets by ray-tracing through precalculated CT volumes to realign the coordinate system along the convolution direction, as defined by zenithal and azimuthal angles. During the ray-tracing step, the data were resampled according to radial sampling and parallel ray-spacing parameters making the algorithm independent of the original CT resolution. The nonvoxel-based algorithm presented in this paper also demonstrated a trade-off in computational performance and dose accuracy for different coordinate system configurations. In order to find the best balance between the computed speedup and the accuracy, the authors employed an exhaustive parameter search on all sampling parameters that defined the coordinate system configuration: zenithal, azimuthal, and radial sampling of the convolution algorithm, as well as the parallel ray spacing during ray tracing. The angular sampling parameters were varied between 4 and 48 discrete angles, while both radial sampling and parallel ray spacing were varied from 0.5 to 10 mm. The gamma distribution analysis method (γ) was used to compare the dose distributions using 2% and 2 mm dose difference and distance-to-agreement criteria, respectively. Accuracy was investigated using three distinct phantoms with varied geometries and heterogeneities and on a series of 14 segmented lung CT data sets. Performance gains were calculated using three 256 mm cube homogenous water phantoms, with isotropic voxel dimensions of 1, 2, and 4 mm. Results: The nonvoxel-based GPU algorithm was independent of the data size and provided significant computational gains over the CPU algorithm for large CT data sizes. The parameter search analysis also showed that the ray combination of 8 zenithal and 8 azimuthal angles along with 1 mm radial sampling and 2 mm parallel ray spacing maintained dose accuracy with greater than 99% of voxels passing the γ test. Combining the acceleration obtained from GPU parallelization with the sampling optimization, the authors achieved a total performance improvement factor of >175 000 when compared to our voxel-based ground truth CPU benchmark and a factor of 20 compared with a voxel-based GPU dose convolution method. Conclusions: The nonvoxel-based convolution method yielded substantial performance improvements over a generic GPU implementation, while maintaining accuracy as compared to a CPU computed ground truth dose distribution. Such an algorithm can be a key contribution toward developing tools for adaptive radiation therapy systems.« less

  7. Efficient and accurate Greedy Search Methods for mining functional modules in protein interaction networks.

    PubMed

    He, Jieyue; Li, Chaojun; Ye, Baoliu; Zhong, Wei

    2012-06-25

    Most computational algorithms mainly focus on detecting highly connected subgraphs in PPI networks as protein complexes but ignore their inherent organization. Furthermore, many of these algorithms are computationally expensive. However, recent analysis indicates that experimentally detected protein complexes generally contain Core/attachment structures. In this paper, a Greedy Search Method based on Core-Attachment structure (GSM-CA) is proposed. The GSM-CA method detects densely connected regions in large protein-protein interaction networks based on the edge weight and two criteria for determining core nodes and attachment nodes. The GSM-CA method improves the prediction accuracy compared to other similar module detection approaches, however it is computationally expensive. Many module detection approaches are based on the traditional hierarchical methods, which is also computationally inefficient because the hierarchical tree structure produced by these approaches cannot provide adequate information to identify whether a network belongs to a module structure or not. In order to speed up the computational process, the Greedy Search Method based on Fast Clustering (GSM-FC) is proposed in this work. The edge weight based GSM-FC method uses a greedy procedure to traverse all edges just once to separate the network into the suitable set of modules. The proposed methods are applied to the protein interaction network of S. cerevisiae. Experimental results indicate that many significant functional modules are detected, most of which match the known complexes. Results also demonstrate that the GSM-FC algorithm is faster and more accurate as compared to other competing algorithms. Based on the new edge weight definition, the proposed algorithm takes advantages of the greedy search procedure to separate the network into the suitable set of modules. Experimental analysis shows that the identified modules are statistically significant. The algorithm can reduce the computational time significantly while keeping high prediction accuracy.

  8. Parallel, stochastic measurement of molecular surface area.

    PubMed

    Juba, Derek; Varshney, Amitabh

    2008-08-01

    Biochemists often wish to compute surface areas of proteins. A variety of algorithms have been developed for this task, but they are designed for traditional single-processor architectures. The current trend in computer hardware is towards increasingly parallel architectures for which these algorithms are not well suited. We describe a parallel, stochastic algorithm for molecular surface area computation that maps well to the emerging multi-core architectures. Our algorithm is also progressive, providing a rough estimate of surface area immediately and refining this estimate as time goes on. Furthermore, the algorithm generates points on the molecular surface which can be used for point-based rendering. We demonstrate a GPU implementation of our algorithm and show that it compares favorably with several existing molecular surface computation programs, giving fast estimates of the molecular surface area with good accuracy.

  9. Smell Detection Agent Based Optimization Algorithm

    NASA Astrophysics Data System (ADS)

    Vinod Chandra, S. S.

    2016-09-01

    In this paper, a novel nature-inspired optimization algorithm has been employed and the trained behaviour of dogs in detecting smell trails is adapted into computational agents for problem solving. The algorithm involves creation of a surface with smell trails and subsequent iteration of the agents in resolving a path. This algorithm can be applied in different computational constraints that incorporate path-based problems. Implementation of the algorithm can be treated as a shortest path problem for a variety of datasets. The simulated agents have been used to evolve the shortest path between two nodes in a graph. This algorithm is useful to solve NP-hard problems that are related to path discovery. This algorithm is also useful to solve many practical optimization problems. The extensive derivation of the algorithm can be enabled to solve shortest path problems.

  10. Research and implementation of the algorithm for unwrapped and distortion correction basing on CORDIC for panoramic image

    NASA Astrophysics Data System (ADS)

    Zhang, Zhenhai; Li, Kejie; Wu, Xiaobing; Zhang, Shujiang

    2008-03-01

    The unwrapped and correcting algorithm based on Coordinate Rotation Digital Computer (CORDIC) and bilinear interpolation algorithm was presented in this paper, with the purpose of processing dynamic panoramic annular image. An original annular panoramic image captured by panoramic annular lens (PAL) can be unwrapped and corrected to conventional rectangular image without distortion, which is much more coincident with people's vision. The algorithm for panoramic image processing is modeled by VHDL and implemented in FPGA. The experimental results show that the proposed panoramic image algorithm for unwrapped and distortion correction has the lower computation complexity and the architecture for dynamic panoramic image processing has lower hardware cost and power consumption. And the proposed algorithm is valid.

  11. Imaging quality analysis of computer-generated holograms using the point-based method and slice-based method

    NASA Astrophysics Data System (ADS)

    Zhang, Zhen; Chen, Siqing; Zheng, Huadong; Sun, Tao; Yu, Yingjie; Gao, Hongyue; Asundi, Anand K.

    2017-06-01

    Computer holography has made a notably progress in recent years. The point-based method and slice-based method are chief calculation algorithms for generating holograms in holographic display. Although both two methods are validated numerically and optically, the differences of the imaging quality of these methods have not been specifically analyzed. In this paper, we analyze the imaging quality of computer-generated phase holograms generated by point-based Fresnel zone plates (PB-FZP), point-based Fresnel diffraction algorithm (PB-FDA) and slice-based Fresnel diffraction algorithm (SB-FDA). The calculation formula and hologram generation with three methods are demonstrated. In order to suppress the speckle noise, sequential phase-only holograms are generated in our work. The results of reconstructed images numerically and experimentally are also exhibited. By comparing the imaging quality, the merits and drawbacks with three methods are analyzed. Conclusions are given by us finally.

  12. Monkey search algorithm for ECE components partitioning

    NASA Astrophysics Data System (ADS)

    Kuliev, Elmar; Kureichik, Vladimir; Kureichik, Vladimir, Jr.

    2018-05-01

    The paper considers one of the important design problems – a partitioning of electronic computer equipment (ECE) components (blocks). It belongs to the NP-hard class of problems and has a combinatorial and logic nature. In the paper, a partitioning problem formulation can be found as a partition of graph into parts. To solve the given problem, the authors suggest using a bioinspired approach based on a monkey search algorithm. Based on the developed software, computational experiments were carried out that show the algorithm efficiency, as well as its recommended settings for obtaining more effective solutions in comparison with a genetic algorithm.

  13. A new augmentation based algorithm for extracting maximal chordal subgraphs

    DOE PAGES

    Bhowmick, Sanjukta; Chen, Tzu-Yi; Halappanavar, Mahantesh

    2014-10-18

    If every cycle of a graph is chordal length greater than three then it contains an edge between non-adjacent vertices. Chordal graphs are of interest both theoretically, since they admit polynomial time solutions to a range of NP-hard graph problems, and practically, since they arise in many applications including sparse linear algebra, computer vision, and computational biology. A maximal chordal subgraph is a chordal subgraph that is not a proper subgraph of any other chordal subgraph. Existing algorithms for computing maximal chordal subgraphs depend on dynamically ordering the vertices, which is an inherently sequential process and therefore limits the algorithms’more » parallelizability. In our paper we explore techniques to develop a scalable parallel algorithm for extracting a maximal chordal subgraph. We demonstrate that an earlier attempt at developing a parallel algorithm may induce a non-optimal vertex ordering and is therefore not guaranteed to terminate with a maximal chordal subgraph. We then give a new algorithm that first computes and then repeatedly augments a spanning chordal subgraph. After proving that the algorithm terminates with a maximal chordal subgraph, we then demonstrate that this algorithm is more amenable to parallelization and that the parallel version also terminates with a maximal chordal subgraph. That said, the complexity of the new algorithm is higher than that of the previous parallel algorithm, although the earlier algorithm computes a chordal subgraph which is not guaranteed to be maximal. Finally, we experimented with our augmentation-based algorithm on both synthetic and real-world graphs. We provide scalability results and also explore the effect of different choices for the initial spanning chordal subgraph on both the running time and on the number of edges in the maximal chordal subgraph.« less

  14. A novel clinical decision support algorithm for constructing complete medication histories.

    PubMed

    Long, Ju; Yuan, Michael Juntao

    2017-07-01

    A patient's complete medication history is a crucial element for physicians to develop a full understanding of the patient's medical conditions and treatment options. However, due to the fragmented nature of medical data, this process can be very time-consuming and often impossible for physicians to construct a complete medication history for complex patients. In this paper, we describe an accurate, computationally efficient and scalable algorithm to construct a medication history timeline. The algorithm is developed and validated based on 1 million random prescription records from a large national prescription data aggregator. Our evaluation shows that the algorithm can be scaled horizontally on-demand, making it suitable for future delivery in a cloud-computing environment. We also propose that this cloud-based medication history computation algorithm could be integrated into Electronic Medical Records, enabling informed clinical decision-making at the point of care. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Scalable software-defined optical networking with high-performance routing and wavelength assignment algorithms.

    PubMed

    Lee, Chankyun; Cao, Xiaoyuan; Yoshikane, Noboru; Tsuritani, Takehiro; Rhee, June-Koo Kevin

    2015-10-19

    The feasibility of software-defined optical networking (SDON) for a practical application critically depends on scalability of centralized control performance. The paper, highly scalable routing and wavelength assignment (RWA) algorithms are investigated on an OpenFlow-based SDON testbed for proof-of-concept demonstration. Efficient RWA algorithms are proposed to achieve high performance in achieving network capacity with reduced computation cost, which is a significant attribute in a scalable centralized-control SDON. The proposed heuristic RWA algorithms differ in the orders of request processes and in the procedures of routing table updates. Combined in a shortest-path-based routing algorithm, a hottest-request-first processing policy that considers demand intensity and end-to-end distance information offers both the highest throughput of networks and acceptable computation scalability. We further investigate trade-off relationship between network throughput and computation complexity in routing table update procedure by a simulation study.

  16. Semiannual Report, April 1, 1989 through September 30, 1989 (Institute for Computer Applications in Science and Engineering)

    DTIC Science & Technology

    1990-02-01

    noise. Tobias B. Orloff Work began on developing a high quality rendering algorithm based on the radiosity method. The algorithm is similar to...previous progressive radiosity algorithms except for the following improvements: 1. At each iteration vertex radiosities are computed using a modified scan...line approach, thus eliminating the quadratic cost associated with a ray tracing computation of vortex radiosities . 2. At each iteration the scene is

  17. A Fast Synthetic Aperture Radar Raw Data Simulation Using Cloud Computing.

    PubMed

    Li, Zhixin; Su, Dandan; Zhu, Haijiang; Li, Wei; Zhang, Fan; Li, Ruirui

    2017-01-08

    Synthetic Aperture Radar (SAR) raw data simulation is a fundamental problem in radar system design and imaging algorithm research. The growth of surveying swath and resolution results in a significant increase in data volume and simulation period, which can be considered to be a comprehensive data intensive and computing intensive issue. Although several high performance computing (HPC) methods have demonstrated their potential for accelerating simulation, the input/output (I/O) bottleneck of huge raw data has not been eased. In this paper, we propose a cloud computing based SAR raw data simulation algorithm, which employs the MapReduce model to accelerate the raw data computing and the Hadoop distributed file system (HDFS) for fast I/O access. The MapReduce model is designed for the irregular parallel accumulation of raw data simulation, which greatly reduces the parallel efficiency of graphics processing unit (GPU) based simulation methods. In addition, three kinds of optimization strategies are put forward from the aspects of programming model, HDFS configuration and scheduling. The experimental results show that the cloud computing based algorithm achieves 4_ speedup over the baseline serial approach in an 8-node cloud environment, and each optimization strategy can improve about 20%. This work proves that the proposed cloud algorithm is capable of solving the computing intensive and data intensive issues in SAR raw data simulation, and is easily extended to large scale computing to achieve higher acceleration.

  18. Assignment Of Finite Elements To Parallel Processors

    NASA Technical Reports Server (NTRS)

    Salama, Moktar A.; Flower, Jon W.; Otto, Steve W.

    1990-01-01

    Elements assigned approximately optimally to subdomains. Mapping algorithm based on simulated-annealing concept used to minimize approximate time required to perform finite-element computation on hypercube computer or other network of parallel data processors. Mapping algorithm needed when shape of domain complicated or otherwise not obvious what allocation of elements to subdomains minimizes cost of computation.

  19. Efficient shortest-path-tree computation in network routing based on pulse-coupled neural networks.

    PubMed

    Qu, Hong; Yi, Zhang; Yang, Simon X

    2013-06-01

    Shortest path tree (SPT) computation is a critical issue for routers using link-state routing protocols, such as the most commonly used open shortest path first and intermediate system to intermediate system. Each router needs to recompute a new SPT rooted from itself whenever a change happens in the link state. Most commercial routers do this computation by deleting the current SPT and building a new one using static algorithms such as the Dijkstra algorithm at the beginning. Such recomputation of an entire SPT is inefficient, which may consume a considerable amount of CPU time and result in a time delay in the network. Some dynamic updating methods using the information in the updated SPT have been proposed in recent years. However, there are still many limitations in those dynamic algorithms. In this paper, a new modified model of pulse-coupled neural networks (M-PCNNs) is proposed for the SPT computation. It is rigorously proved that the proposed model is capable of solving some optimization problems, such as the SPT. A static algorithm is proposed based on the M-PCNNs to compute the SPT efficiently for large-scale problems. In addition, a dynamic algorithm that makes use of the structure of the previously computed SPT is proposed, which significantly improves the efficiency of the algorithm. Simulation results demonstrate the effective and efficient performance of the proposed approach.

  20. Combinatorial-topological framework for the analysis of global dynamics.

    PubMed

    Bush, Justin; Gameiro, Marcio; Harker, Shaun; Kokubu, Hiroshi; Mischaikow, Konstantin; Obayashi, Ippei; Pilarczyk, Paweł

    2012-12-01

    We discuss an algorithmic framework based on efficient graph algorithms and algebraic-topological computational tools. The framework is aimed at automatic computation of a database of global dynamics of a given m-parameter semidynamical system with discrete time on a bounded subset of the n-dimensional phase space. We introduce the mathematical background, which is based upon Conley's topological approach to dynamics, describe the algorithms for the analysis of the dynamics using rectangular grids both in phase space and parameter space, and show two sample applications.

  1. Combinatorial-topological framework for the analysis of global dynamics

    NASA Astrophysics Data System (ADS)

    Bush, Justin; Gameiro, Marcio; Harker, Shaun; Kokubu, Hiroshi; Mischaikow, Konstantin; Obayashi, Ippei; Pilarczyk, Paweł

    2012-12-01

    We discuss an algorithmic framework based on efficient graph algorithms and algebraic-topological computational tools. The framework is aimed at automatic computation of a database of global dynamics of a given m-parameter semidynamical system with discrete time on a bounded subset of the n-dimensional phase space. We introduce the mathematical background, which is based upon Conley's topological approach to dynamics, describe the algorithms for the analysis of the dynamics using rectangular grids both in phase space and parameter space, and show two sample applications.

  2. Exact and Heuristic Algorithms for Runway Scheduling

    NASA Technical Reports Server (NTRS)

    Malik, Waqar A.; Jung, Yoon C.

    2016-01-01

    This paper explores the Single Runway Scheduling (SRS) problem with arrivals, departures, and crossing aircraft on the airport surface. Constraints for wake vortex separations, departure area navigation separations and departure time window restrictions are explicitly considered. The main objective of this research is to develop exact and heuristic based algorithms that can be used in real-time decision support tools for Air Traffic Control Tower (ATCT) controllers. The paper provides a multi-objective dynamic programming (DP) based algorithm that finds the exact solution to the SRS problem, but may prove unusable for application in real-time environment due to large computation times for moderate sized problems. We next propose a second algorithm that uses heuristics to restrict the search space for the DP based algorithm. A third algorithm based on a combination of insertion and local search (ILS) heuristics is then presented. Simulation conducted for the east side of Dallas/Fort Worth International Airport allows comparison of the three proposed algorithms and indicates that the ILS algorithm performs favorably in its ability to find efficient solutions and its computation times.

  3. A Comparative Analysis of Community Detection Algorithms on Artificial Networks

    PubMed Central

    Yang, Zhao; Algesheimer, René; Tessone, Claudio J.

    2016-01-01

    Many community detection algorithms have been developed to uncover the mesoscopic properties of complex networks. However how good an algorithm is, in terms of accuracy and computing time, remains still open. Testing algorithms on real-world network has certain restrictions which made their insights potentially biased: the networks are usually small, and the underlying communities are not defined objectively. In this study, we employ the Lancichinetti-Fortunato-Radicchi benchmark graph to test eight state-of-the-art algorithms. We quantify the accuracy using complementary measures and algorithms’ computing time. Based on simple network properties and the aforementioned results, we provide guidelines that help to choose the most adequate community detection algorithm for a given network. Moreover, these rules allow uncovering limitations in the use of specific algorithms given macroscopic network properties. Our contribution is threefold: firstly, we provide actual techniques to determine which is the most suited algorithm in most circumstances based on observable properties of the network under consideration. Secondly, we use the mixing parameter as an easily measurable indicator of finding the ranges of reliability of the different algorithms. Finally, we study the dependency with network size focusing on both the algorithm’s predicting power and the effective computing time. PMID:27476470

  4. Combining Acceleration Techniques for Low-Dose X-Ray Cone Beam Computed Tomography Image Reconstruction.

    PubMed

    Huang, Hsuan-Ming; Hsiao, Ing-Tsung

    2017-01-01

    Over the past decade, image quality in low-dose computed tomography has been greatly improved by various compressive sensing- (CS-) based reconstruction methods. However, these methods have some disadvantages including high computational cost and slow convergence rate. Many different speed-up techniques for CS-based reconstruction algorithms have been developed. The purpose of this paper is to propose a fast reconstruction framework that combines a CS-based reconstruction algorithm with several speed-up techniques. First, total difference minimization (TDM) was implemented using the soft-threshold filtering (STF). Second, we combined TDM-STF with the ordered subsets transmission (OSTR) algorithm for accelerating the convergence. To further speed up the convergence of the proposed method, we applied the power factor and the fast iterative shrinkage thresholding algorithm to OSTR and TDM-STF, respectively. Results obtained from simulation and phantom studies showed that many speed-up techniques could be combined to greatly improve the convergence speed of a CS-based reconstruction algorithm. More importantly, the increased computation time (≤10%) was minor as compared to the acceleration provided by the proposed method. In this paper, we have presented a CS-based reconstruction framework that combines several acceleration techniques. Both simulation and phantom studies provide evidence that the proposed method has the potential to satisfy the requirement of fast image reconstruction in practical CT.

  5. A Non-Intrusive Algorithm for Sensitivity Analysis of Chaotic Flow Simulations

    NASA Technical Reports Server (NTRS)

    Blonigan, Patrick J.; Wang, Qiqi; Nielsen, Eric J.; Diskin, Boris

    2017-01-01

    We demonstrate a novel algorithm for computing the sensitivity of statistics in chaotic flow simulations to parameter perturbations. The algorithm is non-intrusive but requires exposing an interface. Based on the principle of shadowing in dynamical systems, this algorithm is designed to reduce the effect of the sampling error in computing sensitivity of statistics in chaotic simulations. We compare the effectiveness of this method to that of the conventional finite difference method.

  6. Parallel Implementation of the Wideband DOA Algorithm on the IBM Cell BE Processor

    DTIC Science & Technology

    2010-05-01

    Abstract—The Multiple Signal Classification ( MUSIC ) algorithm is a powerful technique for determining the Direction of Arrival (DOA) of signals...Broadband Engine Processor (Cell BE). The process of adapting the serial based MUSIC algorithm to the Cell BE will be analyzed in terms of parallelism and...using Multiple Signal Classification MUSIC algorithm [4] • Computation of Focus matrix • Computation of number of sources • Separation of Signal

  7. Stream-based Hebbian eigenfilter for real-time neuronal spike discrimination

    PubMed Central

    2012-01-01

    Background Principal component analysis (PCA) has been widely employed for automatic neuronal spike sorting. Calculating principal components (PCs) is computationally expensive, and requires complex numerical operations and large memory resources. Substantial hardware resources are therefore needed for hardware implementations of PCA. General Hebbian algorithm (GHA) has been proposed for calculating PCs of neuronal spikes in our previous work, which eliminates the needs of computationally expensive covariance analysis and eigenvalue decomposition in conventional PCA algorithms. However, large memory resources are still inherently required for storing a large volume of aligned spikes for training PCs. The large size memory will consume large hardware resources and contribute significant power dissipation, which make GHA difficult to be implemented in portable or implantable multi-channel recording micro-systems. Method In this paper, we present a new algorithm for PCA-based spike sorting based on GHA, namely stream-based Hebbian eigenfilter, which eliminates the inherent memory requirements of GHA while keeping the accuracy of spike sorting by utilizing the pseudo-stationarity of neuronal spikes. Because of the reduction of large hardware storage requirements, the proposed algorithm can lead to ultra-low hardware resources and power consumption of hardware implementations, which is critical for the future multi-channel micro-systems. Both clinical and synthetic neural recording data sets were employed for evaluating the accuracy of the stream-based Hebbian eigenfilter. The performance of spike sorting using stream-based eigenfilter and the computational complexity of the eigenfilter were rigorously evaluated and compared with conventional PCA algorithms. Field programmable logic arrays (FPGAs) were employed to implement the proposed algorithm, evaluate the hardware implementations and demonstrate the reduction in both power consumption and hardware memories achieved by the streaming computing Results and discussion Results demonstrate that the stream-based eigenfilter can achieve the same accuracy and is 10 times more computationally efficient when compared with conventional PCA algorithms. Hardware evaluations show that 90.3% logic resources, 95.1% power consumption and 86.8% computing latency can be reduced by the stream-based eigenfilter when compared with PCA hardware. By utilizing the streaming method, 92% memory resources and 67% power consumption can be saved when compared with the direct implementation of GHA. Conclusion Stream-based Hebbian eigenfilter presents a novel approach to enable real-time spike sorting with reduced computational complexity and hardware costs. This new design can be further utilized for multi-channel neuro-physiological experiments or chronic implants. PMID:22490725

  8. A Linked List-Based Algorithm for Blob Detection on Embedded Vision-Based Sensors.

    PubMed

    Acevedo-Avila, Ricardo; Gonzalez-Mendoza, Miguel; Garcia-Garcia, Andres

    2016-05-28

    Blob detection is a common task in vision-based applications. Most existing algorithms are aimed at execution on general purpose computers; while very few can be adapted to the computing restrictions present in embedded platforms. This paper focuses on the design of an algorithm capable of real-time blob detection that minimizes system memory consumption. The proposed algorithm detects objects in one image scan; it is based on a linked-list data structure tree used to label blobs depending on their shape and node information. An example application showing the results of a blob detection co-processor has been built on a low-powered field programmable gate array hardware as a step towards developing a smart video surveillance system. The detection method is intended for general purpose application. As such, several test cases focused on character recognition are also examined. The results obtained present a fair trade-off between accuracy and memory requirements; and prove the validity of the proposed approach for real-time implementation on resource-constrained computing platforms.

  9. Evidence for Model-based Computations in the Human Amygdala during Pavlovian Conditioning

    PubMed Central

    Prévost, Charlotte; McNamee, Daniel; Jessup, Ryan K.; Bossaerts, Peter; O'Doherty, John P.

    2013-01-01

    Contemporary computational accounts of instrumental conditioning have emphasized a role for a model-based system in which values are computed with reference to a rich model of the structure of the world, and a model-free system in which values are updated without encoding such structure. Much less studied is the possibility of a similar distinction operating at the level of Pavlovian conditioning. In the present study, we scanned human participants while they participated in a Pavlovian conditioning task with a simple structure while measuring activity in the human amygdala using a high-resolution fMRI protocol. After fitting a model-based algorithm and a variety of model-free algorithms to the fMRI data, we found evidence for the superiority of a model-based algorithm in accounting for activity in the amygdala compared to the model-free counterparts. These findings support an important role for model-based algorithms in describing the processes underpinning Pavlovian conditioning, as well as providing evidence of a role for the human amygdala in model-based inference. PMID:23436990

  10. Performance comparison of heuristic algorithms for task scheduling in IaaS cloud computing environment.

    PubMed

    Madni, Syed Hamid Hussain; Abd Latiff, Muhammad Shafie; Abdullahi, Mohammed; Abdulhamid, Shafi'i Muhammad; Usman, Mohammed Joda

    2017-01-01

    Cloud computing infrastructure is suitable for meeting computational needs of large task sizes. Optimal scheduling of tasks in cloud computing environment has been proved to be an NP-complete problem, hence the need for the application of heuristic methods. Several heuristic algorithms have been developed and used in addressing this problem, but choosing the appropriate algorithm for solving task assignment problem of a particular nature is difficult since the methods are developed under different assumptions. Therefore, six rule based heuristic algorithms are implemented and used to schedule autonomous tasks in homogeneous and heterogeneous environments with the aim of comparing their performance in terms of cost, degree of imbalance, makespan and throughput. First Come First Serve (FCFS), Minimum Completion Time (MCT), Minimum Execution Time (MET), Max-min, Min-min and Sufferage are the heuristic algorithms considered for the performance comparison and analysis of task scheduling in cloud computing.

  11. Performance comparison of heuristic algorithms for task scheduling in IaaS cloud computing environment

    PubMed Central

    Madni, Syed Hamid Hussain; Abd Latiff, Muhammad Shafie; Abdullahi, Mohammed; Usman, Mohammed Joda

    2017-01-01

    Cloud computing infrastructure is suitable for meeting computational needs of large task sizes. Optimal scheduling of tasks in cloud computing environment has been proved to be an NP-complete problem, hence the need for the application of heuristic methods. Several heuristic algorithms have been developed and used in addressing this problem, but choosing the appropriate algorithm for solving task assignment problem of a particular nature is difficult since the methods are developed under different assumptions. Therefore, six rule based heuristic algorithms are implemented and used to schedule autonomous tasks in homogeneous and heterogeneous environments with the aim of comparing their performance in terms of cost, degree of imbalance, makespan and throughput. First Come First Serve (FCFS), Minimum Completion Time (MCT), Minimum Execution Time (MET), Max-min, Min-min and Sufferage are the heuristic algorithms considered for the performance comparison and analysis of task scheduling in cloud computing. PMID:28467505

  12. 160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA)

    PubMed Central

    Li, Isaac TS; Shum, Warren; Truong, Kevin

    2007-01-01

    Background To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. Results In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. Conclusion This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching. PMID:17555593

  13. 160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA).

    PubMed

    Li, Isaac T S; Shum, Warren; Truong, Kevin

    2007-06-07

    To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching.

  14. Fast polar decomposition of an arbitrary matrix

    NASA Technical Reports Server (NTRS)

    Higham, Nicholas J.; Schreiber, Robert S.

    1988-01-01

    The polar decomposition of an m x n matrix A of full rank, where m is greater than or equal to n, can be computed using a quadratically convergent algorithm. The algorithm is based on a Newton iteration involving a matrix inverse. With the use of a preliminary complete orthogonal decomposition the algorithm can be extended to arbitrary A. How to use the algorithm to compute the positive semi-definite square root of a Hermitian positive semi-definite matrix is described. A hybrid algorithm which adaptively switches from the matrix inversion based iteration to a matrix multiplication based iteration due to Kovarik, and to Bjorck and Bowie is formulated. The decision when to switch is made using a condition estimator. This matrix multiplication rich algorithm is shown to be more efficient on machines for which matrix multiplication can be executed 1.5 times faster than matrix inversion.

  15. An efficient parallel algorithm for the solution of a tridiagonal linear system of equations

    NASA Technical Reports Server (NTRS)

    Stone, H. S.

    1971-01-01

    Tridiagonal linear systems of equations are solved on conventional serial machines in a time proportional to N, where N is the number of equations. The conventional algorithms do not lend themselves directly to parallel computations on computers of the ILLIAC IV class, in the sense that they appear to be inherently serial. An efficient parallel algorithm is presented in which computation time grows as log sub 2 N. The algorithm is based on recursive doubling solutions of linear recurrence relations, and can be used to solve recurrence relations of all orders.

  16. System identification using Nuclear Norm & Tabu Search optimization

    NASA Astrophysics Data System (ADS)

    Ahmed, Asif A.; Schoen, Marco P.; Bosworth, Ken W.

    2018-01-01

    In recent years, subspace System Identification (SI) algorithms have seen increased research, stemming from advanced minimization methods being applied to the Nuclear Norm (NN) approach in system identification. These minimization algorithms are based on hard computing methodologies. To the authors’ knowledge, as of now, there has been no work reported that utilizes soft computing algorithms to address the minimization problem within the nuclear norm SI framework. A linear, time-invariant, discrete time system is used in this work as the basic model for characterizing a dynamical system to be identified. The main objective is to extract a mathematical model from collected experimental input-output data. Hankel matrices are constructed from experimental data, and the extended observability matrix is employed to define an estimated output of the system. This estimated output and the actual - measured - output are utilized to construct a minimization problem. An embedded rank measure assures minimum state realization outcomes. Current NN-SI algorithms employ hard computing algorithms for minimization. In this work, we propose a simple Tabu Search (TS) algorithm for minimization. TS algorithm based SI is compared with the iterative Alternating Direction Method of Multipliers (ADMM) line search optimization based NN-SI. For comparison, several different benchmark system identification problems are solved by both approaches. Results show improved performance of the proposed SI-TS algorithm compared to the NN-SI ADMM algorithm.

  17. FANSe2: a robust and cost-efficient alignment tool for quantitative next-generation sequencing applications.

    PubMed

    Xiao, Chuan-Le; Mai, Zhi-Biao; Lian, Xin-Lei; Zhong, Jia-Yong; Jin, Jing-Jie; He, Qing-Yu; Zhang, Gong

    2014-01-01

    Correct and bias-free interpretation of the deep sequencing data is inevitably dependent on the complete mapping of all mappable reads to the reference sequence, especially for quantitative RNA-seq applications. Seed-based algorithms are generally slow but robust, while Burrows-Wheeler Transform (BWT) based algorithms are fast but less robust. To have both advantages, we developed an algorithm FANSe2 with iterative mapping strategy based on the statistics of real-world sequencing error distribution to substantially accelerate the mapping without compromising the accuracy. Its sensitivity and accuracy are higher than the BWT-based algorithms in the tests using both prokaryotic and eukaryotic sequencing datasets. The gene identification results of FANSe2 is experimentally validated, while the previous algorithms have false positives and false negatives. FANSe2 showed remarkably better consistency to the microarray than most other algorithms in terms of gene expression quantifications. We implemented a scalable and almost maintenance-free parallelization method that can utilize the computational power of multiple office computers, a novel feature not present in any other mainstream algorithm. With three normal office computers, we demonstrated that FANSe2 mapped an RNA-seq dataset generated from an entire Illunima HiSeq 2000 flowcell (8 lanes, 608 M reads) to masked human genome within 4.1 hours with higher sensitivity than Bowtie/Bowtie2. FANSe2 thus provides robust accuracy, full indel sensitivity, fast speed, versatile compatibility and economical computational utilization, making it a useful and practical tool for deep sequencing applications. FANSe2 is freely available at http://bioinformatics.jnu.edu.cn/software/fanse2/.

  18. New Parallel Algorithms for Landscape Evolution Model

    NASA Astrophysics Data System (ADS)

    Jin, Y.; Zhang, H.; Shi, Y.

    2017-12-01

    Most landscape evolution models (LEM) developed in the last two decades solve the diffusion equation to simulate the transportation of surface sediments. This numerical approach is difficult to parallelize due to the computation of drainage area for each node, which needs huge amount of communication if run in parallel. In order to overcome this difficulty, we developed two parallel algorithms for LEM with a stream net. One algorithm handles the partition of grid with traditional methods and applies an efficient global reduction algorithm to do the computation of drainage areas and transport rates for the stream net; the other algorithm is based on a new partition algorithm, which partitions the nodes in catchments between processes first, and then partitions the cells according to the partition of nodes. Both methods focus on decreasing communication between processes and take the advantage of massive computing techniques, and numerical experiments show that they are both adequate to handle large scale problems with millions of cells. We implemented the two algorithms in our program based on the widely used finite element library deal.II, so that it can be easily coupled with ASPECT.

  19. Renewable energy in electric utility capacity planning: a decomposition approach with application to a Mexican utility

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Staschus, K.

    1985-01-01

    In this dissertation, efficient algorithms for electric-utility capacity expansion planning with renewable energy are developed. The algorithms include a deterministic phase that quickly finds a near-optimal expansion plan using derating and a linearized approximation to the time-dependent availability of nondispatchable energy sources. A probabilistic second phase needs comparatively few computer-time consuming probabilistic simulation iterations to modify this solution towards the optimal expansion plan. For the deterministic first phase, two algorithms, based on a Lagrangian Dual decomposition and a Generalized Benders Decomposition, are developed. The probabilistic second phase uses a Generalized Benders Decomposition approach. Extensive computational tests of the algorithms aremore » reported. Among the deterministic algorithms, the one based on Lagrangian Duality proves fastest. The two-phase approach is shown to save up to 80% in computing time as compared to a purely probabilistic algorithm. The algorithms are applied to determine the optimal expansion plan for the Tijuana-Mexicali subsystem of the Mexican electric utility system. A strong recommendation to push conservation programs in the desert city of Mexicali results from this implementation.« less

  20. Parallel Optimization of 3D Cardiac Electrophysiological Model Using GPU

    PubMed Central

    Xia, Yong; Zhang, Henggui

    2015-01-01

    Large-scale 3D virtual heart model simulations are highly demanding in computational resources. This imposes a big challenge to the traditional computation resources based on CPU environment, which already cannot meet the requirement of the whole computation demands or are not easily available due to expensive costs. GPU as a parallel computing environment therefore provides an alternative to solve the large-scale computational problems of whole heart modeling. In this study, using a 3D sheep atrial model as a test bed, we developed a GPU-based simulation algorithm to simulate the conduction of electrical excitation waves in the 3D atria. In the GPU algorithm, a multicellular tissue model was split into two components: one is the single cell model (ordinary differential equation) and the other is the diffusion term of the monodomain model (partial differential equation). Such a decoupling enabled realization of the GPU parallel algorithm. Furthermore, several optimization strategies were proposed based on the features of the virtual heart model, which enabled a 200-fold speedup as compared to a CPU implementation. In conclusion, an optimized GPU algorithm has been developed that provides an economic and powerful platform for 3D whole heart simulations. PMID:26581957

  1. Parallel Optimization of 3D Cardiac Electrophysiological Model Using GPU.

    PubMed

    Xia, Yong; Wang, Kuanquan; Zhang, Henggui

    2015-01-01

    Large-scale 3D virtual heart model simulations are highly demanding in computational resources. This imposes a big challenge to the traditional computation resources based on CPU environment, which already cannot meet the requirement of the whole computation demands or are not easily available due to expensive costs. GPU as a parallel computing environment therefore provides an alternative to solve the large-scale computational problems of whole heart modeling. In this study, using a 3D sheep atrial model as a test bed, we developed a GPU-based simulation algorithm to simulate the conduction of electrical excitation waves in the 3D atria. In the GPU algorithm, a multicellular tissue model was split into two components: one is the single cell model (ordinary differential equation) and the other is the diffusion term of the monodomain model (partial differential equation). Such a decoupling enabled realization of the GPU parallel algorithm. Furthermore, several optimization strategies were proposed based on the features of the virtual heart model, which enabled a 200-fold speedup as compared to a CPU implementation. In conclusion, an optimized GPU algorithm has been developed that provides an economic and powerful platform for 3D whole heart simulations.

  2. Active control of impulsive noise with symmetric α-stable distribution based on an improved step-size normalized adaptive algorithm

    NASA Astrophysics Data System (ADS)

    Zhou, Yali; Zhang, Qizhi; Yin, Yixin

    2015-05-01

    In this paper, active control of impulsive noise with symmetric α-stable (SαS) distribution is studied. A general step-size normalized filtered-x Least Mean Square (FxLMS) algorithm is developed based on the analysis of existing algorithms, and the Gaussian distribution function is used to normalize the step size. Compared with existing algorithms, the proposed algorithm needs neither the parameter selection and thresholds estimation nor the process of cost function selection and complex gradient computation. Computer simulations have been carried out to suggest that the proposed algorithm is effective for attenuating SαS impulsive noise, and then the proposed algorithm has been implemented in an experimental ANC system. Experimental results show that the proposed scheme has good performance for SαS impulsive noise attenuation.

  3. A modern control theory based algorithm for control of the NASA/JPL 70-meter antenna axis servos

    NASA Technical Reports Server (NTRS)

    Hill, R. E.

    1987-01-01

    A digital computer-based state variable controller was designed and applied to the 70-m antenna axis servos. The general equations and structure of the algorithm and provisions for alternate position error feedback modes to accommodate intertarget slew, encoder referenced tracking, and precision tracking modes are descibed. Development of the discrete time domain control model and computation of estimator and control gain parameters based on closed loop pole placement criteria are discussed. The new algorithm was successfully implemented and tested in the 70-m antenna at Deep Space Network station 63 in Spain.

  4. LAWS simulation: Sampling strategies and wind computation algorithms

    NASA Technical Reports Server (NTRS)

    Emmitt, G. D. A.; Wood, S. A.; Houston, S. H.

    1989-01-01

    In general, work has continued on developing and evaluating algorithms designed to manage the Laser Atmospheric Wind Sounder (LAWS) lidar pulses and to compute the horizontal wind vectors from the line-of-sight (LOS) measurements. These efforts fall into three categories: Improvements to the shot management and multi-pair algorithms (SMA/MPA); observing system simulation experiments; and ground-based simulations of LAWS.

  5. SOI layout decomposition for double patterning lithography on high-performance computer platforms

    NASA Astrophysics Data System (ADS)

    Verstov, Vladimir; Zinchenko, Lyudmila; Makarchuk, Vladimir

    2014-12-01

    In the paper silicon on insulator layout decomposition algorithms for the double patterning lithography on high performance computing platforms are discussed. Our approach is based on the use of a contradiction graph and a modified concurrent breadth-first search algorithm. We evaluate our technique on 45 nm Nangate Open Cell Library including non-Manhattan geometry. Experimental results show that our soft computing algorithms decompose layout successfully and a minimal distance between polygons in layout is increased.

  6. A novel iris localization algorithm using correlation filtering

    NASA Astrophysics Data System (ADS)

    Pohit, Mausumi; Sharma, Jitu

    2015-06-01

    Fast and efficient segmentation of iris from the eye images is a primary requirement for robust database independent iris recognition. In this paper we have presented a new algorithm for computing the inner and outer boundaries of the iris and locating the pupil centre. Pupil-iris boundary computation is based on correlation filtering approach, whereas iris-sclera boundary is determined through one dimensional intensity mapping. The proposed approach is computationally less extensive when compared with the existing algorithms like Hough transform.

  7. Algorithms and software used in selecting structure of machine-training cluster based on neurocomputers

    NASA Astrophysics Data System (ADS)

    Romanchuk, V. A.; Lukashenko, V. V.

    2018-05-01

    The technique of functioning of a control system by a computing cluster based on neurocomputers is proposed. Particular attention is paid to the method of choosing the structure of the computing cluster due to the fact that the existing methods are not effective because of a specialized hardware base - neurocomputers, which are highly parallel computer devices with an architecture different from the von Neumann architecture. A developed algorithm for choosing the computational structure of a cloud cluster is described, starting from the direction of data transfer in the flow control graph of the program and its adjacency matrix.

  8. Cognitive Correlates of Performance in Algorithms in a Computer Science Course for High School

    ERIC Educational Resources Information Center

    Avancena, Aimee Theresa; Nishihara, Akinori

    2014-01-01

    Computer science for high school faces many challenging issues. One of these is whether the students possess the appropriate cognitive ability for learning the fundamentals of computer science. Online tests were created based on known cognitive factors and fundamental algorithms and were implemented among the second grade students in the…

  9. Computation-aware algorithm selection approach for interlaced-to-progressive conversion

    NASA Astrophysics Data System (ADS)

    Park, Sang-Jun; Jeon, Gwanggil; Jeong, Jechang

    2010-05-01

    We discuss deinterlacing results in a computationally constrained and varied environment. The proposed computation-aware algorithm selection approach (CASA) for fast interlaced to progressive conversion algorithm consists of three methods: the line-averaging (LA) method for plain regions, the modified edge-based line-averaging (MELA) method for medium regions, and the proposed covariance-based adaptive deinterlacing (CAD) method for complex regions. The proposed CASA uses two criteria, mean-squared error (MSE) and CPU time, for assigning the method. We proposed a CAD method. The principle idea of CAD is based on the correspondence between the high and low-resolution covariances. We estimated the local covariance coefficients from an interlaced image using Wiener filtering theory and then used these optimal minimum MSE interpolation coefficients to obtain a deinterlaced image. The CAD method, though more robust than most known methods, was not found to be very fast compared to the others. To alleviate this issue, we proposed an adaptive selection approach using a fast deinterlacing algorithm rather than using only one CAD algorithm. The proposed hybrid approach of switching between the conventional schemes (LA and MELA) and our CAD was proposed to reduce the overall computational load. A reliable condition to be used for switching the schemes was presented after a wide set of initial training processes. The results of computer simulations showed that the proposed methods outperformed a number of methods presented in the literature.

  10. Approximation algorithms for planning and control

    NASA Technical Reports Server (NTRS)

    Boddy, Mark; Dean, Thomas

    1989-01-01

    A control system operating in a complex environment will encounter a variety of different situations, with varying amounts of time available to respond to critical events. Ideally, such a control system will do the best possible with the time available. In other words, its responses should approximate those that would result from having unlimited time for computation, where the degree of the approximation depends on the amount of time it actually has. There exist approximation algorithms for a wide variety of problems. Unfortunately, the solution to any reasonably complex control problem will require solving several computationally intensive problems. Algorithms for successive approximation are a subclass of the class of anytime algorithms, algorithms that return answers for any amount of computation time, where the answers improve as more time is allotted. An architecture is described for allocating computation time to a set of anytime algorithms, based on expectations regarding the value of the answers they return. The architecture described is quite general, producing optimal schedules for a set of algorithms under widely varying conditions.

  11. TSaT-MUSIC: a novel algorithm for rapid and accurate ultrasonic 3D localization

    NASA Astrophysics Data System (ADS)

    Mizutani, Kyohei; Ito, Toshio; Sugimoto, Masanori; Hashizume, Hiromichi

    2011-12-01

    We describe a fast and accurate indoor localization technique using the multiple signal classification (MUSIC) algorithm. The MUSIC algorithm is known as a high-resolution method for estimating directions of arrival (DOAs) or propagation delays. A critical problem in using the MUSIC algorithm for localization is its computational complexity. Therefore, we devised a novel algorithm called Time Space additional Temporal-MUSIC, which can rapidly and simultaneously identify DOAs and delays of mul-ticarrier ultrasonic waves from transmitters. Computer simulations have proved that the computation time of the proposed algorithm is almost constant in spite of increasing numbers of incoming waves and is faster than that of existing methods based on the MUSIC algorithm. The robustness of the proposed algorithm is discussed through simulations. Experiments in real environments showed that the standard deviation of position estimations in 3D space is less than 10 mm, which is satisfactory for indoor localization.

  12. Design of nucleic acid sequences for DNA computing based on a thermodynamic approach

    PubMed Central

    Tanaka, Fumiaki; Kameda, Atsushi; Yamamoto, Masahito; Ohuchi, Azuma

    2005-01-01

    We have developed an algorithm for designing multiple sequences of nucleic acids that have a uniform melting temperature between the sequence and its complement and that do not hybridize non-specifically with each other based on the minimum free energy (ΔGmin). Sequences that satisfy these constraints can be utilized in computations, various engineering applications such as microarrays, and nano-fabrications. Our algorithm is a random generate-and-test algorithm: it generates a candidate sequence randomly and tests whether the sequence satisfies the constraints. The novelty of our algorithm is that the filtering method uses a greedy search to calculate ΔGmin. This effectively excludes inappropriate sequences before ΔGmin is calculated, thereby reducing computation time drastically when compared with an algorithm without the filtering. Experimental results in silico showed the superiority of the greedy search over the traditional approach based on the hamming distance. In addition, experimental results in vitro demonstrated that the experimental free energy (ΔGexp) of 126 sequences correlated well with ΔGmin (|R| = 0.90) than with the hamming distance (|R| = 0.80). These results validate the rationality of a thermodynamic approach. We implemented our algorithm in a graphic user interface-based program written in Java. PMID:15701762

  13. The Application of the Weighted k-Partite Graph Problem to the Multiple Alignment for Metabolic Pathways.

    PubMed

    Chen, Wenbin; Hendrix, William; Samatova, Nagiza F

    2017-12-01

    The problem of aligning multiple metabolic pathways is one of very challenging problems in computational biology. A metabolic pathway consists of three types of entities: reactions, compounds, and enzymes. Based on similarities between enzymes, Tohsato et al. gave an algorithm for aligning multiple metabolic pathways. However, the algorithm given by Tohsato et al. neglects the similarities among reactions, compounds, enzymes, and pathway topology. How to design algorithms for the alignment problem of multiple metabolic pathways based on the similarity of reactions, compounds, and enzymes? It is a difficult computational problem. In this article, we propose an algorithm for the problem of aligning multiple metabolic pathways based on the similarities among reactions, compounds, enzymes, and pathway topology. First, we compute a weight between each pair of like entities in different input pathways based on the entities' similarity score and topological structure using Ay et al.'s methods. We then construct a weighted k-partite graph for the reactions, compounds, and enzymes. We extract a mapping between these entities by solving the maximum-weighted k-partite matching problem by applying a novel heuristic algorithm. By analyzing the alignment results of multiple pathways in different organisms, we show that the alignments found by our algorithm correctly identify common subnetworks among multiple pathways.

  14. An exact computational method for performance analysis of sequential test algorithms for detecting network intrusions

    NASA Astrophysics Data System (ADS)

    Chen, Xinjia; Lacy, Fred; Carriere, Patrick

    2015-05-01

    Sequential test algorithms are playing increasingly important roles for quick detecting network intrusions such as portscanners. In view of the fact that such algorithms are usually analyzed based on intuitive approximation or asymptotic analysis, we develop an exact computational method for the performance analysis of such algorithms. Our method can be used to calculate the probability of false alarm and average detection time up to arbitrarily pre-specified accuracy.

  15. Think globally and solve locally: secondary memory-based network learning for automated multi-species function prediction

    PubMed Central

    2014-01-01

    Background Network-based learning algorithms for automated function prediction (AFP) are negatively affected by the limited coverage of experimental data and limited a priori known functional annotations. As a consequence their application to model organisms is often restricted to well characterized biological processes and pathways, and their effectiveness with poorly annotated species is relatively limited. A possible solution to this problem might consist in the construction of big networks including multiple species, but this in turn poses challenging computational problems, due to the scalability limitations of existing algorithms and the main memory requirements induced by the construction of big networks. Distributed computation or the usage of big computers could in principle respond to these issues, but raises further algorithmic problems and require resources not satisfiable with simple off-the-shelf computers. Results We propose a novel framework for scalable network-based learning of multi-species protein functions based on both a local implementation of existing algorithms and the adoption of innovative technologies: we solve “locally” the AFP problem, by designing “vertex-centric” implementations of network-based algorithms, but we do not give up thinking “globally” by exploiting the overall topology of the network. This is made possible by the adoption of secondary memory-based technologies that allow the efficient use of the large memory available on disks, thus overcoming the main memory limitations of modern off-the-shelf computers. This approach has been applied to the analysis of a large multi-species network including more than 300 species of bacteria and to a network with more than 200,000 proteins belonging to 13 Eukaryotic species. To our knowledge this is the first work where secondary-memory based network analysis has been applied to multi-species function prediction using biological networks with hundreds of thousands of proteins. Conclusions The combination of these algorithmic and technological approaches makes feasible the analysis of large multi-species networks using ordinary computers with limited speed and primary memory, and in perspective could enable the analysis of huge networks (e.g. the whole proteomes available in SwissProt), using well-equipped stand-alone machines. PMID:24843788

  16. Multiobjective Aerodynamic Shape Optimization Using Pareto Differential Evolution and Generalized Response Surface Metamodels

    NASA Technical Reports Server (NTRS)

    Madavan, Nateri K.

    2004-01-01

    Differential Evolution (DE) is a simple, fast, and robust evolutionary algorithm that has proven effective in determining the global optimum for several difficult single-objective optimization problems. The DE algorithm has been recently extended to multiobjective optimization problem by using a Pareto-based approach. In this paper, a Pareto DE algorithm is applied to multiobjective aerodynamic shape optimization problems that are characterized by computationally expensive objective function evaluations. To improve computational expensive the algorithm is coupled with generalized response surface meta-models based on artificial neural networks. Results are presented for some test optimization problems from the literature to demonstrate the capabilities of the method.

  17. Fast parallel approach for 2-D DHT-based real-valued discrete Gabor transform.

    PubMed

    Tao, Liang; Kwan, Hon Keung

    2009-12-01

    Two-dimensional fast Gabor transform algorithms are useful for real-time applications due to the high computational complexity of the traditional 2-D complex-valued discrete Gabor transform (CDGT). This paper presents two block time-recursive algorithms for 2-D DHT-based real-valued discrete Gabor transform (RDGT) and its inverse transform and develops a fast parallel approach for the implementation of the two algorithms. The computational complexity of the proposed parallel approach is analyzed and compared with that of the existing 2-D CDGT algorithms. The results indicate that the proposed parallel approach is attractive for real time image processing.

  18. Flight data acquisition methodology for validation of passive ranging algorithms for obstacle avoidance

    NASA Technical Reports Server (NTRS)

    Smith, Phillip N.

    1990-01-01

    The automation of low-altitude rotorcraft flight depends on the ability to detect, locate, and navigate around obstacles lying in the rotorcraft's intended flightpath. Computer vision techniques provide a passive method of obstacle detection and range estimation, for obstacle avoidance. Several algorithms based on computer vision methods have been developed for this purpose using laboratory data; however, further development and validation of candidate algorithms require data collected from rotorcraft flight. A data base containing low-altitude imagery augmented with the rotorcraft and sensor parameters required for passive range estimation is not readily available. Here, the emphasis is on the methodology used to develop such a data base from flight-test data consisting of imagery, rotorcraft and sensor parameters, and ground-truth range measurements. As part of the data preparation, a technique for obtaining the sensor calibration parameters is described. The data base will enable the further development of algorithms for computer vision-based obstacle detection and passive range estimation, as well as provide a benchmark for verification of range estimates against ground-truth measurements.

  19. Data association approaches in bearings-only multi-target tracking

    NASA Astrophysics Data System (ADS)

    Xu, Benlian; Wang, Zhiquan

    2008-03-01

    According to requirements of time computation complexity and correctness of data association of the multi-target tracking, two algorithms are suggested in this paper. The proposed Algorithm 1 is developed from the modified version of dual Simplex method, and it has the advantage of direct and explicit form of the optimal solution. The Algorithm 2 is based on the idea of Algorithm 1 and rotational sort method, it combines not only advantages of Algorithm 1, but also reduces the computational burden, whose complexity is only 1/ N times that of Algorithm 1. Finally, numerical analyses are carried out to evaluate the performance of the two data association algorithms.

  20. Opposition-Based Memetic Algorithm and Hybrid Approach for Sorting Permutations by Reversals.

    PubMed

    Soncco-Álvarez, José Luis; Muñoz, Daniel M; Ayala-Rincón, Mauricio

    2018-02-21

    Sorting unsigned permutations by reversals is a difficult problem; indeed, it was proved to be NP-hard by Caprara (1997). Because of its high complexity, many approximation algorithms to compute the minimal reversal distance were proposed until reaching the nowadays best-known theoretical ratio of 1.375. In this article, two memetic algorithms to compute the reversal distance are proposed. The first one uses the technique of opposition-based learning leading to an opposition-based memetic algorithm; the second one improves the previous algorithm by applying the heuristic of two breakpoint elimination leading to a hybrid approach. Several experiments were performed with one-hundred randomly generated permutations, single benchmark permutations, and biological permutations. Results of the experiments showed that the proposed OBMA and Hybrid-OBMA algorithms achieve the best results for practical cases, that is, for permutations of length up to 120. Also, Hybrid-OBMA showed to improve the results of OBMA for permutations greater than or equal to 60. The applicability of our proposed algorithms was checked processing permutations based on biological data, in which case OBMA gave the best average results for all instances.

  1. CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions.

    PubMed

    Liu, Yongchao; Wirawan, Adrianto; Schmidt, Bertil

    2013-04-04

    The maximal sensitivity for local alignments makes the Smith-Waterman algorithm a popular choice for protein sequence database search based on pairwise alignment. However, the algorithm is compute-intensive due to a quadratic time complexity. Corresponding runtimes are further compounded by the rapid growth of sequence databases. We present CUDASW++ 3.0, a fast Smith-Waterman protein database search algorithm, which couples CPU and GPU SIMD instructions and carries out concurrent CPU and GPU computations. For the CPU computation, this algorithm employs SSE-based vector execution units as accelerators. For the GPU computation, we have investigated for the first time a GPU SIMD parallelization, which employs CUDA PTX SIMD video instructions to gain more data parallelism beyond the SIMT execution model. Moreover, sequence alignment workloads are automatically distributed over CPUs and GPUs based on their respective compute capabilities. Evaluation on the Swiss-Prot database shows that CUDASW++ 3.0 gains a performance improvement over CUDASW++ 2.0 up to 2.9 and 3.2, with a maximum performance of 119.0 and 185.6 GCUPS, on a single-GPU GeForce GTX 680 and a dual-GPU GeForce GTX 690 graphics card, respectively. In addition, our algorithm has demonstrated significant speedups over other top-performing tools: SWIPE and BLAST+. CUDASW++ 3.0 is written in CUDA C++ and PTX assembly languages, targeting GPUs based on the Kepler architecture. This algorithm obtains significant speedups over its predecessor: CUDASW++ 2.0, by benefiting from the use of CPU and GPU SIMD instructions as well as the concurrent execution on CPUs and GPUs. The source code and the simulated data are available at http://cudasw.sourceforge.net.

  2. DOE Office of Scientific and Technical Information (OSTI.GOV)

    I. W. Ginsberg

    Multiresolutional decompositions known as spectral fingerprints are often used to extract spectral features from multispectral/hyperspectral data. In this study, the authors investigate the use of wavelet-based algorithms for generating spectral fingerprints. The wavelet-based algorithms are compared to the currently used method, traditional convolution with first-derivative Gaussian filters. The comparison analyses consists of two parts: (a) the computational expense of the new method is compared with the computational costs of the current method and (b) the outputs of the wavelet-based methods are compared with those of the current method to determine any practical differences in the resulting spectral fingerprints. The resultsmore » show that the wavelet-based algorithms can greatly reduce the computational expense of generating spectral fingerprints, while practically no differences exist in the resulting fingerprints. The analysis is conducted on a database of hyperspectral signatures, namely, Hyperspectral Digital Image Collection Experiment (HYDICE) signatures. The reduction in computational expense is by a factor of about 30, and the average Euclidean distance between resulting fingerprints is on the order of 0.02.« less

  3. A low-complexity geometric bilateration method for localization in Wireless Sensor Networks and its comparison with Least-Squares methods.

    PubMed

    Cota-Ruiz, Juan; Rosiles, Jose-Gerardo; Sifuentes, Ernesto; Rivas-Perea, Pablo

    2012-01-01

    This research presents a distributed and formula-based bilateration algorithm that can be used to provide initial set of locations. In this scheme each node uses distance estimates to anchors to solve a set of circle-circle intersection (CCI) problems, solved through a purely geometric formulation. The resulting CCIs are processed to pick those that cluster together and then take the average to produce an initial node location. The algorithm is compared in terms of accuracy and computational complexity with a Least-Squares localization algorithm, based on the Levenberg-Marquardt methodology. Results in accuracy vs. computational performance show that the bilateration algorithm is competitive compared with well known optimized localization algorithms.

  4. Big data mining analysis method based on cloud computing

    NASA Astrophysics Data System (ADS)

    Cai, Qing Qiu; Cui, Hong Gang; Tang, Hao

    2017-08-01

    Information explosion era, large data super-large, discrete and non-(semi) structured features have gone far beyond the traditional data management can carry the scope of the way. With the arrival of the cloud computing era, cloud computing provides a new technical way to analyze the massive data mining, which can effectively solve the problem that the traditional data mining method cannot adapt to massive data mining. This paper introduces the meaning and characteristics of cloud computing, analyzes the advantages of using cloud computing technology to realize data mining, designs the mining algorithm of association rules based on MapReduce parallel processing architecture, and carries out the experimental verification. The algorithm of parallel association rule mining based on cloud computing platform can greatly improve the execution speed of data mining.

  5. A supportive architecture for CFD-based design optimisation

    NASA Astrophysics Data System (ADS)

    Li, Ni; Su, Zeya; Bi, Zhuming; Tian, Chao; Ren, Zhiming; Gong, Guanghong

    2014-03-01

    Multi-disciplinary design optimisation (MDO) is one of critical methodologies to the implementation of enterprise systems (ES). MDO requiring the analysis of fluid dynamics raises a special challenge due to its extremely intensive computation. The rapid development of computational fluid dynamic (CFD) technique has caused a rise of its applications in various fields. Especially for the exterior designs of vehicles, CFD has become one of the three main design tools comparable to analytical approaches and wind tunnel experiments. CFD-based design optimisation is an effective way to achieve the desired performance under the given constraints. However, due to the complexity of CFD, integrating with CFD analysis in an intelligent optimisation algorithm is not straightforward. It is a challenge to solve a CFD-based design problem, which is usually with high dimensions, and multiple objectives and constraints. It is desirable to have an integrated architecture for CFD-based design optimisation. However, our review on existing works has found that very few researchers have studied on the assistive tools to facilitate CFD-based design optimisation. In the paper, a multi-layer architecture and a general procedure are proposed to integrate different CFD toolsets with intelligent optimisation algorithms, parallel computing technique and other techniques for efficient computation. In the proposed architecture, the integration is performed either at the code level or data level to fully utilise the capabilities of different assistive tools. Two intelligent algorithms are developed and embedded with parallel computing. These algorithms, together with the supportive architecture, lay a solid foundation for various applications of CFD-based design optimisation. To illustrate the effectiveness of the proposed architecture and algorithms, the case studies on aerodynamic shape design of a hypersonic cruising vehicle are provided, and the result has shown that the proposed architecture and developed algorithms have performed successfully and efficiently in dealing with the design optimisation with over 200 design variables.

  6. Efficient mapping algorithms for scheduling robot inverse dynamics computation on a multiprocessor system

    NASA Technical Reports Server (NTRS)

    Lee, C. S. G.; Chen, C. L.

    1989-01-01

    Two efficient mapping algorithms for scheduling the robot inverse dynamics computation consisting of m computational modules with precedence relationship to be executed on a multiprocessor system consisting of p identical homogeneous processors with processor and communication costs to achieve minimum computation time are presented. An objective function is defined in terms of the sum of the processor finishing time and the interprocessor communication time. The minimax optimization is performed on the objective function to obtain the best mapping. This mapping problem can be formulated as a combination of the graph partitioning and the scheduling problems; both have been known to be NP-complete. Thus, to speed up the searching for a solution, two heuristic algorithms were proposed to obtain fast but suboptimal mapping solutions. The first algorithm utilizes the level and the communication intensity of the task modules to construct an ordered priority list of ready modules and the module assignment is performed by a weighted bipartite matching algorithm. For a near-optimal mapping solution, the problem can be solved by the heuristic algorithm with simulated annealing. These proposed optimization algorithms can solve various large-scale problems within a reasonable time. Computer simulations were performed to evaluate and verify the performance and the validity of the proposed mapping algorithms. Finally, experiments for computing the inverse dynamics of a six-jointed PUMA-like manipulator based on the Newton-Euler dynamic equations were implemented on an NCUBE/ten hypercube computer to verify the proposed mapping algorithms. Computer simulation and experimental results are compared and discussed.

  7. Robust tuning of robot control systems

    NASA Technical Reports Server (NTRS)

    Minis, I.; Uebel, M.

    1992-01-01

    The computed torque control problem is examined for a robot arm with flexible, geared, joint drive systems which are typical in many industrial robots. The standard computed torque algorithm is not directly applicable to this class of manipulators because of the dynamics introduced by the joint drive system. The proposed approach to computed torque control combines a computed torque algorithm with torque controller at each joint. Three such control schemes are proposed. The first scheme uses the joint torque control system currently implemented on the robot arm and a novel form of the computed torque algorithm. The other two use the standard computed torque algorithm and a novel model following torque control system based on model following techniques. Standard tasks and performance indices are used to evaluate the performance of the controllers. Both numerical simulations and experiments are used in evaluation. The study shows that all three proposed systems lead to improved tracking performance over a conventional PD controller.

  8. A Fast Synthetic Aperture Radar Raw Data Simulation Using Cloud Computing

    PubMed Central

    Li, Zhixin; Su, Dandan; Zhu, Haijiang; Li, Wei; Zhang, Fan; Li, Ruirui

    2017-01-01

    Synthetic Aperture Radar (SAR) raw data simulation is a fundamental problem in radar system design and imaging algorithm research. The growth of surveying swath and resolution results in a significant increase in data volume and simulation period, which can be considered to be a comprehensive data intensive and computing intensive issue. Although several high performance computing (HPC) methods have demonstrated their potential for accelerating simulation, the input/output (I/O) bottleneck of huge raw data has not been eased. In this paper, we propose a cloud computing based SAR raw data simulation algorithm, which employs the MapReduce model to accelerate the raw data computing and the Hadoop distributed file system (HDFS) for fast I/O access. The MapReduce model is designed for the irregular parallel accumulation of raw data simulation, which greatly reduces the parallel efficiency of graphics processing unit (GPU) based simulation methods. In addition, three kinds of optimization strategies are put forward from the aspects of programming model, HDFS configuration and scheduling. The experimental results show that the cloud computing based algorithm achieves 4× speedup over the baseline serial approach in an 8-node cloud environment, and each optimization strategy can improve about 20%. This work proves that the proposed cloud algorithm is capable of solving the computing intensive and data intensive issues in SAR raw data simulation, and is easily extended to large scale computing to achieve higher acceleration. PMID:28075343

  9. Multigrid optimal mass transport for image registration and morphing

    NASA Astrophysics Data System (ADS)

    Rehman, Tauseef ur; Tannenbaum, Allen

    2007-02-01

    In this paper we present a computationally efficient Optimal Mass Transport algorithm. This method is based on the Monge-Kantorovich theory and is used for computing elastic registration and warping maps in image registration and morphing applications. This is a parameter free method which utilizes all of the grayscale data in an image pair in a symmetric fashion. No landmarks need to be specified for correspondence. In our work, we demonstrate significant improvement in computation time when our algorithm is applied as compared to the originally proposed method by Haker et al [1]. The original algorithm was based on a gradient descent method for removing the curl from an initial mass preserving map regarded as 2D vector field. This involves inverting the Laplacian in each iteration which is now computed using full multigrid technique resulting in an improvement in computational time by a factor of two. Greater improvement is achieved by decimating the curl in a multi-resolutional framework. The algorithm was applied to 2D short axis cardiac MRI images and brain MRI images for testing and comparison.

  10. Fast parallel molecular algorithms for DNA-based computation: solving the elliptic curve discrete logarithm problem over GF2.

    PubMed

    Li, Kenli; Zou, Shuting; Xv, Jin

    2008-01-01

    Elliptic curve cryptographic algorithms convert input data to unrecognizable encryption and the unrecognizable data back again into its original decrypted form. The security of this form of encryption hinges on the enormous difficulty that is required to solve the elliptic curve discrete logarithm problem (ECDLP), especially over GF(2(n)), n in Z+. This paper describes an effective method to find solutions to the ECDLP by means of a molecular computer. We propose that this research accomplishment would represent a breakthrough for applied biological computation and this paper demonstrates that in principle this is possible. Three DNA-based algorithms: a parallel adder, a parallel multiplier, and a parallel inverse over GF(2(n)) are described. The biological operation time of all of these algorithms is polynomial with respect to n. Considering this analysis, cryptography using a public key might be less secure. In this respect, a principal contribution of this paper is to provide enhanced evidence of the potential of molecular computing to tackle such ambitious computations.

  11. Fast Parallel Molecular Algorithms for DNA-Based Computation: Solving the Elliptic Curve Discrete Logarithm Problem over GF(2n)

    PubMed Central

    Li, Kenli; Zou, Shuting; Xv, Jin

    2008-01-01

    Elliptic curve cryptographic algorithms convert input data to unrecognizable encryption and the unrecognizable data back again into its original decrypted form. The security of this form of encryption hinges on the enormous difficulty that is required to solve the elliptic curve discrete logarithm problem (ECDLP), especially over GF(2n), n ∈ Z+. This paper describes an effective method to find solutions to the ECDLP by means of a molecular computer. We propose that this research accomplishment would represent a breakthrough for applied biological computation and this paper demonstrates that in principle this is possible. Three DNA-based algorithms: a parallel adder, a parallel multiplier, and a parallel inverse over GF(2n) are described. The biological operation time of all of these algorithms is polynomial with respect to n. Considering this analysis, cryptography using a public key might be less secure. In this respect, a principal contribution of this paper is to provide enhanced evidence of the potential of molecular computing to tackle such ambitious computations. PMID:18431451

  12. An improved conscan algorithm based on a Kalman filter

    NASA Technical Reports Server (NTRS)

    Eldred, D. B.

    1994-01-01

    Conscan is commonly used by DSN antennas to allow adaptive tracking of a target whose position is not precisely known. This article describes an algorithm that is based on a Kalman filter and is proposed to replace the existing fast Fourier transform based (FFT-based) algorithm for conscan. Advantages of this algorithm include better pointing accuracy, continuous update information, and accommodation of missing data. Additionally, a strategy for adaptive selection of the conscan radius is proposed. The performance of the algorithm is illustrated through computer simulations and compared to the FFT algorithm. The results show that the Kalman filter algorithm is consistently superior.

  13. A projected preconditioned conjugate gradient algorithm for computing many extreme eigenpairs of a Hermitian matrix [A projected preconditioned conjugate gradient algorithm for computing a large eigenspace of a Hermitian matrix

    DOE PAGES

    Vecharynski, Eugene; Yang, Chao; Pask, John E.

    2015-02-25

    Here, we present an iterative algorithm for computing an invariant subspace associated with the algebraically smallest eigenvalues of a large sparse or structured Hermitian matrix A. We are interested in the case in which the dimension of the invariant subspace is large (e.g., over several hundreds or thousands) even though it may still be small relative to the dimension of A. These problems arise from, for example, density functional theory (DFT) based electronic structure calculations for complex materials. The key feature of our algorithm is that it performs fewer Rayleigh–Ritz calculations compared to existing algorithms such as the locally optimalmore » block preconditioned conjugate gradient or the Davidson algorithm. It is a block algorithm, and hence can take advantage of efficient BLAS3 operations and be implemented with multiple levels of concurrency. We discuss a number of practical issues that must be addressed in order to implement the algorithm efficiently on a high performance computer.« less

  14. A Low Cost VLSI Architecture for Spike Sorting Based on Feature Extraction with Peak Search.

    PubMed

    Chang, Yuan-Jyun; Hwang, Wen-Jyi; Chen, Chih-Chang

    2016-12-07

    The goal of this paper is to present a novel VLSI architecture for spike sorting with high classification accuracy, low area costs and low power consumption. A novel feature extraction algorithm with low computational complexities is proposed for the design of the architecture. In the feature extraction algorithm, a spike is separated into two portions based on its peak value. The area of each portion is then used as a feature. The algorithm is simple to implement and less susceptible to noise interference. Based on the algorithm, a novel architecture capable of identifying peak values and computing spike areas concurrently is proposed. To further accelerate the computation, a spike can be divided into a number of segments for the local feature computation. The local features are subsequently merged with the global ones by a simple hardware circuit. The architecture can also be easily operated in conjunction with the circuits for commonly-used spike detection algorithms, such as the Non-linear Energy Operator (NEO). The architecture has been implemented by an Application-Specific Integrated Circuit (ASIC) with 90-nm technology. Comparisons to the existing works show that the proposed architecture is well suited for real-time multi-channel spike detection and feature extraction requiring low hardware area costs, low power consumption and high classification accuracy.

  15. Comparison and optimization of in silico algorithms for predicting the pathogenicity of sodium channel variants in epilepsy.

    PubMed

    Holland, Katherine D; Bouley, Thomas M; Horn, Paul S

    2017-07-01

    Variants in neuronal voltage-gated sodium channel α-subunits genes SCN1A, SCN2A, and SCN8A are common in early onset epileptic encephalopathies and other autosomal dominant childhood epilepsy syndromes. However, in clinical practice, missense variants are often classified as variants of uncertain significance when missense variants are identified but heritability cannot be determined. Genetic testing reports often include results of computational tests to estimate pathogenicity and the frequency of that variant in population-based databases. The objective of this work was to enhance clinicians' understanding of results by (1) determining how effectively computational algorithms predict epileptogenicity of sodium channel (SCN) missense variants; (2) optimizing their predictive capabilities; and (3) determining if epilepsy-associated SCN variants are present in population-based databases. This will help clinicians better understand the results of indeterminate SCN test results in people with epilepsy. Pathogenic, likely pathogenic, and benign variants in SCNs were identified using databases of sodium channel variants. Benign variants were also identified from population-based databases. Eight algorithms commonly used to predict pathogenicity were compared. In addition, logistic regression was used to determine if a combination of algorithms could better predict pathogenicity. Based on American College of Medical Genetic Criteria, 440 variants were classified as pathogenic or likely pathogenic and 84 were classified as benign or likely benign. Twenty-eight variants previously associated with epilepsy were present in population-based gene databases. The output provided by most computational algorithms had a high sensitivity but low specificity with an accuracy of 0.52-0.77. Accuracy could be improved by adjusting the threshold for pathogenicity. Using this adjustment, the Mendelian Clinically Applicable Pathogenicity (M-CAP) algorithm had an accuracy of 0.90 and a combination of algorithms increased the accuracy to 0.92. Potentially pathogenic variants are present in population-based sources. Most computational algorithms overestimate pathogenicity; however, a weighted combination of several algorithms increased classification accuracy to >0.90. Wiley Periodicals, Inc. © 2017 International League Against Epilepsy.

  16. Markov chain algorithms: a template for building future robust low-power systems

    PubMed Central

    Deka, Biplab; Birklykke, Alex A.; Duwe, Henry; Mansinghka, Vikash K.; Kumar, Rakesh

    2014-01-01

    Although computational systems are looking towards post CMOS devices in the pursuit of lower power, the expected inherent unreliability of such devices makes it difficult to design robust systems without additional power overheads for guaranteeing robustness. As such, algorithmic structures with inherent ability to tolerate computational errors are of significant interest. We propose to cast applications as stochastic algorithms based on Markov chains (MCs) as such algorithms are both sufficiently general and tolerant to transition errors. We show with four example applications—Boolean satisfiability, sorting, low-density parity-check decoding and clustering—how applications can be cast as MC algorithms. Using algorithmic fault injection techniques, we demonstrate the robustness of these implementations to transition errors with high error rates. Based on these results, we make a case for using MCs as an algorithmic template for future robust low-power systems. PMID:24842030

  17. Improved Ant Colony Clustering Algorithm and Its Performance Study

    PubMed Central

    Gao, Wei

    2016-01-01

    Clustering analysis is used in many disciplines and applications; it is an important tool that descriptively identifies homogeneous groups of objects based on attribute values. The ant colony clustering algorithm is a swarm-intelligent method used for clustering problems that is inspired by the behavior of ant colonies that cluster their corpses and sort their larvae. A new abstraction ant colony clustering algorithm using a data combination mechanism is proposed to improve the computational efficiency and accuracy of the ant colony clustering algorithm. The abstraction ant colony clustering algorithm is used to cluster benchmark problems, and its performance is compared with the ant colony clustering algorithm and other methods used in existing literature. Based on similar computational difficulties and complexities, the results show that the abstraction ant colony clustering algorithm produces results that are not only more accurate but also more efficiently determined than the ant colony clustering algorithm and the other methods. Thus, the abstraction ant colony clustering algorithm can be used for efficient multivariate data clustering. PMID:26839533

  18. A Scalable O(N) Algorithm for Large-Scale Parallel First-Principles Molecular Dynamics Simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Osei-Kuffuor, Daniel; Fattebert, Jean-Luc

    2014-01-01

    Traditional algorithms for first-principles molecular dynamics (FPMD) simulations only gain a modest capability increase from current petascale computers, due to their O(N 3) complexity and their heavy use of global communications. To address this issue, we are developing a truly scalable O(N) complexity FPMD algorithm, based on density functional theory (DFT), which avoids global communications. The computational model uses a general nonorthogonal orbital formulation for the DFT energy functional, which requires knowledge of selected elements of the inverse of the associated overlap matrix. We present a scalable algorithm for approximately computing selected entries of the inverse of the overlap matrix,more » based on an approximate inverse technique, by inverting local blocks corresponding to principal submatrices of the global overlap matrix. The new FPMD algorithm exploits sparsity and uses nearest neighbor communication to provide a computational scheme capable of extreme scalability. Accuracy is controlled by the mesh spacing of the finite difference discretization, the size of the localization regions in which the electronic orbitals are confined, and a cutoff beyond which the entries of the overlap matrix can be omitted when computing selected entries of its inverse. We demonstrate the algorithm's excellent parallel scaling for up to O(100K) atoms on O(100K) processors, with a wall-clock time of O(1) minute per molecular dynamics time step.« less

  19. Scalable Parallel Density-based Clustering and Applications

    NASA Astrophysics Data System (ADS)

    Patwary, Mostofa Ali

    2014-04-01

    Recently, density-based clustering algorithms (DBSCAN and OPTICS) have gotten significant attention of the scientific community due to their unique capability of discovering arbitrary shaped clusters and eliminating noise data. These algorithms have several applications, which require high performance computing, including finding halos and subhalos (clusters) from massive cosmology data in astrophysics, analyzing satellite images, X-ray crystallography, and anomaly detection. However, parallelization of these algorithms are extremely challenging as they exhibit inherent sequential data access order, unbalanced workload resulting in low parallel efficiency. To break the data access sequentiality and to achieve high parallelism, we develop new parallel algorithms, both for DBSCAN and OPTICS, designed using graph algorithmic techniques. For example, our parallel DBSCAN algorithm exploits the similarities between DBSCAN and computing connected components. Using datasets containing up to a billion floating point numbers, we show that our parallel density-based clustering algorithms significantly outperform the existing algorithms, achieving speedups up to 27.5 on 40 cores on shared memory architecture and speedups up to 5,765 using 8,192 cores on distributed memory architecture. In our experiments, we found that while achieving the scalability, our algorithms produce clustering results with comparable quality to the classical algorithms.

  20. AdaBoost-based algorithm for network intrusion detection.

    PubMed

    Hu, Weiming; Hu, Wei; Maybank, Steve

    2008-04-01

    Network intrusion detection aims at distinguishing the attacks on the Internet from normal use of the Internet. It is an indispensable part of the information security system. Due to the variety of network behaviors and the rapid development of attack fashions, it is necessary to develop fast machine-learning-based intrusion detection algorithms with high detection rates and low false-alarm rates. In this correspondence, we propose an intrusion detection algorithm based on the AdaBoost algorithm. In the algorithm, decision stumps are used as weak classifiers. The decision rules are provided for both categorical and continuous features. By combining the weak classifiers for continuous features and the weak classifiers for categorical features into a strong classifier, the relations between these two different types of features are handled naturally, without any forced conversions between continuous and categorical features. Adaptable initial weights and a simple strategy for avoiding overfitting are adopted to improve the performance of the algorithm. Experimental results show that our algorithm has low computational complexity and error rates, as compared with algorithms of higher computational complexity, as tested on the benchmark sample data.

  1. A rate-constrained fast full-search algorithm based on block sum pyramid.

    PubMed

    Song, Byung Cheol; Chun, Kang-Wook; Ra, Jong Beom

    2005-03-01

    This paper presents a fast full-search algorithm (FSA) for rate-constrained motion estimation. The proposed algorithm, which is based on the block sum pyramid frame structure, successively eliminates unnecessary search positions according to rate-constrained criterion. This algorithm provides the identical estimation performance to a conventional FSA having rate constraint, while achieving considerable reduction in computation.

  2. Algorithms for computing solvents of unilateral second-order matrix polynomials over prime finite fields using lambda-matrices

    NASA Astrophysics Data System (ADS)

    Burtyka, Filipp

    2018-01-01

    The paper considers algorithms for finding diagonalizable and non-diagonalizable roots (so called solvents) of monic arbitrary unilateral second-order matrix polynomial over prime finite field. These algorithms are based on polynomial matrices (lambda-matrices). This is an extension of existing general methods for computing solvents of matrix polynomials over field of complex numbers. We analyze how techniques for complex numbers can be adapted for finite field and estimate asymptotic complexity of the obtained algorithms.

  3. GPU-based cloud service for Smith-Waterman algorithm using frequency distance filtration scheme.

    PubMed

    Lee, Sheng-Ta; Lin, Chun-Yuan; Hung, Che Lun

    2013-01-01

    As the conventional means of analyzing the similarity between a query sequence and database sequences, the Smith-Waterman algorithm is feasible for a database search owing to its high sensitivity. However, this algorithm is still quite time consuming. CUDA programming can improve computations efficiently by using the computational power of massive computing hardware as graphics processing units (GPUs). This work presents a novel Smith-Waterman algorithm with a frequency-based filtration method on GPUs rather than merely accelerating the comparisons yet expending computational resources to handle such unnecessary comparisons. A user friendly interface is also designed for potential cloud server applications with GPUs. Additionally, two data sets, H1N1 protein sequences (query sequence set) and human protein database (database set), are selected, followed by a comparison of CUDA-SW and CUDA-SW with the filtration method, referred to herein as CUDA-SWf. Experimental results indicate that reducing unnecessary sequence alignments can improve the computational time by up to 41%. Importantly, by using CUDA-SWf as a cloud service, this application can be accessed from any computing environment of a device with an Internet connection without time constraints.

  4. A parallel implementation of an off-lattice individual-based model of multicellular populations

    NASA Astrophysics Data System (ADS)

    Harvey, Daniel G.; Fletcher, Alexander G.; Osborne, James M.; Pitt-Francis, Joe

    2015-07-01

    As computational models of multicellular populations include ever more detailed descriptions of biophysical and biochemical processes, the computational cost of simulating such models limits their ability to generate novel scientific hypotheses and testable predictions. While developments in microchip technology continue to increase the power of individual processors, parallel computing offers an immediate increase in available processing power. To make full use of parallel computing technology, it is necessary to develop specialised algorithms. To this end, we present a parallel algorithm for a class of off-lattice individual-based models of multicellular populations. The algorithm divides the spatial domain between computing processes and comprises communication routines that ensure the model is correctly simulated on multiple processors. The parallel algorithm is shown to accurately reproduce the results of a deterministic simulation performed using a pre-existing serial implementation. We test the scaling of computation time, memory use and load balancing as more processes are used to simulate a cell population of fixed size. We find approximate linear scaling of both speed-up and memory consumption on up to 32 processor cores. Dynamic load balancing is shown to provide speed-up for non-regular spatial distributions of cells in the case of a growing population.

  5. Dynamic virtual machine allocation policy in cloud computing complying with service level agreement using CloudSim

    NASA Astrophysics Data System (ADS)

    Aneri, Parikh; Sumathy, S.

    2017-11-01

    Cloud computing provides services over the internet and provides application resources and data to the users based on their demand. Base of the Cloud Computing is consumer provider model. Cloud provider provides resources which consumer can access using cloud computing model in order to build their application based on their demand. Cloud data center is a bulk of resources on shared pool architecture for cloud user to access. Virtualization is the heart of the Cloud computing model, it provides virtual machine as per application specific configuration and those applications are free to choose their own configuration. On one hand, there is huge number of resources and on other hand it has to serve huge number of requests effectively. Therefore, resource allocation policy and scheduling policy play very important role in allocation and managing resources in this cloud computing model. This paper proposes the load balancing policy using Hungarian algorithm. Hungarian Algorithm provides dynamic load balancing policy with a monitor component. Monitor component helps to increase cloud resource utilization by managing the Hungarian algorithm by monitoring its state and altering its state based on artificial intelligent. CloudSim used in this proposal is an extensible toolkit and it simulates cloud computing environment.

  6. DE and NLP Based QPLS Algorithm

    NASA Astrophysics Data System (ADS)

    Yu, Xiaodong; Huang, Dexian; Wang, Xiong; Liu, Bo

    As a novel evolutionary computing technique, Differential Evolution (DE) has been considered to be an effective optimization method for complex optimization problems, and achieved many successful applications in engineering. In this paper, a new algorithm of Quadratic Partial Least Squares (QPLS) based on Nonlinear Programming (NLP) is presented. And DE is used to solve the NLP so as to calculate the optimal input weights and the parameters of inner relationship. The simulation results based on the soft measurement of diesel oil solidifying point on a real crude distillation unit demonstrate that the superiority of the proposed algorithm to linear PLS and QPLS which is based on Sequential Quadratic Programming (SQP) in terms of fitting accuracy and computational costs.

  7. LDPC decoder with a limited-precision FPGA-based floating-point multiplication coprocessor

    NASA Astrophysics Data System (ADS)

    Moberly, Raymond; O'Sullivan, Michael; Waheed, Khurram

    2007-09-01

    Implementing the sum-product algorithm, in an FPGA with an embedded processor, invites us to consider a tradeoff between computational precision and computational speed. The algorithm, known outside of the signal processing community as Pearl's belief propagation, is used for iterative soft-decision decoding of LDPC codes. We determined the feasibility of a coprocessor that will perform product computations. Our FPGA-based coprocessor (design) performs computer algebra with significantly less precision than the standard (e.g. integer, floating-point) operations of general purpose processors. Using synthesis, targeting a 3,168 LUT Xilinx FPGA, we show that key components of a decoder are feasible and that the full single-precision decoder could be constructed using a larger part. Soft-decision decoding by the iterative belief propagation algorithm is impacted both positively and negatively by a reduction in the precision of the computation. Reducing precision reduces the coding gain, but the limited-precision computation can operate faster. A proposed solution offers custom logic to perform computations with less precision, yet uses the floating-point format to interface with the software. Simulation results show the achievable coding gain. Synthesis results help theorize the the full capacity and performance of an FPGA-based coprocessor.

  8. Optical image hiding based on computational ghost imaging

    NASA Astrophysics Data System (ADS)

    Wang, Le; Zhao, Shengmei; Cheng, Weiwen; Gong, Longyan; Chen, Hanwu

    2016-05-01

    Imaging hiding schemes play important roles in now big data times. They provide copyright protections of digital images. In the paper, we propose a novel image hiding scheme based on computational ghost imaging to have strong robustness and high security. The watermark is encrypted with the configuration of a computational ghost imaging system, and the random speckle patterns compose a secret key. Least significant bit algorithm is adopted to embed the watermark and both the second-order correlation algorithm and the compressed sensing (CS) algorithm are used to extract the watermark. The experimental and simulation results show that the authorized users can get the watermark with the secret key. The watermark image could not be retrieved when the eavesdropping ratio is less than 45% with the second-order correlation algorithm, whereas it is less than 20% with the TVAL3 CS reconstructed algorithm. In addition, the proposed scheme is robust against the 'salt and pepper' noise and image cropping degradations.

  9. Computing and analyzing the sensitivity of MLP due to the errors of the i.i.d. inputs and weights based on CLT.

    PubMed

    Yang, Sheng-Sung; Ho, Chia-Lu; Siu, Sammy

    2010-12-01

    In this paper, we propose an algorithm based on the central limit theorem to compute the sensitivity of the multilayer perceptron (MLP) due to the errors of the inputs and weights. For simplicity and practicality, all inputs and weights studied here are independently identically distributed (i.i.d.). The theoretical results derived from the proposed algorithm show that the sensitivity of the MLP is affected by the number of layers and the number of neurons adopted in each layer. To prove the reliability of the proposed algorithm, some experimental results of the sensitivity are also presented, and they match the theoretical ones. The good agreement between the theoretical results and the experimental results verifies the reliability and feasibility of the proposed algorithm. Furthermore, the proposed algorithm can also be applied to compute precisely the sensitivity of the MLP with any available activation functions and any types of i.i.d. inputs and weights.

  10. A High-Performance Genetic Algorithm: Using Traveling Salesman Problem as a Case

    PubMed Central

    Tsai, Chun-Wei; Tseng, Shih-Pang; Yang, Chu-Sing

    2014-01-01

    This paper presents a simple but efficient algorithm for reducing the computation time of genetic algorithm (GA) and its variants. The proposed algorithm is motivated by the observation that genes common to all the individuals of a GA have a high probability of surviving the evolution and ending up being part of the final solution; as such, they can be saved away to eliminate the redundant computations at the later generations of a GA. To evaluate the performance of the proposed algorithm, we use it not only to solve the traveling salesman problem but also to provide an extensive analysis on the impact it may have on the quality of the end result. Our experimental results indicate that the proposed algorithm can significantly reduce the computation time of GA and GA-based algorithms while limiting the degradation of the quality of the end result to a very small percentage compared to traditional GA. PMID:24892038

  11. A high-performance genetic algorithm: using traveling salesman problem as a case.

    PubMed

    Tsai, Chun-Wei; Tseng, Shih-Pang; Chiang, Ming-Chao; Yang, Chu-Sing; Hong, Tzung-Pei

    2014-01-01

    This paper presents a simple but efficient algorithm for reducing the computation time of genetic algorithm (GA) and its variants. The proposed algorithm is motivated by the observation that genes common to all the individuals of a GA have a high probability of surviving the evolution and ending up being part of the final solution; as such, they can be saved away to eliminate the redundant computations at the later generations of a GA. To evaluate the performance of the proposed algorithm, we use it not only to solve the traveling salesman problem but also to provide an extensive analysis on the impact it may have on the quality of the end result. Our experimental results indicate that the proposed algorithm can significantly reduce the computation time of GA and GA-based algorithms while limiting the degradation of the quality of the end result to a very small percentage compared to traditional GA.

  12. Ndarts

    NASA Technical Reports Server (NTRS)

    Jain, Abhinandan

    2011-01-01

    Ndarts software provides algorithms for computing quantities associated with the dynamics of articulated, rigid-link, multibody systems. It is designed as a general-purpose dynamics library that can be used for the modeling of robotic platforms, space vehicles, molecular dynamics, and other such applications. The architecture and algorithms in Ndarts are based on the Spatial Operator Algebra (SOA) theory for computational multibody and robot dynamics developed at JPL. It uses minimal, internal coordinate models. The algorithms are low-order, recursive scatter/ gather algorithms. In comparison with the earlier Darts++ software, this version has a more general and cleaner design needed to support a larger class of computational dynamics needs. It includes a frames infrastructure, allows algorithms to operate on subgraphs of the system, and implements lazy and deferred computation for better efficiency. Dynamics modeling modules such as Ndarts are core building blocks of control and simulation software for space, robotic, mechanism, bio-molecular, and material systems modeling.

  13. Computer algorithms and applications used to assist the evaluation and treatment of adolescent idiopathic scoliosis: a review of published articles 2000-2009.

    PubMed

    Phan, Philippe; Mezghani, Neila; Aubin, Carl-Éric; de Guise, Jacques A; Labelle, Hubert

    2011-07-01

    Adolescent idiopathic scoliosis (AIS) is a complex spinal deformity whose assessment and treatment present many challenges. Computer applications have been developed to assist clinicians. A literature review on computer applications used in AIS evaluation and treatment has been undertaken. The algorithms used, their accuracy and clinical usability were analyzed. Computer applications have been used to create new classifications for AIS based on 2D and 3D features, assess scoliosis severity or risk of progression and assist bracing and surgical treatment. It was found that classification accuracy could be improved using computer algorithms that AIS patient follow-up and screening could be done using surface topography thereby limiting radiation and that bracing and surgical treatment could be optimized using simulations. Yet few computer applications are routinely used in clinics. With the development of 3D imaging and databases, huge amounts of clinical and geometrical data need to be taken into consideration when researching and managing AIS. Computer applications based on advanced algorithms will be able to handle tasks that could otherwise not be done which can possibly improve AIS patients' management. Clinically oriented applications and evidence that they can improve current care will be required for their integration in the clinical setting.

  14. A Linked List-Based Algorithm for Blob Detection on Embedded Vision-Based Sensors

    PubMed Central

    Acevedo-Avila, Ricardo; Gonzalez-Mendoza, Miguel; Garcia-Garcia, Andres

    2016-01-01

    Blob detection is a common task in vision-based applications. Most existing algorithms are aimed at execution on general purpose computers; while very few can be adapted to the computing restrictions present in embedded platforms. This paper focuses on the design of an algorithm capable of real-time blob detection that minimizes system memory consumption. The proposed algorithm detects objects in one image scan; it is based on a linked-list data structure tree used to label blobs depending on their shape and node information. An example application showing the results of a blob detection co-processor has been built on a low-powered field programmable gate array hardware as a step towards developing a smart video surveillance system. The detection method is intended for general purpose application. As such, several test cases focused on character recognition are also examined. The results obtained present a fair trade-off between accuracy and memory requirements; and prove the validity of the proposed approach for real-time implementation on resource-constrained computing platforms. PMID:27240382

  15. A model of proto-object based saliency

    PubMed Central

    Russell, Alexander F.; Mihalaş, Stefan; von der Heydt, Rudiger; Niebur, Ernst; Etienne-Cummings, Ralph

    2013-01-01

    Organisms use the process of selective attention to optimally allocate their computational resources to the instantaneously most relevant subsets of a visual scene, ensuring that they can parse the scene in real time. Many models of bottom-up attentional selection assume that elementary image features, like intensity, color and orientation, attract attention. Gestalt psychologists, how-ever, argue that humans perceive whole objects before they analyze individual features. This is supported by recent psychophysical studies that show that objects predict eye-fixations better than features. In this report we present a neurally inspired algorithm of object based, bottom-up attention. The model rivals the performance of state of the art non-biologically plausible feature based algorithms (and outperforms biologically plausible feature based algorithms) in its ability to predict perceptual saliency (eye fixations and subjective interest points) in natural scenes. The model achieves this by computing saliency as a function of proto-objects that establish the perceptual organization of the scene. All computational mechanisms of the algorithm have direct neural correlates, and our results provide evidence for the interface theory of attention. PMID:24184601

  16. A systematic investigation of computation models for predicting Adverse Drug Reactions (ADRs).

    PubMed

    Kuang, Qifan; Wang, MinQi; Li, Rong; Dong, YongCheng; Li, Yizhou; Li, Menglong

    2014-01-01

    Early and accurate identification of adverse drug reactions (ADRs) is critically important for drug development and clinical safety. Computer-aided prediction of ADRs has attracted increasing attention in recent years, and many computational models have been proposed. However, because of the lack of systematic analysis and comparison of the different computational models, there remain limitations in designing more effective algorithms and selecting more useful features. There is therefore an urgent need to review and analyze previous computation models to obtain general conclusions that can provide useful guidance to construct more effective computational models to predict ADRs. In the current study, the main work is to compare and analyze the performance of existing computational methods to predict ADRs, by implementing and evaluating additional algorithms that have been earlier used for predicting drug targets. Our results indicated that topological and intrinsic features were complementary to an extent and the Jaccard coefficient had an important and general effect on the prediction of drug-ADR associations. By comparing the structure of each algorithm, final formulas of these algorithms were all converted to linear model in form, based on this finding we propose a new algorithm called the general weighted profile method and it yielded the best overall performance among the algorithms investigated in this paper. Several meaningful conclusions and useful findings regarding the prediction of ADRs are provided for selecting optimal features and algorithms.

  17. Method for concurrent execution of primitive operations by dynamically assigning operations based upon computational marked graph and availability of data

    NASA Technical Reports Server (NTRS)

    Mielke, Roland V. (Inventor); Stoughton, John W. (Inventor)

    1990-01-01

    Computationally complex primitive operations of an algorithm are executed concurrently in a plurality of functional units under the control of an assignment manager. The algorithm is preferably defined as a computationally marked graph contianing data status edges (paths) corresponding to each of the data flow edges. The assignment manager assigns primitive operations to the functional units and monitors completion of the primitive operations to determine data availability using the computational marked graph of the algorithm. All data accessing of the primitive operations is performed by the functional units independently of the assignment manager.

  18. Fast parallel molecular algorithms for DNA-based computation: factoring integers.

    PubMed

    Chang, Weng-Long; Guo, Minyi; Ho, Michael Shan-Hui

    2005-06-01

    The RSA public-key cryptosystem is an algorithm that converts input data to an unrecognizable encryption and converts the unrecognizable data back into its original decryption form. The security of the RSA public-key cryptosystem is based on the difficulty of factoring the product of two large prime numbers. This paper demonstrates to factor the product of two large prime numbers, and is a breakthrough in basic biological operations using a molecular computer. In order to achieve this, we propose three DNA-based algorithms for parallel subtractor, parallel comparator, and parallel modular arithmetic that formally verify our designed molecular solutions for factoring the product of two large prime numbers. Furthermore, this work indicates that the cryptosystems using public-key are perhaps insecure and also presents clear evidence of the ability of molecular computing to perform complicated mathematical operations.

  19. Research on rolling element bearing fault diagnosis based on genetic algorithm matching pursuit

    NASA Astrophysics Data System (ADS)

    Rong, R. W.; Ming, T. F.

    2017-12-01

    In order to solve the problem of slow computation speed, matching pursuit algorithm is applied to rolling bearing fault diagnosis, and the improvement are conducted from two aspects that are the construction of dictionary and the way to search for atoms. To be specific, Gabor function which can reflect time-frequency localization characteristic well is used to construct the dictionary, and the genetic algorithm to improve the searching speed. A time-frequency analysis method based on genetic algorithm matching pursuit (GAMP) algorithm is proposed. The way to set property parameters for the improvement of the decomposition results is studied. Simulation and experimental results illustrate that the weak fault feature of rolling bearing can be extracted effectively by this proposed method, at the same time, the computation speed increases obviously.

  20. Efficient FFT Algorithm for Psychoacoustic Model of the MPEG-4 AAC

    NASA Astrophysics Data System (ADS)

    Lee, Jae-Seong; Lee, Chang-Joon; Park, Young-Cheol; Youn, Dae-Hee

    This paper proposes an efficient FFT algorithm for the Psycho-Acoustic Model (PAM) of MPEG-4 AAC. The proposed algorithm synthesizes FFT coefficients using MDCT and MDST coefficients through circular convolution. The complexity of the MDCT and MDST coefficients is approximately half of the original FFT. We also design a new PAM based on the proposed FFT algorithm, which has 15% lower computational complexity than the original PAM without degradation of sound quality. Subjective as well as objective test results are presented to confirm the efficiency of the proposed FFT computation algorithm and the PAM.

  1. Resonant transition-based quantum computation

    NASA Astrophysics Data System (ADS)

    Chiang, Chen-Fu; Hsieh, Chang-Yu

    2017-05-01

    In this article we assess a novel quantum computation paradigm based on the resonant transition (RT) phenomenon commonly associated with atomic and molecular systems. We thoroughly analyze the intimate connections between the RT-based quantum computation and the well-established adiabatic quantum computation (AQC). Both quantum computing frameworks encode solutions to computational problems in the spectral properties of a Hamiltonian and rely on the quantum dynamics to obtain the desired output state. We discuss how one can adapt any adiabatic quantum algorithm to a corresponding RT version and the two approaches are limited by different aspects of Hamiltonians' spectra. The RT approach provides a compelling alternative to the AQC under various circumstances. To better illustrate the usefulness of the novel framework, we analyze the time complexity of an algorithm for 3-SAT problems and discuss straightforward methods to fine tune its efficiency.

  2. Colour based fire detection method with temporal intensity variation filtration

    NASA Astrophysics Data System (ADS)

    Trambitckii, K.; Anding, K.; Musalimov, V.; Linß, G.

    2015-02-01

    Development of video, computing technologies and computer vision gives a possibility of automatic fire detection on video information. Under that project different algorithms was implemented to find more efficient way of fire detection. In that article colour based fire detection algorithm is described. But it is not enough to use only colour information to detect fire properly. The main reason of this is that in the shooting conditions may be a lot of things having colour similar to fire. A temporary intensity variation of pixels is used to separate them from the fire. These variations are averaged over the series of several frames. This algorithm shows robust work and was realised as a computer program by using of the OpenCV library.

  3. Job-shop scheduling applied to computer vision

    NASA Astrophysics Data System (ADS)

    Sebastian y Zuniga, Jose M.; Torres-Medina, Fernando; Aracil, Rafael; Reinoso, Oscar; Jimenez, Luis M.; Garcia, David

    1997-09-01

    This paper presents a method for minimizing the total elapsed time spent by n tasks running on m differents processors working in parallel. The developed algorithm not only minimizes the total elapsed time but also reduces the idle time and waiting time of in-process tasks. This condition is very important in some applications of computer vision in which the time to finish the total process is particularly critical -- quality control in industrial inspection, real- time computer vision, guided robots. The scheduling algorithm is based on the use of two matrices, obtained from the precedence relationships between tasks, and the data obtained from the two matrices. The developed scheduling algorithm has been tested in one application of quality control using computer vision. The results obtained have been satisfactory in the application of different image processing algorithms.

  4. Total variation-based neutron computed tomography

    NASA Astrophysics Data System (ADS)

    Barnard, Richard C.; Bilheux, Hassina; Toops, Todd; Nafziger, Eric; Finney, Charles; Splitter, Derek; Archibald, Rick

    2018-05-01

    We perform the neutron computed tomography reconstruction problem via an inverse problem formulation with a total variation penalty. In the case of highly under-resolved angular measurements, the total variation penalty suppresses high-frequency artifacts which appear in filtered back projections. In order to efficiently compute solutions for this problem, we implement a variation of the split Bregman algorithm; due to the error-forgetting nature of the algorithm, the computational cost of updating can be significantly reduced via very inexact approximate linear solvers. We present the effectiveness of the algorithm in the significantly low-angular sampling case using synthetic test problems as well as data obtained from a high flux neutron source. The algorithm removes artifacts and can even roughly capture small features when an extremely low number of angles are used.

  5. Improved Savitzky-Golay-method-based fluorescence subtraction algorithm for rapid recovery of Raman spectra.

    PubMed

    Chen, Kun; Zhang, Hongyuan; Wei, Haoyun; Li, Yan

    2014-08-20

    In this paper, we propose an improved subtraction algorithm for rapid recovery of Raman spectra that can substantially reduce the computation time. This algorithm is based on an improved Savitzky-Golay (SG) iterative smoothing method, which involves two key novel approaches: (a) the use of the Gauss-Seidel method and (b) the introduction of a relaxation factor into the iterative procedure. By applying a novel successive relaxation (SG-SR) iterative method to the relaxation factor, additional improvement in the convergence speed over the standard Savitzky-Golay procedure is realized. The proposed improved algorithm (the RIA-SG-SR algorithm), which uses SG-SR-based iteration instead of Savitzky-Golay iteration, has been optimized and validated with a mathematically simulated Raman spectrum, as well as experimentally measured Raman spectra from non-biological and biological samples. The method results in a significant reduction in computing cost while yielding consistent rejection of fluorescence and noise for spectra with low signal-to-fluorescence ratios and varied baselines. In the simulation, RIA-SG-SR achieved 1 order of magnitude improvement in iteration number and 2 orders of magnitude improvement in computation time compared with the range-independent background-subtraction algorithm (RIA). Furthermore the computation time of the experimentally measured raw Raman spectrum processing from skin tissue decreased from 6.72 to 0.094 s. In general, the processing of the SG-SR method can be conducted within dozens of milliseconds, which can provide a real-time procedure in practical situations.

  6. An Evolutionary Algorithm for Fast Intensity Based Image Matching Between Optical and SAR Satellite Imagery

    NASA Astrophysics Data System (ADS)

    Fischer, Peter; Schuegraf, Philipp; Merkle, Nina; Storch, Tobias

    2018-04-01

    This paper presents a hybrid evolutionary algorithm for fast intensity based matching between satellite imagery from SAR and very high-resolution (VHR) optical sensor systems. The precise and accurate co-registration of image time series and images of different sensors is a key task in multi-sensor image processing scenarios. The necessary preprocessing step of image matching and tie-point detection is divided into a search problem and a similarity measurement. Within this paper we evaluate the use of an evolutionary search strategy for establishing the spatial correspondence between satellite imagery of optical and radar sensors. The aim of the proposed algorithm is to decrease the computational costs during the search process by formulating the search as an optimization problem. Based upon the canonical evolutionary algorithm, the proposed algorithm is adapted for SAR/optical imagery intensity based matching. Extensions are drawn using techniques like hybridization (e.g. local search) and others to lower the number of objective function calls and refine the result. The algorithm significantely decreases the computational costs whilst finding the optimal solution in a reliable way.

  7. Computer vision camera with embedded FPGA processing

    NASA Astrophysics Data System (ADS)

    Lecerf, Antoine; Ouellet, Denis; Arias-Estrada, Miguel

    2000-03-01

    Traditional computer vision is based on a camera-computer system in which the image understanding algorithms are embedded in the computer. To circumvent the computational load of vision algorithms, low-level processing and imaging hardware can be integrated in a single compact module where a dedicated architecture is implemented. This paper presents a Computer Vision Camera based on an open architecture implemented in an FPGA. The system is targeted to real-time computer vision tasks where low level processing and feature extraction tasks can be implemented in the FPGA device. The camera integrates a CMOS image sensor, an FPGA device, two memory banks, and an embedded PC for communication and control tasks. The FPGA device is a medium size one equivalent to 25,000 logic gates. The device is connected to two high speed memory banks, an IS interface, and an imager interface. The camera can be accessed for architecture programming, data transfer, and control through an Ethernet link from a remote computer. A hardware architecture can be defined in a Hardware Description Language (like VHDL), simulated and synthesized into digital structures that can be programmed into the FPGA and tested on the camera. The architecture of a classical multi-scale edge detection algorithm based on a Laplacian of Gaussian convolution has been developed to show the capabilities of the system.

  8. Efficient computation of the joint probability of multiple inherited risk alleles from pedigree data.

    PubMed

    Madsen, Thomas; Braun, Danielle; Peng, Gang; Parmigiani, Giovanni; Trippa, Lorenzo

    2018-06-25

    The Elston-Stewart peeling algorithm enables estimation of an individual's probability of harboring germline risk alleles based on pedigree data, and serves as the computational backbone of important genetic counseling tools. However, it remains limited to the analysis of risk alleles at a small number of genetic loci because its computing time grows exponentially with the number of loci considered. We propose a novel, approximate version of this algorithm, dubbed the peeling and paring algorithm, which scales polynomially in the number of loci. This allows extending peeling-based models to include many genetic loci. The algorithm creates a trade-off between accuracy and speed, and allows the user to control this trade-off. We provide exact bounds on the approximation error and evaluate it in realistic simulations. Results show that the loss of accuracy due to the approximation is negligible in important applications. This algorithm will improve genetic counseling tools by increasing the number of pathogenic risk alleles that can be addressed. To illustrate we create an extended five genes version of BRCAPRO, a widely used model for estimating the carrier probabilities of BRCA1 and BRCA2 risk alleles and assess its computational properties. © 2018 WILEY PERIODICALS, INC.

  9. Solving satisfiability problems using a novel microarray-based DNA computer.

    PubMed

    Lin, Che-Hsin; Cheng, Hsiao-Ping; Yang, Chang-Biau; Yang, Chia-Ning

    2007-01-01

    An algorithm based on a modified sticker model accompanied with an advanced MEMS-based microarray technology is demonstrated to solve SAT problem, which has long served as a benchmark in DNA computing. Unlike conventional DNA computing algorithms needing an initial data pool to cover correct and incorrect answers and further executing a series of separation procedures to destroy the unwanted ones, we built solutions in parts to satisfy one clause in one step, and eventually solve the entire Boolean formula through steps. No time-consuming sample preparation procedures and delicate sample applying equipment were required for the computing process. Moreover, experimental results show the bound DNA sequences can sustain the chemical solutions during computing processes such that the proposed method shall be useful in dealing with large-scale problems.

  10. Pattern recognition for passive polarimetric data using nonparametric classifiers

    NASA Astrophysics Data System (ADS)

    Thilak, Vimal; Saini, Jatinder; Voelz, David G.; Creusere, Charles D.

    2005-08-01

    Passive polarization based imaging is a useful tool in computer vision and pattern recognition. A passive polarization imaging system forms a polarimetric image from the reflection of ambient light that contains useful information for computer vision tasks such as object detection (classification) and recognition. Applications of polarization based pattern recognition include material classification and automatic shape recognition. In this paper, we present two target detection algorithms for images captured by a passive polarimetric imaging system. The proposed detection algorithms are based on Bayesian decision theory. In these approaches, an object can belong to one of any given number classes and classification involves making decisions that minimize the average probability of making incorrect decisions. This minimum is achieved by assigning an object to the class that maximizes the a posteriori probability. Computing a posteriori probabilities requires estimates of class conditional probability density functions (likelihoods) and prior probabilities. A Probabilistic neural network (PNN), which is a nonparametric method that can compute Bayes optimal boundaries, and a -nearest neighbor (KNN) classifier, is used for density estimation and classification. The proposed algorithms are applied to polarimetric image data gathered in the laboratory with a liquid crystal-based system. The experimental results validate the effectiveness of the above algorithms for target detection from polarimetric data.

  11. A Parallel Sliding Region Algorithm to Make Agent-Based Modeling Possible for a Large-Scale Simulation: Modeling Hepatitis C Epidemics in Canada.

    PubMed

    Wong, William W L; Feng, Zeny Z; Thein, Hla-Hla

    2016-11-01

    Agent-based models (ABMs) are computer simulation models that define interactions among agents and simulate emergent behaviors that arise from the ensemble of local decisions. ABMs have been increasingly used to examine trends in infectious disease epidemiology. However, the main limitation of ABMs is the high computational cost for a large-scale simulation. To improve the computational efficiency for large-scale ABM simulations, we built a parallelizable sliding region algorithm (SRA) for ABM and compared it to a nonparallelizable ABM. We developed a complex agent network and performed two simulations to model hepatitis C epidemics based on the real demographic data from Saskatchewan, Canada. The first simulation used the SRA that processed on each postal code subregion subsequently. The second simulation processed the entire population simultaneously. It was concluded that the parallelizable SRA showed computational time saving with comparable results in a province-wide simulation. Using the same method, SRA can be generalized for performing a country-wide simulation. Thus, this parallel algorithm enables the possibility of using ABM for large-scale simulation with limited computational resources.

  12. Efficient parallel resolution of the simplified transport equations in mixed-dual formulation

    NASA Astrophysics Data System (ADS)

    Barrault, M.; Lathuilière, B.; Ramet, P.; Roman, J.

    2011-03-01

    A reactivity computation consists of computing the highest eigenvalue of a generalized eigenvalue problem, for which an inverse power algorithm is commonly used. Very fine modelizations are difficult to treat for our sequential solver, based on the simplified transport equations, in terms of memory consumption and computational time. A first implementation of a Lagrangian based domain decomposition method brings to a poor parallel efficiency because of an increase in the power iterations [1]. In order to obtain a high parallel efficiency, we improve the parallelization scheme by changing the location of the loop over the subdomains in the overall algorithm and by benefiting from the characteristics of the Raviart-Thomas finite element. The new parallel algorithm still allows us to locally adapt the numerical scheme (mesh, finite element order). However, it can be significantly optimized for the matching grid case. The good behavior of the new parallelization scheme is demonstrated for the matching grid case on several hundreds of nodes for computations based on a pin-by-pin discretization.

  13. Combinatorial Algorithms to Enable Computational Science and Engineering: Work from the CSCAPES Institute

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Boman, Erik G.; Catalyurek, Umit V.; Chevalier, Cedric

    2015-01-16

    This final progress report summarizes the work accomplished at the Combinatorial Scientific Computing and Petascale Simulations Institute. We developed Zoltan, a parallel mesh partitioning library that made use of accurate hypergraph models to provide load balancing in mesh-based computations. We developed several graph coloring algorithms for computing Jacobian and Hessian matrices and organized them into a software package called ColPack. We developed parallel algorithms for graph coloring and graph matching problems, and also designed multi-scale graph algorithms. Three PhD students graduated, six more are continuing their PhD studies, and four postdoctoral scholars were advised. Six of these students and Fellowsmore » have joined DOE Labs (Sandia, Berkeley), as staff scientists or as postdoctoral scientists. We also organized the SIAM Workshop on Combinatorial Scientific Computing (CSC) in 2007, 2009, and 2011 to continue to foster the CSC community.« less

  14. Automatically-computed prehospital severity scores are equivalent to scores based on medic documentation.

    PubMed

    Reisner, Andrew T; Chen, Liangyou; McKenna, Thomas M; Reifman, Jaques

    2008-10-01

    Prehospital severity scores can be used in routine prehospital care, mass casualty care, and military triage. If computers could reliably calculate clinical scores, new clinical and research methodologies would be possible. One obstacle is that vital signs measured automatically can be unreliable. We hypothesized that Signal Quality Indices (SQI's), computer algorithms that differentiate between reliable and unreliable monitored physiologic data, could improve the predictive power of computer-calculated scores. In a retrospective analysis of trauma casualties transported by air ambulance, we computed the Triage Revised Trauma Score (RTS) from archived travel monitor data. We compared the areas-under-the-curve (AUC's) of receiver operating characteristic curves for prediction of mortality and red blood cell transfusion for 187 subjects with comparable quantities of good-quality and poor-quality data. Vital signs deemed reliable by SQI's led to significantly more discriminatory severity scores than vital signs deemed unreliable. We also compared automatically-computed RTS (using the SQI's) versus RTS computed from vital signs documented by medics. For the subjects in whom the SQI algorithms identified 15 consecutive seconds of reliable vital signs data (n = 350), the automatically-computed scores' AUC's were the same as the medic-based scores' AUC's. Using the Prehospital Index in place of RTS led to very similar results, corroborating our findings. SQI algorithms improve automatically-computed severity scores, and automatically-computed scores using SQI's are equivalent to medic-based scores.

  15. An Improved Clustering Algorithm of Tunnel Monitoring Data for Cloud Computing

    PubMed Central

    Zhong, Luo; Tang, KunHao; Li, Lin; Yang, Guang; Ye, JingJing

    2014-01-01

    With the rapid development of urban construction, the number of urban tunnels is increasing and the data they produce become more and more complex. It results in the fact that the traditional clustering algorithm cannot handle the mass data of the tunnel. To solve this problem, an improved parallel clustering algorithm based on k-means has been proposed. It is a clustering algorithm using the MapReduce within cloud computing that deals with data. It not only has the advantage of being used to deal with mass data but also is more efficient. Moreover, it is able to compute the average dissimilarity degree of each cluster in order to clean the abnormal data. PMID:24982971

  16. Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades.

    PubMed

    Orchard, Garrick; Jayawant, Ajinkya; Cohen, Gregory K; Thakor, Nitish

    2015-01-01

    Creating datasets for Neuromorphic Vision is a challenging task. A lack of available recordings from Neuromorphic Vision sensors means that data must typically be recorded specifically for dataset creation rather than collecting and labeling existing data. The task is further complicated by a desire to simultaneously provide traditional frame-based recordings to allow for direct comparison with traditional Computer Vision algorithms. Here we propose a method for converting existing Computer Vision static image datasets into Neuromorphic Vision datasets using an actuated pan-tilt camera platform. Moving the sensor rather than the scene or image is a more biologically realistic approach to sensing and eliminates timing artifacts introduced by monitor updates when simulating motion on a computer monitor. We present conversion of two popular image datasets (MNIST and Caltech101) which have played important roles in the development of Computer Vision, and we provide performance metrics on these datasets using spike-based recognition algorithms. This work contributes datasets for future use in the field, as well as results from spike-based algorithms against which future works can compare. Furthermore, by converting datasets already popular in Computer Vision, we enable more direct comparison with frame-based approaches.

  17. An improved dehazing algorithm of aerial high-definition image

    NASA Astrophysics Data System (ADS)

    Jiang, Wentao; Ji, Ming; Huang, Xiying; Wang, Chao; Yang, Yizhou; Li, Tao; Wang, Jiaoying; Zhang, Ying

    2016-01-01

    For unmanned aerial vehicle(UAV) images, the sensor can not get high quality images due to fog and haze weather. To solve this problem, An improved dehazing algorithm of aerial high-definition image is proposed. Based on the model of dark channel prior, the new algorithm firstly extracts the edges from crude estimated transmission map and expands the extracted edges. Then according to the expended edges, the algorithm sets a threshold value to divide the crude estimated transmission map into different areas and makes different guided filter on the different areas compute the optimized transmission map. The experimental results demonstrate that the performance of the proposed algorithm is substantially the same as the one based on dark channel prior and guided filter. The average computation time of the new algorithm is around 40% of the one as well as the detection ability of UAV image is improved effectively in fog and haze weather.

  18. Short-term Power Load Forecasting Based on Balanced KNN

    NASA Astrophysics Data System (ADS)

    Lv, Xianlong; Cheng, Xingong; YanShuang; Tang, Yan-mei

    2018-03-01

    To improve the accuracy of load forecasting, a short-term load forecasting model based on balanced KNN algorithm is proposed; According to the load characteristics, the historical data of massive power load are divided into scenes by the K-means algorithm; In view of unbalanced load scenes, the balanced KNN algorithm is proposed to classify the scene accurately; The local weighted linear regression algorithm is used to fitting and predict the load; Adopting the Apache Hadoop programming framework of cloud computing, the proposed algorithm model is parallelized and improved to enhance its ability of dealing with massive and high-dimension data. The analysis of the household electricity consumption data for a residential district is done by 23-nodes cloud computing cluster, and experimental results show that the load forecasting accuracy and execution time by the proposed model are the better than those of traditional forecasting algorithm.

  19. A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data.

    PubMed

    Baur, Brittany; Bozdag, Serdar

    2016-01-01

    DNA methylation is an important epigenetic event that effects gene expression during development and various diseases such as cancer. Understanding the mechanism of action of DNA methylation is important for downstream analysis. In the Illumina Infinium HumanMethylation 450K array, there are tens of probes associated with each gene. Given methylation intensities of all these probes, it is necessary to compute which of these probes are most representative of the gene centric methylation level. In this study, we developed a feature selection algorithm based on sequential forward selection that utilized different classification methods to compute gene centric DNA methylation using probe level DNA methylation data. We compared our algorithm to other feature selection algorithms such as support vector machines with recursive feature elimination, genetic algorithms and ReliefF. We evaluated all methods based on the predictive power of selected probes on their mRNA expression levels and found that a K-Nearest Neighbors classification using the sequential forward selection algorithm performed better than other algorithms based on all metrics. We also observed that transcriptional activities of certain genes were more sensitive to DNA methylation changes than transcriptional activities of other genes. Our algorithm was able to predict the expression of those genes with high accuracy using only DNA methylation data. Our results also showed that those DNA methylation-sensitive genes were enriched in Gene Ontology terms related to the regulation of various biological processes.

  20. One-way quantum computing in superconducting circuits

    NASA Astrophysics Data System (ADS)

    Albarrán-Arriagada, F.; Alvarado Barrios, G.; Sanz, M.; Romero, G.; Lamata, L.; Retamal, J. C.; Solano, E.

    2018-03-01

    We propose a method for the implementation of one-way quantum computing in superconducting circuits. Measurement-based quantum computing is a universal quantum computation paradigm in which an initial cluster state provides the quantum resource, while the iteration of sequential measurements and local rotations encodes the quantum algorithm. Up to now, technical constraints have limited a scalable approach to this quantum computing alternative. The initial cluster state can be generated with available controlled-phase gates, while the quantum algorithm makes use of high-fidelity readout and coherent feedforward. With current technology, we estimate that quantum algorithms with above 20 qubits may be implemented in the path toward quantum supremacy. Moreover, we propose an alternative initial state with properties of maximal persistence and maximal connectedness, reducing the required resources of one-way quantum computing protocols.

  1. An expert fitness diagnosis system based on elastic cloud computing.

    PubMed

    Tseng, Kevin C; Wu, Chia-Chuan

    2014-01-01

    This paper presents an expert diagnosis system based on cloud computing. It classifies a user's fitness level based on supervised machine learning techniques. This system is able to learn and make customized diagnoses according to the user's physiological data, such as age, gender, and body mass index (BMI). In addition, an elastic algorithm based on Poisson distribution is presented to allocate computation resources dynamically. It predicts the required resources in the future according to the exponential moving average of past observations. The experimental results show that Naïve Bayes is the best classifier with the highest accuracy (90.8%) and that the elastic algorithm is able to capture tightly the trend of requests generated from the Internet and thus assign corresponding computation resources to ensure the quality of service.

  2. a Voxel-Based Filtering Algorithm for Mobile LIDAR Data

    NASA Astrophysics Data System (ADS)

    Qin, H.; Guan, G.; Yu, Y.; Zhong, L.

    2018-04-01

    This paper presents a stepwise voxel-based filtering algorithm for mobile LiDAR data. In the first step, to improve computational efficiency, mobile LiDAR points, in xy-plane, are first partitioned into a set of two-dimensional (2-D) blocks with a given block size, in each of which all laser points are further organized into an octree partition structure with a set of three-dimensional (3-D) voxels. Then, a voxel-based upward growing processing is performed to roughly separate terrain from non-terrain points with global and local terrain thresholds. In the second step, the extracted terrain points are refined by computing voxel curvatures. This voxel-based filtering algorithm is comprehensively discussed in the analyses of parameter sensitivity and overall performance. An experimental study performed on multiple point cloud samples, collected by different commercial mobile LiDAR systems, showed that the proposed algorithm provides a promising solution to terrain point extraction from mobile point clouds.

  3. Geometric Detection Algorithms for Cavities on Protein Surfaces in Molecular Graphics: A Survey

    PubMed Central

    Simões, Tiago; Lopes, Daniel; Dias, Sérgio; Fernandes, Francisco; Pereira, João; Jorge, Joaquim; Bajaj, Chandrajit; Gomes, Abel

    2017-01-01

    Detecting and analyzing protein cavities provides significant information about active sites for biological processes (e.g., protein-protein or protein-ligand binding) in molecular graphics and modeling. Using the three-dimensional structure of a given protein (i.e., atom types and their locations in 3D) as retrieved from a PDB (Protein Data Bank) file, it is now computationally viable to determine a description of these cavities. Such cavities correspond to pockets, clefts, invaginations, voids, tunnels, channels, and grooves on the surface of a given protein. In this work, we survey the literature on protein cavity computation and classify algorithmic approaches into three categories: evolution-based, energy-based, and geometry-based. Our survey focuses on geometric algorithms, whose taxonomy is extended to include not only sphere-, grid-, and tessellation-based methods, but also surface-based, hybrid geometric, consensus, and time-varying methods. Finally, we detail those techniques that have been customized for GPU (Graphics Processing Unit) computing. PMID:29520122

  4. Fast computational scheme of image compression for 32-bit microprocessors

    NASA Technical Reports Server (NTRS)

    Kasperovich, Leonid

    1994-01-01

    This paper presents a new computational scheme of image compression based on the discrete cosine transform (DCT), underlying JPEG and MPEG International Standards. The algorithm for the 2-d DCT computation uses integer operations (register shifts and additions / subtractions only); its computational complexity is about 8 additions per image pixel. As a meaningful example of an on-board image compression application we consider the software implementation of the algorithm for the Mars Rover (Marsokhod, in Russian) imaging system being developed as a part of Mars-96 International Space Project. It's shown that fast software solution for 32-bit microprocessors may compete with the DCT-based image compression hardware.

  5. Accelerating Multiagent Reinforcement Learning by Equilibrium Transfer.

    PubMed

    Hu, Yujing; Gao, Yang; An, Bo

    2015-07-01

    An important approach in multiagent reinforcement learning (MARL) is equilibrium-based MARL, which adopts equilibrium solution concepts in game theory and requires agents to play equilibrium strategies at each state. However, most existing equilibrium-based MARL algorithms cannot scale due to a large number of computationally expensive equilibrium computations (e.g., computing Nash equilibria is PPAD-hard) during learning. For the first time, this paper finds that during the learning process of equilibrium-based MARL, the one-shot games corresponding to each state's successive visits often have the same or similar equilibria (for some states more than 90% of games corresponding to successive visits have similar equilibria). Inspired by this observation, this paper proposes to use equilibrium transfer to accelerate equilibrium-based MARL. The key idea of equilibrium transfer is to reuse previously computed equilibria when each agent has a small incentive to deviate. By introducing transfer loss and transfer condition, a novel framework called equilibrium transfer-based MARL is proposed. We prove that although equilibrium transfer brings transfer loss, equilibrium-based MARL algorithms can still converge to an equilibrium policy under certain assumptions. Experimental results in widely used benchmarks (e.g., grid world game, soccer game, and wall game) show that the proposed framework: 1) not only significantly accelerates equilibrium-based MARL (up to 96.7% reduction in learning time), but also achieves higher average rewards than algorithms without equilibrium transfer and 2) scales significantly better than algorithms without equilibrium transfer when the state/action space grows and the number of agents increases.

  6. A Computer Based Educational Aid for the Instruction of Combat Modeling

    DTIC Science & Technology

    1992-02-27

    representation (36:363-370), and, as Knuth put it, "An algorithm must be seen to be believed" (23:4). Graphics not only aid in achieving instructional...consisted primarily of research, identification and use of existing combat model computer algorithms , interviews, and use of operation research...to-air combat models’ operating manuals provided valuable insight into pro- gram structure and algorithms used to represent the combat. From these

  7. Identifying Optimal Measurement Subspace for the Ensemble Kalman Filter

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhou, Ning; Huang, Zhenyu; Welch, Greg

    2012-05-24

    To reduce the computational load of the ensemble Kalman filter while maintaining its efficacy, an optimization algorithm based on the generalized eigenvalue decomposition method is proposed for identifying the most informative measurement subspace. When the number of measurements is large, the proposed algorithm can be used to make an effective tradeoff between computational complexity and estimation accuracy. This algorithm also can be extended to other Kalman filters for measurement subspace selection.

  8. Sinogram-based adaptive iterative reconstruction for sparse view x-ray computed tomography

    NASA Astrophysics Data System (ADS)

    Trinca, D.; Zhong, Y.; Wang, Y.-Z.; Mamyrbayev, T.; Libin, E.

    2016-10-01

    With the availability of more powerful computing processors, iterative reconstruction algorithms have recently been successfully implemented as an approach to achieving significant dose reduction in X-ray CT. In this paper, we propose an adaptive iterative reconstruction algorithm for X-ray CT, that is shown to provide results comparable to those obtained by proprietary algorithms, both in terms of reconstruction accuracy and execution time. The proposed algorithm is thus provided for free to the scientific community, for regular use, and for possible further optimization.

  9. Symmetric log-domain diffeomorphic Registration: a demons-based approach.

    PubMed

    Vercauteren, Tom; Pennec, Xavier; Perchant, Aymeric; Ayache, Nicholas

    2008-01-01

    Modern morphometric studies use non-linear image registration to compare anatomies and perform group analysis. Recently, log-Euclidean approaches have contributed to promote the use of such computational anatomy tools by permitting simple computations of statistics on a rather large class of invertible spatial transformations. In this work, we propose a non-linear registration algorithm perfectly fit for log-Euclidean statistics on diffeomorphisms. Our algorithm works completely in the log-domain, i.e. it uses a stationary velocity field. This implies that we guarantee the invertibility of the deformation and have access to the true inverse transformation. This also means that our output can be directly used for log-Euclidean statistics without relying on the heavy computation of the log of the spatial transformation. As it is often desirable, our algorithm is symmetric with respect to the order of the input images. Furthermore, we use an alternate optimization approach related to Thirion's demons algorithm to provide a fast non-linear registration algorithm. First results show that our algorithm outperforms both the demons algorithm and the recently proposed diffeomorphic demons algorithm in terms of accuracy of the transformation while remaining computationally efficient.

  10. The Research and Implementation of MUSER CLEAN Algorithm Based on OpenCL

    NASA Astrophysics Data System (ADS)

    Feng, Y.; Chen, K.; Deng, H.; Wang, F.; Mei, Y.; Wei, S. L.; Dai, W.; Yang, Q. P.; Liu, Y. B.; Wu, J. P.

    2017-03-01

    It's urgent to carry out high-performance data processing with a single machine in the development of astronomical software. However, due to the different configuration of the machine, traditional programming techniques such as multi-threading, and CUDA (Compute Unified Device Architecture)+GPU (Graphic Processing Unit) have obvious limitations in portability and seamlessness between different operation systems. The OpenCL (Open Computing Language) used in the development of MUSER (MingantU SpEctral Radioheliograph) data processing system is introduced. And the Högbom CLEAN algorithm is re-implemented into parallel CLEAN algorithm by the Python language and PyOpenCL extended package. The experimental results show that the CLEAN algorithm based on OpenCL has approximately equally operating efficiency compared with the former CLEAN algorithm based on CUDA. More important, the data processing in merely CPU (Central Processing Unit) environment of this system can also achieve high performance, which has solved the problem of environmental dependence of CUDA+GPU. Overall, the research improves the adaptability of the system with emphasis on performance of MUSER image clean computing. In the meanwhile, the realization of OpenCL in MUSER proves its availability in scientific data processing. In view of the high-performance computing features of OpenCL in heterogeneous environment, it will probably become the preferred technology in the future high-performance astronomical software development.

  11. A low-power and high-quality implementation of the discrete cosine transformation

    NASA Astrophysics Data System (ADS)

    Heyne, B.; Götze, J.

    2007-06-01

    In this paper a computationally efficient and high-quality preserving DCT architecture is presented. It is obtained by optimizing the Loeffler DCT based on the Cordic algorithm. The computational complexity is reduced from 11 multiply and 29 add operations (Loeffler DCT) to 38 add and 16 shift operations (which is similar to the complexity of the binDCT). The experimental results show that the proposed DCT algorithm not only reduces the computational complexity significantly, but also retains the good transformation quality of the Loeffler DCT. Therefore, the proposed Cordic based Loeffler DCT is especially suited for low-power and high-quality CODECs in battery-based systems.

  12. Accelerated computer generated holography using sparse bases in the STFT domain.

    PubMed

    Blinder, David; Schelkens, Peter

    2018-01-22

    Computer-generated holography at high resolutions is a computationally intensive task. Efficient algorithms are needed to generate holograms at acceptable speeds, especially for real-time and interactive applications such as holographic displays. We propose a novel technique to generate holograms using a sparse basis representation in the short-time Fourier space combined with a wavefront-recording plane placed in the middle of the 3D object. By computing the point spread functions in the transform domain, we update only a small subset of the precomputed largest-magnitude coefficients to significantly accelerate the algorithm over conventional look-up table methods. We implement the algorithm on a GPU, and report a speedup factor of over 30. We show that this transform is superior over wavelet-based approaches, and show quantitative and qualitative improvements over the state-of-the-art WASABI method; we report accuracy gains of 2dB PSNR, as well improved view preservation.

  13. Overview of fast algorithm in 3D dynamic holographic display

    NASA Astrophysics Data System (ADS)

    Liu, Juan; Jia, Jia; Pan, Yijie; Wang, Yongtian

    2013-08-01

    3D dynamic holographic display is one of the most attractive techniques for achieving real 3D vision with full depth cue without any extra devices. However, huge 3D information and data should be preceded and be computed in real time for generating the hologram in 3D dynamic holographic display, and it is a challenge even for the most advanced computer. Many fast algorithms are proposed for speeding the calculation and reducing the memory usage, such as:look-up table (LUT), compressed look-up table (C-LUT), split look-up table (S-LUT), and novel look-up table (N-LUT) based on the point-based method, and full analytical polygon-based methods, one-step polygon-based method based on the polygon-based method. In this presentation, we overview various fast algorithms based on the point-based method and the polygon-based method, and focus on the fast algorithm with low memory usage, the C-LUT, and one-step polygon-based method by the 2D Fourier analysis of the 3D affine transformation. The numerical simulations and the optical experiments are presented, and several other algorithms are compared. The results show that the C-LUT algorithm and the one-step polygon-based method are efficient methods for saving calculation time. It is believed that those methods could be used in the real-time 3D holographic display in future.

  14. Advanced biologically plausible algorithms for low-level image processing

    NASA Astrophysics Data System (ADS)

    Gusakova, Valentina I.; Podladchikova, Lubov N.; Shaposhnikov, Dmitry G.; Markin, Sergey N.; Golovan, Alexander V.; Lee, Seong-Whan

    1999-08-01

    At present, in computer vision, the approach based on modeling the biological vision mechanisms is extensively developed. However, up to now, real world image processing has no effective solution in frameworks of both biologically inspired and conventional approaches. Evidently, new algorithms and system architectures based on advanced biological motivation should be developed for solution of computational problems related to this visual task. Basic problems that should be solved for creation of effective artificial visual system to process real world imags are a search for new algorithms of low-level image processing that, in a great extent, determine system performance. In the present paper, the result of psychophysical experiments and several advanced biologically motivated algorithms for low-level processing are presented. These algorithms are based on local space-variant filter, context encoding visual information presented in the center of input window, and automatic detection of perceptually important image fragments. The core of latter algorithm are using local feature conjunctions such as noncolinear oriented segment and composite feature map formation. Developed algorithms were integrated into foveal active vision model, the MARR. It is supposed that proposed algorithms may significantly improve model performance while real world image processing during memorizing, search, and recognition.

  15. Deep Convolutional Framelet Denosing for Low-Dose CT via Wavelet Residual Network.

    PubMed

    Kang, Eunhee; Chang, Won; Yoo, Jaejun; Ye, Jong Chul

    2018-06-01

    Model-based iterative reconstruction algorithms for low-dose X-ray computed tomography (CT) are computationally expensive. To address this problem, we recently proposed a deep convolutional neural network (CNN) for low-dose X-ray CT and won the second place in 2016 AAPM Low-Dose CT Grand Challenge. However, some of the textures were not fully recovered. To address this problem, here we propose a novel framelet-based denoising algorithm using wavelet residual network which synergistically combines the expressive power of deep learning and the performance guarantee from the framelet-based denoising algorithms. The new algorithms were inspired by the recent interpretation of the deep CNN as a cascaded convolution framelet signal representation. Extensive experimental results confirm that the proposed networks have significantly improved performance and preserve the detail texture of the original images.

  16. Finding Frequent Closed Itemsets in Sliding Window in Linear Time

    NASA Astrophysics Data System (ADS)

    Chen, Junbo; Zhou, Bo; Chen, Lu; Wang, Xinyu; Ding, Yiqun

    One of the most well-studied problems in data mining is computing the collection of frequent itemsets in large transactional databases. Since the introduction of the famous Apriori algorithm [14], many others have been proposed to find the frequent itemsets. Among such algorithms, the approach of mining closed itemsets has raised much interest in data mining community. The algorithms taking this approach include TITANIC [8], CLOSET+[6], DCI-Closed [4], FCI-Stream [3], GC-Tree [15], TGC-Tree [16] etc. Among these algorithms, FCI-Stream, GC-Tree and TGC-Tree are online algorithms work under sliding window environments. By the performance evaluation in [16], GC-Tree [15] is the fastest one. In this paper, an improved algorithm based on GC-Tree is proposed, the computational complexity of which is proved to be a linear combination of the average transaction size and the average closed itemset size. The algorithm is based on the essential theorem presented in Sect. 4.2. Empirically, the new algorithm is several orders of magnitude faster than the state of art algorithm, GC-Tree.

  17. Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

    NASA Technical Reports Server (NTRS)

    Sun, Xian-He

    1997-01-01

    Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper.

  18. An algorithmic approach to solving polynomial equations associated with quantum circuits

    NASA Astrophysics Data System (ADS)

    Gerdt, V. P.; Zinin, M. V.

    2009-12-01

    In this paper we present two algorithms for reducing systems of multivariate polynomial equations over the finite field F 2 to the canonical triangular form called lexicographical Gröbner basis. This triangular form is the most appropriate for finding solutions of the system. On the other hand, the system of polynomials over F 2 whose variables also take values in F 2 (Boolean polynomials) completely describes the unitary matrix generated by a quantum circuit. In particular, the matrix itself can be computed by counting the number of solutions (roots) of the associated polynomial system. Thereby, efficient construction of the lexicographical Gröbner bases over F 2 associated with quantum circuits gives a method for computing their circuit matrices that is alternative to the direct numerical method based on linear algebra. We compare our implementation of both algorithms with some other software packages available for computing Gröbner bases over F 2.

  19. Implementation of real-time digital signal processing systems

    NASA Technical Reports Server (NTRS)

    Narasimha, M.; Peterson, A.; Narayan, S.

    1978-01-01

    Special purpose hardware implementation of DFT Computers and digital filters is considered in the light of newly introduced algorithms and IC devices. Recent work by Winograd on high-speed convolution techniques for computing short length DFT's, has motivated the development of more efficient algorithms, compared to the FFT, for evaluating the transform of longer sequences. Among these, prime factor algorithms appear suitable for special purpose hardware implementations. Architectural considerations in designing DFT computers based on these algorithms are discussed. With the availability of monolithic multiplier-accumulators, a direct implementation of IIR and FIR filters, using random access memories in place of shift registers, appears attractive. The memory addressing scheme involved in such implementations is discussed. A simple counter set-up to address the data memory in the realization of FIR filters is also described. The combination of a set of simple filters (weighting network) and a DFT computer is shown to realize a bank of uniform bandpass filters. The usefulness of this concept in arriving at a modular design for a million channel spectrum analyzer, based on microprocessors, is discussed.

  20. Automatic computation of 2D cardiac measurements from B-mode echocardiography

    NASA Astrophysics Data System (ADS)

    Park, JinHyeong; Feng, Shaolei; Zhou, S. Kevin

    2012-03-01

    We propose a robust and fully automatic algorithm which computes the 2D echocardiography measurements recommended by America Society of Echocardiography. The algorithm employs knowledge-based imaging technologies which can learn the expert's knowledge from the training images and expert's annotation. Based on the models constructed from the learning stage, the algorithm searches initial location of the landmark points for the measurements by utilizing heart structure of left ventricle including mitral valve aortic valve. It employs the pseudo anatomic M-mode image generated by accumulating the line images in 2D parasternal long axis view along the time to refine the measurement landmark points. The experiment results with large volume of data show that the algorithm runs fast and is robust comparable to expert.

  1. A consensus algorithm for approximate string matching and its application to QRS complex detection

    NASA Astrophysics Data System (ADS)

    Alba, Alfonso; Mendez, Martin O.; Rubio-Rincon, Miguel E.; Arce-Santana, Edgar R.

    2016-08-01

    In this paper, a novel algorithm for approximate string matching (ASM) is proposed. The novelty resides in the fact that, unlike most other methods, the proposed algorithm is not based on the Hamming or Levenshtein distances, but instead computes a score for each symbol in the search text based on a consensus measure. Those symbols with sufficiently high scores will likely correspond to approximate instances of the pattern string. To demonstrate the usefulness of the proposed method, it has been applied to the detection of QRS complexes in electrocardiographic signals with competitive results when compared against the classic Pan-Tompkins (PT) algorithm. The proposed method outperformed PT in 72% of the test cases, with no extra computational cost.

  2. Social Media: Menagerie of Metrics

    DTIC Science & Technology

    2010-01-27

    intelligence, an evolutionary algorithm (EA) is a subset of evolutionary computation, a generic population-based metaheuristic optimization algorithm . An EA...Cloning - 22 Animals were cloned to date; genetic algorithms can help prediction (e.g. “elitism” - attempts to ensure selection by including performers...28, 2010 Evolutionary Algorithm • Evolutionary algorithm From Wikipedia, the free encyclopedia Artificial intelligence portal In artificial

  3. Performance improvement of multi-class detection using greedy algorithm for Viola-Jones cascade selection

    NASA Astrophysics Data System (ADS)

    Tereshin, Alexander A.; Usilin, Sergey A.; Arlazarov, Vladimir V.

    2018-04-01

    This paper aims to study the problem of multi-class object detection in video stream with Viola-Jones cascades. An adaptive algorithm for selecting Viola-Jones cascade based on greedy choice strategy in solution of the N-armed bandit problem is proposed. The efficiency of the algorithm on the problem of detection and recognition of the bank card logos in the video stream is shown. The proposed algorithm can be effectively used in documents localization and identification, recognition of road scene elements, localization and tracking of the lengthy objects , and for solving other problems of rigid object detection in a heterogeneous data flows. The computational efficiency of the algorithm makes it possible to use it both on personal computers and on mobile devices based on processors with low power consumption.

  4. A real time microcomputer implementation of sensor failure detection for turbofan engines

    NASA Technical Reports Server (NTRS)

    Delaat, John C.; Merrill, Walter C.

    1989-01-01

    An algorithm was developed which detects, isolates, and accommodates sensor failures using analytical redundancy. The performance of this algorithm was demonstrated on a full-scale F100 turbofan engine. The algorithm was implemented in real-time on a microprocessor-based controls computer which includes parallel processing and high order language programming. Parallel processing was used to achieve the required computational power for the real-time implementation. High order language programming was used in order to reduce the programming and maintenance costs of the algorithm implementation software. The sensor failure algorithm was combined with an existing multivariable control algorithm to give a complete control implementation with sensor analytical redundancy. The real-time microprocessor implementation of the algorithm which resulted in the successful completion of the algorithm engine demonstration, is described.

  5. A joint swarm intelligence algorithm for multi-user detection in MIMO-OFDM system

    NASA Astrophysics Data System (ADS)

    Hu, Fengye; Du, Dakun; Zhang, Peng; Wang, Zhijun

    2014-11-01

    In the multi-input multi-output orthogonal frequency division multiplexing (MIMO-OFDM) system, traditional multi-user detection (MUD) algorithms that usually used to suppress multiple access interference are difficult to balance system detection performance and the complexity of the algorithm. To solve this problem, this paper proposes a joint swarm intelligence algorithm called Ant Colony and Particle Swarm Optimisation (AC-PSO) by integrating particle swarm optimisation (PSO) and ant colony optimisation (ACO) algorithms. According to simulation results, it has been shown that, with low computational complexity, the MUD for the MIMO-OFDM system based on AC-PSO algorithm gains comparable MUD performance with maximum likelihood algorithm. Thus, the proposed AC-PSO algorithm provides a satisfactory trade-off between computational complexity and detection performance.

  6. Kernel Temporal Differences for Neural Decoding

    PubMed Central

    Bae, Jihye; Sanchez Giraldo, Luis G.; Pohlmeyer, Eric A.; Francis, Joseph T.; Sanchez, Justin C.; Príncipe, José C.

    2015-01-01

    We study the feasibility and capability of the kernel temporal difference (KTD)(λ) algorithm for neural decoding. KTD(λ) is an online, kernel-based learning algorithm, which has been introduced to estimate value functions in reinforcement learning. This algorithm combines kernel-based representations with the temporal difference approach to learning. One of our key observations is that by using strictly positive definite kernels, algorithm's convergence can be guaranteed for policy evaluation. The algorithm's nonlinear functional approximation capabilities are shown in both simulations of policy evaluation and neural decoding problems (policy improvement). KTD can handle high-dimensional neural states containing spatial-temporal information at a reasonable computational complexity allowing real-time applications. When the algorithm seeks a proper mapping between a monkey's neural states and desired positions of a computer cursor or a robot arm, in both open-loop and closed-loop experiments, it can effectively learn the neural state to action mapping. Finally, a visualization of the coadaptation process between the decoder and the subject shows the algorithm's capabilities in reinforcement learning brain machine interfaces. PMID:25866504

  7. [Orthogonal Vector Projection Algorithm for Spectral Unmixing].

    PubMed

    Song, Mei-ping; Xu, Xing-wei; Chang, Chein-I; An, Ju-bai; Yao, Li

    2015-12-01

    Spectrum unmixing is an important part of hyperspectral technologies, which is essential for material quantity analysis in hyperspectral imagery. Most linear unmixing algorithms require computations of matrix multiplication and matrix inversion or matrix determination. These are difficult for programming, especially hard for realization on hardware. At the same time, the computation costs of the algorithms increase significantly as the number of endmembers grows. Here, based on the traditional algorithm Orthogonal Subspace Projection, a new method called. Orthogonal Vector Projection is prompted using orthogonal principle. It simplifies this process by avoiding matrix multiplication and inversion. It firstly computes the final orthogonal vector via Gram-Schmidt process for each endmember spectrum. And then, these orthogonal vectors are used as projection vector for the pixel signature. The unconstrained abundance can be obtained directly by projecting the signature to the projection vectors, and computing the ratio of projected vector length and orthogonal vector length. Compared to the Orthogonal Subspace Projection and Least Squares Error algorithms, this method does not need matrix inversion, which is much computation costing and hard to implement on hardware. It just completes the orthogonalization process by repeated vector operations, easy for application on both parallel computation and hardware. The reasonability of the algorithm is proved by its relationship with Orthogonal Sub-space Projection and Least Squares Error algorithms. And its computational complexity is also compared with the other two algorithms', which is the lowest one. At last, the experimental results on synthetic image and real image are also provided, giving another evidence for effectiveness of the method.

  8. Gradient gravitational search: An efficient metaheuristic algorithm for global optimization.

    PubMed

    Dash, Tirtharaj; Sahu, Prabhat K

    2015-05-30

    The adaptation of novel techniques developed in the field of computational chemistry to solve the concerned problems for large and flexible molecules is taking the center stage with regard to efficient algorithm, computational cost and accuracy. In this article, the gradient-based gravitational search (GGS) algorithm, using analytical gradients for a fast minimization to the next local minimum has been reported. Its efficiency as metaheuristic approach has also been compared with Gradient Tabu Search and others like: Gravitational Search, Cuckoo Search, and Back Tracking Search algorithms for global optimization. Moreover, the GGS approach has also been applied to computational chemistry problems for finding the minimal value potential energy of two-dimensional and three-dimensional off-lattice protein models. The simulation results reveal the relative stability and physical accuracy of protein models with efficient computational cost. © 2015 Wiley Periodicals, Inc.

  9. SubspaceEM: A Fast Maximum-a-posteriori Algorithm for Cryo-EM Single Particle Reconstruction

    PubMed Central

    Dvornek, Nicha C.; Sigworth, Fred J.; Tagare, Hemant D.

    2015-01-01

    Single particle reconstruction methods based on the maximum-likelihood principle and the expectation-maximization (E–M) algorithm are popular because of their ability to produce high resolution structures. However, these algorithms are computationally very expensive, requiring a network of computational servers. To overcome this computational bottleneck, we propose a new mathematical framework for accelerating maximum-likelihood reconstructions. The speedup is by orders of magnitude and the proposed algorithm produces similar quality reconstructions compared to the standard maximum-likelihood formulation. Our approach uses subspace approximations of the cryo-electron microscopy (cryo-EM) data and projection images, greatly reducing the number of image transformations and comparisons that are computed. Experiments using simulated and actual cryo-EM data show that speedup in overall execution time compared to traditional maximum-likelihood reconstruction reaches factors of over 300. PMID:25839831

  10. Integration of symbolic and algorithmic hardware and software for the automation of space station subsystems

    NASA Technical Reports Server (NTRS)

    Gregg, Hugh; Healey, Kathleen; Hack, Edmund; Wong, Carla

    1987-01-01

    Expert systems that require access to data bases, complex simulations and real time instrumentation have both symbolic as well as algorithmic computing needs. These needs could both be met using a general computing workstation running both symbolic and algorithmic code, or separate, specialized computers networked together. The later approach was chosen to implement TEXSYS, the thermal expert system, developed to demonstrate the ability of an expert system to autonomously control the thermal control system of the space station. TEXSYS has been implemented on a Symbolics workstation, and will be linked to a microVAX computer that will control a thermal test bed. Integration options are explored and several possible solutions are presented.

  11. An O(log sup 2 N) parallel algorithm for computing the eigenvalues of a symmetric tridiagonal matrix

    NASA Technical Reports Server (NTRS)

    Swarztrauber, Paul N.

    1989-01-01

    An O(log sup 2 N) parallel algorithm is presented for computing the eigenvalues of a symmetric tridiagonal matrix using a parallel algorithm for computing the zeros of the characteristic polynomial. The method is based on a quadratic recurrence in which the characteristic polynomial is constructed on a binary tree from polynomials whose degree doubles at each level. Intervals that contain exactly one zero are determined by the zeros of polynomials at the previous level which ensures that different processors compute different zeros. The exact behavior of the polynomials at the interval endpoints is used to eliminate the usual problems induced by finite precision arithmetic.

  12. Online selective kernel-based temporal difference learning.

    PubMed

    Chen, Xingguo; Gao, Yang; Wang, Ruili

    2013-12-01

    In this paper, an online selective kernel-based temporal difference (OSKTD) learning algorithm is proposed to deal with large scale and/or continuous reinforcement learning problems. OSKTD includes two online procedures: online sparsification and parameter updating for the selective kernel-based value function. A new sparsification method (i.e., a kernel distance-based online sparsification method) is proposed based on selective ensemble learning, which is computationally less complex compared with other sparsification methods. With the proposed sparsification method, the sparsified dictionary of samples is constructed online by checking if a sample needs to be added to the sparsified dictionary. In addition, based on local validity, a selective kernel-based value function is proposed to select the best samples from the sample dictionary for the selective kernel-based value function approximator. The parameters of the selective kernel-based value function are iteratively updated by using the temporal difference (TD) learning algorithm combined with the gradient descent technique. The complexity of the online sparsification procedure in the OSKTD algorithm is O(n). In addition, two typical experiments (Maze and Mountain Car) are used to compare with both traditional and up-to-date O(n) algorithms (GTD, GTD2, and TDC using the kernel-based value function), and the results demonstrate the effectiveness of our proposed algorithm. In the Maze problem, OSKTD converges to an optimal policy and converges faster than both traditional and up-to-date algorithms. In the Mountain Car problem, OSKTD converges, requires less computation time compared with other sparsification methods, gets a better local optima than the traditional algorithms, and converges much faster than the up-to-date algorithms. In addition, OSKTD can reach a competitive ultimate optima compared with the up-to-date algorithms.

  13. A Systematic Investigation of Computation Models for Predicting Adverse Drug Reactions (ADRs)

    PubMed Central

    Kuang, Qifan; Wang, MinQi; Li, Rong; Dong, YongCheng; Li, Yizhou; Li, Menglong

    2014-01-01

    Background Early and accurate identification of adverse drug reactions (ADRs) is critically important for drug development and clinical safety. Computer-aided prediction of ADRs has attracted increasing attention in recent years, and many computational models have been proposed. However, because of the lack of systematic analysis and comparison of the different computational models, there remain limitations in designing more effective algorithms and selecting more useful features. There is therefore an urgent need to review and analyze previous computation models to obtain general conclusions that can provide useful guidance to construct more effective computational models to predict ADRs. Principal Findings In the current study, the main work is to compare and analyze the performance of existing computational methods to predict ADRs, by implementing and evaluating additional algorithms that have been earlier used for predicting drug targets. Our results indicated that topological and intrinsic features were complementary to an extent and the Jaccard coefficient had an important and general effect on the prediction of drug-ADR associations. By comparing the structure of each algorithm, final formulas of these algorithms were all converted to linear model in form, based on this finding we propose a new algorithm called the general weighted profile method and it yielded the best overall performance among the algorithms investigated in this paper. Conclusion Several meaningful conclusions and useful findings regarding the prediction of ADRs are provided for selecting optimal features and algorithms. PMID:25180585

  14. Characterization of robotics parallel algorithms and mapping onto a reconfigurable SIMD machine

    NASA Technical Reports Server (NTRS)

    Lee, C. S. G.; Lin, C. T.

    1989-01-01

    The kinematics, dynamics, Jacobian, and their corresponding inverse computations are six essential problems in the control of robot manipulators. Efficient parallel algorithms for these computations are discussed and analyzed. Their characteristics are identified and a scheme on the mapping of these algorithms to a reconfigurable parallel architecture is presented. Based on the characteristics including type of parallelism, degree of parallelism, uniformity of the operations, fundamental operations, data dependencies, and communication requirement, it is shown that most of the algorithms for robotic computations possess highly regular properties and some common structures, especially the linear recursive structure. Moreover, they are well-suited to be implemented on a single-instruction-stream multiple-data-stream (SIMD) computer with reconfigurable interconnection network. The model of a reconfigurable dual network SIMD machine with internal direct feedback is introduced. A systematic procedure internal direct feedback is introduced. A systematic procedure to map these computations to the proposed machine is presented. A new scheduling problem for SIMD machines is investigated and a heuristic algorithm, called neighborhood scheduling, that reorders the processing sequence of subtasks to reduce the communication time is described. Mapping results of a benchmark algorithm are illustrated and discussed.

  15. Computer architecture for efficient algorithmic executions in real-time systems: New technology for avionics systems and advanced space vehicles

    NASA Technical Reports Server (NTRS)

    Carroll, Chester C.; Youngblood, John N.; Saha, Aindam

    1987-01-01

    Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processing elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed.

  16. Computer architecture for efficient algorithmic executions in real-time systems: new technology for avionics systems and advanced space vehicles

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Carroll, C.C.; Youngblood, J.N.; Saha, A.

    1987-12-01

    Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processingmore » elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed.« less

  17. Using modified fruit fly optimisation algorithm to perform the function test and case studies

    NASA Astrophysics Data System (ADS)

    Pan, Wen-Tsao

    2013-06-01

    Evolutionary computation is a computing mode established by practically simulating natural evolutionary processes based on the concept of Darwinian Theory, and it is a common research method. The main contribution of this paper was to reinforce the function of searching for the optimised solution using the fruit fly optimization algorithm (FOA), in order to avoid the acquisition of local extremum solutions. The evolutionary computation has grown to include the concepts of animal foraging behaviour and group behaviour. This study discussed three common evolutionary computation methods and compared them with the modified fruit fly optimization algorithm (MFOA). It further investigated the ability of the three mathematical functions in computing extreme values, as well as the algorithm execution speed and the forecast ability of the forecasting model built using the optimised general regression neural network (GRNN) parameters. The findings indicated that there was no obvious difference between particle swarm optimization and the MFOA in regards to the ability to compute extreme values; however, they were both better than the artificial fish swarm algorithm and FOA. In addition, the MFOA performed better than the particle swarm optimization in regards to the algorithm execution speed, and the forecast ability of the forecasting model built using the MFOA's GRNN parameters was better than that of the other three forecasting models.

  18. Designing for deeper learning in a blended computer science course for middle school students

    NASA Astrophysics Data System (ADS)

    Grover, Shuchi; Pea, Roy; Cooper, Stephen

    2015-04-01

    The focus of this research was to create and test an introductory computer science course for middle school. Titled "Foundations for Advancing Computational Thinking" (FACT), the course aims to prepare and motivate middle school learners for future engagement with algorithmic problem solving. FACT was also piloted as a seven-week course on Stanford's OpenEdX MOOC platform for blended in-class learning. Unique aspects of FACT include balanced pedagogical designs that address the cognitive, interpersonal, and intrapersonal aspects of "deeper learning"; a focus on pedagogical strategies for mediating and assessing for transfer from block-based to text-based programming; curricular materials for remedying misperceptions of computing; and "systems of assessments" (including formative and summative quizzes and tests, directed as well as open-ended programming assignments, and a transfer test) to get a comprehensive picture of students' deeper computational learning. Empirical investigations, accomplished over two iterations of a design-based research effort with students (aged 11-14 years) in a public school, sought to examine student understanding of algorithmic constructs, and how well students transferred this learning from Scratch to text-based languages. Changes in student perceptions of computing as a discipline were measured. Results and mixed-method analyses revealed that students in both studies (1) achieved substantial learning gains in algorithmic thinking skills, (2) were able to transfer their learning from Scratch to a text-based programming context, and (3) achieved significant growth toward a more mature understanding of computing as a discipline. Factor analyses of prior computing experience, multivariate regression analyses, and qualitative analyses of student projects and artifact-based interviews were conducted to better understand the factors affecting learning outcomes. Prior computing experiences (as measured by a pretest) and math ability were found to be strong predictors of learning outcomes.

  19. A General, Adaptive, Roadmap-Based Algorithm for Protein Motion Computation.

    PubMed

    Molloy, Kevin; Shehu, Amarda

    2016-03-01

    Precious information on protein function can be extracted from a detailed characterization of protein equilibrium dynamics. This remains elusive in wet and dry laboratories, as function-modulating transitions of a protein between functionally-relevant, thermodynamically-stable and meta-stable structural states often span disparate time scales. In this paper we propose a novel, robotics-inspired algorithm that circumvents time-scale challenges by drawing analogies between protein motion and robot motion. The algorithm adapts the popular roadmap-based framework in robot motion computation to handle the more complex protein conformation space and its underlying rugged energy surface. Given known structures representing stable and meta-stable states of a protein, the algorithm yields a time- and energy-prioritized list of transition paths between the structures, with each path represented as a series of conformations. The algorithm balances computational resources between a global search aimed at obtaining a global view of the network of protein conformations and their connectivity and a detailed local search focused on realizing such connections with physically-realistic models. Promising results are presented on a variety of proteins that demonstrate the general utility of the algorithm and its capability to improve the state of the art without employing system-specific insight.

  20. Localized Ambient Solidity Separation Algorithm Based Computer User Segmentation.

    PubMed

    Sun, Xiao; Zhang, Tongda; Chai, Yueting; Liu, Yi

    2015-01-01

    Most of popular clustering methods typically have some strong assumptions of the dataset. For example, the k-means implicitly assumes that all clusters come from spherical Gaussian distributions which have different means but the same covariance. However, when dealing with datasets that have diverse distribution shapes or high dimensionality, these assumptions might not be valid anymore. In order to overcome this weakness, we proposed a new clustering algorithm named localized ambient solidity separation (LASS) algorithm, using a new isolation criterion called centroid distance. Compared with other density based isolation criteria, our proposed centroid distance isolation criterion addresses the problem caused by high dimensionality and varying density. The experiment on a designed two-dimensional benchmark dataset shows that our proposed LASS algorithm not only inherits the advantage of the original dissimilarity increments clustering method to separate naturally isolated clusters but also can identify the clusters which are adjacent, overlapping, and under background noise. Finally, we compared our LASS algorithm with the dissimilarity increments clustering method on a massive computer user dataset with over two million records that contains demographic and behaviors information. The results show that LASS algorithm works extremely well on this computer user dataset and can gain more knowledge from it.

  1. Localized Ambient Solidity Separation Algorithm Based Computer User Segmentation

    PubMed Central

    Sun, Xiao; Zhang, Tongda; Chai, Yueting; Liu, Yi

    2015-01-01

    Most of popular clustering methods typically have some strong assumptions of the dataset. For example, the k-means implicitly assumes that all clusters come from spherical Gaussian distributions which have different means but the same covariance. However, when dealing with datasets that have diverse distribution shapes or high dimensionality, these assumptions might not be valid anymore. In order to overcome this weakness, we proposed a new clustering algorithm named localized ambient solidity separation (LASS) algorithm, using a new isolation criterion called centroid distance. Compared with other density based isolation criteria, our proposed centroid distance isolation criterion addresses the problem caused by high dimensionality and varying density. The experiment on a designed two-dimensional benchmark dataset shows that our proposed LASS algorithm not only inherits the advantage of the original dissimilarity increments clustering method to separate naturally isolated clusters but also can identify the clusters which are adjacent, overlapping, and under background noise. Finally, we compared our LASS algorithm with the dissimilarity increments clustering method on a massive computer user dataset with over two million records that contains demographic and behaviors information. The results show that LASS algorithm works extremely well on this computer user dataset and can gain more knowledge from it. PMID:26221133

  2. Optimized Laplacian image sharpening algorithm based on graphic processing unit

    NASA Astrophysics Data System (ADS)

    Ma, Tinghuai; Li, Lu; Ji, Sai; Wang, Xin; Tian, Yuan; Al-Dhelaan, Abdullah; Al-Rodhaan, Mznah

    2014-12-01

    In classical Laplacian image sharpening, all pixels are processed one by one, which leads to large amount of computation. Traditional Laplacian sharpening processed on CPU is considerably time-consuming especially for those large pictures. In this paper, we propose a parallel implementation of Laplacian sharpening based on Compute Unified Device Architecture (CUDA), which is a computing platform of Graphic Processing Units (GPU), and analyze the impact of picture size on performance and the relationship between the processing time of between data transfer time and parallel computing time. Further, according to different features of different memory, an improved scheme of our method is developed, which exploits shared memory in GPU instead of global memory and further increases the efficiency. Experimental results prove that two novel algorithms outperform traditional consequentially method based on OpenCV in the aspect of computing speed.

  3. RGCA: A Reliable GPU Cluster Architecture for Large-Scale Internet of Things Computing Based on Effective Performance-Energy Optimization

    PubMed Central

    Chen, Qingkui; Zhao, Deyu; Wang, Jingjuan

    2017-01-01

    This paper aims to develop a low-cost, high-performance and high-reliability computing system to process large-scale data using common data mining algorithms in the Internet of Things (IoT) computing environment. Considering the characteristics of IoT data processing, similar to mainstream high performance computing, we use a GPU (Graphics Processing Unit) cluster to achieve better IoT services. Firstly, we present an energy consumption calculation method (ECCM) based on WSNs. Then, using the CUDA (Compute Unified Device Architecture) Programming model, we propose a Two-level Parallel Optimization Model (TLPOM) which exploits reasonable resource planning and common compiler optimization techniques to obtain the best blocks and threads configuration considering the resource constraints of each node. The key to this part is dynamic coupling Thread-Level Parallelism (TLP) and Instruction-Level Parallelism (ILP) to improve the performance of the algorithms without additional energy consumption. Finally, combining the ECCM and the TLPOM, we use the Reliable GPU Cluster Architecture (RGCA) to obtain a high-reliability computing system considering the nodes’ diversity, algorithm characteristics, etc. The results show that the performance of the algorithms significantly increased by 34.1%, 33.96% and 24.07% for Fermi, Kepler and Maxwell on average with TLPOM and the RGCA ensures that our IoT computing system provides low-cost and high-reliability services. PMID:28777325

  4. RGCA: A Reliable GPU Cluster Architecture for Large-Scale Internet of Things Computing Based on Effective Performance-Energy Optimization.

    PubMed

    Fang, Yuling; Chen, Qingkui; Xiong, Neal N; Zhao, Deyu; Wang, Jingjuan

    2017-08-04

    This paper aims to develop a low-cost, high-performance and high-reliability computing system to process large-scale data using common data mining algorithms in the Internet of Things (IoT) computing environment. Considering the characteristics of IoT data processing, similar to mainstream high performance computing, we use a GPU (Graphics Processing Unit) cluster to achieve better IoT services. Firstly, we present an energy consumption calculation method (ECCM) based on WSNs. Then, using the CUDA (Compute Unified Device Architecture) Programming model, we propose a Two-level Parallel Optimization Model (TLPOM) which exploits reasonable resource planning and common compiler optimization techniques to obtain the best blocks and threads configuration considering the resource constraints of each node. The key to this part is dynamic coupling Thread-Level Parallelism (TLP) and Instruction-Level Parallelism (ILP) to improve the performance of the algorithms without additional energy consumption. Finally, combining the ECCM and the TLPOM, we use the Reliable GPU Cluster Architecture (RGCA) to obtain a high-reliability computing system considering the nodes' diversity, algorithm characteristics, etc. The results show that the performance of the algorithms significantly increased by 34.1%, 33.96% and 24.07% for Fermi, Kepler and Maxwell on average with TLPOM and the RGCA ensures that our IoT computing system provides low-cost and high-reliability services.

  5. Computationally Efficient Power Allocation Algorithm in Multicarrier-Based Cognitive Radio Networks: OFDM and FBMC Systems

    NASA Astrophysics Data System (ADS)

    Shaat, Musbah; Bader, Faouzi

    2010-12-01

    Cognitive Radio (CR) systems have been proposed to increase the spectrum utilization by opportunistically access the unused spectrum. Multicarrier communication systems are promising candidates for CR systems. Due to its high spectral efficiency, filter bank multicarrier (FBMC) can be considered as an alternative to conventional orthogonal frequency division multiplexing (OFDM) for transmission over the CR networks. This paper addresses the problem of resource allocation in multicarrier-based CR networks. The objective is to maximize the downlink capacity of the network under both total power and interference introduced to the primary users (PUs) constraints. The optimal solution has high computational complexity which makes it unsuitable for practical applications and hence a low complexity suboptimal solution is proposed. The proposed algorithm utilizes the spectrum holes in PUs bands as well as active PU bands. The performance of the proposed algorithm is investigated for OFDM and FBMC based CR systems. Simulation results illustrate that the proposed resource allocation algorithm with low computational complexity achieves near optimal performance and proves the efficiency of using FBMC in CR context.

  6. A parallel simulated annealing algorithm for standard cell placement on a hypercube computer

    NASA Technical Reports Server (NTRS)

    Jones, Mark Howard

    1987-01-01

    A parallel version of a simulated annealing algorithm is presented which is targeted to run on a hypercube computer. A strategy for mapping the cells in a two dimensional area of a chip onto processors in an n-dimensional hypercube is proposed such that both small and large distance moves can be applied. Two types of moves are allowed: cell exchanges and cell displacements. The computation of the cost function in parallel among all the processors in the hypercube is described along with a distributed data structure that needs to be stored in the hypercube to support parallel cost evaluation. A novel tree broadcasting strategy is used extensively in the algorithm for updating cell locations in the parallel environment. Studies on the performance of the algorithm on example industrial circuits show that it is faster and gives better final placement results than the uniprocessor simulated annealing algorithms. An improved uniprocessor algorithm is proposed which is based on the improved results obtained from parallelization of the simulated annealing algorithm.

  7. Performance of fusion algorithms for computer-aided detection and classification of mines in very shallow water obtained from testing in navy Fleet Battle Exercise-Hotel 2000

    NASA Astrophysics Data System (ADS)

    Ciany, Charles M.; Zurawski, William; Kerfoot, Ian

    2001-10-01

    The performance of Computer Aided Detection/Computer Aided Classification (CAD/CAC) Fusion algorithms on side-scan sonar images was evaluated using data taken at the Navy's's Fleet Battle Exercise-Hotel held in Panama City, Florida, in August 2000. A 2-of-3 binary fusion algorithm is shown to provide robust performance. The algorithm accepts the classification decisions and associated contact locations form three different CAD/CAC algorithms, clusters the contacts based on Euclidian distance, and then declares a valid target when a clustered contact is declared by at least 2 of the 3 individual algorithms. This simple binary fusion provided a 96 percent probability of correct classification at a false alarm rate of 0.14 false alarms per image per side. The performance represented a 3.8:1 reduction in false alarms over the best performing single CAD/CAC algorithm, with no loss in probability of correct classification.

  8. Application-oriented offloading in heterogeneous networks for mobile cloud computing

    NASA Astrophysics Data System (ADS)

    Tseng, Fan-Hsun; Cho, Hsin-Hung; Chang, Kai-Di; Li, Jheng-Cong; Shih, Timothy K.

    2018-04-01

    Nowadays Internet applications have become more complicated that mobile device needs more computing resources for shorter execution time but it is restricted to limited battery capacity. Mobile cloud computing (MCC) is emerged to tackle the finite resource problem of mobile device. MCC offloads the tasks and jobs of mobile devices to cloud and fog environments by using offloading scheme. It is vital to MCC that which task should be offloaded and how to offload efficiently. In the paper, we formulate the offloading problem between mobile device and cloud data center and propose two algorithms based on application-oriented for minimum execution time, i.e. the Minimum Offloading Time for Mobile device (MOTM) algorithm and the Minimum Execution Time for Cloud data center (METC) algorithm. The MOTM algorithm minimizes offloading time by selecting appropriate offloading links based on application categories. The METC algorithm minimizes execution time in cloud data center by selecting virtual and physical machines with corresponding resource requirements of applications. Simulation results show that the proposed mechanism not only minimizes total execution time for mobile devices but also decreases their energy consumption.

  9. Structure-guided Protein Transition Modeling with a Probabilistic Roadmap Algorithm.

    PubMed

    Maximova, Tatiana; Plaku, Erion; Shehu, Amarda

    2016-07-07

    Proteins are macromolecules in perpetual motion, switching between structural states to modulate their function. A detailed characterization of the precise yet complex relationship between protein structure, dynamics, and function requires elucidating transitions between functionally-relevant states. Doing so challenges both wet and dry laboratories, as protein dynamics involves disparate temporal scales. In this paper we present a novel, sampling-based algorithm to compute transition paths. The algorithm exploits two main ideas. First, it leverages known structures to initialize its search and define a reduced conformation space for rapid sampling. This is key to address the insufficient sampling issue suffered by sampling-based algorithms. Second, the algorithm embeds samples in a nearest-neighbor graph where transition paths can be efficiently computed via queries. The algorithm adapts the probabilistic roadmap framework that is popular in robot motion planning. In addition to efficiently computing lowest-cost paths between any given structures, the algorithm allows investigating hypotheses regarding the order of experimentally-known structures in a transition event. This novel contribution is likely to open up new venues of research. Detailed analysis is presented on multiple-basin proteins of relevance to human disease. Multiscaling and the AMBER ff14SB force field are used to obtain energetically-credible paths at atomistic detail.

  10. Loci-STREAM Version 0.9

    NASA Technical Reports Server (NTRS)

    Wright, Jeffrey; Thakur, Siddharth

    2006-01-01

    Loci-STREAM is an evolving computational fluid dynamics (CFD) software tool for simulating possibly chemically reacting, possibly unsteady flows in diverse settings, including rocket engines, turbomachines, oil refineries, etc. Loci-STREAM implements a pressure- based flow-solving algorithm that utilizes unstructured grids. (The benefit of low memory usage by pressure-based algorithms is well recognized by experts in the field.) The algorithm is robust for flows at all speeds from zero to hypersonic. The flexibility of arbitrary polyhedral grids enables accurate, efficient simulation of flows in complex geometries, including those of plume-impingement problems. The present version - Loci-STREAM version 0.9 - includes an interface with the Portable, Extensible Toolkit for Scientific Computation (PETSc) library for access to enhanced linear-equation-solving programs therein that accelerate convergence toward a solution. The name "Loci" reflects the creation of this software within the Loci computational framework, which was developed at Mississippi State University for the primary purpose of simplifying the writing of complex multidisciplinary application programs to run in distributed-memory computing environments including clusters of personal computers. Loci has been designed to relieve application programmers of the details of programming for distributed-memory computers.

  11. SPHINX--an algorithm for taxonomic binning of metagenomic sequences.

    PubMed

    Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Singh, Nitin Kumar; Mande, Sharmila S

    2011-01-01

    Compared with composition-based binning algorithms, the binning accuracy and specificity of alignment-based binning algorithms is significantly higher. However, being alignment-based, the latter class of algorithms require enormous amount of time and computing resources for binning huge metagenomic datasets. The motivation was to develop a binning approach that can analyze metagenomic datasets as rapidly as composition-based approaches, but nevertheless has the accuracy and specificity of alignment-based algorithms. This article describes a hybrid binning approach (SPHINX) that achieves high binning efficiency by utilizing the principles of both 'composition'- and 'alignment'-based binning algorithms. Validation results with simulated sequence datasets indicate that SPHINX is able to analyze metagenomic sequences as rapidly as composition-based algorithms. Furthermore, the binning efficiency (in terms of accuracy and specificity of assignments) of SPHINX is observed to be comparable with results obtained using alignment-based algorithms. A web server for the SPHINX algorithm is available at http://metagenomics.atc.tcs.com/SPHINX/.

  12. Quantum computation for solving linear systems

    NASA Astrophysics Data System (ADS)

    Cao, Yudong

    Quantum computation is a subject born out of the combination between physics and computer science. It studies how the laws of quantum mechanics can be exploited to perform computations much more efficiently than current computers (termed classical computers as oppose to quantum computers). The thesis starts by introducing ideas from quantum physics and theoretical computer science and based on these ideas, introducing the basic concepts in quantum computing. These introductory discussions are intended for non-specialists to obtain the essential knowledge needed for understanding the new results presented in the subsequent chapters. After introducing the basics of quantum computing, we focus on the recently proposed quantum algorithm for linear systems. The new results include i) special instances of quantum circuits that can be implemented using current experimental resources; ii) detailed quantum algorithms that are suitable for a broader class of linear systems. We show that for some particular problems the quantum algorithm is able to achieve exponential speedup over their classical counterparts.

  13. OpenCL-based vicinity computation for 3D multiresolution mesh compression

    NASA Astrophysics Data System (ADS)

    Hachicha, Soumaya; Elkefi, Akram; Ben Amar, Chokri

    2017-03-01

    3D multiresolution mesh compression systems are still widely addressed in many domains. These systems are more and more requiring volumetric data to be processed in real-time. Therefore, the performance is becoming constrained by material resources usage and an overall reduction in the computational time. In this paper, our contribution entirely lies on computing, in real-time, triangles neighborhood of 3D progressive meshes for a robust compression algorithm based on the scan-based wavelet transform(WT) technique. The originality of this latter algorithm is to compute the WT with minimum memory usage by processing data as they are acquired. However, with large data, this technique is considered poor in term of computational complexity. For that, this work exploits the GPU to accelerate the computation using OpenCL as a heterogeneous programming language. Experiments demonstrate that, aside from the portability across various platforms and the flexibility guaranteed by the OpenCL-based implementation, this method can improve performance gain in speedup factor of 5 compared to the sequential CPU implementation.

  14. FOCuS: a metaheuristic algorithm for computing knockouts from genome-scale models for strain optimization.

    PubMed

    Mutturi, Sarma

    2017-06-27

    Although handful tools are available for constraint-based flux analysis to generate knockout strains, most of these are either based on bilevel-MIP or its modifications. However, metaheuristic approaches that are known for their flexibility and scalability have been less studied. Moreover, in the existing tools, sectioning of search space to find optimal knocks has not been considered. Herein, a novel computational procedure, termed as FOCuS (Flower-pOllination coupled Clonal Selection algorithm), was developed to find the optimal reaction knockouts from a metabolic network to maximize the production of specific metabolites. FOCuS derives its benefits from nature-inspired flower pollination algorithm and artificial immune system-inspired clonal selection algorithm to converge to an optimal solution. To evaluate the performance of FOCuS, reported results obtained from both MIP and other metaheuristic-based tools were compared in selected case studies. The results demonstrated the robustness of FOCuS irrespective of the size of metabolic network and number of knockouts. Moreover, sectioning of search space coupled with pooling of priority reactions based on their contribution to objective function for generating smaller search space significantly reduced the computational time.

  15. Layout Study and Application of Mobile App Recommendation Approach Based On Spark Streaming Framework

    NASA Astrophysics Data System (ADS)

    Wang, H. T.; Chen, T. T.; Yan, C.; Pan, H.

    2018-05-01

    For App recommended areas of mobile phone software, made while using conduct App application recommended combined weighted Slope One algorithm collaborative filtering algorithm items based on further improvement of the traditional collaborative filtering algorithm in cold start, data matrix sparseness and other issues, will recommend Spark stasis parallel algorithm platform, the introduction of real-time streaming streaming real-time computing framework to improve real-time software applications recommended.

  16. Algorithm-Based Fault Tolerance for Numerical Subroutines

    NASA Technical Reports Server (NTRS)

    Tumon, Michael; Granat, Robert; Lou, John

    2007-01-01

    A software library implements a new methodology of detecting faults in numerical subroutines, thus enabling application programs that contain the subroutines to recover transparently from single-event upsets. The software library in question is fault-detecting middleware that is wrapped around the numericalsubroutines. Conventional serial versions (based on LAPACK and FFTW) and a parallel version (based on ScaLAPACK) exist. The source code of the application program that contains the numerical subroutines is not modified, and the middleware is transparent to the user. The methodology used is a type of algorithm- based fault tolerance (ABFT). In ABFT, a checksum is computed before a computation and compared with the checksum of the computational result; an error is declared if the difference between the checksums exceeds some threshold. Novel normalization methods are used in the checksum comparison to ensure correct fault detections independent of algorithm inputs. In tests of this software reported in the peer-reviewed literature, this library was shown to enable detection of 99.9 percent of significant faults while generating no false alarms.

  17. Toward cognitive pipelines of medical assistance algorithms.

    PubMed

    Philipp, Patrick; Maleshkova, Maria; Katic, Darko; Weber, Christian; Götz, Michael; Rettinger, Achim; Speidel, Stefanie; Kämpgen, Benedikt; Nolden, Marco; Wekerle, Anna-Laura; Dillmann, Rüdiger; Kenngott, Hannes; Müller, Beat; Studer, Rudi

    2016-09-01

    Assistance algorithms for medical tasks have great potential to support physicians with their daily work. However, medicine is also one of the most demanding domains for computer-based support systems, since medical assistance tasks are complex and the practical experience of the physician is crucial. Recent developments in the area of cognitive computing appear to be well suited to tackle medicine as an application domain. We propose a system based on the idea of cognitive computing and consisting of auto-configurable medical assistance algorithms and their self-adapting combination. The system enables automatic execution of new algorithms, given they are made available as Medical Cognitive Apps and are registered in a central semantic repository. Learning components can be added to the system to optimize the results in the cases when numerous Medical Cognitive Apps are available for the same task. Our prototypical implementation is applied to the areas of surgical phase recognition based on sensor data and image progressing for tumor progression mappings. Our results suggest that such assistance algorithms can be automatically configured in execution pipelines, candidate results can be automatically scored and combined, and the system can learn from experience. Furthermore, our evaluation shows that the Medical Cognitive Apps are providing the correct results as they did for local execution and run in a reasonable amount of time. The proposed solution is applicable to a variety of medical use cases and effectively supports the automated and self-adaptive configuration of cognitive pipelines based on medical interpretation algorithms.

  18. Identifying online user reputation of user-object bipartite networks

    NASA Astrophysics Data System (ADS)

    Liu, Xiao-Lu; Liu, Jian-Guo; Yang, Kai; Guo, Qiang; Han, Jing-Ti

    2017-02-01

    Identifying online user reputation based on the rating information of the user-object bipartite networks is important for understanding online user collective behaviors. Based on the Bayesian analysis, we present a parameter-free algorithm for ranking online user reputation, where the user reputation is calculated based on the probability that their ratings are consistent with the main part of all user opinions. The experimental results show that the AUC values of the presented algorithm could reach 0.8929 and 0.8483 for the MovieLens and Netflix data sets, respectively, which is better than the results generated by the CR and IARR methods. Furthermore, the experimental results for different user groups indicate that the presented algorithm outperforms the iterative ranking methods in both ranking accuracy and computation complexity. Moreover, the results for the synthetic networks show that the computation complexity of the presented algorithm is a linear function of the network size, which suggests that the presented algorithm is very effective and efficient for the large scale dynamic online systems.

  19. Fast intersection detection algorithm for PC-based robot off-line programming

    NASA Astrophysics Data System (ADS)

    Fedrowitz, Christian H.

    1994-11-01

    This paper presents a method for fast and reliable collision detection in complex production cells. The algorithm is part of the PC-based robot off-line programming system of the University of Siegen (Ropsus). The method is based on a solid model which is managed by a simplified constructive solid geometry model (CSG-model). The collision detection problem is divided in two steps. In the first step the complexity of the problem is reduced in linear time. In the second step the remaining solids are tested for intersection. For this the Simplex algorithm, which is known from linear optimization, is used. It computes a point which is common to two convex polyhedra. The polyhedra intersect, if such a point exists. Regarding the simplified geometrical model of Ropsus the algorithm runs also in linear time. In conjunction with the first step a resultant collision detection algorithm is found which requires linear time in all. Moreover it computes the resultant intersection polyhedron using the dual transformation.

  20. High-speed parallel implementation of a modified PBR algorithm on DSP-based EH topology

    NASA Astrophysics Data System (ADS)

    Rajan, K.; Patnaik, L. M.; Ramakrishna, J.

    1997-08-01

    Algebraic Reconstruction Technique (ART) is an age-old method used for solving the problem of three-dimensional (3-D) reconstruction from projections in electron microscopy and radiology. In medical applications, direct 3-D reconstruction is at the forefront of investigation. The simultaneous iterative reconstruction technique (SIRT) is an ART-type algorithm with the potential of generating in a few iterations tomographic images of a quality comparable to that of convolution backprojection (CBP) methods. Pixel-based reconstruction (PBR) is similar to SIRT reconstruction, and it has been shown that PBR algorithms give better quality pictures compared to those produced by SIRT algorithms. In this work, we propose a few modifications to the PBR algorithms. The modified algorithms are shown to give better quality pictures compared to PBR algorithms. The PBR algorithm and the modified PBR algorithms are highly compute intensive, Not many attempts have been made to reconstruct objects in the true 3-D sense because of the high computational overhead. In this study, we have developed parallel two-dimensional (2-D) and 3-D reconstruction algorithms based on modified PBR. We attempt to solve the two problems encountered by the PBR and modified PBR algorithms, i.e., the long computational time and the large memory requirements, by parallelizing the algorithm on a multiprocessor system. We investigate the possible task and data partitioning schemes by exploiting the potential parallelism in the PBR algorithm subject to minimizing the memory requirement. We have implemented an extended hypercube (EH) architecture for the high-speed execution of the 3-D reconstruction algorithm using the commercially available fast floating point digital signal processor (DSP) chips as the processing elements (PEs) and dual-port random access memories (DPR) as channels between the PEs. We discuss and compare the performances of the PBR algorithm on an IBM 6000 RISC workstation, on a Silicon Graphics Indigo 2 workstation, and on an EH system. The results show that an EH(3,1) using DSP chips as PEs executes the modified PBR algorithm about 100 times faster than an LBM 6000 RISC workstation. We have executed the algorithms on a 4-node IBM SP2 parallel computer. The results show that execution time of the algorithm on an EH(3,1) is better than that of a 4-node IBM SP2 system. The speed-up of an EH(3,1) system with eight PEs and one network controller is approximately 7.85.

  1. BWM*: A Novel, Provable, Ensemble-based Dynamic Programming Algorithm for Sparse Approximations of Computational Protein Design.

    PubMed

    Jou, Jonathan D; Jain, Swati; Georgiev, Ivelin S; Donald, Bruce R

    2016-06-01

    Sparse energy functions that ignore long range interactions between residue pairs are frequently used by protein design algorithms to reduce computational cost. Current dynamic programming algorithms that fully exploit the optimal substructure produced by these energy functions only compute the GMEC. This disproportionately favors the sequence of a single, static conformation and overlooks better binding sequences with multiple low-energy conformations. Provable, ensemble-based algorithms such as A* avoid this problem, but A* cannot guarantee better performance than exhaustive enumeration. We propose a novel, provable, dynamic programming algorithm called Branch-Width Minimization* (BWM*) to enumerate a gap-free ensemble of conformations in order of increasing energy. Given a branch-decomposition of branch-width w for an n-residue protein design with at most q discrete side-chain conformations per residue, BWM* returns the sparse GMEC in O([Formula: see text]) time and enumerates each additional conformation in merely O([Formula: see text]) time. We define a new measure, Total Effective Search Space (TESS), which can be computed efficiently a priori before BWM* or A* is run. We ran BWM* on 67 protein design problems and found that TESS discriminated between BWM*-efficient and A*-efficient cases with 100% accuracy. As predicted by TESS and validated experimentally, BWM* outperforms A* in 73% of the cases and computes the full ensemble or a close approximation faster than A*, enumerating each additional conformation in milliseconds. Unlike A*, the performance of BWM* can be predicted in polynomial time before running the algorithm, which gives protein designers the power to choose the most efficient algorithm for their particular design problem.

  2. Algorithms for Computation of Fundamental Properties of Seawater. Endorsed by Unesco/SCOR/ICES/IAPSO Joint Panel on Oceanographic Tables and Standards and SCOR Working Group 51. Unesco Technical Papers in Marine Science, No. 44.

    ERIC Educational Resources Information Center

    Fofonoff, N. P.; Millard, R. C., Jr.

    Algorithms for computation of fundamental properties of seawater, based on the practicality salinity scale (PSS-78) and the international equation of state for seawater (EOS-80), are compiled in the present report for implementing and standardizing computer programs for oceanographic data processing. Sample FORTRAN subprograms and tables are given…

  3. Aeroelastic Deflection of NURBS Geometry

    NASA Technical Reports Server (NTRS)

    Samareh, Jamshid A.

    1998-01-01

    The purpose of this paper is to present an algorithm for using NonUniform Rational B-Spline (NURBS) representation in an aeroelastic loop. The algorithm is based on creating a least-squares NURBS surface representing the aeroelastic defection. The resulting NURBS surfaces are used to update either the original Computer- Aided Design (CAD) model, Computational Structural Mechanics (CSM) grid or the Computational Fluid Dynamics (CFD) grid. Results are presented for a generic High-Speed Civil Transport (HSCT).

  4. Fast online and index-based algorithms for approximate search of RNA sequence-structure patterns

    PubMed Central

    2013-01-01

    Background It is well known that the search for homologous RNAs is more effective if both sequence and structure information is incorporated into the search. However, current tools for searching with RNA sequence-structure patterns cannot fully handle mutations occurring on both these levels or are simply not fast enough for searching large sequence databases because of the high computational costs of the underlying sequence-structure alignment problem. Results We present new fast index-based and online algorithms for approximate matching of RNA sequence-structure patterns supporting a full set of edit operations on single bases and base pairs. Our methods efficiently compute semi-global alignments of structural RNA patterns and substrings of the target sequence whose costs satisfy a user-defined sequence-structure edit distance threshold. For this purpose, we introduce a new computing scheme to optimally reuse the entries of the required dynamic programming matrices for all substrings and combine it with a technique for avoiding the alignment computation of non-matching substrings. Our new index-based methods exploit suffix arrays preprocessed from the target database and achieve running times that are sublinear in the size of the searched sequences. To support the description of RNA molecules that fold into complex secondary structures with multiple ordered sequence-structure patterns, we use fast algorithms for the local or global chaining of approximate sequence-structure pattern matches. The chaining step removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our improved online algorithm is faster than the best previous method by up to factor 45. Our best new index-based algorithm achieves a speedup of factor 560. Conclusions The presented methods achieve considerable speedups compared to the best previous method. This, together with the expected sublinear running time of the presented index-based algorithms, allows for the first time approximate matching of RNA sequence-structure patterns in large sequence databases. Beyond the algorithmic contributions, we provide with RaligNAtor a robust and well documented open-source software package implementing the algorithms presented in this manuscript. The RaligNAtor software is available at http://www.zbh.uni-hamburg.de/ralignator. PMID:23865810

  5. A parallel algorithm for the initial screening of space debris collisions prediction using the SGP4/SDP4 models and GPU acceleration

    NASA Astrophysics Data System (ADS)

    Lin, Mingpei; Xu, Ming; Fu, Xiaoyu

    2017-05-01

    Currently, a tremendous amount of space debris in Earth's orbit imperils operational spacecraft. It is essential to undertake risk assessments of collisions and predict dangerous encounters in space. However, collision predictions for an enormous amount of space debris give rise to large-scale computations. In this paper, a parallel algorithm is established on the Compute Unified Device Architecture (CUDA) platform of NVIDIA Corporation for collision prediction. According to the parallel structure of NVIDIA graphics processors, a block decomposition strategy is adopted in the algorithm. Space debris is divided into batches, and the computation and data transfer operations of adjacent batches overlap. As a consequence, the latency to access shared memory during the entire computing process is significantly reduced, and a higher computing speed is reached. Theoretically, a simulation of collision prediction for space debris of any amount and for any time span can be executed. To verify this algorithm, a simulation example including 1382 pieces of debris, whose operational time scales vary from 1 min to 3 days, is conducted on Tesla C2075 of NVIDIA. The simulation results demonstrate that with the same computational accuracy as that of a CPU, the computing speed of the parallel algorithm on a GPU is 30 times that on a CPU. Based on this algorithm, collision prediction of over 150 Chinese spacecraft for a time span of 3 days can be completed in less than 3 h on a single computer, which meets the timeliness requirement of the initial screening task. Furthermore, the algorithm can be adapted for multiple tasks, including particle filtration, constellation design, and Monte-Carlo simulation of an orbital computation.

  6. Multiscale computations with a wavelet-adaptive algorithm

    NASA Astrophysics Data System (ADS)

    Rastigejev, Yevgenii Anatolyevich

    A wavelet-based adaptive multiresolution algorithm for the numerical solution of multiscale problems governed by partial differential equations is introduced. The main features of the method include fast algorithms for the calculation of wavelet coefficients and approximation of derivatives on nonuniform stencils. The connection between the wavelet order and the size of the stencil is established. The algorithm is based on the mathematically well established wavelet theory. This allows us to provide error estimates of the solution which are used in conjunction with an appropriate threshold criteria to adapt the collocation grid. The efficient data structures for grid representation as well as related computational algorithms to support grid rearrangement procedure are developed. The algorithm is applied to the simulation of phenomena described by Navier-Stokes equations. First, we undertake the study of the ignition and subsequent viscous detonation of a H2 : O2 : Ar mixture in a one-dimensional shock tube. Subsequently, we apply the algorithm to solve the two- and three-dimensional benchmark problem of incompressible flow in a lid-driven cavity at large Reynolds numbers. For these cases we show that solutions of comparable accuracy as the benchmarks are obtained with more than an order of magnitude reduction in degrees of freedom. The simulations show the striking ability of the algorithm to adapt to a solution having different scales at different spatial locations so as to produce accurate results at a relatively low computational cost.

  7. Distributed Environment Control Using Wireless Sensor/Actuator Networks for Lighting Applications

    PubMed Central

    Nakamura, Masayuki; Sakurai, Atsushi; Nakamura, Jiro

    2009-01-01

    We propose a decentralized algorithm to calculate the control signals for lights in wireless sensor/actuator networks. This algorithm uses an appropriate step size in the iterative process used for quickly computing the control signals. We demonstrate the accuracy and efficiency of this approach compared with the penalty method by using Mote-based mesh sensor networks. The estimation error of the new approach is one-eighth as large as that of the penalty method with one-fifth of its computation time. In addition, we describe our sensor/actuator node for distributed lighting control based on the decentralized algorithm and demonstrate its practical efficacy. PMID:22291525

  8. Efficient rejection-based simulation of biochemical reactions with stochastic noise and delays

    NASA Astrophysics Data System (ADS)

    Thanh, Vo Hong; Priami, Corrado; Zunino, Roberto

    2014-10-01

    We propose a new exact stochastic rejection-based simulation algorithm for biochemical reactions and extend it to systems with delays. Our algorithm accelerates the simulation by pre-computing reaction propensity bounds to select the next reaction to perform. Exploiting such bounds, we are able to avoid recomputing propensities every time a (delayed) reaction is initiated or finished, as is typically necessary in standard approaches. Propensity updates in our approach are still performed, but only infrequently and limited for a small number of reactions, saving computation time and without sacrificing exactness. We evaluate the performance improvement of our algorithm by experimenting with concrete biological models.

  9. Fast algorithm for bilinear transforms in optics

    NASA Astrophysics Data System (ADS)

    Ostrovsky, Andrey S.; Martinez-Niconoff, Gabriel C.; Ramos Romero, Obdulio; Cortes, Liliana

    2000-10-01

    The fast algorithm for calculating the bilinear transform in the optical system is proposed. This algorithm is based on the coherent-mode representation of the cross-spectral density function of the illumination. The algorithm is computationally efficient when the illumination is partially coherent. Numerical examples are studied and compared with the theoretical results.

  10. A review of ocean chlorophyll algorithms and primary production models

    NASA Astrophysics Data System (ADS)

    Li, Jingwen; Zhou, Song; Lv, Nan

    2015-12-01

    This paper mainly introduces the five ocean chlorophyll concentration inversion algorithm and 3 main models for computing ocean primary production based on ocean chlorophyll concentration. Through the comparison of five ocean chlorophyll inversion algorithm, sums up the advantages and disadvantages of these algorithm,and briefly analyzes the trend of ocean primary production model.

  11. Stable computations with flat radial basis functions using vector-valued rational approximations

    NASA Astrophysics Data System (ADS)

    Wright, Grady B.; Fornberg, Bengt

    2017-02-01

    One commonly finds in applications of smooth radial basis functions (RBFs) that scaling the kernels so they are 'flat' leads to smaller discretization errors. However, the direct numerical approach for computing with flat RBFs (RBF-Direct) is severely ill-conditioned. We present an algorithm for bypassing this ill-conditioning that is based on a new method for rational approximation (RA) of vector-valued analytic functions with the property that all components of the vector share the same singularities. This new algorithm (RBF-RA) is more accurate, robust, and easier to implement than the Contour-Padé method, which is similarly based on vector-valued rational approximation. In contrast to the stable RBF-QR and RBF-GA algorithms, which are based on finding a better conditioned base in the same RBF-space, the new algorithm can be used with any type of smooth radial kernel, and it is also applicable to a wider range of tasks (including calculating Hermite type implicit RBF-FD stencils). We present a series of numerical experiments demonstrating the effectiveness of this new method for computing RBF interpolants in the flat regime. We also demonstrate the flexibility of the method by using it to compute implicit RBF-FD formulas in the flat regime and then using these for solving Poisson's equation in a 3-D spherical shell.

  12. Reactor transient control in support of PFR/TREAT TUCOP experiments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Burrows, D.R.; Larsen, G.R.; Harrison, L.J.

    1984-01-01

    Unique energy deposition and experiment control requirements posed bythe PFR/TREAT series of transient undercooling/overpower (TUCOP) experiments resulted in equally unique TREAT reactor operations. New reactor control computer algorithms were written and used with the TREAT reactor control computer system to perform such functions as early power burst generation (based on test train flow conditions), burst generation produced by a step insertion of reactivity following a controlled power ramp, and shutdown (SCRAM) initiators based on both test train conditions and energy deposition. Specialized hardware was constructed to simulate test train inputs to the control computer system so that computer algorithms couldmore » be tested in real time without irradiating the experiment.« less

  13. Parallel Processing Systems for Passive Ranging During Helicopter Flight

    NASA Technical Reports Server (NTRS)

    Sridhar, Bavavar; Suorsa, Raymond E.; Showman, Robert D. (Technical Monitor)

    1994-01-01

    The complexity of rotorcraft missions involving operations close to the ground result in high pilot workload. In order to allow a pilot time to perform mission-oriented tasks, sensor-aiding and automation of some of the guidance and control functions are highly desirable. Images from an electro-optical sensor provide a covert way of detecting objects in the flight path of a low-flying helicopter. Passive ranging consists of processing a sequence of images using techniques based on optical low computation and recursive estimation. The passive ranging algorithm has to extract obstacle information from imagery at rates varying from five to thirty or more frames per second depending on the helicopter speed. We have implemented and tested the passive ranging algorithm off-line using helicopter-collected images. However, the real-time data and computation requirements of the algorithm are beyond the capability of any off-the-shelf microprocessor or digital signal processor. This paper describes the computational requirements of the algorithm and uses parallel processing technology to meet these requirements. Various issues in the selection of a parallel processing architecture are discussed and four different computer architectures are evaluated regarding their suitability to process the algorithm in real-time. Based on this evaluation, we conclude that real-time passive ranging is a realistic goal and can be achieved with a short time.

  14. A depth-first search algorithm to compute elementary flux modes by linear programming.

    PubMed

    Quek, Lake-Ee; Nielsen, Lars K

    2014-07-30

    The decomposition of complex metabolic networks into elementary flux modes (EFMs) provides a useful framework for exploring reaction interactions systematically. Generating a complete set of EFMs for large-scale models, however, is near impossible. Even for moderately-sized models (<400 reactions), existing approaches based on the Double Description method must iterate through a large number of combinatorial candidates, thus imposing an immense processor and memory demand. Based on an alternative elementarity test, we developed a depth-first search algorithm using linear programming (LP) to enumerate EFMs in an exhaustive fashion. Constraints can be introduced to directly generate a subset of EFMs satisfying the set of constraints. The depth-first search algorithm has a constant memory overhead. Using flux constraints, a large LP problem can be massively divided and parallelized into independent sub-jobs for deployment into computing clusters. Since the sub-jobs do not overlap, the approach scales to utilize all available computing nodes with minimal coordination overhead or memory limitations. The speed of the algorithm was comparable to efmtool, a mainstream Double Description method, when enumerating all EFMs; the attrition power gained from performing flux feasibility tests offsets the increased computational demand of running an LP solver. Unlike the Double Description method, the algorithm enables accelerated enumeration of all EFMs satisfying a set of constraints.

  15. Contributed Review: Source-localization algorithms and applications using time of arrival and time difference of arrival measurements

    NASA Astrophysics Data System (ADS)

    Li, Xinya; Deng, Zhiqun Daniel; Rauchenstein, Lynn T.; Carlson, Thomas J.

    2016-04-01

    Locating the position of fixed or mobile sources (i.e., transmitters) based on measurements obtained from sensors (i.e., receivers) is an important research area that is attracting much interest. In this paper, we review several representative localization algorithms that use time of arrivals (TOAs) and time difference of arrivals (TDOAs) to achieve high signal source position estimation accuracy when a transmitter is in the line-of-sight of a receiver. Circular (TOA) and hyperbolic (TDOA) position estimation approaches both use nonlinear equations that relate the known locations of receivers and unknown locations of transmitters. Estimation of the location of transmitters using the standard nonlinear equations may not be very accurate because of receiver location errors, receiver measurement errors, and computational efficiency challenges that result in high computational burdens. Least squares and maximum likelihood based algorithms have become the most popular computational approaches to transmitter location estimation. In this paper, we summarize the computational characteristics and position estimation accuracies of various positioning algorithms. By improving methods for estimating the time-of-arrival of transmissions at receivers and transmitter location estimation algorithms, transmitter location estimation may be applied across a range of applications and technologies such as radar, sonar, the Global Positioning System, wireless sensor networks, underwater animal tracking, mobile communications, and multimedia.

  16. Mathematical detection of aortic valve opening (B point) in impedance cardiography: A comparison of three popular algorithms.

    PubMed

    Árbol, Javier Rodríguez; Perakakis, Pandelis; Garrido, Alba; Mata, José Luis; Fernández-Santaella, M Carmen; Vila, Jaime

    2017-03-01

    The preejection period (PEP) is an index of left ventricle contractility widely used in psychophysiological research. Its computation requires detecting the moment when the aortic valve opens, which coincides with the B point in the first derivative of impedance cardiogram (ICG). Although this operation has been traditionally made via visual inspection, several algorithms based on derivative calculations have been developed to enable an automatic performance of the task. However, despite their popularity, data about their empirical validation are not always available. The present study analyzes the performance in the estimation of the aortic valve opening of three popular algorithms, by comparing their performance with the visual detection of the B point made by two independent scorers. Algorithm 1 is based on the first derivative of the ICG, Algorithm 2 on the second derivative, and Algorithm 3 on the third derivative. Algorithm 3 showed the highest accuracy rate (78.77%), followed by Algorithm 1 (24.57%) and Algorithm 2 (13.82%). In the automatic computation of PEP, Algorithm 2 resulted in significantly more missed cycles (48.57%) than Algorithm 1 (6.3%) and Algorithm 3 (3.5%). Algorithm 2 also estimated a significantly lower average PEP (70 ms), compared with the values obtained by Algorithm 1 (119 ms) and Algorithm 3 (113 ms). Our findings indicate that the algorithm based on the third derivative of the ICG performs significantly better. Nevertheless, a visual inspection of the signal proves indispensable, and this article provides a novel visual guide to facilitate the manual detection of the B point. © 2016 Society for Psychophysiological Research.

  17. Real-time image annotation by manifold-based biased Fisher discriminant analysis

    NASA Astrophysics Data System (ADS)

    Ji, Rongrong; Yao, Hongxun; Wang, Jicheng; Sun, Xiaoshuai; Liu, Xianming

    2008-01-01

    Automatic Linguistic Annotation is a promising solution to bridge the semantic gap in content-based image retrieval. However, two crucial issues are not well addressed in state-of-art annotation algorithms: 1. The Small Sample Size (3S) problem in keyword classifier/model learning; 2. Most of annotation algorithms can not extend to real-time online usage due to their low computational efficiencies. This paper presents a novel Manifold-based Biased Fisher Discriminant Analysis (MBFDA) algorithm to address these two issues by transductive semantic learning and keyword filtering. To address the 3S problem, Co-Training based Manifold learning is adopted for keyword model construction. To achieve real-time annotation, a Bias Fisher Discriminant Analysis (BFDA) based semantic feature reduction algorithm is presented for keyword confidence discrimination and semantic feature reduction. Different from all existing annotation methods, MBFDA views image annotation from a novel Eigen semantic feature (which corresponds to keywords) selection aspect. As demonstrated in experiments, our manifold-based biased Fisher discriminant analysis annotation algorithm outperforms classical and state-of-art annotation methods (1.K-NN Expansion; 2.One-to-All SVM; 3.PWC-SVM) in both computational time and annotation accuracy with a large margin.

  18. Implementation and evaluation of various demons deformable image registration algorithms on a GPU.

    PubMed

    Gu, Xuejun; Pan, Hubert; Liang, Yun; Castillo, Richard; Yang, Deshan; Choi, Dongju; Castillo, Edward; Majumdar, Amitava; Guerrero, Thomas; Jiang, Steve B

    2010-01-07

    Online adaptive radiation therapy (ART) promises the ability to deliver an optimal treatment in response to daily patient anatomic variation. A major technical barrier for the clinical implementation of online ART is the requirement of rapid image segmentation. Deformable image registration (DIR) has been used as an automated segmentation method to transfer tumor/organ contours from the planning image to daily images. However, the current computational time of DIR is insufficient for online ART. In this work, this issue is addressed by using computer graphics processing units (GPUs). A gray-scale-based DIR algorithm called demons and five of its variants were implemented on GPUs using the compute unified device architecture (CUDA) programming environment. The spatial accuracy of these algorithms was evaluated over five sets of pulmonary 4D CT images with an average size of 256 x 256 x 100 and more than 1100 expert-determined landmark point pairs each. For all the testing scenarios presented in this paper, the GPU-based DIR computation required around 7 to 11 s to yield an average 3D error ranging from 1.5 to 1.8 mm. It is interesting to find out that the original passive force demons algorithms outperform subsequently proposed variants based on the combination of accuracy, efficiency and ease of implementation.

  19. Speech recognition for embedded automatic positioner for laparoscope

    NASA Astrophysics Data System (ADS)

    Chen, Xiaodong; Yin, Qingyun; Wang, Yi; Yu, Daoyin

    2014-07-01

    In this paper a novel speech recognition methodology based on Hidden Markov Model (HMM) is proposed for embedded Automatic Positioner for Laparoscope (APL), which includes a fixed point ARM processor as the core. The APL system is designed to assist the doctor in laparoscopic surgery, by implementing the specific doctor's vocal control to the laparoscope. Real-time respond to the voice commands asks for more efficient speech recognition algorithm for the APL. In order to reduce computation cost without significant loss in recognition accuracy, both arithmetic and algorithmic optimizations are applied in the method presented. First, depending on arithmetic optimizations most, a fixed point frontend for speech feature analysis is built according to the ARM processor's character. Then the fast likelihood computation algorithm is used to reduce computational complexity of the HMM-based recognition algorithm. The experimental results show that, the method shortens the recognition time within 0.5s, while the accuracy higher than 99%, demonstrating its ability to achieve real-time vocal control to the APL.

  20. Vibration extraction based on fast NCC algorithm and high-speed camera.

    PubMed

    Lei, Xiujun; Jin, Yi; Guo, Jie; Zhu, Chang'an

    2015-09-20

    In this study, a high-speed camera system is developed to complete the vibration measurement in real time and to overcome the mass introduced by conventional contact measurements. The proposed system consists of a notebook computer and a high-speed camera which can capture the images as many as 1000 frames per second. In order to process the captured images in the computer, the normalized cross-correlation (NCC) template tracking algorithm with subpixel accuracy is introduced. Additionally, a modified local search algorithm based on the NCC is proposed to reduce the computation time and to increase efficiency significantly. The modified algorithm can rapidly accomplish one displacement extraction 10 times faster than the traditional template matching without installing any target panel onto the structures. Two experiments were carried out under laboratory and outdoor conditions to validate the accuracy and efficiency of the system performance in practice. The results demonstrated the high accuracy and efficiency of the camera system in extracting vibrating signals.

  1. Ion flux through membrane channels--an enhanced algorithm for the Poisson-Nernst-Planck model.

    PubMed

    Dyrka, Witold; Augousti, Andy T; Kotulska, Malgorzata

    2008-09-01

    A novel algorithmic scheme for numerical solution of the 3D Poisson-Nernst-Planck model is proposed. The algorithmic improvements are universal and independent of the detailed physical model. They include three major steps: an adjustable gradient-based step value, an adjustable relaxation coefficient, and an optimized segmentation of the modeled space. The enhanced algorithm significantly accelerates the speed of computation and reduces the computational demands. The theoretical model was tested on a regular artificial channel and validated on a real protein channel-alpha-hemolysin, proving its efficiency. (c) 2008 Wiley Periodicals, Inc.

  2. A novel track-before-detect algorithm based on optimal nonlinear filtering for detecting and tracking infrared dim target

    NASA Astrophysics Data System (ADS)

    Tian, Yuexin; Gao, Kun; Liu, Ying; Han, Lu

    2015-08-01

    Aiming at the nonlinear and non-Gaussian features of the real infrared scenes, an optimal nonlinear filtering based algorithm for the infrared dim target tracking-before-detecting application is proposed. It uses the nonlinear theory to construct the state and observation models and uses the spectral separation scheme based Wiener chaos expansion method to resolve the stochastic differential equation of the constructed models. In order to improve computation efficiency, the most time-consuming operations independent of observation data are processed on the fore observation stage. The other observation data related rapid computations are implemented subsequently. Simulation results show that the algorithm possesses excellent detection performance and is more suitable for real-time processing.

  3. GPU implementation of prior image constrained compressed sensing (PICCS)

    NASA Astrophysics Data System (ADS)

    Nett, Brian E.; Tang, Jie; Chen, Guang-Hong

    2010-04-01

    The Prior Image Constrained Compressed Sensing (PICCS) algorithm (Med. Phys. 35, pg. 660, 2008) has been applied to several computed tomography applications with both standard CT systems and flat-panel based systems designed for guiding interventional procedures and radiation therapy treatment delivery. The PICCS algorithm typically utilizes a prior image which is reconstructed via the standard Filtered Backprojection (FBP) reconstruction algorithm. The algorithm then iteratively solves for the image volume that matches the measured data, while simultaneously assuring the image is similar to the prior image. The PICCS algorithm has demonstrated utility in several applications including: improved temporal resolution reconstruction, 4D respiratory phase specific reconstructions for radiation therapy, and cardiac reconstruction from data acquired on an interventional C-arm. One disadvantage of the PICCS algorithm, just as other iterative algorithms, is the long computation times typically associated with reconstruction. In order for an algorithm to gain clinical acceptance reconstruction must be achievable in minutes rather than hours. In this work the PICCS algorithm has been implemented on the GPU in order to significantly reduce the reconstruction time of the PICCS algorithm. The Compute Unified Device Architecture (CUDA) was used in this implementation.

  4. RB Particle Filter Time Synchronization Algorithm Based on the DPM Model.

    PubMed

    Guo, Chunsheng; Shen, Jia; Sun, Yao; Ying, Na

    2015-09-03

    Time synchronization is essential for node localization, target tracking, data fusion, and various other Wireless Sensor Network (WSN) applications. To improve the estimation accuracy of continuous clock offset and skew of mobile nodes in WSNs, we propose a novel time synchronization algorithm, the Rao-Blackwellised (RB) particle filter time synchronization algorithm based on the Dirichlet process mixture (DPM) model. In a state-space equation with a linear substructure, state variables are divided into linear and non-linear variables by the RB particle filter algorithm. These two variables can be estimated using Kalman filter and particle filter, respectively, which improves the computational efficiency more so than if only the particle filter was used. In addition, the DPM model is used to describe the distribution of non-deterministic delays and to automatically adjust the number of Gaussian mixture model components based on the observational data. This improves the estimation accuracy of clock offset and skew, which allows achieving the time synchronization. The time synchronization performance of this algorithm is also validated by computer simulations and experimental measurements. The results show that the proposed algorithm has a higher time synchronization precision than traditional time synchronization algorithms.

  5. An Accelerated Recursive Doubling Algorithm for Block Tridiagonal Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Seal, Sudip K

    2014-01-01

    Block tridiagonal systems of linear equations arise in a wide variety of scientific and engineering applications. Recursive doubling algorithm is a well-known prefix computation-based numerical algorithm that requires O(M^3(N/P + log P)) work to compute the solution of a block tridiagonal system with N block rows and block size M on P processors. In real-world applications, solutions of tridiagonal systems are most often sought with multiple, often hundreds and thousands, of different right hand sides but with the same tridiagonal matrix. Here, we show that a recursive doubling algorithm is sub-optimal when computing solutions of block tridiagonal systems with multiplemore » right hand sides and present a novel algorithm, called the accelerated recursive doubling algorithm, that delivers O(R) improvement when solving block tridiagonal systems with R distinct right hand sides. Since R is typically about 100 1000, this improvement translates to very significant speedups in practice. Detailed complexity analyses of the new algorithm with empirical confirmation of runtime improvements are presented. To the best of our knowledge, this algorithm has not been reported before in the literature.« less

  6. An Agent Inspired Reconfigurable Computing Implementation of a Genetic Algorithm

    NASA Technical Reports Server (NTRS)

    Weir, John M.; Wells, B. Earl

    2003-01-01

    Many software systems have been successfully implemented using an agent paradigm which employs a number of independent entities that communicate with one another to achieve a common goal. The distributed nature of such a paradigm makes it an excellent candidate for use in high speed reconfigurable computing hardware environments such as those present in modem FPGA's. In this paper, a distributed genetic algorithm that can be applied to the agent based reconfigurable hardware model is introduced. The effectiveness of this new algorithm is evaluated by comparing the quality of the solutions found by the new algorithm with those found by traditional genetic algorithms. The performance of a reconfigurable hardware implementation of the new algorithm on an FPGA is compared to traditional single processor implementations.

  7. Constraint treatment techniques and parallel algorithms for multibody dynamic analysis. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Chiou, Jin-Chern

    1990-01-01

    Computational procedures for kinematic and dynamic analysis of three-dimensional multibody dynamic (MBD) systems are developed from the differential-algebraic equations (DAE's) viewpoint. Constraint violations during the time integration process are minimized and penalty constraint stabilization techniques and partitioning schemes are developed. The governing equations of motion, a two-stage staggered explicit-implicit numerical algorithm, are treated which takes advantage of a partitioned solution procedure. A robust and parallelizable integration algorithm is developed. This algorithm uses a two-stage staggered central difference algorithm to integrate the translational coordinates and the angular velocities. The angular orientations of bodies in MBD systems are then obtained by using an implicit algorithm via the kinematic relationship between Euler parameters and angular velocities. It is shown that the combination of the present solution procedures yields a computationally more accurate solution. To speed up the computational procedures, parallel implementation of the present constraint treatment techniques, the two-stage staggered explicit-implicit numerical algorithm was efficiently carried out. The DAE's and the constraint treatment techniques were transformed into arrowhead matrices to which Schur complement form was derived. By fully exploiting the sparse matrix structural analysis techniques, a parallel preconditioned conjugate gradient numerical algorithm is used to solve the systems equations written in Schur complement form. A software testbed was designed and implemented in both sequential and parallel computers. This testbed was used to demonstrate the robustness and efficiency of the constraint treatment techniques, the accuracy of the two-stage staggered explicit-implicit numerical algorithm, and the speed up of the Schur-complement-based parallel preconditioned conjugate gradient algorithm on a parallel computer.

  8. Fast and accurate computation of system matrix for area integral model-based algebraic reconstruction technique

    NASA Astrophysics Data System (ADS)

    Zhang, Shunli; Zhang, Dinghua; Gong, Hao; Ghasemalizadeh, Omid; Wang, Ge; Cao, Guohua

    2014-11-01

    Iterative algorithms, such as the algebraic reconstruction technique (ART), are popular for image reconstruction. For iterative reconstruction, the area integral model (AIM) is more accurate for better reconstruction quality than the line integral model (LIM). However, the computation of the system matrix for AIM is more complex and time-consuming than that for LIM. Here, we propose a fast and accurate method to compute the system matrix for AIM. First, we calculate the intersection of each boundary line of a narrow fan-beam with pixels in a recursive and efficient manner. Then, by grouping the beam-pixel intersection area into six types according to the slopes of the two boundary lines, we analytically compute the intersection area of the narrow fan-beam with the pixels in a simple algebraic fashion. Overall, experimental results show that our method is about three times faster than the Siddon algorithm and about two times faster than the distance-driven model (DDM) in computation of the system matrix. The reconstruction speed of our AIM-based ART is also faster than the LIM-based ART that uses the Siddon algorithm and DDM-based ART, for one iteration. The fast reconstruction speed of our method was accomplished without compromising the image quality.

  9. Joint estimation of motion and illumination change in a sequence of images

    NASA Astrophysics Data System (ADS)

    Koo, Ja-Keoung; Kim, Hyo-Hun; Hong, Byung-Woo

    2015-09-01

    We present an algorithm that simultaneously computes optical flow and estimates illumination change from an image sequence in a unified framework. We propose an energy functional consisting of conventional optical flow energy based on Horn-Schunck method and an additional constraint that is designed to compensate for illumination changes. Any undesirable illumination change that occurs in the imaging procedure in a sequence while the optical flow is being computed is considered a nuisance factor. In contrast to the conventional optical flow algorithm based on Horn-Schunck functional, which assumes the brightness constancy constraint, our algorithm is shown to be robust with respect to temporal illumination changes in the computation of optical flows. An efficient conjugate gradient descent technique is used in the optimization procedure as a numerical scheme. The experimental results obtained from the Middlebury benchmark dataset demonstrate the robustness and the effectiveness of our algorithm. In addition, comparative analysis of our algorithm and Horn-Schunck algorithm is performed on the additional test dataset that is constructed by applying a variety of synthetic bias fields to the original image sequences in the Middlebury benchmark dataset in order to demonstrate that our algorithm outperforms the Horn-Schunck algorithm. The superior performance of the proposed method is observed in terms of both qualitative visualizations and quantitative accuracy errors when compared to Horn-Schunck optical flow algorithm that easily yields poor results in the presence of small illumination changes leading to violation of the brightness constancy constraint.

  10. An efficient impedance method for induced field evaluation based on a stabilized Bi-conjugate gradient algorithm.

    PubMed

    Wang, Hua; Liu, Feng; Xia, Ling; Crozier, Stuart

    2008-11-21

    This paper presents a stabilized Bi-conjugate gradient algorithm (BiCGstab) that can significantly improve the performance of the impedance method, which has been widely applied to model low-frequency field induction phenomena in voxel phantoms. The improved impedance method offers remarkable computational advantages in terms of convergence performance and memory consumption over the conventional, successive over-relaxation (SOR)-based algorithm. The scheme has been validated against other numerical/analytical solutions on a lossy, multilayered sphere phantom excited by an ideal coil loop. To demonstrate the computational performance and application capability of the developed algorithm, the induced fields inside a human phantom due to a low-frequency hyperthermia device is evaluated. The simulation results show the numerical accuracy and superior performance of the method.

  11. FPGA-based coprocessor for matrix algorithms implementation

    NASA Astrophysics Data System (ADS)

    Amira, Abbes; Bensaali, Faycal

    2003-03-01

    Matrix algorithms are important in many types of applications including image and signal processing. These areas require enormous computing power. A close examination of the algorithms used in these, and related, applications reveals that many of the fundamental actions involve matrix operations such as matrix multiplication which is of O (N3) on a sequential computer and O (N3/p) on a parallel system with p processors complexity. This paper presents an investigation into the design and implementation of different matrix algorithms such as matrix operations, matrix transforms and matrix decompositions using an FPGA based environment. Solutions for the problem of processing large matrices have been proposed. The proposed system architectures are scalable, modular and require less area and time complexity with reduced latency when compared with existing structures.

  12. Normalized Cut Algorithm for Automated Assignment of Protein Domains

    NASA Technical Reports Server (NTRS)

    Samanta, M. P.; Liang, S.; Zha, H.; Biegel, Bryan A. (Technical Monitor)

    2002-01-01

    We present a novel computational method for automatic assignment of protein domains from structural data. At the core of our algorithm lies a recently proposed clustering technique that has been very successful for image-partitioning applications. This grap.,l-theory based clustering method uses the notion of a normalized cut to partition. an undirected graph into its strongly-connected components. Computer implementation of our method tested on the standard comparison set of proteins from the literature shows a high success rate (84%), better than most existing alternative In addition, several other features of our algorithm, such as reliance on few adjustable parameters, linear run-time with respect to the size of the protein and reduced complexity compared to other graph-theory based algorithms, would make it an attractive tool for structural biologists.

  13. Case for a field-programmable gate array multicore hybrid machine for an image-processing application

    NASA Astrophysics Data System (ADS)

    Rakvic, Ryan N.; Ives, Robert W.; Lira, Javier; Molina, Carlos

    2011-01-01

    General purpose computer designers have recently begun adding cores to their processors in order to increase performance. For example, Intel has adopted a homogeneous quad-core processor as a base for general purpose computing. PlayStation3 (PS3) game consoles contain a multicore heterogeneous processor known as the Cell, which is designed to perform complex image processing algorithms at a high level. Can modern image-processing algorithms utilize these additional cores? On the other hand, modern advancements in configurable hardware, most notably field-programmable gate arrays (FPGAs) have created an interesting question for general purpose computer designers. Is there a reason to combine FPGAs with multicore processors to create an FPGA multicore hybrid general purpose computer? Iris matching, a repeatedly executed portion of a modern iris-recognition algorithm, is parallelized on an Intel-based homogeneous multicore Xeon system, a heterogeneous multicore Cell system, and an FPGA multicore hybrid system. Surprisingly, the cheaper PS3 slightly outperforms the Intel-based multicore on a core-for-core basis. However, both multicore systems are beaten by the FPGA multicore hybrid system by >50%.

  14. High-Performance Mixed Models Based Genome-Wide Association Analysis with omicABEL software

    PubMed Central

    Fabregat-Traver, Diego; Sharapov, Sodbo Zh.; Hayward, Caroline; Rudan, Igor; Campbell, Harry; Aulchenko, Yurii; Bientinesi, Paolo

    2014-01-01

    To raise the power of genome-wide association studies (GWAS) and avoid false-positive results in structured populations, one can rely on mixed model based tests. When large samples are used, and when multiple traits are to be studied in the ’omics’ context, this approach becomes computationally challenging. Here we consider the problem of mixed-model based GWAS for arbitrary number of traits, and demonstrate that for the analysis of single-trait and multiple-trait scenarios different computational algorithms are optimal. We implement these optimal algorithms in a high-performance computing framework that uses state-of-the-art linear algebra kernels, incorporates optimizations, and avoids redundant computations, increasing throughput while reducing memory usage and energy consumption. We show that, compared to existing libraries, our algorithms and software achieve considerable speed-ups. The OmicABEL software described in this manuscript is available under the GNU GPL v. 3 license as part of the GenABEL project for statistical genomics at http: //www.genabel.org/packages/OmicABEL. PMID:25717363

  15. High-Performance Mixed Models Based Genome-Wide Association Analysis with omicABEL software.

    PubMed

    Fabregat-Traver, Diego; Sharapov, Sodbo Zh; Hayward, Caroline; Rudan, Igor; Campbell, Harry; Aulchenko, Yurii; Bientinesi, Paolo

    2014-01-01

    To raise the power of genome-wide association studies (GWAS) and avoid false-positive results in structured populations, one can rely on mixed model based tests. When large samples are used, and when multiple traits are to be studied in the 'omics' context, this approach becomes computationally challenging. Here we consider the problem of mixed-model based GWAS for arbitrary number of traits, and demonstrate that for the analysis of single-trait and multiple-trait scenarios different computational algorithms are optimal. We implement these optimal algorithms in a high-performance computing framework that uses state-of-the-art linear algebra kernels, incorporates optimizations, and avoids redundant computations, increasing throughput while reducing memory usage and energy consumption. We show that, compared to existing libraries, our algorithms and software achieve considerable speed-ups. The OmicABEL software described in this manuscript is available under the GNU GPL v. 3 license as part of the GenABEL project for statistical genomics at http: //www.genabel.org/packages/OmicABEL.

  16. Computer sciences

    NASA Technical Reports Server (NTRS)

    Smith, Paul H.

    1988-01-01

    The Computer Science Program provides advanced concepts, techniques, system architectures, algorithms, and software for both space and aeronautics information sciences and computer systems. The overall goal is to provide the technical foundation within NASA for the advancement of computing technology in aerospace applications. The research program is improving the state of knowledge of fundamental aerospace computing principles and advancing computing technology in space applications such as software engineering and information extraction from data collected by scientific instruments in space. The program includes the development of special algorithms and techniques to exploit the computing power provided by high performance parallel processors and special purpose architectures. Research is being conducted in the fundamentals of data base logic and improvement techniques for producing reliable computing systems.

  17. A community detection algorithm based on structural similarity

    NASA Astrophysics Data System (ADS)

    Guo, Xuchao; Hao, Xia; Liu, Yaqiong; Zhang, Li; Wang, Lu

    2017-09-01

    In order to further improve the efficiency and accuracy of community detection algorithm, a new algorithm named SSTCA (the community detection algorithm based on structural similarity with threshold) is proposed. In this algorithm, the structural similarities are taken as the weights of edges, and the threshold k is considered to remove multiple edges whose weights are less than the threshold, and improve the computational efficiency. Tests were done on the Zachary’s network, Dolphins’ social network and Football dataset by the proposed algorithm, and compared with GN and SSNCA algorithm. The results show that the new algorithm is superior to other algorithms in accuracy for the dense networks and the operating efficiency is improved obviously.

  18. Flowfield computation of entry vehicles

    NASA Technical Reports Server (NTRS)

    Prabhu, Dinesh K.

    1990-01-01

    The equations governing the multidimensional flow of a reacting mixture of thermally perfect gasses were derived. The modeling procedures for the various terms of the conservation laws are discussed. A numerical algorithm, based on the finite-volume approach, to solve these conservation equations was developed. The advantages and disadvantages of the present numerical scheme are discussed from the point of view of accuracy, computer time, and memory requirements. A simple one-dimensional model problem was solved to prove the feasibility and accuracy of the algorithm. A computer code implementing the above algorithm was developed and is presently being applied to simple geometries and conditions. Once the code is completely debugged and validated, it will be used to compute the complete unsteady flow field around the Aeroassist Flight Experiment (AFE) body.

  19. Computational algorithms for simulations in atmospheric optics.

    PubMed

    Konyaev, P A; Lukin, V P

    2016-04-20

    A computer simulation technique for atmospheric and adaptive optics based on parallel programing is discussed. A parallel propagation algorithm is designed and a modified spectral-phase method for computer generation of 2D time-variant random fields is developed. Temporal power spectra of Laguerre-Gaussian beam fluctuations are considered as an example to illustrate the applications discussed. Implementation of the proposed algorithms using Intel MKL and IPP libraries and NVIDIA CUDA technology is shown to be very fast and accurate. The hardware system for the computer simulation is an off-the-shelf desktop with an Intel Core i7-4790K CPU operating at a turbo-speed frequency up to 5 GHz and an NVIDIA GeForce GTX-960 graphics accelerator with 1024 1.5 GHz processors.

  20. CREKID: A computer code for transient, gas-phase combustion of kinetics

    NASA Technical Reports Server (NTRS)

    Pratt, D. T.; Radhakrishnan, K.

    1984-01-01

    A new algorithm was developed for fast, automatic integration of chemical kinetic rate equations describing homogeneous, gas-phase combustion at constant pressure. Particular attention is paid to the distinguishing physical and computational characteristics of the induction, heat-release and equilibration regimes. The two-part predictor-corrector algorithm, based on an exponentially-fitted trapezoidal rule, includes filtering of ill-posed initial conditions, automatic selection of Newton-Jacobi or Newton iteration for convergence to achieve maximum computational efficiency while observing a prescribed error tolerance. The new algorithm was found to compare favorably with LSODE on two representative test problems drawn from combustion kinetics.

  1. Modification and fixed-point analysis of a Kalman filter for orientation estimation based on 9D inertial measurement unit data.

    PubMed

    Brückner, Hans-Peter; Spindeldreier, Christian; Blume, Holger

    2013-01-01

    A common approach for high accuracy sensor fusion based on 9D inertial measurement unit data is Kalman filtering. State of the art floating-point filter algorithms differ in their computational complexity nevertheless, real-time operation on a low-power microcontroller at high sampling rates is not possible. This work presents algorithmic modifications to reduce the computational demands of a two-step minimum order Kalman filter. Furthermore, the required bit-width of a fixed-point filter version is explored. For evaluation real-world data captured using an Xsens MTx inertial sensor is used. Changes in computational latency and orientation estimation accuracy due to the proposed algorithmic modifications and fixed-point number representation are evaluated in detail on a variety of processing platforms enabling on-board processing on wearable sensor platforms.

  2. Image communication scheme based on dynamic visual cryptography and computer generated holography

    NASA Astrophysics Data System (ADS)

    Palevicius, Paulius; Ragulskis, Minvydas

    2015-01-01

    Computer generated holograms are often exploited to implement optical encryption schemes. This paper proposes the integration of dynamic visual cryptography (an optical technique based on the interplay of visual cryptography and time-averaging geometric moiré) with Gerchberg-Saxton algorithm. A stochastic moiré grating is used to embed the secret into a single cover image. The secret can be visually decoded by a naked eye if only the amplitude of harmonic oscillations corresponds to an accurately preselected value. The proposed visual image encryption scheme is based on computer generated holography, optical time-averaging moiré and principles of dynamic visual cryptography. Dynamic visual cryptography is used both for the initial encryption of the secret image and for the final decryption. Phase data of the encrypted image are computed by using Gerchberg-Saxton algorithm. The optical image is decrypted using the computationally reconstructed field of amplitudes.

  3. Optimization-based image reconstruction from sparse-view data in offset-detector CBCT

    NASA Astrophysics Data System (ADS)

    Bian, Junguo; Wang, Jiong; Han, Xiao; Sidky, Emil Y.; Shao, Lingxiong; Pan, Xiaochuan

    2013-01-01

    The field of view (FOV) of a cone-beam computed tomography (CBCT) unit in a single-photon emission computed tomography (SPECT)/CBCT system can be increased by offsetting the CBCT detector. Analytic-based algorithms have been developed for image reconstruction from data collected at a large number of densely sampled views in offset-detector CBCT. However, the radiation dose involved in a large number of projections can be of a health concern to the imaged subject. CBCT-imaging dose can be reduced by lowering the number of projections. As analytic-based algorithms are unlikely to reconstruct accurate images from sparse-view data, we investigate and characterize in the work optimization-based algorithms, including an adaptive steepest descent-weighted projection onto convex sets (ASD-WPOCS) algorithms, for image reconstruction from sparse-view data collected in offset-detector CBCT. Using simulated data and real data collected from a physical pelvis phantom and patient, we verify and characterize properties of the algorithms under study. Results of our study suggest that optimization-based algorithms such as ASD-WPOCS may be developed for yielding images of potential utility from a number of projections substantially smaller than those used currently in clinical SPECT/CBCT imaging, thus leading to a dose reduction in CBCT imaging.

  4. Cost versus life cycle assessment-based environmental impact optimization of drinking water production plants.

    PubMed

    Capitanescu, F; Rege, S; Marvuglia, A; Benetto, E; Ahmadi, A; Gutiérrez, T Navarrete; Tiruta-Barna, L

    2016-07-15

    Empowering decision makers with cost-effective solutions for reducing industrial processes environmental burden, at both design and operation stages, is nowadays a major worldwide concern. The paper addresses this issue for the sector of drinking water production plants (DWPPs), seeking for optimal solutions trading-off operation cost and life cycle assessment (LCA)-based environmental impact while satisfying outlet water quality criteria. This leads to a challenging bi-objective constrained optimization problem, which relies on a computationally expensive intricate process-modelling simulator of the DWPP and has to be solved with limited computational budget. Since mathematical programming methods are unusable in this case, the paper examines the performances in tackling these challenges of six off-the-shelf state-of-the-art global meta-heuristic optimization algorithms, suitable for such simulation-based optimization, namely Strength Pareto Evolutionary Algorithm (SPEA2), Non-dominated Sorting Genetic Algorithm (NSGA-II), Indicator-based Evolutionary Algorithm (IBEA), Multi-Objective Evolutionary Algorithm based on Decomposition (MOEA/D), Differential Evolution (DE), and Particle Swarm Optimization (PSO). The results of optimization reveal that good reduction in both operating cost and environmental impact of the DWPP can be obtained. Furthermore, NSGA-II outperforms the other competing algorithms while MOEA/D and DE perform unexpectedly poorly. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Modern meta-heuristics based on nonlinear physics processes: A review of models and design procedures

    NASA Astrophysics Data System (ADS)

    Salcedo-Sanz, S.

    2016-10-01

    Meta-heuristic algorithms are problem-solving methods which try to find good-enough solutions to very hard optimization problems, at a reasonable computation time, where classical approaches fail, or cannot even been applied. Many existing meta-heuristics approaches are nature-inspired techniques, which work by simulating or modeling different natural processes in a computer. Historically, many of the most successful meta-heuristic approaches have had a biological inspiration, such as evolutionary computation or swarm intelligence paradigms, but in the last few years new approaches based on nonlinear physics processes modeling have been proposed and applied with success. Non-linear physics processes, modeled as optimization algorithms, are able to produce completely new search procedures, with extremely effective exploration capabilities in many cases, which are able to outperform existing optimization approaches. In this paper we review the most important optimization algorithms based on nonlinear physics, how they have been constructed from specific modeling of a real phenomena, and also their novelty in terms of comparison with alternative existing algorithms for optimization. We first review important concepts on optimization problems, search spaces and problems' difficulty. Then, the usefulness of heuristics and meta-heuristics approaches to face hard optimization problems is introduced, and some of the main existing classical versions of these algorithms are reviewed. The mathematical framework of different nonlinear physics processes is then introduced as a preparatory step to review in detail the most important meta-heuristics based on them. A discussion on the novelty of these approaches, their main computational implementation and design issues, and the evaluation of a novel meta-heuristic based on Strange Attractors mutation will be carried out to complete the review of these techniques. We also describe some of the most important application areas, in broad sense, of meta-heuristics, and describe free-accessible software frameworks which can be used to make easier the implementation of these algorithms.

  6. Modelling the spread of innovation in wild birds.

    PubMed

    Shultz, Thomas R; Montrey, Marcel; Aplin, Lucy M

    2017-06-01

    We apply three plausible algorithms in agent-based computer simulations to recent experiments on social learning in wild birds. Although some of the phenomena are simulated by all three learning algorithms, several manifestations of social conformity bias are simulated by only the approximate majority (AM) algorithm, which has roots in chemistry, molecular biology and theoretical computer science. The simulations generate testable predictions and provide several explanatory insights into the diffusion of innovation through a population. The AM algorithm's success raises the possibility of its usefulness in studying group dynamics more generally, in several different scientific domains. Our differential-equation model matches simulation results and provides mathematical insights into the dynamics of these algorithms. © 2017 The Author(s).

  7. A hybrid flower pollination algorithm based modified randomized location for multi-threshold medical image segmentation.

    PubMed

    Wang, Rui; Zhou, Yongquan; Zhao, Chengyan; Wu, Haizhou

    2015-01-01

    Multi-threshold image segmentation is a powerful image processing technique that is used for the preprocessing of pattern recognition and computer vision. However, traditional multilevel thresholding methods are computationally expensive because they involve exhaustively searching the optimal thresholds to optimize the objective functions. To overcome this drawback, this paper proposes a flower pollination algorithm with a randomized location modification. The proposed algorithm is used to find optimal threshold values for maximizing Otsu's objective functions with regard to eight medical grayscale images. When benchmarked against other state-of-the-art evolutionary algorithms, the new algorithm proves itself to be robust and effective through numerical experimental results including Otsu's objective values and standard deviations.

  8. An Efficient Virtual Machine Consolidation Scheme for Multimedia Cloud Computing.

    PubMed

    Han, Guangjie; Que, Wenhui; Jia, Gangyong; Shu, Lei

    2016-02-18

    Cloud computing has innovated the IT industry in recent years, as it can delivery subscription-based services to users in the pay-as-you-go model. Meanwhile, multimedia cloud computing is emerging based on cloud computing to provide a variety of media services on the Internet. However, with the growing popularity of multimedia cloud computing, its large energy consumption cannot only contribute to greenhouse gas emissions, but also result in the rising of cloud users' costs. Therefore, the multimedia cloud providers should try to minimize its energy consumption as much as possible while satisfying the consumers' resource requirements and guaranteeing quality of service (QoS). In this paper, we have proposed a remaining utilization-aware (RUA) algorithm for virtual machine (VM) placement, and a power-aware algorithm (PA) is proposed to find proper hosts to shut down for energy saving. These two algorithms have been combined and applied to cloud data centers for completing the process of VM consolidation. Simulation results have shown that there exists a trade-off between the cloud data center's energy consumption and service-level agreement (SLA) violations. Besides, the RUA algorithm is able to deal with variable workload to prevent hosts from overloading after VM placement and to reduce the SLA violations dramatically.

  9. An Efficient Virtual Machine Consolidation Scheme for Multimedia Cloud Computing

    PubMed Central

    Han, Guangjie; Que, Wenhui; Jia, Gangyong; Shu, Lei

    2016-01-01

    Cloud computing has innovated the IT industry in recent years, as it can delivery subscription-based services to users in the pay-as-you-go model. Meanwhile, multimedia cloud computing is emerging based on cloud computing to provide a variety of media services on the Internet. However, with the growing popularity of multimedia cloud computing, its large energy consumption cannot only contribute to greenhouse gas emissions, but also result in the rising of cloud users’ costs. Therefore, the multimedia cloud providers should try to minimize its energy consumption as much as possible while satisfying the consumers’ resource requirements and guaranteeing quality of service (QoS). In this paper, we have proposed a remaining utilization-aware (RUA) algorithm for virtual machine (VM) placement, and a power-aware algorithm (PA) is proposed to find proper hosts to shut down for energy saving. These two algorithms have been combined and applied to cloud data centers for completing the process of VM consolidation. Simulation results have shown that there exists a trade-off between the cloud data center’s energy consumption and service-level agreement (SLA) violations. Besides, the RUA algorithm is able to deal with variable workload to prevent hosts from overloading after VM placement and to reduce the SLA violations dramatically. PMID:26901201

  10. An acceleration framework for synthetic aperture radar algorithms

    NASA Astrophysics Data System (ADS)

    Kim, Youngsoo; Gloster, Clay S.; Alexander, Winser E.

    2017-04-01

    Algorithms for radar signal processing, such as Synthetic Aperture Radar (SAR) are computationally intensive and require considerable execution time on a general purpose processor. Reconfigurable logic can be used to off-load the primary computational kernel onto a custom computing machine in order to reduce execution time by an order of magnitude as compared to kernel execution on a general purpose processor. Specifically, Field Programmable Gate Arrays (FPGAs) can be used to accelerate these kernels using hardware-based custom logic implementations. In this paper, we demonstrate a framework for algorithm acceleration. We used SAR as a case study to illustrate the potential for algorithm acceleration offered by FPGAs. Initially, we profiled the SAR algorithm and implemented a homomorphic filter using a hardware implementation of the natural logarithm. Experimental results show a linear speedup by adding reasonably small processing elements in Field Programmable Gate Array (FPGA) as opposed to using a software implementation running on a typical general purpose processor.

  11. A survey of SAT solver

    NASA Astrophysics Data System (ADS)

    Gong, Weiwei; Zhou, Xu

    2017-06-01

    In Computer Science, the Boolean Satisfiability Problem(SAT) is the problem of determining if there exists an interpretation that satisfies a given Boolean formula. SAT is one of the first problems that was proven to be NP-complete, which is also fundamental to artificial intelligence, algorithm and hardware design. This paper reviews the main algorithms of the SAT solver in recent years, including serial SAT algorithms, parallel SAT algorithms, SAT algorithms based on GPU, and SAT algorithms based on FPGA. The development of SAT is analyzed comprehensively in this paper. Finally, several possible directions for the development of the SAT problem are proposed.

  12. A POSTERIORI ERROR ANALYSIS OF TWO STAGE COMPUTATION METHODS WITH APPLICATION TO EFFICIENT DISCRETIZATION AND THE PARAREAL ALGORITHM.

    PubMed

    Chaudhry, Jehanzeb Hameed; Estep, Don; Tavener, Simon; Carey, Varis; Sandelin, Jeff

    2016-01-01

    We consider numerical methods for initial value problems that employ a two stage approach consisting of solution on a relatively coarse discretization followed by solution on a relatively fine discretization. Examples include adaptive error control, parallel-in-time solution schemes, and efficient solution of adjoint problems for computing a posteriori error estimates. We describe a general formulation of two stage computations then perform a general a posteriori error analysis based on computable residuals and solution of an adjoint problem. The analysis accommodates various variations in the two stage computation and in formulation of the adjoint problems. We apply the analysis to compute "dual-weighted" a posteriori error estimates, to develop novel algorithms for efficient solution that take into account cancellation of error, and to the Parareal Algorithm. We test the various results using several numerical examples.

  13. Convergence properties of simple genetic algorithms

    NASA Technical Reports Server (NTRS)

    Bethke, A. D.; Zeigler, B. P.; Strauss, D. M.

    1974-01-01

    The essential parameters determining the behaviour of genetic algorithms were investigated. Computer runs were made while systematically varying the parameter values. Results based on the progress curves obtained from these runs are presented along with results based on the variability of the population as the run progresses.

  14. Evaluation of Item-Based Top-N Recommendation Algorithms

    DTIC Science & Technology

    2000-09-15

    Furthermore, one of the advantages of the item-based algorithm is that it has much smaller computational require- 11 0.0 0.1 0.2 0.3 0.4 0.5 0.6 ecommerce ...items, utilized by many e-commerce sites, cannot take advantage of pre-computed user-to-user similarities. Consequently, even though the throughput of...Non-Zeros ecommerce 6667 17491 91222 catalog 50918 39080 435524 ccard 42629 68793 398619 skills 4374 2125 82612 movielens 943 1682 100000 Table 1: The

  15. Rapid Human-Computer Interactive Conceptual Design of Mobile and Manipulative Robot Systems

    DTIC Science & Technology

    2015-05-19

    algorithm based on Age-Fitness Pareto Optimization (AFPO) ([9]) with an additional user prefer- ence objective and a neural network-based user model, we...greater than 40, which is about 5 times further than any robot traveled in our experiments. 6 3.3 Methods The algorithm uses a client -server computational...architecture. The client here is an interactive pro- gram which takes a pair of controllers as input, simulates4 two copies of the robot with

  16. Parallel algorithms for mapping pipelined and parallel computations

    NASA Technical Reports Server (NTRS)

    Nicol, David M.

    1988-01-01

    Many computational problems in image processing, signal processing, and scientific computing are naturally structured for either pipelined or parallel computation. When mapping such problems onto a parallel architecture it is often necessary to aggregate an obvious problem decomposition. Even in this context the general mapping problem is known to be computationally intractable, but recent advances have been made in identifying classes of problems and architectures for which optimal solutions can be found in polynomial time. Among these, the mapping of pipelined or parallel computations onto linear array, shared memory, and host-satellite systems figures prominently. This paper extends that work first by showing how to improve existing serial mapping algorithms. These improvements have significantly lower time and space complexities: in one case a published O(nm sup 3) time algorithm for mapping m modules onto n processors is reduced to an O(nm log m) time complexity, and its space requirements reduced from O(nm sup 2) to O(m). Run time complexity is further reduced with parallel mapping algorithms based on these improvements, which run on the architecture for which they create the mappings.

  17. A Secure Alignment Algorithm for Mapping Short Reads to Human Genome.

    PubMed

    Zhao, Yongan; Wang, Xiaofeng; Tang, Haixu

    2018-05-09

    The elastic and inexpensive computing resources such as clouds have been recognized as a useful solution to analyzing massive human genomic data (e.g., acquired by using next-generation sequencers) in biomedical researches. However, outsourcing human genome computation to public or commercial clouds was hindered due to privacy concerns: even a small number of human genome sequences contain sufficient information for identifying the donor of the genomic data. This issue cannot be directly addressed by existing security and cryptographic techniques (such as homomorphic encryption), because they are too heavyweight to carry out practical genome computation tasks on massive data. In this article, we present a secure algorithm to accomplish the read mapping, one of the most basic tasks in human genomic data analysis based on a hybrid cloud computing model. Comparing with the existing approaches, our algorithm delegates most computation to the public cloud, while only performing encryption and decryption on the private cloud, and thus makes the maximum use of the computing resource of the public cloud. Furthermore, our algorithm reports similar results as the nonsecure read mapping algorithms, including the alignment between reads and the reference genome, which can be directly used in the downstream analysis such as the inference of genomic variations. We implemented the algorithm in C++ and Python on a hybrid cloud system, in which the public cloud uses an Apache Spark system.

  18. GPU accelerated fuzzy connected image segmentation by using CUDA.

    PubMed

    Zhuge, Ying; Cao, Yong; Miller, Robert W

    2009-01-01

    Image segmentation techniques using fuzzy connectedness principles have shown their effectiveness in segmenting a variety of objects in several large applications in recent years. However, one problem of these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays commodity graphics hardware provides high parallel computing power. In this paper, we present a parallel fuzzy connected image segmentation algorithm on Nvidia's Compute Unified Device Architecture (CUDA) platform for segmenting large medical image data sets. Our experiments based on three data sets with small, medium, and large data size demonstrate the efficiency of the parallel algorithm, which achieves a speed-up factor of 7.2x, 7.3x, and 14.4x, correspondingly, for the three data sets over the sequential implementation of fuzzy connected image segmentation algorithm on CPU.

  19. Tomography by iterative convolution - Empirical study and application to interferometry

    NASA Technical Reports Server (NTRS)

    Vest, C. M.; Prikryl, I.

    1984-01-01

    An algorithm for computer tomography has been developed that is applicable to reconstruction from data having incomplete projections because an opaque object blocks some of the probing radiation as it passes through the object field. The algorithm is based on iteration between the object domain and the projection (Radon transform) domain. Reconstructions are computed during each iteration by the well-known convolution method. Although it is demonstrated that this algorithm does not converge, an empirically justified criterion for terminating the iteration when the most accurate estimate has been computed is presented. The algorithm has been studied by using it to reconstruct several different object fields with several different opaque regions. It also has been used to reconstruct aerodynamic density fields from interferometric data recorded in wind tunnel tests.

  20. Wavelet-based algorithm to the evaluation of contrasted hepatocellular carcinoma in CT-images after transarterial chemoembolization.

    PubMed

    Alvarez, Matheus; de Pina, Diana Rodrigues; Romeiro, Fernando Gomes; Duarte, Sérgio Barbosa; Miranda, José Ricardo de Arruda

    2014-07-26

    Hepatocellular carcinoma is a primary tumor of the liver and involves different treatment modalities according to the tumor stage. After local therapies, the tumor evaluation is based on the mRECIST criteria, which involves the measurement of the maximum diameter of the viable lesion. This paper describes a computed methodology to measure through the contrasted area of the lesions the maximum diameter of the tumor by a computational algorithm. 63 computed tomography (CT) slices from 23 patients were assessed. Non-contrasted liver and HCC typical nodules were evaluated, and a virtual phantom was developed for this purpose. Optimization of the algorithm detection and quantification was made using the virtual phantom. After that, we compared the algorithm findings of maximum diameter of the target lesions against radiologist measures. Computed results of the maximum diameter are in good agreement with the results obtained by radiologist evaluation, indicating that the algorithm was able to detect properly the tumor limits. A comparison of the estimated maximum diameter by radiologist versus the algorithm revealed differences on the order of 0.25 cm for large-sized tumors (diameter > 5 cm), whereas agreement lesser than 1.0 cm was found for small-sized tumors. Differences between algorithm and radiologist measures were accurate for small-sized tumors with a trend to a small decrease for tumors greater than 5 cm. Therefore, traditional methods for measuring lesion diameter should be complemented non-subjective measurement methods, which would allow a more correct evaluation of the contrast-enhanced areas of HCC according to the mRECIST criteria.

  1. A novel measure and significance testing in data analysis of cell image segmentation.

    PubMed

    Wu, Jin Chu; Halter, Michael; Kacker, Raghu N; Elliott, John T; Plant, Anne L

    2017-03-14

    Cell image segmentation (CIS) is an essential part of quantitative imaging of biological cells. Designing a performance measure and conducting significance testing are critical for evaluating and comparing the CIS algorithms for image-based cell assays in cytometry. Many measures and methods have been proposed and implemented to evaluate segmentation methods. However, computing the standard errors (SE) of the measures and their correlation coefficient is not described, and thus the statistical significance of performance differences between CIS algorithms cannot be assessed. We propose the total error rate (TER), a novel performance measure for segmenting all cells in the supervised evaluation. The TER statistically aggregates all misclassification error rates (MER) by taking cell sizes as weights. The MERs are for segmenting each single cell in the population. The TER is fully supported by the pairwise comparisons of MERs using 106 manually segmented ground-truth cells with different sizes and seven CIS algorithms taken from ImageJ. Further, the SE and 95% confidence interval (CI) of TER are computed based on the SE of MER that is calculated using the bootstrap method. An algorithm for computing the correlation coefficient of TERs between two CIS algorithms is also provided. Hence, the 95% CI error bars can be used to classify CIS algorithms. The SEs of TERs and their correlation coefficient can be employed to conduct the hypothesis testing, while the CIs overlap, to determine the statistical significance of the performance differences between CIS algorithms. A novel measure TER of CIS is proposed. The TER's SEs and correlation coefficient are computed. Thereafter, CIS algorithms can be evaluated and compared statistically by conducting the significance testing.

  2. A fast Fourier transform on multipoles (FFTM) algorithm for solving Helmholtz equation in acoustics analysis.

    PubMed

    Ong, Eng Teo; Lee, Heow Pueh; Lim, Kian Meng

    2004-09-01

    This article presents a fast algorithm for the efficient solution of the Helmholtz equation. The method is based on the translation theory of the multipole expansions. Here, the speedup comes from the convolution nature of the translation operators, which can be evaluated rapidly using fast Fourier transform algorithms. Also, the computations of the translation operators are accelerated by using the recursive formulas developed recently by Gumerov and Duraiswami [SIAM J. Sci. Comput. 25, 1344-1381(2003)]. It is demonstrated that the algorithm can produce good accuracy with a relatively low order of expansion. Efficiency analyses of the algorithm reveal that it has computational complexities of O(Na), where a ranges from 1.05 to 1.24. However, this method requires substantially more memory to store the translation operators as compared to the fast multipole method. Hence, despite its simplicity in implementation, this memory requirement issue may limit the application of this algorithm to solving very large-scale problems.

  3. Initialization and Restart in Stochastic Local Search: Computing a Most Probable Explanation in Bayesian Networks

    NASA Technical Reports Server (NTRS)

    Mengshoel, Ole J.; Wilkins, David C.; Roth, Dan

    2010-01-01

    For hard computational problems, stochastic local search has proven to be a competitive approach to finding optimal or approximately optimal problem solutions. Two key research questions for stochastic local search algorithms are: Which algorithms are effective for initialization? When should the search process be restarted? In the present work we investigate these research questions in the context of approximate computation of most probable explanations (MPEs) in Bayesian networks (BNs). We introduce a novel approach, based on the Viterbi algorithm, to explanation initialization in BNs. While the Viterbi algorithm works on sequences and trees, our approach works on BNs with arbitrary topologies. We also give a novel formalization of stochastic local search, with focus on initialization and restart, using probability theory and mixture models. Experimentally, we apply our methods to the problem of MPE computation, using a stochastic local search algorithm known as Stochastic Greedy Search. By carefully optimizing both initialization and restart, we reduce the MPE search time for application BNs by several orders of magnitude compared to using uniform at random initialization without restart. On several BNs from applications, the performance of Stochastic Greedy Search is competitive with clique tree clustering, a state-of-the-art exact algorithm used for MPE computation in BNs.

  4. Improved teaching-learning-based and JAYA optimization algorithms for solving flexible flow shop scheduling problems

    NASA Astrophysics Data System (ADS)

    Buddala, Raviteja; Mahapatra, Siba Sankar

    2017-11-01

    Flexible flow shop (or a hybrid flow shop) scheduling problem is an extension of classical flow shop scheduling problem. In a simple flow shop configuration, a job having `g' operations is performed on `g' operation centres (stages) with each stage having only one machine. If any stage contains more than one machine for providing alternate processing facility, then the problem becomes a flexible flow shop problem (FFSP). FFSP which contains all the complexities involved in a simple flow shop and parallel machine scheduling problems is a well-known NP-hard (Non-deterministic polynomial time) problem. Owing to high computational complexity involved in solving these problems, it is not always possible to obtain an optimal solution in a reasonable computation time. To obtain near-optimal solutions in a reasonable computation time, a large variety of meta-heuristics have been proposed in the past. However, tuning algorithm-specific parameters for solving FFSP is rather tricky and time consuming. To address this limitation, teaching-learning-based optimization (TLBO) and JAYA algorithm are chosen for the study because these are not only recent meta-heuristics but they do not require tuning of algorithm-specific parameters. Although these algorithms seem to be elegant, they lose solution diversity after few iterations and get trapped at the local optima. To alleviate such drawback, a new local search procedure is proposed in this paper to improve the solution quality. Further, mutation strategy (inspired from genetic algorithm) is incorporated in the basic algorithm to maintain solution diversity in the population. Computational experiments have been conducted on standard benchmark problems to calculate makespan and computational time. It is found that the rate of convergence of TLBO is superior to JAYA. From the results, it is found that TLBO and JAYA outperform many algorithms reported in the literature and can be treated as efficient methods for solving the FFSP.

  5. Efficient Geometric Sound Propagation Using Visibility Culling

    NASA Astrophysics Data System (ADS)

    Chandak, Anish

    2011-07-01

    Simulating propagation of sound can improve the sense of realism in interactive applications such as video games and can lead to better designs in engineering applications such as architectural acoustics. In this thesis, we present geometric sound propagation techniques which are faster than prior methods and map well to upcoming parallel multi-core CPUs. We model specular reflections by using the image-source method and model finite-edge diffraction by using the well-known Biot-Tolstoy-Medwin (BTM) model. We accelerate the computation of specular reflections by applying novel visibility algorithms, FastV and AD-Frustum, which compute visibility from a point. We accelerate finite-edge diffraction modeling by applying a novel visibility algorithm which computes visibility from a region. Our visibility algorithms are based on frustum tracing and exploit recent advances in fast ray-hierarchy intersections, data-parallel computations, and scalable, multi-core algorithms. The AD-Frustum algorithm adapts its computation to the scene complexity and allows small errors in computing specular reflection paths for higher computational efficiency. FastV and our visibility algorithm from a region are general, object-space, conservative visibility algorithms that together significantly reduce the number of image sources compared to other techniques while preserving the same accuracy. Our geometric propagation algorithms are an order of magnitude faster than prior approaches for modeling specular reflections and two to ten times faster for modeling finite-edge diffraction. Our algorithms are interactive, scale almost linearly on multi-core CPUs, and can handle large, complex, and dynamic scenes. We also compare the accuracy of our sound propagation algorithms with other methods. Once sound propagation is performed, it is desirable to listen to the propagated sound in interactive and engineering applications. We can generate smooth, artifact-free output audio signals by applying efficient audio-processing algorithms. We also present the first efficient audio-processing algorithm for scenarios with simultaneously moving source and moving receiver (MS-MR) which incurs less than 25% overhead compared to static source and moving receiver (SS-MR) or moving source and static receiver (MS-SR) scenario.

  6. Convergence issues in domain decomposition parallel computation of hovering rotor

    NASA Astrophysics Data System (ADS)

    Xiao, Zhongyun; Liu, Gang; Mou, Bin; Jiang, Xiong

    2018-05-01

    Implicit LU-SGS time integration algorithm has been widely used in parallel computation in spite of its lack of information from adjacent domains. When applied to parallel computation of hovering rotor flows in a rotating frame, it brings about convergence issues. To remedy the problem, three LU factorization-based implicit schemes (consisting of LU-SGS, DP-LUR and HLU-SGS) are investigated comparatively. A test case of pure grid rotation is designed to verify these algorithms, which show that LU-SGS algorithm introduces errors on boundary cells. When partition boundaries are circumferential, errors arise in proportion to grid speed, accumulating along with the rotation, and leading to computational failure in the end. Meanwhile, DP-LUR and HLU-SGS methods show good convergence owing to boundary treatment which are desirable in domain decomposition parallel computations.

  7. Computing the Envelope for Stepwise-Constant Resource Allocations

    NASA Technical Reports Server (NTRS)

    Muscettola, Nicola; Clancy, Daniel (Technical Monitor)

    2002-01-01

    Computing tight resource-level bounds is a fundamental problem in the construction of flexible plans with resource utilization. In this paper we describe an efficient algorithm that builds a resource envelope, the tightest possible such bound. The algorithm is based on transforming the temporal network of resource consuming and producing events into a flow network with nodes equal to the events and edges equal to the necessary predecessor links between events. A staged maximum flow problem on the network is then used to compute the time of occurrence and the height of each step of the resource envelope profile. Each stage has the same computational complexity of solving a maximum flow problem on the entire flow network. This makes this method computationally feasible and promising for use in the inner loop of flexible-time scheduling algorithms.

  8. High-performance computing on GPUs for resistivity logging of oil and gas wells

    NASA Astrophysics Data System (ADS)

    Glinskikh, V.; Dudaev, A.; Nechaev, O.; Surodina, I.

    2017-10-01

    We developed and implemented into software an algorithm for high-performance simulation of electrical logs from oil and gas wells using high-performance heterogeneous computing. The numerical solution of the 2D forward problem is based on the finite-element method and the Cholesky decomposition for solving a system of linear algebraic equations (SLAE). Software implementations of the algorithm used the NVIDIA CUDA technology and computing libraries are made, allowing us to perform decomposition of SLAE and find its solution on central processor unit (CPU) and graphics processor unit (GPU). The calculation time is analyzed depending on the matrix size and number of its non-zero elements. We estimated the computing speed on CPU and GPU, including high-performance heterogeneous CPU-GPU computing. Using the developed algorithm, we simulated resistivity data in realistic models.

  9. Parallel and Preemptable Dynamically Dimensioned Search Algorithms for Single and Multi-objective Optimization in Water Resources

    NASA Astrophysics Data System (ADS)

    Tolson, B.; Matott, L. S.; Gaffoor, T. A.; Asadzadeh, M.; Shafii, M.; Pomorski, P.; Xu, X.; Jahanpour, M.; Razavi, S.; Haghnegahdar, A.; Craig, J. R.

    2015-12-01

    We introduce asynchronous parallel implementations of the Dynamically Dimensioned Search (DDS) family of algorithms including DDS, discrete DDS, PA-DDS and DDS-AU. These parallel algorithms are unique from most existing parallel optimization algorithms in the water resources field in that parallel DDS is asynchronous and does not require an entire population (set of candidate solutions) to be evaluated before generating and then sending a new candidate solution for evaluation. One key advance in this study is developing the first parallel PA-DDS multi-objective optimization algorithm. The other key advance is enhancing the computational efficiency of solving optimization problems (such as model calibration) by combining a parallel optimization algorithm with the deterministic model pre-emption concept. These two efficiency techniques can only be combined because of the asynchronous nature of parallel DDS. Model pre-emption functions to terminate simulation model runs early, prior to completely simulating the model calibration period for example, when intermediate results indicate the candidate solution is so poor that it will definitely have no influence on the generation of further candidate solutions. The computational savings of deterministic model preemption available in serial implementations of population-based algorithms (e.g., PSO) disappear in synchronous parallel implementations as these algorithms. In addition to the key advances above, we implement the algorithms across a range of computation platforms (Windows and Unix-based operating systems from multi-core desktops to a supercomputer system) and package these for future modellers within a model-independent calibration software package called Ostrich as well as MATLAB versions. Results across multiple platforms and multiple case studies (from 4 to 64 processors) demonstrate the vast improvement over serial DDS-based algorithms and highlight the important role model pre-emption plays in the performance of parallel, pre-emptable DDS algorithms. Case studies include single- and multiple-objective optimization problems in water resources model calibration and in many cases linear or near linear speedups are observed.

  10. Fast algorithms for computing phylogenetic divergence time.

    PubMed

    Crosby, Ralph W; Williams, Tiffani L

    2017-12-06

    The inference of species divergence time is a key step in most phylogenetic studies. Methods have been available for the last ten years to perform the inference, but the performance of the methods does not yet scale well to studies with hundreds of taxa and thousands of DNA base pairs. For example a study of 349 primate taxa was estimated to require over 9 months of processing time. In this work, we present a new algorithm, AncestralAge, that significantly improves the performance of the divergence time process. As part of AncestralAge, we demonstrate a new method for the computation of phylogenetic likelihood and our experiments show a 90% improvement in likelihood computation time on the aforementioned dataset of 349 primates taxa with over 60,000 DNA base pairs. Additionally, we show that our new method for the computation of the Bayesian prior on node ages reduces the running time for this computation on the 349 taxa dataset by 99%. Through the use of these new algorithms we open up the ability to perform divergence time inference on large phylogenetic studies.

  11. GPU-accelerated algorithms for many-particle continuous-time quantum walks

    NASA Astrophysics Data System (ADS)

    Piccinini, Enrico; Benedetti, Claudia; Siloi, Ilaria; Paris, Matteo G. A.; Bordone, Paolo

    2017-06-01

    Many-particle continuous-time quantum walks (CTQWs) represent a resource for several tasks in quantum technology, including quantum search algorithms and universal quantum computation. In order to design and implement CTQWs in a realistic scenario, one needs effective simulation tools for Hamiltonians that take into account static noise and fluctuations in the lattice, i.e. Hamiltonians containing stochastic terms. To this aim, we suggest a parallel algorithm based on the Taylor series expansion of the evolution operator, and compare its performances with those of algorithms based on the exact diagonalization of the Hamiltonian or a 4th order Runge-Kutta integration. We prove that both Taylor-series expansion and Runge-Kutta algorithms are reliable and have a low computational cost, the Taylor-series expansion showing the additional advantage of a memory allocation not depending on the precision of calculation. Both algorithms are also highly parallelizable within the SIMT paradigm, and are thus suitable for GPGPU computing. In turn, we have benchmarked 4 NVIDIA GPUs and 3 quad-core Intel CPUs for a 2-particle system over lattices of increasing dimension, showing that the speedup provided by GPU computing, with respect to the OPENMP parallelization, lies in the range between 8x and (more than) 20x, depending on the frequency of post-processing. GPU-accelerated codes thus allow one to overcome concerns about the execution time, and make it possible simulations with many interacting particles on large lattices, with the only limit of the memory available on the device.

  12. A Parallel Particle Swarm Optimization Algorithm Accelerated by Asynchronous Evaluations

    NASA Technical Reports Server (NTRS)

    Venter, Gerhard; Sobieszczanski-Sobieski, Jaroslaw

    2005-01-01

    A parallel Particle Swarm Optimization (PSO) algorithm is presented. Particle swarm optimization is a fairly recent addition to the family of non-gradient based, probabilistic search algorithms that is based on a simplified social model and is closely tied to swarming theory. Although PSO algorithms present several attractive properties to the designer, they are plagued by high computational cost as measured by elapsed time. One approach to reduce the elapsed time is to make use of coarse-grained parallelization to evaluate the design points. Previous parallel PSO algorithms were mostly implemented in a synchronous manner, where all design points within a design iteration are evaluated before the next iteration is started. This approach leads to poor parallel speedup in cases where a heterogeneous parallel environment is used and/or where the analysis time depends on the design point being analyzed. This paper introduces an asynchronous parallel PSO algorithm that greatly improves the parallel e ciency. The asynchronous algorithm is benchmarked on a cluster assembled of Apple Macintosh G5 desktop computers, using the multi-disciplinary optimization of a typical transport aircraft wing as an example.

  13. Cloud computing-based TagSNP selection algorithm for human genome data.

    PubMed

    Hung, Che-Lun; Chen, Wen-Pei; Hua, Guan-Jie; Zheng, Huiru; Tsai, Suh-Jen Jane; Lin, Yaw-Ling

    2015-01-05

    Single nucleotide polymorphisms (SNPs) play a fundamental role in human genetic variation and are used in medical diagnostics, phylogeny construction, and drug design. They provide the highest-resolution genetic fingerprint for identifying disease associations and human features. Haplotypes are regions of linked genetic variants that are closely spaced on the genome and tend to be inherited together. Genetics research has revealed SNPs within certain haplotype blocks that introduce few distinct common haplotypes into most of the population. Haplotype block structures are used in association-based methods to map disease genes. In this paper, we propose an efficient algorithm for identifying haplotype blocks in the genome. In chromosomal haplotype data retrieved from the HapMap project website, the proposed algorithm identified longer haplotype blocks than an existing algorithm. To enhance its performance, we extended the proposed algorithm into a parallel algorithm that copies data in parallel via the Hadoop MapReduce framework. The proposed MapReduce-paralleled combinatorial algorithm performed well on real-world data obtained from the HapMap dataset; the improvement in computational efficiency was proportional to the number of processors used.

  14. Cloud Computing-Based TagSNP Selection Algorithm for Human Genome Data

    PubMed Central

    Hung, Che-Lun; Chen, Wen-Pei; Hua, Guan-Jie; Zheng, Huiru; Tsai, Suh-Jen Jane; Lin, Yaw-Ling

    2015-01-01

    Single nucleotide polymorphisms (SNPs) play a fundamental role in human genetic variation and are used in medical diagnostics, phylogeny construction, and drug design. They provide the highest-resolution genetic fingerprint for identifying disease associations and human features. Haplotypes are regions of linked genetic variants that are closely spaced on the genome and tend to be inherited together. Genetics research has revealed SNPs within certain haplotype blocks that introduce few distinct common haplotypes into most of the population. Haplotype block structures are used in association-based methods to map disease genes. In this paper, we propose an efficient algorithm for identifying haplotype blocks in the genome. In chromosomal haplotype data retrieved from the HapMap project website, the proposed algorithm identified longer haplotype blocks than an existing algorithm. To enhance its performance, we extended the proposed algorithm into a parallel algorithm that copies data in parallel via the Hadoop MapReduce framework. The proposed MapReduce-paralleled combinatorial algorithm performed well on real-world data obtained from the HapMap dataset; the improvement in computational efficiency was proportional to the number of processors used. PMID:25569088

  15. A vectorized Lanczos eigensolver for high-performance computers

    NASA Technical Reports Server (NTRS)

    Bostic, Susan W.

    1990-01-01

    The computational strategies used to implement a Lanczos-based-method eigensolver on the latest generation of supercomputers are described. Several examples of structural vibration and buckling problems are presented that show the effects of using optimization techniques to increase the vectorization of the computational steps. The data storage and access schemes and the tools and strategies that best exploit the computer resources are presented. The method is implemented on the Convex C220, the Cray 2, and the Cray Y-MP computers. Results show that very good computation rates are achieved for the most computationally intensive steps of the Lanczos algorithm and that the Lanczos algorithm is many times faster than other methods extensively used in the past.

  16. Adjoint-Based Algorithms for Adaptation and Design Optimizations on Unstructured Grids

    NASA Technical Reports Server (NTRS)

    Nielsen, Eric J.

    2006-01-01

    Schemes based on discrete adjoint algorithms present several exciting opportunities for significantly advancing the current state of the art in computational fluid dynamics. Such methods provide an extremely efficient means for obtaining discretely consistent sensitivity information for hundreds of design variables, opening the door to rigorous, automated design optimization of complex aerospace configuration using the Navier-Stokes equation. Moreover, the discrete adjoint formulation provides a mathematically rigorous foundation for mesh adaptation and systematic reduction of spatial discretization error. Error estimates are also an inherent by-product of an adjoint-based approach, valuable information that is virtually non-existent in today's large-scale CFD simulations. An overview of the adjoint-based algorithm work at NASA Langley Research Center is presented, with examples demonstrating the potential impact on complex computational problems related to design optimization as well as mesh adaptation.

  17. Correlation-coefficient-based fast template matching through partial elimination.

    PubMed

    Mahmood, Arif; Khan, Sohaib

    2012-04-01

    Partial computation elimination techniques are often used for fast template matching. At a particular search location, computations are prematurely terminated as soon as it is found that this location cannot compete with an already known best match location. Due to the nonmonotonic growth pattern of the correlation-based similarity measures, partial computation elimination techniques have been traditionally considered inapplicable to speed up these measures. In this paper, we show that partial elimination techniques may be applied to a correlation coefficient by using a monotonic formulation, and we propose basic-mode and extended-mode partial correlation elimination algorithms for fast template matching. The basic-mode algorithm is more efficient on small template sizes, whereas the extended mode is faster on medium and larger templates. We also propose a strategy to decide which algorithm to use for a given data set. To achieve a high speedup, elimination algorithms require an initial guess of the peak correlation value. We propose two initialization schemes including a coarse-to-fine scheme for larger templates and a two-stage technique for small- and medium-sized templates. Our proposed algorithms are exact, i.e., having exhaustive equivalent accuracy, and are compared with the existing fast techniques using real image data sets on a wide variety of template sizes. While the actual speedups are data dependent, in most cases, our proposed algorithms have been found to be significantly faster than the other algorithms.

  18. Efficient Pricing Technique for Resource Allocation Problem in Downlink OFDM Cognitive Radio Networks

    NASA Astrophysics Data System (ADS)

    Abdulghafoor, O. B.; Shaat, M. M. R.; Ismail, M.; Nordin, R.; Yuwono, T.; Alwahedy, O. N. A.

    2017-05-01

    In this paper, the problem of resource allocation in OFDM-based downlink cognitive radio (CR) networks has been proposed. The purpose of this research is to decrease the computational complexity of the resource allocation algorithm for downlink CR network while concerning the interference constraint of primary network. The objective has been secured by adopting pricing scheme to develop power allocation algorithm with the following concerns: (i) reducing the complexity of the proposed algorithm and (ii) providing firm power control to the interference introduced to primary users (PUs). The performance of the proposed algorithm is tested for OFDM- CRNs. The simulation results show that the performance of the proposed algorithm approached the performance of the optimal algorithm at a lower computational complexity, i.e., O(NlogN), which makes the proposed algorithm suitable for more practical applications.

  19. Ray Tracing Through Non-Imaging Concentrators

    NASA Astrophysics Data System (ADS)

    Greynolds, Alan W.

    1984-01-01

    A generalized algorithm for tracing rays through both imaging and non-imaging radiation collectors is presented. A computer program based on the algorithm is then applied to analyzing various two-stage Winston concentrators.

  20. The Cyborg Astrobiologist: testing a novelty detection algorithm on two mobile exploration systems at Rivas Vaciamadrid in Spain and at the Mars Desert Research Station in Utah

    NASA Astrophysics Data System (ADS)

    McGuire, P. C.; Gross, C.; Wendt, L.; Bonnici, A.; Souza-Egipsy, V.; Ormö, J.; Díaz-Martínez, E.; Foing, B. H.; Bose, R.; Walter, S.; Oesker, M.; Ontrup, J.; Haschke, R.; Ritter, H.

    2010-01-01

    In previous work, a platform was developed for testing computer-vision algorithms for robotic planetary exploration. This platform consisted of a digital video camera connected to a wearable computer for real-time processing of images at geological and astrobiological field sites. The real-time processing included image segmentation and the generation of interest points based upon uncommonness in the segmentation maps. Also in previous work, this platform for testing computer-vision algorithms has been ported to a more ergonomic alternative platform, consisting of a phone camera connected via the Global System for Mobile Communications (GSM) network to a remote-server computer. The wearable-computer platform has been tested at geological and astrobiological field sites in Spain (Rivas Vaciamadrid and Riba de Santiuste), and the phone camera has been tested at a geological field site in Malta. In this work, we (i) apply a Hopfield neural-network algorithm for novelty detection based upon colour, (ii) integrate a field-capable digital microscope on the wearable computer platform, (iii) test this novelty detection with the digital microscope at Rivas Vaciamadrid, (iv) develop a Bluetooth communication mode for the phone-camera platform, in order to allow access to a mobile processing computer at the field sites, and (v) test the novelty detection on the Bluetooth-enabled phone camera connected to a netbook computer at the Mars Desert Research Station in Utah. This systems engineering and field testing have together allowed us to develop a real-time computer-vision system that is capable, for example, of identifying lichens as novel within a series of images acquired in semi-arid desert environments. We acquired sequences of images of geologic outcrops in Utah and Spain consisting of various rock types and colours to test this algorithm. The algorithm robustly recognized previously observed units by their colour, while requiring only a single image or a few images to learn colours as familiar, demonstrating its fast learning capability.

  1. Vision-based algorithms for near-host object detection and multilane sensing

    NASA Astrophysics Data System (ADS)

    Kenue, Surender K.

    1995-01-01

    Vision-based sensing can be used for lane sensing, adaptive cruise control, collision warning, and driver performance monitoring functions of intelligent vehicles. Current computer vision algorithms are not robust for handling multiple vehicles in highway scenarios. Several new algorithms are proposed for multi-lane sensing, near-host object detection, vehicle cut-in situations, and specifying regions of interest for object tracking. These algorithms were tested successfully on more than 6000 images taken from real-highway scenes under different daytime lighting conditions.

  2. Algorithm for detection the QRS complexes based on support vector machine

    NASA Astrophysics Data System (ADS)

    Van, G. V.; Podmasteryev, K. V.

    2017-11-01

    The efficiency of computer ECG analysis depends on the accurate detection of QRS-complexes. This paper presents an algorithm for QRS complex detection based of support vector machine (SVM). The proposed algorithm is evaluated on annotated standard databases such as MIT-BIH Arrhythmia database. The QRS detector obtained a sensitivity Se = 98.32% and specificity Sp = 95.46% for MIT-BIH Arrhythmia database. This algorithm can be used as the basis for the software to diagnose electrical activity of the heart.

  3. Efficient rejection-based simulation of biochemical reactions with stochastic noise and delays

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Thanh, Vo Hong, E-mail: vo@cosbi.eu; Priami, Corrado, E-mail: priami@cosbi.eu; Department of Mathematics, University of Trento

    2014-10-07

    We propose a new exact stochastic rejection-based simulation algorithm for biochemical reactions and extend it to systems with delays. Our algorithm accelerates the simulation by pre-computing reaction propensity bounds to select the next reaction to perform. Exploiting such bounds, we are able to avoid recomputing propensities every time a (delayed) reaction is initiated or finished, as is typically necessary in standard approaches. Propensity updates in our approach are still performed, but only infrequently and limited for a small number of reactions, saving computation time and without sacrificing exactness. We evaluate the performance improvement of our algorithm by experimenting with concretemore » biological models.« less

  4. Multiple feature fusion via covariance matrix for visual tracking

    NASA Astrophysics Data System (ADS)

    Jin, Zefenfen; Hou, Zhiqiang; Yu, Wangsheng; Wang, Xin; Sun, Hui

    2018-04-01

    Aiming at the problem of complicated dynamic scenes in visual target tracking, a multi-feature fusion tracking algorithm based on covariance matrix is proposed to improve the robustness of the tracking algorithm. In the frame-work of quantum genetic algorithm, this paper uses the region covariance descriptor to fuse the color, edge and texture features. It also uses a fast covariance intersection algorithm to update the model. The low dimension of region covariance descriptor, the fast convergence speed and strong global optimization ability of quantum genetic algorithm, and the fast computation of fast covariance intersection algorithm are used to improve the computational efficiency of fusion, matching, and updating process, so that the algorithm achieves a fast and effective multi-feature fusion tracking. The experiments prove that the proposed algorithm can not only achieve fast and robust tracking but also effectively handle interference of occlusion, rotation, deformation, motion blur and so on.

  5. A proximity algorithm accelerated by Gauss-Seidel iterations for L1/TV denoising models

    NASA Astrophysics Data System (ADS)

    Li, Qia; Micchelli, Charles A.; Shen, Lixin; Xu, Yuesheng

    2012-09-01

    Our goal in this paper is to improve the computational performance of the proximity algorithms for the L1/TV denoising model. This leads us to a new characterization of all solutions to the L1/TV model via fixed-point equations expressed in terms of the proximity operators. Based upon this observation we develop an algorithm for solving the model and establish its convergence. Furthermore, we demonstrate that the proposed algorithm can be accelerated through the use of the componentwise Gauss-Seidel iteration so that the CPU time consumed is significantly reduced. Numerical experiments using the proposed algorithm for impulsive noise removal are included, with a comparison to three recently developed algorithms. The numerical results show that while the proposed algorithm enjoys a high quality of the restored images, as the other three known algorithms do, it performs significantly better in terms of computational efficiency measured in the CPU time consumed.

  6. A robust method of computing finite difference coefficients based on Vandermonde matrix

    NASA Astrophysics Data System (ADS)

    Zhang, Yijie; Gao, Jinghuai; Peng, Jigen; Han, Weimin

    2018-05-01

    When the finite difference (FD) method is employed to simulate the wave propagation, high-order FD method is preferred in order to achieve better accuracy. However, if the order of FD scheme is high enough, the coefficient matrix of the formula for calculating finite difference coefficients is close to be singular. In this case, when the FD coefficients are computed by matrix inverse operator of MATLAB, inaccuracy can be produced. In order to overcome this problem, we have suggested an algorithm based on Vandermonde matrix in this paper. After specified mathematical transformation, the coefficient matrix is transformed into a Vandermonde matrix. Then the FD coefficients of high-order FD method can be computed by the algorithm of Vandermonde matrix, which prevents the inverse of the singular matrix. The dispersion analysis and numerical results of a homogeneous elastic model and a geophysical model of oil and gas reservoir demonstrate that the algorithm based on Vandermonde matrix has better accuracy compared with matrix inverse operator of MATLAB.

  7. A GENERAL ALGORITHM FOR THE CONSTRUCTION OF CONTOUR PLOTS

    NASA Technical Reports Server (NTRS)

    Johnson, W.

    1994-01-01

    The graphical presentation of experimentally or theoretically generated data sets frequently involves the construction of contour plots. A general computer algorithm has been developed for the construction of contour plots. The algorithm provides for efficient and accurate contouring with a modular approach which allows flexibility in modifying the algorithm for special applications. The algorithm accepts as input data values at a set of points irregularly distributed over a plane. The algorithm is based on an interpolation scheme in which the points in the plane are connected by straight line segments to form a set of triangles. In general, the data is smoothed using a least-squares-error fit of the data to a bivariate polynomial. To construct the contours, interpolation along the edges of the triangles is performed, using the bivariable polynomial if data smoothing was performed. Once the contour points have been located, the contour may be drawn. This program is written in FORTRAN IV for batch execution and has been implemented on an IBM 360 series computer with a central memory requirement of approximately 100K of 8-bit bytes. This computer algorithm was developed in 1981.

  8. DANoC: An Efficient Algorithm and Hardware Codesign of Deep Neural Networks on Chip.

    PubMed

    Zhou, Xichuan; Li, Shengli; Tang, Fang; Hu, Shengdong; Lin, Zhi; Zhang, Lei

    2017-07-18

    Deep neural networks (NNs) are the state-of-the-art models for understanding the content of images and videos. However, implementing deep NNs in embedded systems is a challenging task, e.g., a typical deep belief network could exhaust gigabytes of memory and result in bandwidth and computational bottlenecks. To address this challenge, this paper presents an algorithm and hardware codesign for efficient deep neural computation. A hardware-oriented deep learning algorithm, named the deep adaptive network, is proposed to explore the sparsity of neural connections. By adaptively removing the majority of neural connections and robustly representing the reserved connections using binary integers, the proposed algorithm could save up to 99.9% memory utility and computational resources without undermining classification accuracy. An efficient sparse-mapping-memory-based hardware architecture is proposed to fully take advantage of the algorithmic optimization. Different from traditional Von Neumann architecture, the deep-adaptive network on chip (DANoC) brings communication and computation in close proximity to avoid power-hungry parameter transfers between on-board memory and on-chip computational units. Experiments over different image classification benchmarks show that the DANoC system achieves competitively high accuracy and efficiency comparing with the state-of-the-art approaches.

  9. Local Competition-Based Superpixel Segmentation Algorithm in Remote Sensing

    PubMed Central

    Liu, Jiayin; Tang, Zhenmin; Cui, Ying; Wu, Guoxing

    2017-01-01

    Remote sensing technologies have been widely applied in urban environments’ monitoring, synthesis and modeling. Incorporating spatial information in perceptually coherent regions, superpixel-based approaches can effectively eliminate the “salt and pepper” phenomenon which is common in pixel-wise approaches. Compared with fixed-size windows, superpixels have adaptive sizes and shapes for different spatial structures. Moreover, superpixel-based algorithms can significantly improve computational efficiency owing to the greatly reduced number of image primitives. Hence, the superpixel algorithm, as a preprocessing technique, is more and more popularly used in remote sensing and many other fields. In this paper, we propose a superpixel segmentation algorithm called Superpixel Segmentation with Local Competition (SSLC), which utilizes a local competition mechanism to construct energy terms and label pixels. The local competition mechanism leads to energy terms locality and relativity, and thus, the proposed algorithm is less sensitive to the diversity of image content and scene layout. Consequently, SSLC could achieve consistent performance in different image regions. In addition, the Probability Density Function (PDF), which is estimated by Kernel Density Estimation (KDE) with the Gaussian kernel, is introduced to describe the color distribution of superpixels as a more sophisticated and accurate measure. To reduce computational complexity, a boundary optimization framework is introduced to only handle boundary pixels instead of the whole image. We conduct experiments to benchmark the proposed algorithm with the other state-of-the-art ones on the Berkeley Segmentation Dataset (BSD) and remote sensing images. Results demonstrate that the SSLC algorithm yields the best overall performance, while the computation time-efficiency is still competitive. PMID:28604641

  10. Local Competition-Based Superpixel Segmentation Algorithm in Remote Sensing.

    PubMed

    Liu, Jiayin; Tang, Zhenmin; Cui, Ying; Wu, Guoxing

    2017-06-12

    Remote sensing technologies have been widely applied in urban environments' monitoring, synthesis and modeling. Incorporating spatial information in perceptually coherent regions, superpixel-based approaches can effectively eliminate the "salt and pepper" phenomenon which is common in pixel-wise approaches. Compared with fixed-size windows, superpixels have adaptive sizes and shapes for different spatial structures. Moreover, superpixel-based algorithms can significantly improve computational efficiency owing to the greatly reduced number of image primitives. Hence, the superpixel algorithm, as a preprocessing technique, is more and more popularly used in remote sensing and many other fields. In this paper, we propose a superpixel segmentation algorithm called Superpixel Segmentation with Local Competition (SSLC), which utilizes a local competition mechanism to construct energy terms and label pixels. The local competition mechanism leads to energy terms locality and relativity, and thus, the proposed algorithm is less sensitive to the diversity of image content and scene layout. Consequently, SSLC could achieve consistent performance in different image regions. In addition, the Probability Density Function (PDF), which is estimated by Kernel Density Estimation (KDE) with the Gaussian kernel, is introduced to describe the color distribution of superpixels as a more sophisticated and accurate measure. To reduce computational complexity, a boundary optimization framework is introduced to only handle boundary pixels instead of the whole image. We conduct experiments to benchmark the proposed algorithm with the other state-of-the-art ones on the Berkeley Segmentation Dataset (BSD) and remote sensing images. Results demonstrate that the SSLC algorithm yields the best overall performance, while the computation time-efficiency is still competitive.

  11. Power System Decomposition for Practical Implementation of Bulk-Grid Voltage Control Methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vallem, Mallikarjuna R.; Vyakaranam, Bharat GNVSR; Holzer, Jesse T.

    Power system algorithms such as AC optimal power flow and coordinated volt/var control of the bulk power system are computationally intensive and become difficult to solve in operational time frames. The computational time required to run these algorithms increases exponentially as the size of the power system increases. The solution time for multiple subsystems is less than that for solving the entire system simultaneously, and the local nature of the voltage problem lends itself to such decomposition. This paper describes an algorithm that can be used to perform power system decomposition from the point of view of the voltage controlmore » problem. Our approach takes advantage of the dominant localized effect of voltage control and is based on clustering buses according to the electrical distances between them. One of the contributions of the paper is to use multidimensional scaling to compute n-dimensional Euclidean coordinates for each bus based on electrical distance to perform algorithms like K-means clustering. A simple coordinated reactive power control of photovoltaic inverters for voltage regulation is used to demonstrate the effectiveness of the proposed decomposition algorithm and its components. The proposed decomposition method is demonstrated on the IEEE 118-bus system.« less

  12. Scalable domain decomposition solvers for stochastic PDEs in high performance computing

    DOE PAGES

    Desai, Ajit; Khalil, Mohammad; Pettit, Chris; ...

    2017-09-21

    Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less

  13. A Kirchhoff approach to seismic modeling and prestack depth migration

    NASA Astrophysics Data System (ADS)

    Liu, Zhen-Yue

    1993-05-01

    The Kirchhoff integral provides a robust method for implementing seismic modeling and prestack depth migration, which can handle lateral velocity variation and turning waves. With a little extra computation cost, the Kirchoff-type migration can obtain multiple outputs that have the same phase but different amplitudes, compared with that of other migration methods. The ratio of these amplitudes is helpful in computing some quantities such as reflection angle. I develop a seismic modeling and prestack depth migration method based on the Kirchhoff integral, that handles both laterally variant velocity and a dip beyond 90 degrees. The method uses a finite-difference algorithm to calculate travel times and WKBJ amplitudes for the Kirchhoff integral. Compared to ray-tracing algorithms, the finite-difference algorithm gives an efficient implementation and single-valued quantities (first arrivals) on output. In my finite difference algorithm, the upwind scheme is used to calculate travel times, and the Crank-Nicolson scheme is used to calculate amplitudes. Moreover, interpolation is applied to save computation cost. The modeling and migration algorithms require a smooth velocity function. I develop a velocity-smoothing technique based on damped least-squares to aid in obtaining a successful migration.

  14. Scalable domain decomposition solvers for stochastic PDEs in high performance computing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Desai, Ajit; Khalil, Mohammad; Pettit, Chris

    Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less

  15. PCA-based artifact removal algorithm for stroke detection using UWB radar imaging.

    PubMed

    Ricci, Elisa; di Domenico, Simone; Cianca, Ernestina; Rossi, Tommaso; Diomedi, Marina

    2017-06-01

    Stroke patients should be dispatched at the highest level of care available in the shortest time. In this context, a transportable system in specialized ambulances, able to evaluate the presence of an acute brain lesion in a short time interval (i.e., few minutes), could shorten delay of treatment. UWB radar imaging is an emerging diagnostic branch that has great potential for the implementation of a transportable and low-cost device. Transportability, low cost and short response time pose challenges to the signal processing algorithms of the backscattered signals as they should guarantee good performance with a reasonably low number of antennas and low computational complexity, tightly related to the response time of the device. The paper shows that a PCA-based preprocessing algorithm can: (1) achieve good performance already with a computationally simple beamforming algorithm; (2) outperform state-of-the-art preprocessing algorithms; (3) enable a further improvement in the performance (and/or decrease in the number of antennas) by using a multistatic approach with just a modest increase in computational complexity. This is an important result toward the implementation of such a diagnostic device that could play an important role in emergency scenario.

  16. Dynamic Load-Balancing for Distributed Heterogeneous Computing of Parallel CFD Problems

    NASA Technical Reports Server (NTRS)

    Ecer, A.; Chien, Y. P.; Boenisch, T.; Akay, H. U.

    2000-01-01

    The developed methodology is aimed at improving the efficiency of executing block-structured algorithms on parallel, distributed, heterogeneous computers. The basic approach of these algorithms is to divide the flow domain into many sub- domains called blocks, and solve the governing equations over these blocks. Dynamic load balancing problem is defined as the efficient distribution of the blocks among the available processors over a period of several hours of computations. In environments with computers of different architecture, operating systems, CPU speed, memory size, load, and network speed, balancing the loads and managing the communication between processors becomes crucial. Load balancing software tools for mutually dependent parallel processes have been created to efficiently utilize an advanced computation environment and algorithms. These tools are dynamic in nature because of the chances in the computer environment during execution time. More recently, these tools were extended to a second operating system: NT. In this paper, the problems associated with this application will be discussed. Also, the developed algorithms were combined with the load sharing capability of LSF to efficiently utilize workstation clusters for parallel computing. Finally, results will be presented on running a NASA based code ADPAC to demonstrate the developed tools for dynamic load balancing.

  17. Adaptive multi-time-domain subcycling for crystal plasticity FE modeling of discrete twin evolution

    NASA Astrophysics Data System (ADS)

    Ghosh, Somnath; Cheng, Jiahao

    2018-02-01

    Crystal plasticity finite element (CPFE) models that accounts for discrete micro-twin nucleation-propagation have been recently developed for studying complex deformation behavior of hexagonal close-packed (HCP) materials (Cheng and Ghosh in Int J Plast 67:148-170, 2015, J Mech Phys Solids 99:512-538, 2016). A major difficulty with conducting high fidelity, image-based CPFE simulations of polycrystalline microstructures with explicit twin formation is the prohibitively high demands on computing time. High strain localization within fast propagating twin bands requires very fine simulation time steps and leads to enormous computational cost. To mitigate this shortcoming and improve the simulation efficiency, this paper proposes a multi-time-domain subcycling algorithm. It is based on adaptive partitioning of the evolving computational domain into twinned and untwinned domains. Based on the local deformation-rate, the algorithm accelerates simulations by adopting different time steps for each sub-domain. The sub-domains are coupled back after coarse time increments using a predictor-corrector algorithm at the interface. The subcycling-augmented CPFEM is validated with a comprehensive set of numerical tests. Significant speed-up is observed with this novel algorithm without any loss of accuracy that is advantageous for predicting twinning in polycrystalline microstructures.

  18. A 3D inversion for all-space magnetotelluric data with static shift correction

    NASA Astrophysics Data System (ADS)

    Zhang, Kun

    2017-04-01

    Base on the previous studies on the static shift correction and 3D inversion algorithms, we improve the NLCG 3D inversion method and propose a new static shift correction method which work in the inversion. The static shift correction method is based on the 3D theory and real data. The static shift can be detected by the quantitative analysis of apparent parameters (apparent resistivity and impedance phase) of MT in high frequency range, and completed correction with inversion. The method is an automatic processing technology of computer with 0 cost, and avoids the additional field work and indoor processing with good results. The 3D inversion algorithm is improved (Zhang et al., 2013) base on the NLCG method of Newman & Alumbaugh (2000) and Rodi & Mackie (2001). For the algorithm, we added the parallel structure, improved the computational efficiency, reduced the memory of computer and added the topographic and marine factors. So the 3D inversion could work in general PC with high efficiency and accuracy. And all the MT data of surface stations, seabed stations and underground stations can be used in the inversion algorithm.

  19. Compressive Sensing of Foot Gait Signals and Its Application for the Estimation of Clinically Relevant Time Series.

    PubMed

    Pant, Jeevan K; Krishnan, Sridhar

    2016-07-01

    A new signal reconstruction algorithm for compressive sensing based on the minimization of a pseudonorm which promotes block-sparse structure on the first-order difference of the signal is proposed. Involved optimization is carried out by using a sequential version of Fletcher-Reeves' conjugate-gradient algorithm, and the line search is based on Banach's fixed-point theorem. The algorithm is suitable for the reconstruction of foot gait signals which admit block-sparse structure on the first-order difference. An additional algorithm for the estimation of stride-interval, swing-interval, and stance-interval time series from the reconstructed foot gait signals is also proposed. This algorithm is based on finding zero crossing indices of the foot gait signal and using the resulting indices for the computation of time series. Extensive simulation results demonstrate that the proposed signal reconstruction algorithm yields improved signal-to-noise ratio and requires significantly reduced computational effort relative to several competing algorithms over a wide range of compression ratio. For a compression ratio in the range from 88% to 94%, the proposed algorithm is found to offer improved accuracy for the estimation of clinically relevant time-series parameters, namely, the mean value, variance, and spectral index of stride-interval, stance-interval, and swing-interval time series, relative to its nearest competitor algorithm. The improvement in performance for compression ratio as high as 94% indicates that the proposed algorithms would be useful for designing compressive sensing-based systems for long-term telemonitoring of human gait signals.

  20. Feature Selection in Classification of Eye Movements Using Electrooculography for Activity Recognition

    PubMed Central

    Mala, S.; Latha, K.

    2014-01-01

    Activity recognition is needed in different requisition, for example, reconnaissance system, patient monitoring, and human-computer interfaces. Feature selection plays an important role in activity recognition, data mining, and machine learning. In selecting subset of features, an efficient evolutionary algorithm Differential Evolution (DE), a very efficient optimizer, is used for finding informative features from eye movements using electrooculography (EOG). Many researchers use EOG signals in human-computer interactions with various computational intelligence methods to analyze eye movements. The proposed system involves analysis of EOG signals using clearness based features, minimum redundancy maximum relevance features, and Differential Evolution based features. This work concentrates more on the feature selection algorithm based on DE in order to improve the classification for faultless activity recognition. PMID:25574185

  1. Feature selection in classification of eye movements using electrooculography for activity recognition.

    PubMed

    Mala, S; Latha, K

    2014-01-01

    Activity recognition is needed in different requisition, for example, reconnaissance system, patient monitoring, and human-computer interfaces. Feature selection plays an important role in activity recognition, data mining, and machine learning. In selecting subset of features, an efficient evolutionary algorithm Differential Evolution (DE), a very efficient optimizer, is used for finding informative features from eye movements using electrooculography (EOG). Many researchers use EOG signals in human-computer interactions with various computational intelligence methods to analyze eye movements. The proposed system involves analysis of EOG signals using clearness based features, minimum redundancy maximum relevance features, and Differential Evolution based features. This work concentrates more on the feature selection algorithm based on DE in order to improve the classification for faultless activity recognition.

  2. Synthetic aperture radar image formation for the moving-target and near-field bistatic cases

    NASA Astrophysics Data System (ADS)

    Ding, Yu

    This dissertation addresses topics in two areas of synthetic aperture radar (SAR) image formation: time-frequency based SAR imaging of moving targets and a fast backprojection (BP) algorithm for near-field bistatic SAR imaging. SAR imaging of a moving target is a challenging task due to unknown motion of the target. We approach this problem in a theoretical way, by analyzing the Wigner-Ville distribution (WVD) based SAR imaging technique. We derive approximate closed-form expressions for the point-target response of the SAR imaging system, which quantify the image resolution, and show how the blurring in conventional SAR imaging can be eliminated, while the target shift still remains. Our analyses lead to accurate prediction of the target position in the reconstructed images. The derived expressions also enable us to further study additional aspects of WVD-based SAR imaging. Bistatic SAR imaging is more involved than the monostatic SAR case, because of the separation of the transmitter and the receiver, and possibly the changing bistatic geometry. For near-field bistatic SAR imaging, we develop a novel fast BP algorithm, motivated by a newly proposed fast BP algorithm in computer tomography. First we show that the BP algorithm is the spatial-domain counterpart of the benchmark o -- k algorithm in bistatic SAR imaging, yet it avoids the frequency-domain interpolation in the o -- k algorithm, which may cause artifacts in the reconstructed image. We then derive the band-limited property for BP methods in both monostatic and bistatic SAR imaging, which is the basis for developing the fast BP algorithm. We compare our algorithm with other frequency-domain based algorithms, and show that it achieves better reconstructed image quality, while having the same computational complexity as that of the frequency-domain based algorithms.

  3. A special purpose silicon compiler for designing supercomputing VLSI systems

    NASA Technical Reports Server (NTRS)

    Venkateswaran, N.; Murugavel, P.; Kamakoti, V.; Shankarraman, M. J.; Rangarajan, S.; Mallikarjun, M.; Karthikeyan, B.; Prabhakar, T. S.; Satish, V.; Venkatasubramaniam, P. R.

    1991-01-01

    Design of general/special purpose supercomputing VLSI systems for numeric algorithm execution involves tackling two important aspects, namely their computational and communication complexities. Development of software tools for designing such systems itself becomes complex. Hence a novel design methodology has to be developed. For designing such complex systems a special purpose silicon compiler is needed in which: the computational and communicational structures of different numeric algorithms should be taken into account to simplify the silicon compiler design, the approach is macrocell based, and the software tools at different levels (algorithm down to the VLSI circuit layout) should get integrated. In this paper a special purpose silicon (SPS) compiler based on PACUBE macrocell VLSI arrays for designing supercomputing VLSI systems is presented. It is shown that turn-around time and silicon real estate get reduced over the silicon compilers based on PLA's, SLA's, and gate arrays. The first two silicon compiler characteristics mentioned above enable the SPS compiler to perform systolic mapping (at the macrocell level) of algorithms whose computational structures are of GIPOP (generalized inner product outer product) form. Direct systolic mapping on PLA's, SLA's, and gate arrays is very difficult as they are micro-cell based. A novel GIPOP processor is under development using this special purpose silicon compiler.

  4. A general algorithm for the construction of contour plots

    NASA Technical Reports Server (NTRS)

    Johnson, W.; Silva, F.

    1981-01-01

    An algorithm is described that performs the task of drawing equal level contours on a plane, which requires interpolation in two dimensions based on data prescribed at points distributed irregularly over the plane. The approach is described in detail. The computer program that implements the algorithm is documented and listed.

  5. Three-Dimensional Imaging and Numerical Reconstruction of Graphite/Epoxy Composite Microstructure Based on Ultra-High Resolution X-Ray Computed Tomography

    NASA Technical Reports Server (NTRS)

    Czabaj, M. W.; Riccio, M. L.; Whitacre, W. W.

    2014-01-01

    A combined experimental and computational study aimed at high-resolution 3D imaging, visualization, and numerical reconstruction of fiber-reinforced polymer microstructures at the fiber length scale is presented. To this end, a sample of graphite/epoxy composite was imaged at sub-micron resolution using a 3D X-ray computed tomography microscope. Next, a novel segmentation algorithm was developed, based on concepts adopted from computer vision and multi-target tracking, to detect and estimate, with high accuracy, the position of individual fibers in a volume of the imaged composite. In the current implementation, the segmentation algorithm was based on Global Nearest Neighbor data-association architecture, a Kalman filter estimator, and several novel algorithms for virtualfiber stitching, smoothing, and overlap removal. The segmentation algorithm was used on a sub-volume of the imaged composite, detecting 508 individual fibers. The segmentation data were qualitatively compared to the tomographic data, demonstrating high accuracy of the numerical reconstruction. Moreover, the data were used to quantify a) the relative distribution of individual-fiber cross sections within the imaged sub-volume, and b) the local fiber misorientation relative to the global fiber axis. Finally, the segmentation data were converted using commercially available finite element (FE) software to generate a detailed FE mesh of the composite volume. The methodology described herein demonstrates the feasibility of realizing an FE-based, virtual-testing framework for graphite/fiber composites at the constituent level.

  6. A Comparison of Three Curve Intersection Algorithms

    NASA Technical Reports Server (NTRS)

    Sederberg, T. W.; Parry, S. R.

    1985-01-01

    An empirical comparison is made between three algorithms for computing the points of intersection of two planar Bezier curves. The algorithms compared are: the well known Bezier subdivision algorithm, which is discussed in Lane 80; a subdivision algorithm based on interval analysis due to Koparkar and Mudur; and an algorithm due to Sederberg, Anderson and Goldman which reduces the problem to one of finding the roots of a univariate polynomial. The details of these three algorithms are presented in their respective references.

  7. Optimization Algorithm for Kalman Filter Exploiting the Numerical Characteristics of SINS/GPS Integrated Navigation Systems.

    PubMed

    Hu, Shaoxing; Xu, Shike; Wang, Duhu; Zhang, Aiwu

    2015-11-11

    Aiming at addressing the problem of high computational cost of the traditional Kalman filter in SINS/GPS, a practical optimization algorithm with offline-derivation and parallel processing methods based on the numerical characteristics of the system is presented in this paper. The algorithm exploits the sparseness and/or symmetry of matrices to simplify the computational procedure. Thus plenty of invalid operations can be avoided by offline derivation using a block matrix technique. For enhanced efficiency, a new parallel computational mechanism is established by subdividing and restructuring calculation processes after analyzing the extracted "useful" data. As a result, the algorithm saves about 90% of the CPU processing time and 66% of the memory usage needed in a classical Kalman filter. Meanwhile, the method as a numerical approach needs no precise-loss transformation/approximation of system modules and the accuracy suffers little in comparison with the filter before computational optimization. Furthermore, since no complicated matrix theories are needed, the algorithm can be easily transplanted into other modified filters as a secondary optimization method to achieve further efficiency.

  8. A depth-first search algorithm to compute elementary flux modes by linear programming

    PubMed Central

    2014-01-01

    Background The decomposition of complex metabolic networks into elementary flux modes (EFMs) provides a useful framework for exploring reaction interactions systematically. Generating a complete set of EFMs for large-scale models, however, is near impossible. Even for moderately-sized models (<400 reactions), existing approaches based on the Double Description method must iterate through a large number of combinatorial candidates, thus imposing an immense processor and memory demand. Results Based on an alternative elementarity test, we developed a depth-first search algorithm using linear programming (LP) to enumerate EFMs in an exhaustive fashion. Constraints can be introduced to directly generate a subset of EFMs satisfying the set of constraints. The depth-first search algorithm has a constant memory overhead. Using flux constraints, a large LP problem can be massively divided and parallelized into independent sub-jobs for deployment into computing clusters. Since the sub-jobs do not overlap, the approach scales to utilize all available computing nodes with minimal coordination overhead or memory limitations. Conclusions The speed of the algorithm was comparable to efmtool, a mainstream Double Description method, when enumerating all EFMs; the attrition power gained from performing flux feasibility tests offsets the increased computational demand of running an LP solver. Unlike the Double Description method, the algorithm enables accelerated enumeration of all EFMs satisfying a set of constraints. PMID:25074068

  9. Observer-Based Discrete-Time Nonnegative Edge Synchronization of Networked Systems.

    PubMed

    Su, Housheng; Wu, Han; Chen, Xia

    2017-10-01

    This paper studies the multi-input and multi-output discrete-time nonnegative edge synchronization of networked systems based on neighbors' output information. The communication relationship among the edges of networked systems is modeled by well-known line graph. Two observer-based edge synchronization algorithms are designed, for which some necessary and sufficient synchronization conditions are derived. Moreover, some computable sufficient synchronization conditions are obtained, in which the feedback matrix and the observer matrix are computed by solving the linear programming problems. We finally design several simulation examples to demonstrate the validity of the given nonnegative edge synchronization algorithms.

  10. Parallelized seeded region growing using CUDA.

    PubMed

    Park, Seongjin; Lee, Jeongjin; Lee, Hyunna; Shin, Juneseuk; Seo, Jinwook; Lee, Kyoung Ho; Shin, Yeong-Gil; Kim, Bohyoung

    2014-01-01

    This paper presents a novel method for parallelizing the seeded region growing (SRG) algorithm using Compute Unified Device Architecture (CUDA) technology, with intention to overcome the theoretical weakness of SRG algorithm of its computation time being directly proportional to the size of a segmented region. The segmentation performance of the proposed CUDA-based SRG is compared with SRG implementations on single-core CPUs, quad-core CPUs, and shader language programming, using synthetic datasets and 20 body CT scans. Based on the experimental results, the CUDA-based SRG outperforms the other three implementations, advocating that it can substantially assist the segmentation during massive CT screening tests.

  11. A gossip based information fusion protocol for distributed frequent itemset mining

    NASA Astrophysics Data System (ADS)

    Sohrabi, Mohammad Karim

    2018-07-01

    The computational complexity, huge memory space requirement, and time-consuming nature of frequent pattern mining process are the most important motivations for distribution and parallelization of this mining process. On the other hand, the emergence of distributed computational and operational environments, which causes the production and maintenance of data on different distributed data sources, makes the parallelization and distribution of the knowledge discovery process inevitable. In this paper, a gossip based distributed itemset mining (GDIM) algorithm is proposed to extract frequent itemsets, which are special types of frequent patterns, in a wireless sensor network environment. In this algorithm, local frequent itemsets of each sensor are extracted using a bit-wise horizontal approach (LHPM) from the nodes which are clustered using a leach-based protocol. Heads of clusters exploit a gossip based protocol in order to communicate each other to find the patterns which their global support is equal to or more than the specified support threshold. Experimental results show that the proposed algorithm outperforms the best existing gossip based algorithm in term of execution time.

  12. GBA manager: an online tool for querying low-complexity regions in proteins.

    PubMed

    Bandyopadhyay, Nirmalya; Kahveci, Tamer

    2010-01-01

    Abstract We developed GBA Manager, an online software that facilitates the Graph-Based Algorithm (GBA) we proposed in our earlier work. GBA identifies the low-complexity regions (LCR) of protein sequences. GBA exploits a similarity matrix, such as BLOSUM62, to compute the complexity of the subsequences of the input protein sequence. It uses a graph-based algorithm to accurately compute the regions that have low complexities. GBA Manager is a user friendly web-service that enables online querying of protein sequences using GBA. In addition to querying capabilities of the existing GBA algorithm, GBA Manager computes the p-values of the LCR identified. The p-value gives an estimate of the possibility that the region appears by chance. GBA Manager presents the output in three different understandable formats. GBA Manager is freely accessible at http://bioinformatics.cise.ufl.edu/GBA/GBA.htm .

  13. Incremental update of electrostatic interactions in adaptively restrained particle simulations.

    PubMed

    Edorh, Semeho Prince A; Redon, Stéphane

    2018-04-06

    The computation of long-range potentials is one of the demanding tasks in Molecular Dynamics. During the last decades, an inventive panoply of methods was developed to reduce the CPU time of this task. In this work, we propose a fast method dedicated to the computation of the electrostatic potential in adaptively restrained systems. We exploit the fact that, in such systems, only some particles are allowed to move at each timestep. We developed an incremental algorithm derived from a multigrid-based alternative to traditional Fourier-based methods. Our algorithm was implemented inside LAMMPS, a popular molecular dynamics simulation package. We evaluated the method on different systems. We showed that the new algorithm's computational complexity scales with the number of active particles in the simulated system, and is able to outperform the well-established Particle Particle Particle Mesh (P3M) for adaptively restrained simulations. © 2018 Wiley Periodicals, Inc. © 2018 Wiley Periodicals, Inc.

  14. Cut set-based risk and reliability analysis for arbitrarily interconnected networks

    DOEpatents

    Wyss, Gregory D.

    2000-01-01

    Method for computing all-terminal reliability for arbitrarily interconnected networks such as the United States public switched telephone network. The method includes an efficient search algorithm to generate minimal cut sets for nonhierarchical networks directly from the network connectivity diagram. Efficiency of the search algorithm stems in part from its basis on only link failures. The method also includes a novel quantification scheme that likewise reduces computational effort associated with assessing network reliability based on traditional risk importance measures. Vast reductions in computational effort are realized since combinatorial expansion and subsequent Boolean reduction steps are eliminated through analysis of network segmentations using a technique of assuming node failures to occur on only one side of a break in the network, and repeating the technique for all minimal cut sets generated with the search algorithm. The method functions equally well for planar and non-planar networks.

  15. Ant Lion Optimization algorithm for kidney exchanges.

    PubMed

    Hamouda, Eslam; El-Metwally, Sara; Tarek, Mayada

    2018-01-01

    The kidney exchange programs bring new insights in the field of organ transplantation. They make the previously not allowed surgery of incompatible patient-donor pairs easier to be performed on a large scale. Mathematically, the kidney exchange is an optimization problem for the number of possible exchanges among the incompatible pairs in a given pool. Also, the optimization modeling should consider the expected quality-adjusted life of transplant candidates and the shortage of computational and operational hospital resources. In this article, we introduce a bio-inspired stochastic-based Ant Lion Optimization, ALO, algorithm to the kidney exchange space to maximize the number of feasible cycles and chains among the pool pairs. Ant Lion Optimizer-based program achieves comparable kidney exchange results to the deterministic-based approaches like integer programming. Also, ALO outperforms other stochastic-based methods such as Genetic Algorithm in terms of the efficient usage of computational resources and the quantity of resulting exchanges. Ant Lion Optimization algorithm can be adopted easily for on-line exchanges and the integration of weights for hard-to-match patients, which will improve the future decisions of kidney exchange programs. A reference implementation for ALO algorithm for kidney exchanges is written in MATLAB and is GPL licensed. It is available as free open-source software from: https://github.com/SaraEl-Metwally/ALO_algorithm_for_Kidney_Exchanges.

  16. Multipole Algorithms for Molecular Dynamics Simulation on High Performance Computers.

    NASA Astrophysics Data System (ADS)

    Elliott, William Dewey

    1995-01-01

    A fundamental problem in modeling large molecular systems with molecular dynamics (MD) simulations is the underlying N-body problem of computing the interactions between all pairs of N atoms. The simplest algorithm to compute pair-wise atomic interactions scales in runtime {cal O}(N^2), making it impractical for interesting biomolecular systems, which can contain millions of atoms. Recently, several algorithms have become available that solve the N-body problem by computing the effects of all pair-wise interactions while scaling in runtime less than {cal O}(N^2). One algorithm, which scales {cal O}(N) for a uniform distribution of particles, is called the Greengard-Rokhlin Fast Multipole Algorithm (FMA). This work describes an FMA-like algorithm called the Molecular Dynamics Multipole Algorithm (MDMA). The algorithm contains several features that are new to N-body algorithms. MDMA uses new, efficient series expansion equations to compute general 1/r^{n } potentials to arbitrary accuracy. In particular, the 1/r Coulomb potential and the 1/r^6 portion of the Lennard-Jones potential are implemented. The new equations are based on multivariate Taylor series expansions. In addition, MDMA uses a cell-to-cell interaction region of cells that is closely tied to worst case error bounds. The worst case error bounds for MDMA are derived in this work also. These bounds apply to other multipole algorithms as well. Several implementation enhancements are described which apply to MDMA as well as other N-body algorithms such as FMA and tree codes. The mathematics of the cell -to-cell interactions are converted to the Fourier domain for reduced operation count and faster computation. A relative indexing scheme was devised to locate cells in the interaction region which allows efficient pre-computation of redundant information and prestorage of much of the cell-to-cell interaction. Also, MDMA was integrated into the MD program SIgMA to demonstrate the performance of the program over several simulation timesteps. One MD application described here highlights the utility of including long range contributions to Lennard-Jones potential in constant pressure simulations. Another application shows the time dependence of long range forces in a multiple time step MD simulation.

  17. Generating Multimodal References

    ERIC Educational Resources Information Center

    van der Sluis, Ielka; Krahmer, Emiel

    2007-01-01

    This article presents a new computational model for the generation of multimodal referring expressions (REs), based on observations in human communication. The algorithm is an extension of the graph-based algorithm proposed by Krahmer, van Erk, and Verleg (2003) and makes use of a so-called Flashlight Model for pointing. The Flashlight Model…

  18. Multi-label spacecraft electrical signal classification method based on DBN and random forest

    PubMed Central

    Li, Ke; Yu, Nan; Li, Pengfei; Song, Shimin; Wu, Yalei; Li, Yang; Liu, Meng

    2017-01-01

    In spacecraft electrical signal characteristic data, there exists a large amount of data with high-dimensional features, a high computational complexity degree, and a low rate of identification problems, which causes great difficulty in fault diagnosis of spacecraft electronic load systems. This paper proposes a feature extraction method that is based on deep belief networks (DBN) and a classification method that is based on the random forest (RF) algorithm; The proposed algorithm mainly employs a multi-layer neural network to reduce the dimension of the original data, and then, classification is applied. Firstly, we use the method of wavelet denoising, which was used to pre-process the data. Secondly, the deep belief network is used to reduce the feature dimension and improve the rate of classification for the electrical characteristics data. Finally, we used the random forest algorithm to classify the data and comparing it with other algorithms. The experimental results show that compared with other algorithms, the proposed method shows excellent performance in terms of accuracy, computational efficiency, and stability in addressing spacecraft electrical signal data. PMID:28486479

  19. Multi-label spacecraft electrical signal classification method based on DBN and random forest.

    PubMed

    Li, Ke; Yu, Nan; Li, Pengfei; Song, Shimin; Wu, Yalei; Li, Yang; Liu, Meng

    2017-01-01

    In spacecraft electrical signal characteristic data, there exists a large amount of data with high-dimensional features, a high computational complexity degree, and a low rate of identification problems, which causes great difficulty in fault diagnosis of spacecraft electronic load systems. This paper proposes a feature extraction method that is based on deep belief networks (DBN) and a classification method that is based on the random forest (RF) algorithm; The proposed algorithm mainly employs a multi-layer neural network to reduce the dimension of the original data, and then, classification is applied. Firstly, we use the method of wavelet denoising, which was used to pre-process the data. Secondly, the deep belief network is used to reduce the feature dimension and improve the rate of classification for the electrical characteristics data. Finally, we used the random forest algorithm to classify the data and comparing it with other algorithms. The experimental results show that compared with other algorithms, the proposed method shows excellent performance in terms of accuracy, computational efficiency, and stability in addressing spacecraft electrical signal data.

  20. Phase unwrapping with graph cuts optimization and dual decomposition acceleration for 3D high-resolution MRI data.

    PubMed

    Dong, Jianwu; Chen, Feng; Zhou, Dong; Liu, Tian; Yu, Zhaofei; Wang, Yi

    2017-03-01

    Existence of low SNR regions and rapid-phase variations pose challenges to spatial phase unwrapping algorithms. Global optimization-based phase unwrapping methods are widely used, but are significantly slower than greedy methods. In this paper, dual decomposition acceleration is introduced to speed up a three-dimensional graph cut-based phase unwrapping algorithm. The phase unwrapping problem is formulated as a global discrete energy minimization problem, whereas the technique of dual decomposition is used to increase the computational efficiency by splitting the full problem into overlapping subproblems and enforcing the congruence of overlapping variables. Using three dimensional (3D) multiecho gradient echo images from an agarose phantom and five brain hemorrhage patients, we compared this proposed method with an unaccelerated graph cut-based method. Experimental results show up to 18-fold acceleration in computation time. Dual decomposition significantly improves the computational efficiency of 3D graph cut-based phase unwrapping algorithms. Magn Reson Med 77:1353-1358, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.

  1. A hardware-algorithm co-design approach to optimize seizure detection algorithms for implantable applications.

    PubMed

    Raghunathan, Shriram; Gupta, Sumeet K; Markandeya, Himanshu S; Roy, Kaushik; Irazoqui, Pedro P

    2010-10-30

    Implantable neural prostheses that deliver focal electrical stimulation upon demand are rapidly emerging as an alternate therapy for roughly a third of the epileptic patient population that is medically refractory. Seizure detection algorithms enable feedback mechanisms to provide focally and temporally specific intervention. Real-time feasibility and computational complexity often limit most reported detection algorithms to implementations using computers for bedside monitoring or external devices communicating with the implanted electrodes. A comparison of algorithms based on detection efficacy does not present a complete picture of the feasibility of the algorithm with limited computational power, as is the case with most battery-powered applications. We present a two-dimensional design optimization approach that takes into account both detection efficacy and hardware cost in evaluating algorithms for their feasibility in an implantable application. Detection features are first compared for their ability to detect electrographic seizures from micro-electrode data recorded from kainate-treated rats. Circuit models are then used to estimate the dynamic and leakage power consumption of the compared features. A score is assigned based on detection efficacy and the hardware cost for each of the features, then plotted on a two-dimensional design space. An optimal combination of compared features is used to construct an algorithm that provides maximal detection efficacy per unit hardware cost. The methods presented in this paper would facilitate the development of a common platform to benchmark seizure detection algorithms for comparison and feasibility analysis in the next generation of implantable neuroprosthetic devices to treat epilepsy. Copyright © 2010 Elsevier B.V. All rights reserved.

  2. Fast computation algorithms for speckle pattern simulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nascov, Victor; Samoilă, Cornel; Ursuţiu, Doru

    2013-11-13

    We present our development of a series of efficient computation algorithms, generally usable to calculate light diffraction and particularly for speckle pattern simulation. We use mainly the scalar diffraction theory in the form of Rayleigh-Sommerfeld diffraction formula and its Fresnel approximation. Our algorithms are based on a special form of the convolution theorem and the Fast Fourier Transform. They are able to evaluate the diffraction formula much faster than by direct computation and we have circumvented the restrictions regarding the relative sizes of the input and output domains, met on commonly used procedures. Moreover, the input and output planes canmore » be tilted each to other and the output domain can be off-axis shifted.« less

  3. Hybrid-dual-fourier tomographic algorithm for a fast three-dimensionial optical image reconstruction in turbid media

    NASA Technical Reports Server (NTRS)

    Alfano, Robert R. (Inventor); Cai, Wei (Inventor)

    2007-01-01

    A reconstruction technique for reducing computation burden in the 3D image processes, wherein the reconstruction procedure comprises an inverse and a forward model. The inverse model uses a hybrid dual Fourier algorithm that combines a 2D Fourier inversion with a 1D matrix inversion to thereby provide high-speed inverse computations. The inverse algorithm uses a hybrid transfer to provide fast Fourier inversion for data of multiple sources and multiple detectors. The forward model is based on an analytical cumulant solution of a radiative transfer equation. The accurate analytical form of the solution to the radiative transfer equation provides an efficient formalism for fast computation of the forward model.

  4. Naturally selecting solutions: the use of genetic algorithms in bioinformatics.

    PubMed

    Manning, Timmy; Sleator, Roy D; Walsh, Paul

    2013-01-01

    For decades, computer scientists have looked to nature for biologically inspired solutions to computational problems; ranging from robotic control to scheduling optimization. Paradoxically, as we move deeper into the post-genomics era, the reverse is occurring, as biologists and bioinformaticians look to computational techniques, to solve a variety of biological problems. One of the most common biologically inspired techniques are genetic algorithms (GAs), which take the Darwinian concept of natural selection as the driving force behind systems for solving real world problems, including those in the bioinformatics domain. Herein, we provide an overview of genetic algorithms and survey some of the most recent applications of this approach to bioinformatics based problems.

  5. A novel computer algorithm for modeling and treating mandibular fractures: A pilot study.

    PubMed

    Rizzi, Christopher J; Ortlip, Timothy; Greywoode, Jewel D; Vakharia, Kavita T; Vakharia, Kalpesh T

    2017-02-01

    To describe a novel computer algorithm that can model mandibular fracture repair. To evaluate the algorithm as a tool to model mandibular fracture reduction and hardware selection. Retrospective pilot study combined with cross-sectional survey. A computer algorithm utilizing Aquarius Net (TeraRecon, Inc, Foster City, CA) and Adobe Photoshop CS6 (Adobe Systems, Inc, San Jose, CA) was developed to model mandibular fracture repair. Ten different fracture patterns were selected from nine patients who had already undergone mandibular fracture repair. The preoperative computed tomography (CT) images were processed with the computer algorithm to create virtual images that matched the actual postoperative three-dimensional CT images. A survey comparing the true postoperative image with the virtual postoperative images was created and administered to otolaryngology resident and attending physicians. They were asked to rate on a scale from 0 to 10 (0 = completely different; 10 = identical) the similarity between the two images in terms of the fracture reduction and fixation hardware. Ten mandible fracture cases were analyzed and processed. There were 15 survey respondents. The mean score for overall similarity between the images was 8.41 ± 0.91; the mean score for similarity of fracture reduction was 8.61 ± 0.98; and the mean score for hardware appearance was 8.27 ± 0.97. There were no significant differences between attending and resident responses. There were no significant differences based on fracture location. This computer algorithm can accurately model mandibular fracture repair. Images created by the algorithm are highly similar to true postoperative images. The algorithm can potentially assist a surgeon planning mandibular fracture repair. 4. Laryngoscope, 2016 127:331-336, 2017. © 2016 The American Laryngological, Rhinological and Otological Society, Inc.

  6. Customizing FP-growth algorithm to parallel mining with Charm++ library

    NASA Astrophysics Data System (ADS)

    Puścian, Marek

    2017-08-01

    This paper presents a frequent item mining algorithm that was customized to handle growing data repositories. The proposed solution applies Master Slave scheme to frequent pattern growth technique. Efficient utilization of available computation units is achieved by dynamic reallocation of tasks. Conditional frequent trees are assigned to parallel workers basing on their workload. Proposed enhancements have been successfully implemented using Charm++ library. This paper discusses results of the performance of parallelized FP-growth algorithm against different datasets. The approach has been illustrated with many experiments and measurements performed using multiprocessor and multithreaded computer.

  7. A fast algorithm for computer aided collimation gamma camera (CACAO)

    NASA Astrophysics Data System (ADS)

    Jeanguillaume, C.; Begot, S.; Quartuccio, M.; Douiri, A.; Franck, D.; Pihet, P.; Ballongue, P.

    2000-08-01

    The computer aided collimation gamma camera is aimed at breaking down the resolution sensitivity trade-off of the conventional parallel hole collimator. It uses larger and longer holes, having an added linear movement at the acquisition sequence. A dedicated algorithm including shift and sum, deconvolution, parabolic filtering and rotation is described. Examples of reconstruction are given. This work shows that a simple and fast algorithm, based on a diagonal dominant approximation of the problem can be derived. Its gives a practical solution to the CACAO reconstruction problem.

  8. Routing performance analysis and optimization within a massively parallel computer

    DOEpatents

    Archer, Charles Jens; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen

    2013-04-16

    An apparatus, program product and method optimize the operation of a massively parallel computer system by, in part, receiving actual performance data concerning an application executed by the plurality of interconnected nodes, and analyzing the actual performance data to identify an actual performance pattern. A desired performance pattern may be determined for the application, and an algorithm may be selected from among a plurality of algorithms stored within a memory, the algorithm being configured to achieve the desired performance pattern based on the actual performance data.

  9. A review on economic emission dispatch problems using quantum computational intelligence

    NASA Astrophysics Data System (ADS)

    Mahdi, Fahad Parvez; Vasant, Pandian; Kallimani, Vish; Abdullah-Al-Wadud, M.

    2016-11-01

    Economic emission dispatch (EED) problems are one of the most crucial problems in power systems. Growing energy demand, limitation of natural resources and global warming make this topic into the center of discussion and research. This paper reviews the use of Quantum Computational Intelligence (QCI) in solving Economic Emission Dispatch problems. QCI techniques like Quantum Genetic Algorithm (QGA) and Quantum Particle Swarm Optimization (QPSO) algorithm are discussed here. This paper will encourage the researcher to use more QCI based algorithm to get better optimal result for solving EED problems.

  10. Positive dwell time algorithm with minimum equal extra material removal in deterministic optical surfacing technology.

    PubMed

    Li, Longxiang; Xue, Donglin; Deng, Weijie; Wang, Xu; Bai, Yang; Zhang, Feng; Zhang, Xuejun

    2017-11-10

    In deterministic computer-controlled optical surfacing, accurate dwell time execution by computer numeric control machines is crucial in guaranteeing a high-convergence ratio for the optical surface error. It is necessary to consider the machine dynamics limitations in the numerical dwell time algorithms. In this paper, these constraints on dwell time distribution are analyzed, and a model of the equal extra material removal is established. A positive dwell time algorithm with minimum equal extra material removal is developed. Results of simulations based on deterministic magnetorheological finishing demonstrate the necessity of considering machine dynamics performance and illustrate the validity of the proposed algorithm. Indeed, the algorithm effectively facilitates the determinacy of sub-aperture optical surfacing processes.

  11. Ab initio molecular simulations with numeric atom-centered orbitals

    NASA Astrophysics Data System (ADS)

    Blum, Volker; Gehrke, Ralf; Hanke, Felix; Havu, Paula; Havu, Ville; Ren, Xinguo; Reuter, Karsten; Scheffler, Matthias

    2009-11-01

    We describe a complete set of algorithms for ab initio molecular simulations based on numerically tabulated atom-centered orbitals (NAOs) to capture a wide range of molecular and materials properties from quantum-mechanical first principles. The full algorithmic framework described here is embodied in the Fritz Haber Institute "ab initio molecular simulations" (FHI-aims) computer program package. Its comprehensive description should be relevant to any other first-principles implementation based on NAOs. The focus here is on density-functional theory (DFT) in the local and semilocal (generalized gradient) approximations, but an extension to hybrid functionals, Hartree-Fock theory, and MP2/GW electron self-energies for total energies and excited states is possible within the same underlying algorithms. An all-electron/full-potential treatment that is both computationally efficient and accurate is achieved for periodic and cluster geometries on equal footing, including relaxation and ab initio molecular dynamics. We demonstrate the construction of transferable, hierarchical basis sets, allowing the calculation to range from qualitative tight-binding like accuracy to meV-level total energy convergence with the basis set. Since all basis functions are strictly localized, the otherwise computationally dominant grid-based operations scale as O(N) with system size N. Together with a scalar-relativistic treatment, the basis sets provide access to all elements from light to heavy. Both low-communication parallelization of all real-space grid based algorithms and a ScaLapack-based, customized handling of the linear algebra for all matrix operations are possible, guaranteeing efficient scaling (CPU time and memory) up to massively parallel computer systems with thousands of CPUs.

  12. An open source software for fast grid-based data-mining in spatial epidemiology (FGBASE).

    PubMed

    Baker, David M; Valleron, Alain-Jacques

    2014-10-30

    Examining whether disease cases are clustered in space is an important part of epidemiological research. Another important part of spatial epidemiology is testing whether patients suffering from a disease are more, or less, exposed to environmental factors of interest than adequately defined controls. Both approaches involve determining the number of cases and controls (or population at risk) in specific zones. For cluster searches, this often must be done for millions of different zones. Doing this by calculating distances can lead to very lengthy computations. In this work we discuss the computational advantages of geographical grid-based methods, and introduce an open source software (FGBASE) which we have created for this purpose. Geographical grids based on the Lambert Azimuthal Equal Area projection are well suited for spatial epidemiology because they preserve area: each cell of the grid has the same area. We describe how data is projected onto such a grid, as well as grid-based algorithms for spatial epidemiological data-mining. The software program (FGBASE), that we have developed, implements these grid-based methods. The grid based algorithms perform extremely fast. This is particularly the case for cluster searches. When applied to a cohort of French Type 1 Diabetes (T1D) patients, as an example, the grid based algorithms detected potential clusters in a few seconds on a modern laptop. This compares very favorably to an equivalent cluster search using distance calculations instead of a grid, which took over 4 hours on the same computer. In the case study we discovered 4 potential clusters of T1D cases near the cities of Le Havre, Dunkerque, Toulouse and Nantes. One example of environmental analysis with our software was to study whether a significant association could be found between distance to vineyards with heavy pesticide. None was found. In both examples, the software facilitates the rapid testing of hypotheses. Grid-based algorithms for mining spatial epidemiological data provide advantages in terms of computational complexity thus improving the speed of computations. We believe that these methods and this software tool (FGBASE) will lower the computational barriers to entry for those performing epidemiological research.

  13. A new parallel DNA algorithm to solve the task scheduling problem based on inspired computational model.

    PubMed

    Wang, Zhaocai; Ji, Zuwen; Wang, Xiaoming; Wu, Tunhua; Huang, Wei

    2017-12-01

    As a promising approach to solve the computationally intractable problem, the method based on DNA computing is an emerging research area including mathematics, computer science and molecular biology. The task scheduling problem, as a well-known NP-complete problem, arranges n jobs to m individuals and finds the minimum execution time of last finished individual. In this paper, we use a biologically inspired computational model and describe a new parallel algorithm to solve the task scheduling problem by basic DNA molecular operations. In turn, we skillfully design flexible length DNA strands to represent elements of the allocation matrix, take appropriate biological experiment operations and get solutions of the task scheduling problem in proper length range with less than O(n 2 ) time complexity. Copyright © 2017. Published by Elsevier B.V.

  14. Object-based media and stream-based computing

    NASA Astrophysics Data System (ADS)

    Bove, V. Michael, Jr.

    1998-03-01

    Object-based media refers to the representation of audiovisual information as a collection of objects - the result of scene-analysis algorithms - and a script describing how they are to be rendered for display. Such multimedia presentations can adapt to viewing circumstances as well as to viewer preferences and behavior, and can provide a richer link between content creator and consumer. With faster networks and processors, such ideas become applicable to live interpersonal communications as well, creating a more natural and productive alternative to traditional videoconferencing. In this paper is outlined an example of object-based media algorithms and applications developed by my group, and present new hardware architectures and software methods that we have developed to enable meeting the computational requirements of object- based and other advanced media representations. In particular we describe stream-based processing, which enables automatic run-time parallelization of multidimensional signal processing tasks even given heterogenous computational resources.

  15. A Novel Color Image Encryption Algorithm Based on Quantum Chaos Sequence

    NASA Astrophysics Data System (ADS)

    Liu, Hui; Jin, Cong

    2017-03-01

    In this paper, a novel algorithm of image encryption based on quantum chaotic is proposed. The keystreams are generated by the two-dimensional logistic map as initial conditions and parameters. And then general Arnold scrambling algorithm with keys is exploited to permute the pixels of color components. In diffusion process, a novel encryption algorithm, folding algorithm, is proposed to modify the value of diffused pixels. In order to get the high randomness and complexity, the two-dimensional logistic map and quantum chaotic map are coupled with nearest-neighboring coupled-map lattices. Theoretical analyses and computer simulations confirm that the proposed algorithm has high level of security.

  16. Accelerating global optimization of aerodynamic shapes using a new surrogate-assisted parallel genetic algorithm

    NASA Astrophysics Data System (ADS)

    Ebrahimi, Mehdi; Jahangirian, Alireza

    2017-12-01

    An efficient strategy is presented for global shape optimization of wing sections with a parallel genetic algorithm. Several computational techniques are applied to increase the convergence rate and the efficiency of the method. A variable fidelity computational evaluation method is applied in which the expensive Navier-Stokes flow solver is complemented by an inexpensive multi-layer perceptron neural network for the objective function evaluations. A population dispersion method that consists of two phases, of exploration and refinement, is developed to improve the convergence rate and the robustness of the genetic algorithm. Owing to the nature of the optimization problem, a parallel framework based on the master/slave approach is used. The outcomes indicate that the method is able to find the global optimum with significantly lower computational time in comparison to the conventional genetic algorithm.

  17. A General Cross-Layer Cloud Scheduling Framework for Multiple IoT Computer Tasks.

    PubMed

    Wu, Guanlin; Bao, Weidong; Zhu, Xiaomin; Zhang, Xiongtao

    2018-05-23

    The diversity of IoT services and applications brings enormous challenges to improving the performance of multiple computer tasks' scheduling in cross-layer cloud computing systems. Unfortunately, the commonly-employed frameworks fail to adapt to the new patterns on the cross-layer cloud. To solve this issue, we design a new computer task scheduling framework for multiple IoT services in cross-layer cloud computing systems. Specifically, we first analyze the features of the cross-layer cloud and computer tasks. Then, we design the scheduling framework based on the analysis and present detailed models to illustrate the procedures of using the framework. With the proposed framework, the IoT services deployed in cross-layer cloud computing systems can dynamically select suitable algorithms and use resources more effectively to finish computer tasks with different objectives. Finally, the algorithms are given based on the framework, and extensive experiments are also given to validate its effectiveness, as well as its superiority.

  18. On some methods for improving time of reachability sets computation for the dynamic system control problem

    NASA Astrophysics Data System (ADS)

    Zimovets, Artem; Matviychuk, Alexander; Ushakov, Vladimir

    2016-12-01

    The paper presents two different approaches to reduce the time of computer calculation of reachability sets. First of these two approaches use different data structures for storing the reachability sets in the computer memory for calculation in single-threaded mode. Second approach is based on using parallel algorithms with reference to the data structures from the first approach. Within the framework of this paper parallel algorithm of approximate reachability set calculation on computer with SMP-architecture is proposed. The results of numerical modelling are presented in the form of tables which demonstrate high efficiency of parallel computing technology and also show how computing time depends on the used data structure.

  19. Time-domain analysis of planar microstrip devices using a generalized Yee-algorithm based on unstructured grids

    NASA Technical Reports Server (NTRS)

    Gedney, Stephen D.; Lansing, Faiza

    1993-01-01

    The generalized Yee-algorithm is presented for the temporal full-wave analysis of planar microstrip devices. This algorithm has the significant advantage over the traditional Yee-algorithm in that it is based on unstructured and irregular grids. The robustness of the generalized Yee-algorithm is that structures that contain curved conductors or complex three-dimensional geometries can be more accurately, and much more conveniently modeled using standard automatic grid generation techniques. This generalized Yee-algorithm is based on the the time-marching solution of the discrete form of Maxwell's equations in their integral form. To this end, the electric and magnetic fields are discretized over a dual, irregular, and unstructured grid. The primary grid is assumed to be composed of general fitted polyhedra distributed throughout the volume. The secondary grid (or dual grid) is built up of the closed polyhedra whose edges connect the centroid's of adjacent primary cells, penetrating shared faces. Faraday's law and Ampere's law are used to update the fields normal to the primary and secondary grid faces, respectively. Subsequently, a correction scheme is introduced to project the normal fields onto the grid edges. It is shown that this scheme is stable, maintains second-order accuracy, and preserves the divergenceless nature of the flux densities. Finally, for computational efficiency the algorithm is structured as a series of sparse matrix-vector multiplications. Based on this scheme, the generalized Yee-algorithm has been implemented on vector and parallel high performance computers in a highly efficient manner.

  20. Extension of a streamwise upwind algorithm to a moving grid system

    NASA Technical Reports Server (NTRS)

    Obayashi, Shigeru; Goorjian, Peter M.; Guruswamy, Guru P.

    1990-01-01

    A new streamwise upwind algorithm was derived to compute unsteady flow fields with the use of a moving-grid system. The temporally nonconservative LU-ADI (lower-upper-factored, alternating-direction-implicit) method was applied for time marching computations. A comparison of the temporally nonconservative method with a time-conservative implicit upwind method indicates that the solutions are insensitive to the conservative properties of the implicit solvers when practical time steps are used. Using this new method, computations were made for an oscillating wing at a transonic Mach number. The computed results confirm that the present upwind scheme captures the shock motion better than the central-difference scheme based on the beam-warming algorithm. The new upwind option of the code allows larger time-steps and thus is more efficient, even though it requires slightly more computational time per time step than the central-difference option.

  1. Experimental comparison of two quantum computing architectures.

    PubMed

    Linke, Norbert M; Maslov, Dmitri; Roetteler, Martin; Debnath, Shantanu; Figgatt, Caroline; Landsman, Kevin A; Wright, Kenneth; Monroe, Christopher

    2017-03-28

    We run a selection of algorithms on two state-of-the-art 5-qubit quantum computers that are based on different technology platforms. One is a publicly accessible superconducting transmon device (www. ibm.com/ibm-q) with limited connectivity, and the other is a fully connected trapped-ion system. Even though the two systems have different native quantum interactions, both can be programed in a way that is blind to the underlying hardware, thus allowing a comparison of identical quantum algorithms between different physical systems. We show that quantum algorithms and circuits that use more connectivity clearly benefit from a better-connected system of qubits. Although the quantum systems here are not yet large enough to eclipse classical computers, this experiment exposes critical factors of scaling quantum computers, such as qubit connectivity and gate expressivity. In addition, the results suggest that codesigning particular quantum applications with the hardware itself will be paramount in successfully using quantum computers in the future.

  2. Continuous-variable quantum Gaussian process regression and quantum singular value decomposition of nonsparse low-rank matrices

    NASA Astrophysics Data System (ADS)

    Das, Siddhartha; Siopsis, George; Weedbrook, Christian

    2018-02-01

    With the significant advancement in quantum computation during the past couple of decades, the exploration of machine-learning subroutines using quantum strategies has become increasingly popular. Gaussian process regression is a widely used technique in supervised classical machine learning. Here we introduce an algorithm for Gaussian process regression using continuous-variable quantum systems that can be realized with technology based on photonic quantum computers under certain assumptions regarding distribution of data and availability of efficient quantum access. Our algorithm shows that by using a continuous-variable quantum computer a dramatic speedup in computing Gaussian process regression can be achieved, i.e., the possibility of exponentially reducing the time to compute. Furthermore, our results also include a continuous-variable quantum-assisted singular value decomposition method of nonsparse low rank matrices and forms an important subroutine in our Gaussian process regression algorithm.

  3. Sensor placement algorithm development to maximize the efficiency of acid gas removal unit for integrated gasifiction combined sycle (IGCC) power plant with CO2 capture

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Paul, P.; Bhattacharyya, D.; Turton, R.

    2012-01-01

    Future integrated gasification combined cycle (IGCC) power plants with CO{sub 2} capture will face stricter operational and environmental constraints. Accurate values of relevant states/outputs/disturbances are needed to satisfy these constraints and to maximize the operational efficiency. Unfortunately, a number of these process variables cannot be measured while a number of them can be measured, but have low precision, reliability, or signal-to-noise ratio. In this work, a sensor placement (SP) algorithm is developed for optimal selection of sensor location, number, and type that can maximize the plant efficiency and result in a desired precision of the relevant measured/unmeasured states. In thismore » work, an SP algorithm is developed for an selective, dual-stage Selexol-based acid gas removal (AGR) unit for an IGCC plant with pre-combustion CO{sub 2} capture. A comprehensive nonlinear dynamic model of the AGR unit is developed in Aspen Plus Dynamics® (APD) and used to generate a linear state-space model that is used in the SP algorithm. The SP algorithm is developed with the assumption that an optimal Kalman filter will be implemented in the plant for state and disturbance estimation. The algorithm is developed assuming steady-state Kalman filtering and steady-state operation of the plant. The control system is considered to operate based on the estimated states and thereby, captures the effects of the SP algorithm on the overall plant efficiency. The optimization problem is solved by Genetic Algorithm (GA) considering both linear and nonlinear equality and inequality constraints. Due to the very large number of candidate sets available for sensor placement and because of the long time that it takes to solve the constrained optimization problem that includes more than 1000 states, solution of this problem is computationally expensive. For reducing the computation time, parallel computing is performed using the Distributed Computing Server (DCS®) and the Parallel Computing® toolbox from Mathworks®. In this presentation, we will share our experience in setting up parallel computing using GA in the MATLAB® environment and present the overall approach for achieving higher computational efficiency in this framework.« less

  4. Sensor placement algorithm development to maximize the efficiency of acid gas removal unit for integrated gasification combined cycle (IGCC) power plant with CO{sub 2} capture

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Paul, P.; Bhattacharyya, D.; Turton, R.

    2012-01-01

    Future integrated gasification combined cycle (IGCC) power plants with CO{sub 2} capture will face stricter operational and environmental constraints. Accurate values of relevant states/outputs/disturbances are needed to satisfy these constraints and to maximize the operational efficiency. Unfortunately, a number of these process variables cannot be measured while a number of them can be measured, but have low precision, reliability, or signal-to-noise ratio. In this work, a sensor placement (SP) algorithm is developed for optimal selection of sensor location, number, and type that can maximize the plant efficiency and result in a desired precision of the relevant measured/unmeasured states. In thismore » work, an SP algorithm is developed for an selective, dual-stage Selexol-based acid gas removal (AGR) unit for an IGCC plant with pre-combustion CO{sub 2} capture. A comprehensive nonlinear dynamic model of the AGR unit is developed in Aspen Plus Dynamics® (APD) and used to generate a linear state-space model that is used in the SP algorithm. The SP algorithm is developed with the assumption that an optimal Kalman filter will be implemented in the plant for state and disturbance estimation. The algorithm is developed assuming steady-state Kalman filtering and steady-state operation of the plant. The control system is considered to operate based on the estimated states and thereby, captures the effects of the SP algorithm on the overall plant efficiency. The optimization problem is solved by Genetic Algorithm (GA) considering both linear and nonlinear equality and inequality constraints. Due to the very large number of candidate sets available for sensor placement and because of the long time that it takes to solve the constrained optimization problem that includes more than 1000 states, solution of this problem is computationally expensive. For reducing the computation time, parallel computing is performed using the Distributed Computing Server (DCS®) and the Parallel Computing® toolbox from Mathworks®. In this presentation, we will share our experience in setting up parallel computing using GA in the MATLAB® environment and present the overall approach for achieving higher computational efficiency in this framework.« less

  5. Fast perceptual image hash based on cascade algorithm

    NASA Astrophysics Data System (ADS)

    Ruchay, Alexey; Kober, Vitaly; Yavtushenko, Evgeniya

    2017-09-01

    In this paper, we propose a perceptual image hash algorithm based on cascade algorithm, which can be applied in image authentication, retrieval, and indexing. Image perceptual hash uses for image retrieval in sense of human perception against distortions caused by compression, noise, common signal processing and geometrical modifications. The main disadvantage of perceptual hash is high time expenses. In the proposed cascade algorithm of image retrieval initializes with short hashes, and then a full hash is applied to the processed results. Computer simulation results show that the proposed hash algorithm yields a good performance in terms of robustness, discriminability, and time expenses.

  6. A combined direct/inverse three-dimensional transonic wing design method for vector computers

    NASA Technical Reports Server (NTRS)

    Weed, R. A.; Carlson, L. A.; Anderson, W. K.

    1984-01-01

    A three-dimensional transonic-wing design algorithm for vector computers is developed, and the results of sample computations are presented graphically. The method incorporates the direct/inverse scheme of Carlson (1975), a Cartesian grid system with boundary conditions applied at a mean plane, and a potential-flow solver based on the conservative form of the full potential equation and using the ZEBRA II vectorizable solution algorithm of South et al. (1980). The accuracy and consistency of the method with regard to direct and inverse analysis and trailing-edge closure are verified in the test computations.

  7. Computational Approaches to Simulation and Optimization of Global Aircraft Trajectories

    NASA Technical Reports Server (NTRS)

    Ng, Hok Kwan; Sridhar, Banavar

    2016-01-01

    This study examines three possible approaches to improving the speed in generating wind-optimal routes for air traffic at the national or global level. They are: (a) using the resources of a supercomputer, (b) running the computations on multiple commercially available computers and (c) implementing those same algorithms into NASAs Future ATM Concepts Evaluation Tool (FACET) and compares those to a standard implementation run on a single CPU. Wind-optimal aircraft trajectories are computed using global air traffic schedules. The run time and wait time on the supercomputer for trajectory optimization using various numbers of CPUs ranging from 80 to 10,240 units are compared with the total computational time for running the same computation on a single desktop computer and on multiple commercially available computers for potential computational enhancement through parallel processing on the computer clusters. This study also re-implements the trajectory optimization algorithm for further reduction of computational time through algorithm modifications and integrates that with FACET to facilitate the use of the new features which calculate time-optimal routes between worldwide airport pairs in a wind field for use with existing FACET applications. The implementations of trajectory optimization algorithms use MATLAB, Python, and Java programming languages. The performance evaluations are done by comparing their computational efficiencies and based on the potential application of optimized trajectories. The paper shows that in the absence of special privileges on a supercomputer, a cluster of commercially available computers provides a feasible approach for national and global air traffic system studies.

  8. Computer Applications in Balancing Chemical Equations.

    ERIC Educational Resources Information Center

    Kumar, David D.

    2001-01-01

    Discusses computer-based approaches to balancing chemical equations. Surveys 13 methods, 6 based on matrix, 2 interactive programs, 1 stand-alone system, 1 developed in algorithm in Basic, 1 based on design engineering, 1 written in HyperCard, and 1 prepared for the World Wide Web. (Contains 17 references.) (Author/YDS)

  9. A Subspace Pursuit–based Iterative Greedy Hierarchical Solution to the Neuromagnetic Inverse Problem

    PubMed Central

    Babadi, Behtash; Obregon-Henao, Gabriel; Lamus, Camilo; Hämäläinen, Matti S.; Brown, Emery N.; Purdon, Patrick L.

    2013-01-01

    Magnetoencephalography (MEG) is an important non-invasive method for studying activity within the human brain. Source localization methods can be used to estimate spatiotemporal activity from MEG measurements with high temporal resolution, but the spatial resolution of these estimates is poor due to the ill-posed nature of the MEG inverse problem. Recent developments in source localization methodology have emphasized temporal as well as spatial constraints to improve source localization accuracy, but these methods can be computationally intense. Solutions emphasizing spatial sparsity hold tremendous promise, since the underlying neurophysiological processes generating MEG signals are often sparse in nature, whether in the form of focal sources, or distributed sources representing large-scale functional networks. Recent developments in the theory of compressed sensing (CS) provide a rigorous framework to estimate signals with sparse structure. In particular, a class of CS algorithms referred to as greedy pursuit algorithms can provide both high recovery accuracy and low computational complexity. Greedy pursuit algorithms are difficult to apply directly to the MEG inverse problem because of the high-dimensional structure of the MEG source space and the high spatial correlation in MEG measurements. In this paper, we develop a novel greedy pursuit algorithm for sparse MEG source localization that overcomes these fundamental problems. This algorithm, which we refer to as the Subspace Pursuit-based Iterative Greedy Hierarchical (SPIGH) inverse solution, exhibits very low computational complexity while achieving very high localization accuracy. We evaluate the performance of the proposed algorithm using comprehensive simulations, as well as the analysis of human MEG data during spontaneous brain activity and somatosensory stimuli. These studies reveal substantial performance gains provided by the SPIGH algorithm in terms of computational complexity, localization accuracy, and robustness. PMID:24055554

  10. Large scale systems : a study of computer organizations for air traffic control applications.

    DOT National Transportation Integrated Search

    1971-06-01

    Based on current sizing estimates and tracking algorithms, some computer organizations applicable to future air traffic control computing systems are described and assessed. Hardware and software problem areas are defined and solutions are outlined.

  11. Computational Discovery of Materials Using the Firefly Algorithm

    NASA Astrophysics Data System (ADS)

    Avendaño-Franco, Guillermo; Romero, Aldo

    Our current ability to model physical phenomena accurately, the increase computational power and better algorithms are the driving forces behind the computational discovery and design of novel materials, allowing for virtual characterization before their realization in the laboratory. We present the implementation of a novel firefly algorithm, a population-based algorithm for global optimization for searching the structure/composition space. This novel computation-intensive approach naturally take advantage of concurrency, targeted exploration and still keeping enough diversity. We apply the new method in both periodic and non-periodic structures and we present the implementation challenges and solutions to improve efficiency. The implementation makes use of computational materials databases and network analysis to optimize the search and get insights about the geometric structure of local minima on the energy landscape. The method has been implemented in our software PyChemia, an open-source package for materials discovery. We acknowledge the support of DMREF-NSF 1434897 and the Donors of the American Chemical Society Petroleum Research Fund for partial support of this research under Contract 54075-ND10.

  12. Automatic Blocking Of QR and LU Factorizations for Locality

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yi, Q; Kennedy, K; You, H

    2004-03-26

    QR and LU factorizations for dense matrices are important linear algebra computations that are widely used in scientific applications. To efficiently perform these computations on modern computers, the factorization algorithms need to be blocked when operating on large matrices to effectively exploit the deep cache hierarchy prevalent in today's computer memory systems. Because both QR (based on Householder transformations) and LU factorization algorithms contain complex loop structures, few compilers can fully automate the blocking of these algorithms. Though linear algebra libraries such as LAPACK provides manually blocked implementations of these algorithms, by automatically generating blocked versions of the computations, moremore » benefit can be gained such as automatic adaptation of different blocking strategies. This paper demonstrates how to apply an aggressive loop transformation technique, dependence hoisting, to produce efficient blockings for both QR and LU with partial pivoting. We present different blocking strategies that can be generated by our optimizer and compare the performance of auto-blocked versions with manually tuned versions in LAPACK, both using reference BLAS, ATLAS BLAS and native BLAS specially tuned for the underlying machine architectures.« less

  13. Parallel fuzzy connected image segmentation on GPU

    PubMed Central

    Zhuge, Ying; Cao, Yong; Udupa, Jayaram K.; Miller, Robert W.

    2011-01-01

    Purpose: Image segmentation techniques using fuzzy connectedness (FC) principles have shown their effectiveness in segmenting a variety of objects in several large applications. However, one challenge in these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays, commodity graphics hardware provides a highly parallel computing environment. In this paper, the authors present a parallel fuzzy connected image segmentation algorithm implementation on NVIDIA’s compute unified device Architecture (cuda) platform for segmenting medical image data sets. Methods: In the FC algorithm, there are two major computational tasks: (i) computing the fuzzy affinity relations and (ii) computing the fuzzy connectedness relations. These two tasks are implemented as cuda kernels and executed on GPU. A dramatic improvement in speed for both tasks is achieved as a result. Results: Our experiments based on three data sets of small, medium, and large data size demonstrate the efficiency of the parallel algorithm, which achieves a speed-up factor of 24.4x, 18.1x, and 10.3x, correspondingly, for the three data sets on the NVIDIA Tesla C1060 over the implementation of the algorithm on CPU, and takes 0.25, 0.72, and 15.04 s, correspondingly, for the three data sets. Conclusions: The authors developed a parallel algorithm of the widely used fuzzy connected image segmentation method on the NVIDIA GPUs, which are far more cost- and speed-effective than both cluster of workstations and multiprocessing systems. A near-interactive speed of segmentation has been achieved, even for the large data set. PMID:21859037

  14. Parallel fuzzy connected image segmentation on GPU.

    PubMed

    Zhuge, Ying; Cao, Yong; Udupa, Jayaram K; Miller, Robert W

    2011-07-01

    Image segmentation techniques using fuzzy connectedness (FC) principles have shown their effectiveness in segmenting a variety of objects in several large applications. However, one challenge in these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays, commodity graphics hardware provides a highly parallel computing environment. In this paper, the authors present a parallel fuzzy connected image segmentation algorithm implementation on NVIDIA's compute unified device Architecture (CUDA) platform for segmenting medical image data sets. In the FC algorithm, there are two major computational tasks: (i) computing the fuzzy affinity relations and (ii) computing the fuzzy connectedness relations. These two tasks are implemented as CUDA kernels and executed on GPU. A dramatic improvement in speed for both tasks is achieved as a result. Our experiments based on three data sets of small, medium, and large data size demonstrate the efficiency of the parallel algorithm, which achieves a speed-up factor of 24.4x, 18.1x, and 10.3x, correspondingly, for the three data sets on the NVIDIA Tesla C1060 over the implementation of the algorithm on CPU, and takes 0.25, 0.72, and 15.04 s, correspondingly, for the three data sets. The authors developed a parallel algorithm of the widely used fuzzy connected image segmentation method on the NVIDIA GPUs, which are far more cost- and speed-effective than both cluster of workstations and multiprocessing systems. A near-interactive speed of segmentation has been achieved, even for the large data set.

  15. Correction of Visual Perception Based on Neuro-Fuzzy Learning for the Humanoid Robot TEO.

    PubMed

    Hernandez-Vicen, Juan; Martinez, Santiago; Garcia-Haro, Juan Miguel; Balaguer, Carlos

    2018-03-25

    New applications related to robotic manipulation or transportation tasks, with or without physical grasping, are continuously being developed. To perform these activities, the robot takes advantage of different kinds of perceptions. One of the key perceptions in robotics is vision. However, some problems related to image processing makes the application of visual information within robot control algorithms difficult. Camera-based systems have inherent errors that affect the quality and reliability of the information obtained. The need of correcting image distortion slows down image parameter computing, which decreases performance of control algorithms. In this paper, a new approach to correcting several sources of visual distortions on images in only one computing step is proposed. The goal of this system/algorithm is the computation of the tilt angle of an object transported by a robot, minimizing image inherent errors and increasing computing speed. After capturing the image, the computer system extracts the angle using a Fuzzy filter that corrects at the same time all possible distortions, obtaining the real angle in only one processing step. This filter has been developed by the means of Neuro-Fuzzy learning techniques, using datasets with information obtained from real experiments. In this way, the computing time has been decreased and the performance of the application has been improved. The resulting algorithm has been tried out experimentally in robot transportation tasks in the humanoid robot TEO (Task Environment Operator) from the University Carlos III of Madrid.

  16. Correction of Visual Perception Based on Neuro-Fuzzy Learning for the Humanoid Robot TEO

    PubMed Central

    2018-01-01

    New applications related to robotic manipulation or transportation tasks, with or without physical grasping, are continuously being developed. To perform these activities, the robot takes advantage of different kinds of perceptions. One of the key perceptions in robotics is vision. However, some problems related to image processing makes the application of visual information within robot control algorithms difficult. Camera-based systems have inherent errors that affect the quality and reliability of the information obtained. The need of correcting image distortion slows down image parameter computing, which decreases performance of control algorithms. In this paper, a new approach to correcting several sources of visual distortions on images in only one computing step is proposed. The goal of this system/algorithm is the computation of the tilt angle of an object transported by a robot, minimizing image inherent errors and increasing computing speed. After capturing the image, the computer system extracts the angle using a Fuzzy filter that corrects at the same time all possible distortions, obtaining the real angle in only one processing step. This filter has been developed by the means of Neuro-Fuzzy learning techniques, using datasets with information obtained from real experiments. In this way, the computing time has been decreased and the performance of the application has been improved. The resulting algorithm has been tried out experimentally in robot transportation tasks in the humanoid robot TEO (Task Environment Operator) from the University Carlos III of Madrid. PMID:29587392

  17. Research on retailer data clustering algorithm based on Spark

    NASA Astrophysics Data System (ADS)

    Huang, Qiuman; Zhou, Feng

    2017-03-01

    Big data analysis is a hot topic in the IT field now. Spark is a high-reliability and high-performance distributed parallel computing framework for big data sets. K-means algorithm is one of the classical partition methods in clustering algorithm. In this paper, we study the k-means clustering algorithm on Spark. Firstly, the principle of the algorithm is analyzed, and then the clustering analysis is carried out on the supermarket customers through the experiment to find out the different shopping patterns. At the same time, this paper proposes the parallelization of k-means algorithm and the distributed computing framework of Spark, and gives the concrete design scheme and implementation scheme. This paper uses the two-year sales data of a supermarket to validate the proposed clustering algorithm and achieve the goal of subdividing customers, and then analyze the clustering results to help enterprises to take different marketing strategies for different customer groups to improve sales performance.

  18. Real-time image dehazing using local adaptive neighborhoods and dark-channel-prior

    NASA Astrophysics Data System (ADS)

    Valderrama, Jesus A.; Díaz-Ramírez, Víctor H.; Kober, Vitaly; Hernandez, Enrique

    2015-09-01

    A real-time algorithm for single image dehazing is presented. The algorithm is based on calculation of local neighborhoods of a hazed image inside a moving window. The local neighborhoods are constructed by computing rank-order statistics. Next the dark-channel-prior approach is applied to the local neighborhoods to estimate the transmission function of the scene. By using the suggested approach there is no need for applying a refining algorithm to the estimated transmission such as the soft matting algorithm. To achieve high-rate signal processing the proposed algorithm is implemented exploiting massive parallelism on a graphics processing unit (GPU). Computer simulation results are carried out to test the performance of the proposed algorithm in terms of dehazing efficiency and speed of processing. These tests are performed using several synthetic and real images. The obtained results are analyzed and compared with those obtained with existing dehazing algorithms.

  19. Automatic human body modeling for vision-based motion capture system using B-spline parameterization of the silhouette

    NASA Astrophysics Data System (ADS)

    Jaume-i-Capó, Antoni; Varona, Javier; González-Hidalgo, Manuel; Mas, Ramon; Perales, Francisco J.

    2012-02-01

    Human motion capture has a wide variety of applications, and in vision-based motion capture systems a major issue is the human body model and its initialization. We present a computer vision algorithm for building a human body model skeleton in an automatic way. The algorithm is based on the analysis of the human shape. We decompose the body into its main parts by computing the curvature of a B-spline parameterization of the human contour. This algorithm has been applied in a context where the user is standing in front of a camera stereo pair. The process is completed after the user assumes a predefined initial posture so as to identify the main joints and construct the human model. Using this model, the initialization problem of a vision-based markerless motion capture system of the human body is solved.

  20. Video-based depression detection using local Curvelet binary patterns in pairwise orthogonal planes.

    PubMed

    Pampouchidou, Anastasia; Marias, Kostas; Tsiknakis, Manolis; Simos, Panagiotis; Fan Yang; Lemaitre, Guillaume; Meriaudeau, Fabrice

    2016-08-01

    Depression is an increasingly prevalent mood disorder. This is the reason why the field of computer-based depression assessment has been gaining the attention of the research community during the past couple of years. The present work proposes two algorithms for depression detection, one Frame-based and the second Video-based, both employing Curvelet transform and Local Binary Patterns. The main advantage of these methods is that they have significantly lower computational requirements, as the extracted features are of very low dimensionality. This is achieved by modifying the previously proposed algorithm which considers Three-Orthogonal-Planes, to only Pairwise-Orthogonal-Planes. Performance of the algorithms was tested on the benchmark dataset provided by the Audio/Visual Emotion Challenge 2014, with the person-specific system achieving 97.6% classification accuracy, and the person-independed one yielding promising preliminary results of 74.5% accuracy. The paper concludes with open issues, proposed solutions, and future plans.

  1. Peak-to-average power ratio reduction in orthogonal frequency division multiplexing-based visible light communication systems using a modified partial transmit sequence technique

    NASA Astrophysics Data System (ADS)

    Liu, Yan; Deng, Honggui; Ren, Shuang; Tang, Chengying; Qian, Xuewen

    2018-01-01

    We propose an efficient partial transmit sequence technique based on genetic algorithm and peak-value optimization algorithm (GAPOA) to reduce high peak-to-average power ratio (PAPR) in visible light communication systems based on orthogonal frequency division multiplexing (VLC-OFDM). By analysis of hill-climbing algorithm's pros and cons, we propose the POA with excellent local search ability to further process the signals whose PAPR is still over the threshold after processed by genetic algorithm (GA). To verify the effectiveness of the proposed technique and algorithm, we evaluate the PAPR performance and the bit error rate (BER) performance and compare them with partial transmit sequence (PTS) technique based on GA (GA-PTS), PTS technique based on genetic and hill-climbing algorithm (GH-PTS), and PTS based on shuffled frog leaping algorithm and hill-climbing algorithm (SFLAHC-PTS). The results show that our technique and algorithm have not only better PAPR performance but also lower computational complexity and BER than GA-PTS, GH-PTS, and SFLAHC-PTS technique.

  2. Learning-based computing techniques in geoid modeling for precise height transformation

    NASA Astrophysics Data System (ADS)

    Erol, B.; Erol, S.

    2013-03-01

    Precise determination of local geoid is of particular importance for establishing height control in geodetic GNSS applications, since the classical leveling technique is too laborious. A geoid model can be accurately obtained employing properly distributed benchmarks having GNSS and leveling observations using an appropriate computing algorithm. Besides the classical multivariable polynomial regression equations (MPRE), this study attempts an evaluation of learning based computing algorithms: artificial neural networks (ANNs), adaptive network-based fuzzy inference system (ANFIS) and especially the wavelet neural networks (WNNs) approach in geoid surface approximation. These algorithms were developed parallel to advances in computer technologies and recently have been used for solving complex nonlinear problems of many applications. However, they are rather new in dealing with precise modeling problem of the Earth gravity field. In the scope of the study, these methods were applied to Istanbul GPS Triangulation Network data. The performances of the methods were assessed considering the validation results of the geoid models at the observation points. In conclusion the ANFIS and WNN revealed higher prediction accuracies compared to ANN and MPRE methods. Beside the prediction capabilities, these methods were also compared and discussed from the practical point of view in conclusions.

  3. Computation of Quasi-Periodic Normally Hyperbolic Invariant Tori: Algorithms, Numerical Explorations and Mechanisms of Breakdown

    NASA Astrophysics Data System (ADS)

    Canadell, Marta; Haro, Àlex

    2017-12-01

    We present several algorithms for computing normally hyperbolic invariant tori carrying quasi-periodic motion of a fixed frequency in families of dynamical systems. The algorithms are based on a KAM scheme presented in Canadell and Haro (J Nonlinear Sci, 2016. doi: 10.1007/s00332-017-9389-y), to find the parameterization of the torus with prescribed dynamics by detuning parameters of the model. The algorithms use different hyperbolicity and reducibility properties and, in particular, compute also the invariant bundles and Floquet transformations. We implement these methods in several 2-parameter families of dynamical systems, to compute quasi-periodic arcs, that is, the parameters for which 1D normally hyperbolic invariant tori with a given fixed frequency do exist. The implementation lets us to perform the continuations up to the tip of the quasi-periodic arcs, for which the invariant curves break down. Three different mechanisms of breakdown are analyzed, using several observables, leading to several conjectures.

  4. Automated flight path planning for virtual endoscopy.

    PubMed

    Paik, D S; Beaulieu, C F; Jeffrey, R B; Rubin, G D; Napel, S

    1998-05-01

    In this paper, a novel technique for rapid and automatic computation of flight paths for guiding virtual endoscopic exploration of three-dimensional medical images is described. While manually planning flight paths is a tedious and time consuming task, our algorithm is automated and fast. Our method for positioning the virtual camera is based on the medial axis transform but is much more computationally efficient. By iteratively correcting a path toward the medial axis, the necessity of evaluating simple point criteria during morphological thinning is eliminated. The virtual camera is also oriented in a stable viewing direction, avoiding sudden twists and turns. We tested our algorithm on volumetric data sets of eight colons, one aorta and one bronchial tree. The algorithm computed the flight paths in several minutes per volume on an inexpensive workstation with minimal computation time added for multiple paths through branching structures (10%-13% per extra path). The results of our algorithm are smooth, centralized paths that aid in the task of navigation in virtual endoscopic exploration of three-dimensional medical images.

  5. Nonlinear Dynamic Model-Based Multiobjective Sensor Network Design Algorithm for a Plant with an Estimator-Based Control System

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Paul, Prokash; Bhattacharyya, Debangsu; Turton, Richard

    Here, a novel sensor network design (SND) algorithm is developed for maximizing process efficiency while minimizing sensor network cost for a nonlinear dynamic process with an estimator-based control system. The multiobjective optimization problem is solved following a lexicographic approach where the process efficiency is maximized first followed by minimization of the sensor network cost. The partial net present value, which combines the capital cost due to the sensor network and the operating cost due to deviation from the optimal efficiency, is proposed as an alternative objective. The unscented Kalman filter is considered as the nonlinear estimator. The large-scale combinatorial optimizationmore » problem is solved using a genetic algorithm. The developed SND algorithm is applied to an acid gas removal (AGR) unit as part of an integrated gasification combined cycle (IGCC) power plant with CO 2 capture. Due to the computational expense, a reduced order nonlinear model of the AGR process is identified and parallel computation is performed during implementation.« less

  6. Nonlinear Dynamic Model-Based Multiobjective Sensor Network Design Algorithm for a Plant with an Estimator-Based Control System

    DOE PAGES

    Paul, Prokash; Bhattacharyya, Debangsu; Turton, Richard; ...

    2017-06-06

    Here, a novel sensor network design (SND) algorithm is developed for maximizing process efficiency while minimizing sensor network cost for a nonlinear dynamic process with an estimator-based control system. The multiobjective optimization problem is solved following a lexicographic approach where the process efficiency is maximized first followed by minimization of the sensor network cost. The partial net present value, which combines the capital cost due to the sensor network and the operating cost due to deviation from the optimal efficiency, is proposed as an alternative objective. The unscented Kalman filter is considered as the nonlinear estimator. The large-scale combinatorial optimizationmore » problem is solved using a genetic algorithm. The developed SND algorithm is applied to an acid gas removal (AGR) unit as part of an integrated gasification combined cycle (IGCC) power plant with CO 2 capture. Due to the computational expense, a reduced order nonlinear model of the AGR process is identified and parallel computation is performed during implementation.« less

  7. Optimal Golomb Ruler Sequences Generation for Optical WDM Systems: A Novel Parallel Hybrid Multi-objective Bat Algorithm

    NASA Astrophysics Data System (ADS)

    Bansal, Shonak; Singh, Arun Kumar; Gupta, Neena

    2017-02-01

    In real-life, multi-objective engineering design problems are very tough and time consuming optimization problems due to their high degree of nonlinearities, complexities and inhomogeneity. Nature-inspired based multi-objective optimization algorithms are now becoming popular for solving multi-objective engineering design problems. This paper proposes original multi-objective Bat algorithm (MOBA) and its extended form, namely, novel parallel hybrid multi-objective Bat algorithm (PHMOBA) to generate shortest length Golomb ruler called optimal Golomb ruler (OGR) sequences at a reasonable computation time. The OGRs found their application in optical wavelength division multiplexing (WDM) systems as channel-allocation algorithm to reduce the four-wave mixing (FWM) crosstalk. The performances of both the proposed algorithms to generate OGRs as optical WDM channel-allocation is compared with other existing classical computing and nature-inspired algorithms, including extended quadratic congruence (EQC), search algorithm (SA), genetic algorithms (GAs), biogeography based optimization (BBO) and big bang-big crunch (BB-BC) optimization algorithms. Simulations conclude that the proposed parallel hybrid multi-objective Bat algorithm works efficiently as compared to original multi-objective Bat algorithm and other existing algorithms to generate OGRs for optical WDM systems. The algorithm PHMOBA to generate OGRs, has higher convergence and success rate than original MOBA. The efficiency improvement of proposed PHMOBA to generate OGRs up to 20-marks, in terms of ruler length and total optical channel bandwidth (TBW) is 100 %, whereas for original MOBA is 85 %. Finally the implications for further research are also discussed.

  8. Comparison of four machine learning algorithms for their applicability in satellite-based optical rainfall retrievals

    NASA Astrophysics Data System (ADS)

    Meyer, Hanna; Kühnlein, Meike; Appelhans, Tim; Nauss, Thomas

    2016-03-01

    Machine learning (ML) algorithms have successfully been demonstrated to be valuable tools in satellite-based rainfall retrievals which show the practicability of using ML algorithms when faced with high dimensional and complex data. Moreover, recent developments in parallel computing with ML present new possibilities for training and prediction speed and therefore make their usage in real-time systems feasible. This study compares four ML algorithms - random forests (RF), neural networks (NNET), averaged neural networks (AVNNET) and support vector machines (SVM) - for rainfall area detection and rainfall rate assignment using MSG SEVIRI data over Germany. Satellite-based proxies for cloud top height, cloud top temperature, cloud phase and cloud water path serve as predictor variables. The results indicate an overestimation of rainfall area delineation regardless of the ML algorithm (averaged bias = 1.8) but a high probability of detection ranging from 81% (SVM) to 85% (NNET). On a 24-hour basis, the performance of the rainfall rate assignment yielded R2 values between 0.39 (SVM) and 0.44 (AVNNET). Though the differences in the algorithms' performance were rather small, NNET and AVNNET were identified as the most suitable algorithms. On average, they demonstrated the best performance in rainfall area delineation as well as in rainfall rate assignment. NNET's computational speed is an additional advantage in work with large datasets such as in remote sensing based rainfall retrievals. However, since no single algorithm performed considerably better than the others we conclude that further research in providing suitable predictors for rainfall is of greater necessity than an optimization through the choice of the ML algorithm.

  9. Adaptive control for eye-gaze input system

    NASA Astrophysics Data System (ADS)

    Zhao, Qijie; Tu, Dawei; Yin, Hairong

    2004-01-01

    The characteristics of the vision-based human-computer interaction system have been analyzed, and the practical application and its limited factors at present time have also been mentioned. The information process methods have been put forward. In order to make the communication flexible and spontaneous, the algorithms to adaptive control of user"s head movement has been designed, and the events-based methods and object-oriented computer language is used to develop the system software, by experiment testing, we found that under given condition, these methods and algorithms can meet the need of the HCI.

  10. Research on conceptual/innovative design for the life cycle

    NASA Technical Reports Server (NTRS)

    Cagan, Jonathan; Agogino, Alice M.

    1990-01-01

    The goal of this research is developing and integrating qualitative and quantitative methods for life cycle design. The definition of the problem includes formal computer-based methods limited to final detailing stages of design; CAD data bases do not capture design intent or design history; and life cycle issues were ignored during early stages of design. Viewgraphs outline research in conceptual design; the SYMON (SYmbolic MONotonicity analyzer) algorithm; multistart vector quantization optimization algorithm; intelligent manufacturing: IDES - Influence Diagram Architecture; and 1st PRINCE (FIRST PRINciple Computational Evaluator).

  11. Vehicle routing problem with time windows using natural inspired algorithms

    NASA Astrophysics Data System (ADS)

    Pratiwi, A. B.; Pratama, A.; Sa’diyah, I.; Suprajitno, H.

    2018-03-01

    Process of distribution of goods needs a strategy to make the total cost spent for operational activities minimized. But there are several constrains have to be satisfied which are the capacity of the vehicles and the service time of the customers. This Vehicle Routing Problem with Time Windows (VRPTW) gives complex constrains problem. This paper proposes natural inspired algorithms for dealing with constrains of VRPTW which involves Bat Algorithm and Cat Swarm Optimization. Bat Algorithm is being hybrid with Simulated Annealing, the worst solution of Bat Algorithm is replaced by the solution from Simulated Annealing. Algorithm which is based on behavior of cats, Cat Swarm Optimization, is improved using Crow Search Algorithm to make simplier and faster convergence. From the computational result, these algorithms give good performances in finding the minimized total distance. Higher number of population causes better computational performance. The improved Cat Swarm Optimization with Crow Search gives better performance than the hybridization of Bat Algorithm and Simulated Annealing in dealing with big data.

  12. An Orthogonal Evolutionary Algorithm With Learning Automata for Multiobjective Optimization.

    PubMed

    Dai, Cai; Wang, Yuping; Ye, Miao; Xue, Xingsi; Liu, Hailin

    2016-12-01

    Research on multiobjective optimization problems becomes one of the hottest topics of intelligent computation. In order to improve the search efficiency of an evolutionary algorithm and maintain the diversity of solutions, in this paper, the learning automata (LA) is first used for quantization orthogonal crossover (QOX), and a new fitness function based on decomposition is proposed to achieve these two purposes. Based on these, an orthogonal evolutionary algorithm with LA for complex multiobjective optimization problems with continuous variables is proposed. The experimental results show that in continuous states, the proposed algorithm is able to achieve accurate Pareto-optimal sets and wide Pareto-optimal fronts efficiently. Moreover, the comparison with the several existing well-known algorithms: nondominated sorting genetic algorithm II, decomposition-based multiobjective evolutionary algorithm, decomposition-based multiobjective evolutionary algorithm with an ensemble of neighborhood sizes, multiobjective optimization by LA, and multiobjective immune algorithm with nondominated neighbor-based selection, on 15 multiobjective benchmark problems, shows that the proposed algorithm is able to find more accurate and evenly distributed Pareto-optimal fronts than the compared ones.

  13. A post-processing algorithm for time domain pitch trackers

    NASA Astrophysics Data System (ADS)

    Specker, P.

    1983-01-01

    This paper describes a powerful post-processing algorithm for time-domain pitch trackers. On two successive passes, the post-processing algorithm eliminates errors produced during a first pass by a time-domain pitch tracker. During the second pass, incorrect pitch values are detected as outliers by computing the distribution of values over a sliding 80 msec window. During the third pass (based on artificial intelligence techniques), remaining pitch pulses are used as anchor points to reconstruct the pitch train from the original waveform. The algorithm produced a decrease in the error rate from 21% obtained with the original time domain pitch tracker to 2% for isolated words and sentences produced in an office environment by 3 male and 3 female talkers. In a noisy computer room errors decreased from 52% to 2.9% for the same stimuli produced by 2 male talkers. The algorithm is efficient, accurate, and resistant to noise. The fundamental frequency micro-structure is tracked sufficiently well to be used in extracting phonetic features in a feature-based recognition system.

  14. Concurrent computation of attribute filters on shared memory parallel machines.

    PubMed

    Wilkinson, Michael H F; Gao, Hui; Hesselink, Wim H; Jonker, Jan-Eppo; Meijster, Arnold

    2008-10-01

    Morphological attribute filters have not previously been parallelized, mainly because they are both global and non-separable. We propose a parallel algorithm that achieves efficient parallelism for a large class of attribute filters, including attribute openings, closings, thinnings and thickenings, based on Salembier's Max-Trees and Min-trees. The image or volume is first partitioned in multiple slices. We then compute the Max-trees of each slice using any sequential Max-Tree algorithm. Subsequently, the Max-trees of the slices can be merged to obtain the Max-tree of the image. A C-implementation yielded good speed-ups on both a 16-processor MIPS 14000 parallel machine, and a dual-core Opteron-based machine. It is shown that the speed-up of the parallel algorithm is a direct measure of the gain with respect to the sequential algorithm used. Furthermore, the concurrent algorithm shows a speed gain of up to 72 percent on a single-core processor, due to reduced cache thrashing.

  15. Acceleration of image-based resolution modelling reconstruction using an expectation maximization nested algorithm.

    PubMed

    Angelis, G I; Reader, A J; Markiewicz, P J; Kotasidis, F A; Lionheart, W R; Matthews, J C

    2013-08-07

    Recent studies have demonstrated the benefits of a resolution model within iterative reconstruction algorithms in an attempt to account for effects that degrade the spatial resolution of the reconstructed images. However, these algorithms suffer from slower convergence rates, compared to algorithms where no resolution model is used, due to the additional need to solve an image deconvolution problem. In this paper, a recently proposed algorithm, which decouples the tomographic and image deconvolution problems within an image-based expectation maximization (EM) framework, was evaluated. This separation is convenient, because more computational effort can be placed on the image deconvolution problem and therefore accelerate convergence. Since the computational cost of solving the image deconvolution problem is relatively small, multiple image-based EM iterations do not significantly increase the overall reconstruction time. The proposed algorithm was evaluated using 2D simulations, as well as measured 3D data acquired on the high-resolution research tomograph. Results showed that bias reduction can be accelerated by interleaving multiple iterations of the image-based EM algorithm solving the resolution model problem, with a single EM iteration solving the tomographic problem. Significant improvements were observed particularly for voxels that were located on the boundaries between regions of high contrast within the object being imaged and for small regions of interest, where resolution recovery is usually more challenging. Minor differences were observed using the proposed nested algorithm, compared to the single iteration normally performed, when an optimal number of iterations are performed for each algorithm. However, using the proposed nested approach convergence is significantly accelerated enabling reconstruction using far fewer tomographic iterations (up to 70% fewer iterations for small regions). Nevertheless, the optimal number of nested image-based EM iterations is hard to be defined and it should be selected according to the given application.

  16. Local spatio-temporal analysis in vision systems

    NASA Astrophysics Data System (ADS)

    Geisler, Wilson S.; Bovik, Alan; Cormack, Lawrence; Ghosh, Joydeep; Gildeen, David

    1994-07-01

    The aims of this project are the following: (1) develop a physiologically and psychophysically based model of low-level human visual processing (a key component of which are local frequency coding mechanisms); (2) develop image models and image-processing methods based upon local frequency coding; (3) develop algorithms for performing certain complex visual tasks based upon local frequency representations, (4) develop models of human performance in certain complex tasks based upon our understanding of low-level processing; and (5) develop a computational testbed for implementing, evaluating and visualizing the proposed models and algorithms, using a massively parallel computer. Progress has been substantial on all aims. The highlights include the following: (1) completion of a number of psychophysical and physiological experiments revealing new, systematic and exciting properties of the primate (human and monkey) visual system; (2) further development of image models that can accurately represent the local frequency structure in complex images; (3) near completion in the construction of the Texas Active Vision Testbed; (4) development and testing of several new computer vision algorithms dealing with shape-from-texture, shape-from-stereo, and depth-from-focus; (5) implementation and evaluation of several new models of human visual performance; and (6) evaluation, purchase and installation of a MasPar parallel computer.

  17. High-Speed GPU-Based Fully Three-Dimensional Diffuse Optical Tomographic System

    PubMed Central

    Saikia, Manob Jyoti; Kanhirodan, Rajan; Mohan Vasu, Ram

    2014-01-01

    We have developed a graphics processor unit (GPU-) based high-speed fully 3D system for diffuse optical tomography (DOT). The reduction in execution time of 3D DOT algorithm, a severely ill-posed problem, is made possible through the use of (1) an algorithmic improvement that uses Broyden approach for updating the Jacobian matrix and thereby updating the parameter matrix and (2) the multinode multithreaded GPU and CUDA (Compute Unified Device Architecture) software architecture. Two different GPU implementations of DOT programs are developed in this study: (1) conventional C language program augmented by GPU CUDA and CULA routines (C GPU), (2) MATLAB program supported by MATLAB parallel computing toolkit for GPU (MATLAB GPU). The computation time of the algorithm on host CPU and the GPU system is presented for C and Matlab implementations. The forward computation uses finite element method (FEM) and the problem domain is discretized into 14610, 30823, and 66514 tetrahedral elements. The reconstruction time, so achieved for one iteration of the DOT reconstruction for 14610 elements, is 0.52 seconds for a C based GPU program for 2-plane measurements. The corresponding MATLAB based GPU program took 0.86 seconds. The maximum number of reconstructed frames so achieved is 2 frames per second. PMID:24891848

  18. High-Speed GPU-Based Fully Three-Dimensional Diffuse Optical Tomographic System.

    PubMed

    Saikia, Manob Jyoti; Kanhirodan, Rajan; Mohan Vasu, Ram

    2014-01-01

    We have developed a graphics processor unit (GPU-) based high-speed fully 3D system for diffuse optical tomography (DOT). The reduction in execution time of 3D DOT algorithm, a severely ill-posed problem, is made possible through the use of (1) an algorithmic improvement that uses Broyden approach for updating the Jacobian matrix and thereby updating the parameter matrix and (2) the multinode multithreaded GPU and CUDA (Compute Unified Device Architecture) software architecture. Two different GPU implementations of DOT programs are developed in this study: (1) conventional C language program augmented by GPU CUDA and CULA routines (C GPU), (2) MATLAB program supported by MATLAB parallel computing toolkit for GPU (MATLAB GPU). The computation time of the algorithm on host CPU and the GPU system is presented for C and Matlab implementations. The forward computation uses finite element method (FEM) and the problem domain is discretized into 14610, 30823, and 66514 tetrahedral elements. The reconstruction time, so achieved for one iteration of the DOT reconstruction for 14610 elements, is 0.52 seconds for a C based GPU program for 2-plane measurements. The corresponding MATLAB based GPU program took 0.86 seconds. The maximum number of reconstructed frames so achieved is 2 frames per second.

  19. Phase-unwrapping algorithm by a rounding-least-squares approach

    NASA Astrophysics Data System (ADS)

    Juarez-Salazar, Rigoberto; Robledo-Sanchez, Carlos; Guerrero-Sanchez, Fermin

    2014-02-01

    A simple and efficient phase-unwrapping algorithm based on a rounding procedure and a global least-squares minimization is proposed. Instead of processing the gradient of the wrapped phase, this algorithm operates over the gradient of the phase jumps by a robust and noniterative scheme. Thus, the residue-spreading and over-smoothing effects are reduced. The algorithm's performance is compared with four well-known phase-unwrapping methods: minimum cost network flow (MCNF), fast Fourier transform (FFT), quality-guided, and branch-cut. A computer simulation and experimental results show that the proposed algorithm reaches a high-accuracy level than the MCNF method by a low-computing time similar to the FFT phase-unwrapping method. Moreover, since the proposed algorithm is simple, fast, and user-free, it could be used in metrological interferometric and fringe-projection automatic real-time applications.

  20. Management of Computer-Based Instruction: Design of an Adaptive Control Strategy.

    ERIC Educational Resources Information Center

    Tennyson, Robert D.; Rothen, Wolfgang

    1979-01-01

    Theoretical and research literature on learner, program, and adaptive control as forms of instructional management are critiqued in reference to the design of computer-based instruction. An adaptive control strategy using an online, iterative algorithmic model is proposed. (RAO)

  1. Fundamentals and Recent Developments in Approximate Bayesian Computation

    PubMed Central

    Lintusaari, Jarno; Gutmann, Michael U.; Dutta, Ritabrata; Kaski, Samuel; Corander, Jukka

    2017-01-01

    Abstract Bayesian inference plays an important role in phylogenetics, evolutionary biology, and in many other branches of science. It provides a principled framework for dealing with uncertainty and quantifying how it changes in the light of new evidence. For many complex models and inference problems, however, only approximate quantitative answers are obtainable. Approximate Bayesian computation (ABC) refers to a family of algorithms for approximate inference that makes a minimal set of assumptions by only requiring that sampling from a model is possible. We explain here the fundamentals of ABC, review the classical algorithms, and highlight recent developments. [ABC; approximate Bayesian computation; Bayesian inference; likelihood-free inference; phylogenetics; simulator-based models; stochastic simulation models; tree-based models.] PMID:28175922

  2. Machining fixture layout optimization using particle swarm optimization algorithm

    NASA Astrophysics Data System (ADS)

    Dou, Jianping; Wang, Xingsong; Wang, Lei

    2011-05-01

    Optimization of fixture layout (locator and clamp locations) is critical to reduce geometric error of the workpiece during machining process. In this paper, the application of particle swarm optimization (PSO) algorithm is presented to minimize the workpiece deformation in the machining region. A PSO based approach is developed to optimize fixture layout through integrating ANSYS parametric design language (APDL) of finite element analysis to compute the objective function for a given fixture layout. Particle library approach is used to decrease the total computation time. The computational experiment of 2D case shows that the numbers of function evaluations are decreased about 96%. Case study illustrates the effectiveness and efficiency of the PSO based optimization approach.

  3. GCALIGNER 1.0: an alignment program to compute a multiple sample comparison data matrix from large eco-chemical datasets obtained by GC.

    PubMed

    Dellicour, Simon; Lecocq, Thomas

    2013-10-01

    GCALIGNER 1.0 is a computer program designed to perform a preliminary data comparison matrix of chemical data obtained by GC without MS information. The alignment algorithm is based on the comparison between the retention times of each detected compound in a sample. In this paper, we test the GCALIGNER efficiency on three datasets of the chemical secretions of bumble bees. The algorithm performs the alignment with a low error rate (<3%). GCALIGNER 1.0 is a useful, simple and free program based on an algorithm that enables the alignment of table-type data from GC. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. Hybrid glowworm swarm optimization for task scheduling in the cloud environment

    NASA Astrophysics Data System (ADS)

    Zhou, Jing; Dong, Shoubin

    2018-06-01

    In recent years many heuristic algorithms have been proposed to solve task scheduling problems in the cloud environment owing to their optimization capability. This article proposes a hybrid glowworm swarm optimization (HGSO) based on glowworm swarm optimization (GSO), which uses a technique of evolutionary computation, a strategy of quantum behaviour based on the principle of neighbourhood, offspring production and random walk, to achieve more efficient scheduling with reasonable scheduling costs. The proposed HGSO reduces the redundant computation and the dependence on the initialization of GSO, accelerates the convergence and more easily escapes from local optima. The conducted experiments and statistical analysis showed that in most cases the proposed HGSO algorithm outperformed previous heuristic algorithms to deal with independent tasks.

  5. Space Object Maneuver Detection Algorithms Using TLE Data

    NASA Astrophysics Data System (ADS)

    Pittelkau, M.

    2016-09-01

    An important aspect of Space Situational Awareness (SSA) is detection of deliberate and accidental orbit changes of space objects. Although space surveillance systems detect orbit maneuvers within their tracking algorithms, maneuver data are not readily disseminated for general use. However, two-line element (TLE) data is available and can be used to detect maneuvers of space objects. This work is an attempt to improve upon existing TLE-based maneuver detection algorithms. Three adaptive maneuver detection algorithms are developed and evaluated: The first is a fading-memory Kalman filter, which is equivalent to the sliding-window least-squares polynomial fit, but computationally more efficient and adaptive to the noise in the TLE data. The second algorithm is based on a sample cumulative distribution function (CDF) computed from a histogram of the magnitude-squared |V|2 of change-in-velocity vectors (V), which is computed from the TLE data. A maneuver detection threshold is computed from the median estimated from the CDF, or from the CDF and a specified probability of false alarm. The third algorithm is a median filter. The median filter is the simplest of a class of nonlinear filters called order statistics filters, which is within the theory of robust statistics. The output of the median filter is practically insensitive to outliers, or large maneuvers. The median of the |V|2 data is proportional to the variance of the V, so the variance is estimated from the output of the median filter. A maneuver is detected when the input data exceeds a constant times the estimated variance.

  6. Using Mathematics to Make Computing on Encrypted Data Secure and Practical

    DTIC Science & Technology

    2015-12-01

    LLL) lattice basis reduction algorithm, G-Lattice, Cryptography , Security, Gentry-Szydlo Algorithm, Ring-LWE 16. SECURITY CLASSIFICATION OF: 17...with symmetry be further developed, in order to quantify the security of lattice-based cryptography , including especially the security of homomorphic...the Gentry-Szydlo algorithm, and the ideas should be applicable to a range of questions in cryptography . The new algorithm of Lenstra and Silverberg

  7. Π4U: A high performance computing framework for Bayesian uncertainty quantification of complex models

    NASA Astrophysics Data System (ADS)

    Hadjidoukas, P. E.; Angelikopoulos, P.; Papadimitriou, C.; Koumoutsakos, P.

    2015-03-01

    We present Π4U, an extensible framework, for non-intrusive Bayesian Uncertainty Quantification and Propagation (UQ+P) of complex and computationally demanding physical models, that can exploit massively parallel computer architectures. The framework incorporates Laplace asymptotic approximations as well as stochastic algorithms, along with distributed numerical differentiation and task-based parallelism for heterogeneous clusters. Sampling is based on the Transitional Markov Chain Monte Carlo (TMCMC) algorithm and its variants. The optimization tasks associated with the asymptotic approximations are treated via the Covariance Matrix Adaptation Evolution Strategy (CMA-ES). A modified subset simulation method is used for posterior reliability measurements of rare events. The framework accommodates scheduling of multiple physical model evaluations based on an adaptive load balancing library and shows excellent scalability. In addition to the software framework, we also provide guidelines as to the applicability and efficiency of Bayesian tools when applied to computationally demanding physical models. Theoretical and computational developments are demonstrated with applications drawn from molecular dynamics, structural dynamics and granular flow.

  8. Evaluation of Event-Based Algorithms for Optical Flow with Ground-Truth from Inertial Measurement Sensor

    PubMed Central

    Rueckauer, Bodo; Delbruck, Tobi

    2016-01-01

    In this study we compare nine optical flow algorithms that locally measure the flow normal to edges according to accuracy and computation cost. In contrast to conventional, frame-based motion flow algorithms, our open-source implementations compute optical flow based on address-events from a neuromorphic Dynamic Vision Sensor (DVS). For this benchmarking we created a dataset of two synthesized and three real samples recorded from a 240 × 180 pixel Dynamic and Active-pixel Vision Sensor (DAVIS). This dataset contains events from the DVS as well as conventional frames to support testing state-of-the-art frame-based methods. We introduce a new source for the ground truth: In the special case that the perceived motion stems solely from a rotation of the vision sensor around its three camera axes, the true optical flow can be estimated using gyro data from the inertial measurement unit integrated with the DAVIS camera. This provides a ground-truth to which we can compare algorithms that measure optical flow by means of motion cues. An analysis of error sources led to the use of a refractory period, more accurate numerical derivatives and a Savitzky-Golay filter to achieve significant improvements in accuracy. Our pure Java implementations of two recently published algorithms reduce computational cost by up to 29% compared to the original implementations. Two of the algorithms introduced in this paper further speed up processing by a factor of 10 compared with the original implementations, at equal or better accuracy. On a desktop PC, they run in real-time on dense natural input recorded by a DAVIS camera. PMID:27199639

  9. Quantum Chemistry on Quantum Computers: A Polynomial-Time Quantum Algorithm for Constructing the Wave Functions of Open-Shell Molecules.

    PubMed

    Sugisaki, Kenji; Yamamoto, Satoru; Nakazawa, Shigeaki; Toyota, Kazuo; Sato, Kazunobu; Shiomi, Daisuke; Takui, Takeji

    2016-08-18

    Quantum computers are capable to efficiently perform full configuration interaction (FCI) calculations of atoms and molecules by using the quantum phase estimation (QPE) algorithm. Because the success probability of the QPE depends on the overlap between approximate and exact wave functions, efficient methods to prepare accurate initial guess wave functions enough to have sufficiently large overlap with the exact ones are highly desired. Here, we propose a quantum algorithm to construct the wave function consisting of one configuration state function, which is suitable for the initial guess wave function in QPE-based FCI calculations of open-shell molecules, based on the addition theorem of angular momentum. The proposed quantum algorithm enables us to prepare the wave function consisting of an exponential number of Slater determinants only by a polynomial number of quantum operations.

  10. Image Segmentation Method Using Fuzzy C Mean Clustering Based on Multi-Objective Optimization

    NASA Astrophysics Data System (ADS)

    Chen, Jinlin; Yang, Chunzhi; Xu, Guangkui; Ning, Li

    2018-04-01

    Image segmentation is not only one of the hottest topics in digital image processing, but also an important part of computer vision applications. As one kind of image segmentation algorithms, fuzzy C-means clustering is an effective and concise segmentation algorithm. However, the drawback of FCM is that it is sensitive to image noise. To solve the problem, this paper designs a novel fuzzy C-mean clustering algorithm based on multi-objective optimization. We add a parameter λ to the fuzzy distance measurement formula to improve the multi-objective optimization. The parameter λ can adjust the weights of the pixel local information. In the algorithm, the local correlation of neighboring pixels is added to the improved multi-objective mathematical model to optimize the clustering cent. Two different experimental results show that the novel fuzzy C-means approach has an efficient performance and computational time while segmenting images by different type of noises.

  11. Diagonalization of complex symmetric matrices: Generalized Householder reflections, iterative deflation and implicit shifts

    NASA Astrophysics Data System (ADS)

    Noble, J. H.; Lubasch, M.; Stevens, J.; Jentschura, U. D.

    2017-12-01

    We describe a matrix diagonalization algorithm for complex symmetric (not Hermitian) matrices, A ̲ =A̲T, which is based on a two-step algorithm involving generalized Householder reflections based on the indefinite inner product 〈 u ̲ , v ̲ 〉 ∗ =∑iuivi. This inner product is linear in both arguments and avoids complex conjugation. The complex symmetric input matrix is transformed to tridiagonal form using generalized Householder transformations (first step). An iterative, generalized QL decomposition of the tridiagonal matrix employing an implicit shift converges toward diagonal form (second step). The QL algorithm employs iterative deflation techniques when a machine-precision zero is encountered "prematurely" on the super-/sub-diagonal. The algorithm allows for a reliable and computationally efficient computation of resonance and antiresonance energies which emerge from complex-scaled Hamiltonians, and for the numerical determination of the real energy eigenvalues of pseudo-Hermitian and PT-symmetric Hamilton matrices. Numerical reference values are provided.

  12. Algorithms for Computing the Magnetic Field, Vector Potential, and Field Derivatives for Circular Current Loops in Cylindrical Coordinates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Walstrom, Peter Lowell

    A numerical algorithm for computing the field components B r and B z and their r and z derivatives with open boundaries in cylindrical coordinates for circular current loops is described. An algorithm for computing the vector potential is also described. For the convenience of the reader, derivations of the final expressions from their defining integrals are given in detail, since their derivations (especially for the field derivatives) are not all easily found in textbooks. Numerical calculations are based on evaluation of complete elliptic integrals using the Bulirsch algorithm cel. Since cel can evaluate complete elliptic integrals of a fairlymore » general type, in some cases the elliptic integrals can be evaluated without first reducing them to forms containing standard Legendre forms. The algorithms avoid the numerical difficulties that many of the textbook solutions have for points near the axis because of explicit factors of 1=r or 1=r 2 in the some of the expressions.« less

  13. User's guide to the Fault Inferring Nonlinear Detection System (FINDS) computer program

    NASA Technical Reports Server (NTRS)

    Caglayan, A. K.; Godiwala, P. M.; Satz, H. S.

    1988-01-01

    Described are the operation and internal structure of the computer program FINDS (Fault Inferring Nonlinear Detection System). The FINDS algorithm is designed to provide reliable estimates for aircraft position, velocity, attitude, and horizontal winds to be used for guidance and control laws in the presence of possible failures in the avionics sensors. The FINDS algorithm was developed with the use of a digital simulation of a commercial transport aircraft and tested with flight recorded data. The algorithm was then modified to meet the size constraints and real-time execution requirements on a flight computer. For the real-time operation, a multi-rate implementation of the FINDS algorithm has been partitioned to execute on a dual parallel processor configuration: one based on the translational dynamics and the other on the rotational kinematics. The report presents an overview of the FINDS algorithm, the implemented equations, the flow charts for the key subprograms, the input and output files, program variable indexing convention, subprogram descriptions, and the common block descriptions used in the program.

  14. Differential-Evolution Control Parameter Optimization for Unmanned Aerial Vehicle Path Planning

    PubMed Central

    Kok, Kai Yit; Rajendran, Parvathy

    2016-01-01

    The differential evolution algorithm has been widely applied on unmanned aerial vehicle (UAV) path planning. At present, four random tuning parameters exist for differential evolution algorithm, namely, population size, differential weight, crossover, and generation number. These tuning parameters are required, together with user setting on path and computational cost weightage. However, the optimum settings of these tuning parameters vary according to application. Instead of trial and error, this paper presents an optimization method of differential evolution algorithm for tuning the parameters of UAV path planning. The parameters that this research focuses on are population size, differential weight, crossover, and generation number. The developed algorithm enables the user to simply define the weightage desired between the path and computational cost to converge with the minimum generation required based on user requirement. In conclusion, the proposed optimization of tuning parameters in differential evolution algorithm for UAV path planning expedites and improves the final output path and computational cost. PMID:26943630

  15. Efficient Fourier-based algorithms for time-periodic unsteady problems

    NASA Astrophysics Data System (ADS)

    Gopinath, Arathi Kamath

    2007-12-01

    This dissertation work proposes two algorithms for the simulation of time-periodic unsteady problems via the solution of Unsteady Reynolds-Averaged Navier-Stokes (URANS) equations. These algorithms use a Fourier representation in time and hence solve for the periodic state directly without resolving transients (which consume most of the resources in a time-accurate scheme). In contrast to conventional Fourier-based techniques which solve the governing equations in frequency space, the new algorithms perform all the calculations in the time domain, and hence require minimal modifications to an existing solver. The complete space-time solution is obtained by iterating in a fifth pseudo-time dimension. Various time-periodic problems such as helicopter rotors, wind turbines, turbomachinery and flapping-wings can be simulated using the Time Spectral method. The algorithm is first validated using pitching airfoil/wing test cases. The method is further extended to turbomachinery problems, and computational results verified by comparison with a time-accurate calculation. The technique can be very memory intensive for large problems, since the solution is computed (and hence stored) simultaneously at all time levels. Often, the blade counts of a turbomachine are rescaled such that a periodic fraction of the annulus can be solved. This approximation enables the solution to be obtained at a fraction of the cost of a full-scale time-accurate solution. For a viscous computation over a three-dimensional single-stage rescaled compressor, an order of magnitude savings is achieved. The second algorithm, the reduced-order Harmonic Balance method is applicable only to turbomachinery flows, and offers even larger computational savings than the Time Spectral method. It simulates the true geometry of the turbomachine using only one blade passage per blade row as the computational domain. In each blade row of the turbomachine, only the dominant frequencies are resolved, namely, combinations of neighbor's blade passing. An appropriate set of frequencies can be chosen by the analyst/designer based on a trade-off between accuracy and computational resources available. A cost comparison with a time-accurate computation for an Euler calculation on a two-dimensional multi-stage compressor obtained an order of magnitude savings, and a RANS calculation on a three-dimensional single-stage compressor achieved two orders of magnitude savings, with comparable accuracy.

  16. Parallel O(log n) algorithms for open- and closed-chain rigid multibody systems based on a new mass matrix factorization technique

    NASA Technical Reports Server (NTRS)

    Fijany, Amir

    1993-01-01

    In this paper, parallel O(log n) algorithms for computation of rigid multibody dynamics are developed. These parallel algorithms are derived by parallelization of new O(n) algorithms for the problem. The underlying feature of these O(n) algorithms is a drastically different strategy for decomposition of interbody force which leads to a new factorization of the mass matrix (M). Specifically, it is shown that a factorization of the inverse of the mass matrix in the form of the Schur Complement is derived as M(exp -1) = C - B(exp *)A(exp -1)B, wherein matrices C, A, and B are block tridiagonal matrices. The new O(n) algorithm is then derived as a recursive implementation of this factorization of M(exp -1). For the closed-chain systems, similar factorizations and O(n) algorithms for computation of Operational Space Mass Matrix lambda and its inverse lambda(exp -1) are also derived. It is shown that these O(n) algorithms are strictly parallel, that is, they are less efficient than other algorithms for serial computation of the problem. But, to our knowledge, they are the only known algorithms that can be parallelized and that lead to both time- and processor-optimal parallel algorithms for the problem, i.e., parallel O(log n) algorithms with O(n) processors. The developed parallel algorithms, in addition to their theoretical significance, are also practical from an implementation point of view due to their simple architectural requirements.

  17. Robust Vision-Based Pose Estimation Algorithm for AN Uav with Known Gravity Vector

    NASA Astrophysics Data System (ADS)

    Kniaz, V. V.

    2016-06-01

    Accurate estimation of camera external orientation with respect to a known object is one of the central problems in photogrammetry and computer vision. In recent years this problem is gaining an increasing attention in the field of UAV autonomous flight. Such application requires a real-time performance and robustness of the external orientation estimation algorithm. The accuracy of the solution is strongly dependent on the number of reference points visible on the given image. The problem only has an analytical solution if 3 or more reference points are visible. However, in limited visibility conditions it is often needed to perform external orientation with only 2 visible reference points. In such case the solution could be found if the gravity vector direction in the camera coordinate system is known. A number of algorithms for external orientation estimation for the case of 2 known reference points and a gravity vector were developed to date. Most of these algorithms provide analytical solution in the form of polynomial equation that is subject to large errors in the case of complex reference points configurations. This paper is focused on the development of a new computationally effective and robust algorithm for external orientation based on positions of 2 known reference points and a gravity vector. The algorithm implementation for guidance of a Parrot AR.Drone 2.0 micro-UAV is discussed. The experimental evaluation of the algorithm proved its computational efficiency and robustness against errors in reference points positions and complex configurations.

  18. SMV⊥: Simplex of maximal volume based upon the Gram-Schmidt process

    NASA Astrophysics Data System (ADS)

    Salazar-Vazquez, Jairo; Mendez-Vazquez, Andres

    2015-10-01

    In recent years, different algorithms for Hyperspectral Image (HI) analysis have been introduced. The high spectral resolution of these images allows to develop different algorithms for target detection, material mapping, and material identification for applications in Agriculture, Security and Defense, Industry, etc. Therefore, from the computer science's point of view, there is fertile field of research for improving and developing algorithms in HI analysis. In some applications, the spectral pixels of a HI can be classified using laboratory spectral signatures. Nevertheless, for many others, there is no enough available prior information or spectral signatures, making any analysis a difficult task. One of the most popular algorithms for the HI analysis is the N-FINDR because it is easy to understand and provides a way to unmix the original HI in the respective material compositions. The N-FINDR is computationally expensive and its performance depends on a random initialization process. This paper proposes a novel idea to reduce the complexity of the N-FINDR by implementing a bottom-up approach based in an observation from linear algebra and the use of the Gram-Schmidt process. Therefore, the Simplex of Maximal Volume Perpendicular (SMV⊥) algorithm is proposed for fast endmember extraction in hyperspectral imagery. This novel algorithm has complexity O(n) with respect to the number of pixels. In addition, the evidence shows that SMV⊥ calculates a bigger volume, and has lower computational time complexity than other poular algorithms on synthetic and real scenarios.

  19. Lattice surgery on the Raussendorf lattice

    NASA Astrophysics Data System (ADS)

    Herr, Daniel; Paler, Alexandru; Devitt, Simon J.; Nori, Franco

    2018-07-01

    Lattice surgery is a method to perform quantum computation fault-tolerantly by using operations on boundary qubits between different patches of the planar code. This technique allows for universal planar code computation without eliminating the intrinsic two-dimensional nearest-neighbor properties of the surface code that eases physical hardware implementations. Lattice surgery approaches to algorithmic compilation and optimization have been demonstrated to be more resource efficient for resource-intensive components of a fault-tolerant algorithm, and consequently may be preferable over braid-based logic. Lattice surgery can be extended to the Raussendorf lattice, providing a measurement-based approach to the surface code. In this paper we describe how lattice surgery can be performed on the Raussendorf lattice and therefore give a viable alternative to computation using braiding in measurement-based implementations of topological codes.

  20. Information-based management mode based on value network analysis for livestock enterprises

    NASA Astrophysics Data System (ADS)

    Liu, Haoqi; Lee, Changhoon; Han, Mingming; Su, Zhongbin; Padigala, Varshinee Anu; Shen, Weizheng

    2018-01-01

    With the development of computer and IT technologies, enterprise management has gradually become information-based management. Moreover, due to poor technical competence and non-uniform management, most breeding enterprises show a lack of organisation in data collection and management. In addition, low levels of efficiency result in increasing production costs. This paper adopts 'struts2' in order to construct an information-based management system for standardised and normalised management within the process of production in beef cattle breeding enterprises. We present a radio-frequency identification system by studying multiple-tag anti-collision via a dynamic grouping ALOHA algorithm. This algorithm is based on the existing ALOHA algorithm and uses an improved packet dynamic of this algorithm, which is characterised by a high-throughput rate. This new algorithm can reach a throughput 42% higher than that of the general ALOHA algorithm. With a change in the number of tags, the system throughput is relatively stable.

  1. A method of optimized neural network by L-M algorithm to transformer winding hot spot temperature forecasting

    NASA Astrophysics Data System (ADS)

    Wei, B. G.; Wu, X. Y.; Yao, Z. F.; Huang, H.

    2017-11-01

    Transformers are essential devices of the power system. The accurate computation of the highest temperature (HST) of a transformer’s windings is very significant, as for the HST is a fundamental parameter in controlling the load operation mode and influencing the life time of the insulation. Based on the analysis of the heat transfer processes and the thermal characteristics inside transformers, there is taken into consideration the influence of factors like the sunshine, external wind speed etc. on the oil-immersed transformers. Experimental data and the neural network are used for modeling and protesting of the HST, and furthermore, investigations are conducted on the optimization of the structure and algorithms of neutral network are conducted. Comparison is made between the measured values and calculated values by using the recommended algorithm of IEC60076 and by using the neural network algorithm proposed by the authors; comparison that shows that the value computed with the neural network algorithm approximates better the measured value than the value computed with the algorithm proposed by IEC60076.

  2. An Efficient Biometric-Based Algorithm Using Heart Rate Variability for Securing Body Sensor Networks

    PubMed Central

    Pirbhulal, Sandeep; Zhang, Heye; Mukhopadhyay, Subhas Chandra; Li, Chunyue; Wang, Yumei; Li, Guanglin; Wu, Wanqing; Zhang, Yuan-Ting

    2015-01-01

    Body Sensor Network (BSN) is a network of several associated sensor nodes on, inside or around the human body to monitor vital signals, such as, Electroencephalogram (EEG), Photoplethysmography (PPG), Electrocardiogram (ECG), etc. Each sensor node in BSN delivers major information; therefore, it is very significant to provide data confidentiality and security. All existing approaches to secure BSN are based on complex cryptographic key generation procedures, which not only demands high resource utilization and computation time, but also consumes large amount of energy, power and memory during data transmission. However, it is indispensable to put forward energy efficient and computationally less complex authentication technique for BSN. In this paper, a novel biometric-based algorithm is proposed, which utilizes Heart Rate Variability (HRV) for simple key generation process to secure BSN. Our proposed algorithm is compared with three data authentication techniques, namely Physiological Signal based Key Agreement (PSKA), Data Encryption Standard (DES) and Rivest Shamir Adleman (RSA). Simulation is performed in Matlab and results suggest that proposed algorithm is quite efficient in terms of transmission time utilization, average remaining energy and total power consumption. PMID:26131666

  3. An Efficient Biometric-Based Algorithm Using Heart Rate Variability for Securing Body Sensor Networks.

    PubMed

    Pirbhulal, Sandeep; Zhang, Heye; Mukhopadhyay, Subhas Chandra; Li, Chunyue; Wang, Yumei; Li, Guanglin; Wu, Wanqing; Zhang, Yuan-Ting

    2015-06-26

    Body Sensor Network (BSN) is a network of several associated sensor nodes on, inside or around the human body to monitor vital signals, such as, Electroencephalogram (EEG), Photoplethysmography (PPG), Electrocardiogram (ECG), etc. Each sensor node in BSN delivers major information; therefore, it is very significant to provide data confidentiality and security. All existing approaches to secure BSN are based on complex cryptographic key generation procedures, which not only demands high resource utilization and computation time, but also consumes large amount of energy, power and memory during data transmission. However, it is indispensable to put forward energy efficient and computationally less complex authentication technique for BSN. In this paper, a novel biometric-based algorithm is proposed, which utilizes Heart Rate Variability (HRV) for simple key generation process to secure BSN. Our proposed algorithm is compared with three data authentication techniques, namely Physiological Signal based Key Agreement (PSKA), Data Encryption Standard (DES) and Rivest Shamir Adleman (RSA). Simulation is performed in Matlab and results suggest that proposed algorithm is quite efficient in terms of transmission time utilization, average remaining energy and total power consumption.

  4. Minimalist ensemble algorithms for genome-wide protein localization prediction.

    PubMed

    Lin, Jhih-Rong; Mondal, Ananda Mohan; Liu, Rong; Hu, Jianjun

    2012-07-03

    Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms. This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature selection based filter and a logistic regression classifier. Using a novel concept of contribution scores, we analyzed issues of algorithm redundancy, consensus mistakes, and algorithm complementarity in designing ensemble algorithms. We applied the proposed minimalist logistic regression (LR) ensemble algorithm to two genome-wide datasets of Yeast and Human and compared its performance with current ensemble algorithms. Experimental results showed that the minimalist ensemble algorithm can achieve high prediction accuracy with only 1/3 to 1/2 of individual predictors of current ensemble algorithms, which greatly reduces computational complexity and running time. It was found that the high performance ensemble algorithms are usually composed of the predictors that together cover most of available features. Compared to the best individual predictor, our ensemble algorithm improved the prediction accuracy from AUC score of 0.558 to 0.707 for the Yeast dataset and from 0.628 to 0.646 for the Human dataset. Compared with popular weighted voting based ensemble algorithms, our classifier-based ensemble algorithms achieved much better performance without suffering from inclusion of too many individual predictors. We proposed a method for rational design of minimalist ensemble algorithms using feature selection and classifiers. The proposed minimalist ensemble algorithm based on logistic regression can achieve equal or better prediction performance while using only half or one-third of individual predictors compared to other ensemble algorithms. The results also suggested that meta-predictors that take advantage of a variety of features by combining individual predictors tend to achieve the best performance. The LR ensemble server and related benchmark datasets are available at http://mleg.cse.sc.edu/LRensemble/cgi-bin/predict.cgi.

  5. Minimalist ensemble algorithms for genome-wide protein localization prediction

    PubMed Central

    2012-01-01

    Background Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms. Results This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature selection based filter and a logistic regression classifier. Using a novel concept of contribution scores, we analyzed issues of algorithm redundancy, consensus mistakes, and algorithm complementarity in designing ensemble algorithms. We applied the proposed minimalist logistic regression (LR) ensemble algorithm to two genome-wide datasets of Yeast and Human and compared its performance with current ensemble algorithms. Experimental results showed that the minimalist ensemble algorithm can achieve high prediction accuracy with only 1/3 to 1/2 of individual predictors of current ensemble algorithms, which greatly reduces computational complexity and running time. It was found that the high performance ensemble algorithms are usually composed of the predictors that together cover most of available features. Compared to the best individual predictor, our ensemble algorithm improved the prediction accuracy from AUC score of 0.558 to 0.707 for the Yeast dataset and from 0.628 to 0.646 for the Human dataset. Compared with popular weighted voting based ensemble algorithms, our classifier-based ensemble algorithms achieved much better performance without suffering from inclusion of too many individual predictors. Conclusions We proposed a method for rational design of minimalist ensemble algorithms using feature selection and classifiers. The proposed minimalist ensemble algorithm based on logistic regression can achieve equal or better prediction performance while using only half or one-third of individual predictors compared to other ensemble algorithms. The results also suggested that meta-predictors that take advantage of a variety of features by combining individual predictors tend to achieve the best performance. The LR ensemble server and related benchmark datasets are available at http://mleg.cse.sc.edu/LRensemble/cgi-bin/predict.cgi. PMID:22759391

  6. Improved parallel data partitioning by nested dissection with applications to information retrieval.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wolf, Michael M.; Chevalier, Cedric; Boman, Erik Gunnar

    The computational work in many information retrieval and analysis algorithms is based on sparse linear algebra. Sparse matrix-vector multiplication is a common kernel in many of these computations. Thus, an important related combinatorial problem in parallel computing is how to distribute the matrix and the vectors among processors so as to minimize the communication cost. We focus on minimizing the total communication volume while keeping the computation balanced across processes. In [1], the first two authors presented a new 2D partitioning method, the nested dissection partitioning algorithm. In this paper, we improve on that algorithm and show that it ismore » a good option for data partitioning in information retrieval. We also show partitioning time can be substantially reduced by using the SCOTCH software, and quality improves in some cases, too.« less

  7. Design of an optimum computer vision-based automatic abalone (Haliotis discus hannai) grading algorithm.

    PubMed

    Lee, Donggil; Lee, Kyounghoon; Kim, Seonghun; Yang, Yongsu

    2015-04-01

    An automatic abalone grading algorithm that estimates abalone weights on the basis of computer vision using 2D images is developed and tested. The algorithm overcomes the problems experienced by conventional abalone grading methods that utilize manual sorting and mechanical automatic grading. To design an optimal algorithm, a regression formula and R(2) value were investigated by performing a regression analysis for each of total length, body width, thickness, view area, and actual volume against abalone weights. The R(2) value between the actual volume and abalone weight was 0.999, showing a relatively high correlation. As a result, to easily estimate the actual volumes of abalones based on computer vision, the volumes were calculated under the assumption that abalone shapes are half-oblate ellipsoids, and a regression formula was derived to estimate the volumes of abalones through linear regression analysis between the calculated and actual volumes. The final automatic abalone grading algorithm is designed using the abalone volume estimation regression formula derived from test results, and the actual volumes and abalone weights regression formula. In the range of abalones weighting from 16.51 to 128.01 g, the results of evaluation of the performance of algorithm via cross-validation indicate root mean square and worst-case prediction errors of are 2.8 and ±8 g, respectively. © 2015 Institute of Food Technologists®

  8. Air traffic surveillance and control using hybrid estimation and protocol-based conflict resolution

    NASA Astrophysics Data System (ADS)

    Hwang, Inseok

    The continued growth of air travel and recent advances in new technologies for navigation, surveillance, and communication have led to proposals by the Federal Aviation Administration (FAA) to provide reliable and efficient tools to aid Air Traffic Control (ATC) in performing their tasks. In this dissertation, we address four problems frequently encountered in air traffic surveillance and control; multiple target tracking and identity management, conflict detection, conflict resolution, and safety verification. We develop a set of algorithms and tools to aid ATC; These algorithms have the provable properties of safety, computational efficiency, and convergence. Firstly, we develop a multiple-maneuvering-target tracking and identity management algorithm which can keep track of maneuvering aircraft in noisy environments and of their identities. Secondly, we propose a hybrid probabilistic conflict detection algorithm between multiple aircraft which uses flight mode estimates as well as aircraft current state estimates. Our algorithm is based on hybrid models of aircraft, which incorporate both continuous dynamics and discrete mode switching. Thirdly, we develop an algorithm for multiple (greater than two) aircraft conflict avoidance that is based on a closed-form analytic solution and thus provides guarantees of safety. Finally, we consider the problem of safety verification of control laws for safety critical systems, with application to air traffic control systems. We approach safety verification through reachability analysis, which is a computationally expensive problem. We develop an over-approximate method for reachable set computation using polytopic approximation methods and dynamic optimization. These algorithms may be used either in a fully autonomous way, or as supporting tools to increase controllers' situational awareness and to reduce their work load.

  9. Practical Sub-Nyquist Sampling via Array-Based Compressed Sensing Receiver Architecture

    DTIC Science & Technology

    2016-07-10

    different array ele- ments at different sub-Nyquist sampling rates. Signal processing inspired by the sparse fast Fourier transform allows for signal...reconstruction algorithms can be computationally demanding (REF). The related sparse Fourier transform algorithms aim to reduce the processing time nec- essary to...compute the DFT of frequency-sparse signals [7]. In particular, the sparse fast Fourier transform (sFFT) achieves processing time better than the

  10. A memetic optimization algorithm for multi-constrained multicast routing in ad hoc networks

    PubMed Central

    Hammad, Karim; El Bakly, Ahmed M.

    2018-01-01

    A mobile ad hoc network is a conventional self-configuring network where the routing optimization problem—subject to various Quality-of-Service (QoS) constraints—represents a major challenge. Unlike previously proposed solutions, in this paper, we propose a memetic algorithm (MA) employing an adaptive mutation parameter, to solve the multicast routing problem with higher search ability and computational efficiency. The proposed algorithm utilizes an updated scheme, based on statistical analysis, to estimate the best values for all MA parameters and enhance MA performance. The numerical results show that the proposed MA improved the delay and jitter of the network, while reducing computational complexity as compared to existing algorithms. PMID:29509760

  11. A memetic optimization algorithm for multi-constrained multicast routing in ad hoc networks.

    PubMed

    Ramadan, Rahab M; Gasser, Safa M; El-Mahallawy, Mohamed S; Hammad, Karim; El Bakly, Ahmed M

    2018-01-01

    A mobile ad hoc network is a conventional self-configuring network where the routing optimization problem-subject to various Quality-of-Service (QoS) constraints-represents a major challenge. Unlike previously proposed solutions, in this paper, we propose a memetic algorithm (MA) employing an adaptive mutation parameter, to solve the multicast routing problem with higher search ability and computational efficiency. The proposed algorithm utilizes an updated scheme, based on statistical analysis, to estimate the best values for all MA parameters and enhance MA performance. The numerical results show that the proposed MA improved the delay and jitter of the network, while reducing computational complexity as compared to existing algorithms.

  12. Pressure algorithm for elliptic flow calculations with the PDF method

    NASA Technical Reports Server (NTRS)

    Anand, M. S.; Pope, S. B.; Mongia, H. C.

    1991-01-01

    An algorithm to determine the mean pressure field for elliptic flow calculations with the probability density function (PDF) method is developed and applied. The PDF method is a most promising approach for the computation of turbulent reacting flows. Previous computations of elliptic flows with the method were in conjunction with conventional finite volume based calculations that provided the mean pressure field. The algorithm developed and described here permits the mean pressure field to be determined within the PDF calculations. The PDF method incorporating the pressure algorithm is applied to the flow past a backward-facing step. The results are in good agreement with data for the reattachment length, mean velocities, and turbulence quantities including triple correlations.

  13. Optimization of view weighting in tilted-plane-based reconstruction algorithms to minimize helical artifacts in multi-slice helical CT

    NASA Astrophysics Data System (ADS)

    Tang, Xiangyang

    2003-05-01

    In multi-slice helical CT, the single-tilted-plane-based reconstruction algorithm has been proposed to combat helical and cone beam artifacts by tilting a reconstruction plane to fit a helical source trajectory optimally. Furthermore, to improve the noise characteristics or dose efficiency of the single-tilted-plane-based reconstruction algorithm, the multi-tilted-plane-based reconstruction algorithm has been proposed, in which the reconstruction plane deviates from the pose globally optimized due to an extra rotation along the 3rd axis. As a result, the capability of suppressing helical and cone beam artifacts in the multi-tilted-plane-based reconstruction algorithm is compromised. An optomized tilted-plane-based reconstruction algorithm is proposed in this paper, in which a matched view weighting strategy is proposed to optimize the capability of suppressing helical and cone beam artifacts and noise characteristics. A helical body phantom is employed to quantitatively evaluate the imaging performance of the matched view weighting approach by tabulating artifact index and noise characteristics, showing that the matched view weighting improves both the helical artifact suppression and noise characteristics or dose efficiency significantly in comparison to the case in which non-matched view weighting is applied. Finally, it is believed that the matched view weighting approach is of practical importance in the development of multi-slive helical CT, because it maintains the computational structure of fan beam filtered backprojection and demands no extra computational services.

  14. Optimization algorithms for large-scale multireservoir hydropower systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hiew, K.L.

    Five optimization algorithms were vigorously evaluated based on applications on a hypothetical five-reservoir hydropower system. These algorithms are incremental dynamic programming (IDP), successive linear programing (SLP), feasible direction method (FDM), optimal control theory (OCT) and objective-space dynamic programming (OSDP). The performance of these algorithms were comparatively evaluated using unbiased, objective criteria which include accuracy of results, rate of convergence, smoothness of resulting storage and release trajectories, computer time and memory requirements, robustness and other pertinent secondary considerations. Results have shown that all the algorithms, with the exception of OSDP converge to optimum objective values within 1.0% difference from one another.more » The highest objective value is obtained by IDP, followed closely by OCT. Computer time required by these algorithms, however, differ by more than two orders of magnitude, ranging from 10 seconds in the case of OCT to a maximum of about 2000 seconds for IDP. With a well-designed penalty scheme to deal with state-space constraints, OCT proves to be the most-efficient algorithm based on its overall performance. SLP, FDM, and OCT were applied to the case study of Mahaweli project, a ten-powerplant system in Sri Lanka.« less

  15. A new root-based direction-finding algorithm

    NASA Astrophysics Data System (ADS)

    Wasylkiwskyj, Wasyl; Kopriva, Ivica; DoroslovačKi, Miloš; Zaghloul, Amir I.

    2007-04-01

    Polynomial rooting direction-finding (DF) algorithms are a computationally efficient alternative to search-based DF algorithms and are particularly suitable for uniform linear arrays of physically identical elements provided that mutual interaction among the array elements can be either neglected or compensated for. A popular algorithm in such situations is Root Multiple Signal Classification (Root MUSIC (RM)), wherein the estimation of the directions of arrivals (DOA) requires the computation of the roots of a (2N - 2) -order polynomial, where N represents number of array elements. The DOA are estimated from the L pairs of roots closest to the unit circle, where L represents number of sources. In this paper we derive a modified root polynomial (MRP) algorithm requiring the calculation of only L roots in order to estimate the L DOA. We evaluate the performance of the MRP algorithm numerically and show that it is as accurate as the RM algorithm but with a significantly simpler algebraic structure. In order to demonstrate that the theoretically predicted performance can be achieved in an experimental setting, a decoupled array is emulated in hardware using phase shifters. The results are in excellent agreement with theory.

  16. Kalman/Map filtering-aided fast normalized cross correlation-based Wi-Fi fingerprinting location sensing.

    PubMed

    Sun, Yongliang; Xu, Yubin; Li, Cheng; Ma, Lin

    2013-11-13

    A Kalman/map filtering (KMF)-aided fast normalized cross correlation (FNCC)-based Wi-Fi fingerprinting location sensing system is proposed in this paper. Compared with conventional neighbor selection algorithms that calculate localization results with received signal strength (RSS) mean samples, the proposed FNCC algorithm makes use of all the on-line RSS samples and reference point RSS variations to achieve higher fingerprinting accuracy. The FNCC computes efficiently while maintaining the same accuracy as the basic normalized cross correlation. Additionally, a KMF is also proposed to process fingerprinting localization results. It employs a new map matching algorithm to nonlinearize the linear location prediction process of Kalman filtering (KF) that takes advantage of spatial proximities of consecutive localization results. With a calibration model integrated into an indoor map, the map matching algorithm corrects unreasonable prediction locations of the KF according to the building interior structure. Thus, more accurate prediction locations are obtained. Using these locations, the KMF considerably improves fingerprinting algorithm performance. Experimental results demonstrate that the FNCC algorithm with reduced computational complexity outperforms other neighbor selection algorithms and the KMF effectively improves location sensing accuracy by using indoor map information and spatial proximities of consecutive localization results.

  17. Kalman/Map Filtering-Aided Fast Normalized Cross Correlation-Based Wi-Fi Fingerprinting Location Sensing

    PubMed Central

    Sun, Yongliang; Xu, Yubin; Li, Cheng; Ma, Lin

    2013-01-01

    A Kalman/map filtering (KMF)-aided fast normalized cross correlation (FNCC)-based Wi-Fi fingerprinting location sensing system is proposed in this paper. Compared with conventional neighbor selection algorithms that calculate localization results with received signal strength (RSS) mean samples, the proposed FNCC algorithm makes use of all the on-line RSS samples and reference point RSS variations to achieve higher fingerprinting accuracy. The FNCC computes efficiently while maintaining the same accuracy as the basic normalized cross correlation. Additionally, a KMF is also proposed to process fingerprinting localization results. It employs a new map matching algorithm to nonlinearize the linear location prediction process of Kalman filtering (KF) that takes advantage of spatial proximities of consecutive localization results. With a calibration model integrated into an indoor map, the map matching algorithm corrects unreasonable prediction locations of the KF according to the building interior structure. Thus, more accurate prediction locations are obtained. Using these locations, the KMF considerably improves fingerprinting algorithm performance. Experimental results demonstrate that the FNCC algorithm with reduced computational complexity outperforms other neighbor selection algorithms and the KMF effectively improves location sensing accuracy by using indoor map information and spatial proximities of consecutive localization results. PMID:24233027

  18. A-VCI: A flexible method to efficiently compute vibrational spectra

    NASA Astrophysics Data System (ADS)

    Odunlami, Marc; Le Bris, Vincent; Bégué, Didier; Baraille, Isabelle; Coulaud, Olivier

    2017-06-01

    The adaptive vibrational configuration interaction algorithm has been introduced as a new method to efficiently reduce the dimension of the set of basis functions used in a vibrational configuration interaction process. It is based on the construction of nested bases for the discretization of the Hamiltonian operator according to a theoretical criterion that ensures the convergence of the method. In the present work, the Hamiltonian is written as a sum of products of operators. The purpose of this paper is to study the properties and outline the performance details of the main steps of the algorithm. New parameters have been incorporated to increase flexibility, and their influence has been thoroughly investigated. The robustness and reliability of the method are demonstrated for the computation of the vibrational spectrum up to 3000 cm-1 of a widely studied 6-atom molecule (acetonitrile). Our results are compared to the most accurate up to date computation; we also give a new reference calculation for future work on this system. The algorithm has also been applied to a more challenging 7-atom molecule (ethylene oxide). The computed spectrum up to 3200 cm-1 is the most accurate computation that exists today on such systems.

  19. A-VCI: A flexible method to efficiently compute vibrational spectra.

    PubMed

    Odunlami, Marc; Le Bris, Vincent; Bégué, Didier; Baraille, Isabelle; Coulaud, Olivier

    2017-06-07

    The adaptive vibrational configuration interaction algorithm has been introduced as a new method to efficiently reduce the dimension of the set of basis functions used in a vibrational configuration interaction process. It is based on the construction of nested bases for the discretization of the Hamiltonian operator according to a theoretical criterion that ensures the convergence of the method. In the present work, the Hamiltonian is written as a sum of products of operators. The purpose of this paper is to study the properties and outline the performance details of the main steps of the algorithm. New parameters have been incorporated to increase flexibility, and their influence has been thoroughly investigated. The robustness and reliability of the method are demonstrated for the computation of the vibrational spectrum up to 3000 cm -1 of a widely studied 6-atom molecule (acetonitrile). Our results are compared to the most accurate up to date computation; we also give a new reference calculation for future work on this system. The algorithm has also been applied to a more challenging 7-atom molecule (ethylene oxide). The computed spectrum up to 3200 cm -1 is the most accurate computation that exists today on such systems.

  20. Fast semivariogram computation using FPGA architectures

    NASA Astrophysics Data System (ADS)

    Lagadapati, Yamuna; Shirvaikar, Mukul; Dong, Xuanliang

    2015-02-01

    The semivariogram is a statistical measure of the spatial distribution of data and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. The semivariogram is a plot of semivariances for different lag distances between pixels. A semi-variance, γ(h), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h. Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O(n2). Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz, but they can perform tens of thousands of calculations per clock cycle while operating in the low range of power. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. The design consists of several modules dedicated to the constituent computational tasks. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. Anisotropic semivariogram implementation is anticipated to be an extension of the current architecture, ostensibly based on refinements to the current modules. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from MRI scans are utilized for the experiments. Computational speedup is measured with respect to Matlab implementation on a personal computer with an Intel i7 multi-core processor. Preliminary simulation results indicate that a significant advantage in speed can be attained by the architectures, making the algorithm viable for implementation in medical devices

Top