developed computational algorithms: Topics by Science.gov

Sample records for developed computational algorithms

Computation of Symmetric Discrete Cosine Transform Using Bakhvalov's Algorithm

NASA Technical Reports Server (NTRS)

Aburdene, Maurice F.; Strojny, Brian C.; Dorband, John E.

2005-01-01

A number of algorithms for recursive computation of the discrete cosine transform (DCT) have been developed recently. This paper presents a new method for computing the discrete cosine transform and its inverse using Bakhvalov's algorithm, a method developed for evaluation of a polynomial at a point. In this paper, we will focus on both the application of the algorithm to the computation of the DCT-I and its complexity. In addition, Bakhvalov s algorithm is compared with Clenshaw s algorithm for the computation of the DCT.
A class of parallel algorithms for computation of the manipulator inertia matrix

NASA Technical Reports Server (NTRS)

Fijany, Amir; Bejczy, Antal K.

1989-01-01

Parallel and parallel/pipeline algorithms for computation of the manipulator inertia matrix are presented. An algorithm based on composite rigid-body spatial inertia method, which provides better features for parallelization, is used for the computation of the inertia matrix. Two parallel algorithms are developed which achieve the time lower bound in computation. Also described is the mapping of these algorithms with topological variation on a two-dimensional processor array, with nearest-neighbor connection, and with cardinality variation on a linear processor array. An efficient parallel/pipeline algorithm for the linear array was also developed, but at significantly higher efficiency.
Development and application of unified algorithms for problems in computational science

NASA Technical Reports Server (NTRS)

Shankar, Vijaya; Chakravarthy, Sukumar

1987-01-01

A framework is presented for developing computationally unified numerical algorithms for solving nonlinear equations that arise in modeling various problems in mathematical physics. The concept of computational unification is an attempt to encompass efficient solution procedures for computing various nonlinear phenomena that may occur in a given problem. For example, in Computational Fluid Dynamics (CFD), a unified algorithm will be one that allows for solutions to subsonic (elliptic), transonic (mixed elliptic-hyperbolic), and supersonic (hyperbolic) flows for both steady and unsteady problems. The objectives are: development of superior unified algorithms emphasizing accuracy and efficiency aspects; development of codes based on selected algorithms leading to validation; application of mature codes to realistic problems; and extension/application of CFD-based algorithms to problems in other areas of mathematical physics. The ultimate objective is to achieve integration of multidisciplinary technologies to enhance synergism in the design process through computational simulation. Specific unified algorithms for a hierarchy of gas dynamics equations and their applications to two other areas: electromagnetic scattering, and laser-materials interaction accounting for melting.
CAD system for footwear design based on whole real 3D data of last surface

NASA Astrophysics Data System (ADS)

Song, Wanzhong; Su, Xianyu

2000-10-01

Two major parts of application of CAD in footwear design are studied: the development of last surface; computer-aided design of planar shoe-template. A new quasi-experiential development algorithm of last surface based on triangulation approximation is presented. This development algorithm consumes less time and does not need any interactive operation for precisely development compared with other development algorithm of last surface. Based on this algorithm, a software, SHOEMAKERTM, which contains computer aided automatic measurement, automatic development of last surface and computer aide design of shoe-template has been developed.
Efficient conjugate gradient algorithms for computation of the manipulator forward dynamics

NASA Technical Reports Server (NTRS)

Fijany, Amir; Scheid, Robert E.

1989-01-01

The applicability of conjugate gradient algorithms for computation of the manipulator forward dynamics is investigated. The redundancies in the previously proposed conjugate gradient algorithm are analyzed. A new version is developed which, by avoiding these redundancies, achieves a significantly greater efficiency. A preconditioned conjugate gradient algorithm is also presented. A diagonal matrix whose elements are the diagonal elements of the inertia matrix is proposed as the preconditioner. In order to increase the computational efficiency, an algorithm is developed which exploits the synergism between the computation of the diagonal elements of the inertia matrix and that required by the conjugate gradient algorithm.
Parallel algorithms for computation of the manipulator inertia matrix

NASA Technical Reports Server (NTRS)

Amin-Javaheri, Masoud; Orin, David E.

1989-01-01

The development of an O(log2N) parallel algorithm for the manipulator inertia matrix is presented. It is based on the most efficient serial algorithm which uses the composite rigid body method. Recursive doubling is used to reformulate the linear recurrence equations which are required to compute the diagonal elements of the matrix. It results in O(log2N) levels of computation. Computation of the off-diagonal elements involves N linear recurrences of varying-size and a new method, which avoids redundant computation of position and orientation transforms for the manipulator, is developed. The O(log2N) algorithm is presented in both equation and graphic forms which clearly show the parallelism inherent in the algorithm.
Signal and image processing algorithm performance in a virtual and elastic computing environment

NASA Astrophysics Data System (ADS)

Bennett, Kelly W.; Robertson, James

2013-05-01

The U.S. Army Research Laboratory (ARL) supports the development of classification, detection, tracking, and localization algorithms using multiple sensing modalities including acoustic, seismic, E-field, magnetic field, PIR, and visual and IR imaging. Multimodal sensors collect large amounts of data in support of algorithm development. The resulting large amount of data, and their associated high-performance computing needs, increases and challenges existing computing infrastructures. Purchasing computer power as a commodity using a Cloud service offers low-cost, pay-as-you-go pricing models, scalability, and elasticity that may provide solutions to develop and optimize algorithms without having to procure additional hardware and resources. This paper provides a detailed look at using a commercial cloud service provider, such as Amazon Web Services (AWS), to develop and deploy simple signal and image processing algorithms in a cloud and run the algorithms on a large set of data archived in the ARL Multimodal Signatures Database (MMSDB). Analytical results will provide performance comparisons with existing infrastructure. A discussion on using cloud computing with government data will discuss best security practices that exist within cloud services, such as AWS.
Parallel conjugate gradient algorithms for manipulator dynamic simulation

NASA Technical Reports Server (NTRS)

Fijany, Amir; Scheld, Robert E.

1989-01-01

Parallel conjugate gradient algorithms for the computation of multibody dynamics are developed for the specialized case of a robot manipulator. For an n-dimensional positive-definite linear system, the Classical Conjugate Gradient (CCG) algorithms are guaranteed to converge in n iterations, each with a computation cost of O(n); this leads to a total computational cost of O(n sq) on a serial processor. A conjugate gradient algorithms is presented that provide greater efficiency using a preconditioner, which reduces the number of iterations required, and by exploiting parallelism, which reduces the cost of each iteration. Two Preconditioned Conjugate Gradient (PCG) algorithms are proposed which respectively use a diagonal and a tridiagonal matrix, composed of the diagonal and tridiagonal elements of the mass matrix, as preconditioners. Parallel algorithms are developed to compute the preconditioners and their inversions in O(log sub 2 n) steps using n processors. A parallel algorithm is also presented which, on the same architecture, achieves the computational time of O(log sub 2 n) for each iteration. Simulation results for a seven degree-of-freedom manipulator are presented. Variants of the proposed algorithms are also developed which can be efficiently implemented on the Robot Mathematics Processor (RMP).
GeoBuilder: a geometric algorithm visualization and debugging system for 2D and 3D geometric computing.

PubMed

Wei, Jyh-Da; Tsai, Ming-Hung; Lee, Gen-Cher; Huang, Jeng-Hung; Lee, Der-Tsai

2009-01-01

Algorithm visualization is a unique research topic that integrates engineering skills such as computer graphics, system programming, database management, computer networks, etc., to facilitate algorithmic researchers in testing their ideas, demonstrating new findings, and teaching algorithm design in the classroom. Within the broad applications of algorithm visualization, there still remain performance issues that deserve further research, e.g., system portability, collaboration capability, and animation effect in 3D environments. Using modern technologies of Java programming, we develop an algorithm visualization and debugging system, dubbed GeoBuilder, for geometric computing. The GeoBuilder system features Java's promising portability, engagement of collaboration in algorithm development, and automatic camera positioning for tracking 3D geometric objects. In this paper, we describe the design of the GeoBuilder system and demonstrate its applications.
Computer-aided US diagnosis of breast lesions by using cell-based contour grouping.

PubMed

Cheng, Jie-Zhi; Chou, Yi-Hong; Huang, Chiun-Sheng; Chang, Yeun-Chung; Tiu, Chui-Mei; Chen, Kuei-Wu; Chen, Chung-Ming

2010-06-01

To develop a computer-aided diagnostic algorithm with automatic boundary delineation for differential diagnosis of benign and malignant breast lesions at ultrasonography (US) and investigate the effect of boundary quality on the performance of a computer-aided diagnostic algorithm. This was an institutional review board-approved retrospective study with waiver of informed consent. A cell-based contour grouping (CBCG) segmentation algorithm was used to delineate the lesion boundaries automatically. Seven morphologic features were extracted. The classifier was a logistic regression function. Five hundred twenty breast US scans were obtained from 520 subjects (age range, 15-89 years), including 275 benign (mean size, 15 mm; range, 5-35 mm) and 245 malignant (mean size, 18 mm; range, 8-29 mm) lesions. The newly developed computer-aided diagnostic algorithm was evaluated on the basis of boundary quality and differentiation performance. The segmentation algorithms and features in two conventional computer-aided diagnostic algorithms were used for comparative study. The CBCG-generated boundaries were shown to be comparable with the manually delineated boundaries. The area under the receiver operating characteristic curve (AUC) and differentiation accuracy were 0.968 +/- 0.010 and 93.1% +/- 0.7, respectively, for all 520 breast lesions. At the 5% significance level, the newly developed algorithm was shown to be superior to the use of the boundaries and features of the two conventional computer-aided diagnostic algorithms in terms of AUC (0.974 +/- 0.007 versus 0.890 +/- 0.008 and 0.788 +/- 0.024, respectively). The newly developed computer-aided diagnostic algorithm that used a CBCG segmentation method to measure boundaries achieved a high differentiation performance. Copyright RSNA, 2010
Computer architecture for efficient algorithmic executions in real-time systems: New technology for avionics systems and advanced space vehicles

NASA Technical Reports Server (NTRS)

Carroll, Chester C.; Youngblood, John N.; Saha, Aindam

1987-01-01

Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processing elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed.
Computer architecture for efficient algorithmic executions in real-time systems: new technology for avionics systems and advanced space vehicles

DOE Office of Scientific and Technical Information (OSTI.GOV)

Carroll, C.C.; Youngblood, J.N.; Saha, A.

1987-12-01

Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processingmore » elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed.« less
Computing rank-revealing QR factorizations of dense matrices.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bischof, C. H.; Quintana-Orti, G.; Mathematics and Computer Science

1998-06-01

We develop algorithms and implementations for computing rank-revealing QR (RRQR) factorizations of dense matrices. First, we develop an efficient block algorithm for approximating an RRQR factorization, employing a windowed version of the commonly used Golub pivoting strategy, aided by incremental condition estimation. Second, we develop efficiently implementable variants of guaranteed reliable RRQR algorithms for triangular matrices originally suggested by Chandrasekaran and Ipsen and by Pan and Tang. We suggest algorithmic improvements with respect to condition estimation, termination criteria, and Givens updating. By combining the block algorithm with one of the triangular postprocessing steps, we arrive at an efficient and reliablemore » algorithm for computing an RRQR factorization of a dense matrix. Experimental results on IBM RS/6000 SGI R8000 platforms show that this approach performs up to three times faster that the less reliable QR factorization with column pivoting as it is currently implemented in LAPACK, and comes within 15% of the performance of the LAPACK block algorithm for computing a QR factorization without any column exchanges. Thus, we expect this routine to be useful in may circumstances where numerical rank deficiency cannot be ruled out, but currently has been ignored because of the computational cost of dealing with it.« less
Development of Online Cognitive and Algorithm Tests as Assessment Tools in Introductory Computer Science Courses

ERIC Educational Resources Information Center

Avancena, Aimee Theresa; Nishihara, Akinori; Vergara, John Paul

2012-01-01

This paper presents the online cognitive and algorithm tests, which were developed in order to determine if certain cognitive factors and fundamental algorithms correlate with the performance of students in their introductory computer science course. The tests were implemented among Management Information Systems majors from the Philippines and…
Fast algorithm for computing complex number-theoretic transforms

NASA Technical Reports Server (NTRS)

Reed, I. S.; Liu, K. Y.; Truong, T. K.

1977-01-01

A high-radix FFT algorithm for computing transforms over FFT, where q is a Mersenne prime, is developed to implement fast circular convolutions. This new algorithm requires substantially fewer multiplications than the conventional FFT.
CCOMP: An efficient algorithm for complex roots computation of determinantal equations

NASA Astrophysics Data System (ADS)

Zouros, Grigorios P.

2018-01-01

In this paper a free Python algorithm, entitled CCOMP (Complex roots COMPutation), is developed for the efficient computation of complex roots of determinantal equations inside a prescribed complex domain. The key to the method presented is the efficient determination of the candidate points inside the domain which, in their close neighborhood, a complex root may lie. Once these points are detected, the algorithm proceeds to a two-dimensional minimization problem with respect to the minimum modulus eigenvalue of the system matrix. In the core of CCOMP exist three sub-algorithms whose tasks are the efficient estimation of the minimum modulus eigenvalues of the system matrix inside the prescribed domain, the efficient computation of candidate points which guarantee the existence of minima, and finally, the computation of minima via bound constrained minimization algorithms. Theoretical results and heuristics support the development and the performance of the algorithm, which is discussed in detail. CCOMP supports general complex matrices, and its efficiency, applicability and validity is demonstrated to a variety of microwave applications.
Combinatorial Algorithms to Enable Computational Science and Engineering: Work from the CSCAPES Institute

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boman, Erik G.; Catalyurek, Umit V.; Chevalier, Cedric

2015-01-16

This final progress report summarizes the work accomplished at the Combinatorial Scientific Computing and Petascale Simulations Institute. We developed Zoltan, a parallel mesh partitioning library that made use of accurate hypergraph models to provide load balancing in mesh-based computations. We developed several graph coloring algorithms for computing Jacobian and Hessian matrices and organized them into a software package called ColPack. We developed parallel algorithms for graph coloring and graph matching problems, and also designed multi-scale graph algorithms. Three PhD students graduated, six more are continuing their PhD studies, and four postdoctoral scholars were advised. Six of these students and Fellowsmore » have joined DOE Labs (Sandia, Berkeley), as staff scientists or as postdoctoral scientists. We also organized the SIAM Workshop on Combinatorial Scientific Computing (CSC) in 2007, 2009, and 2011 to continue to foster the CSC community.« less
Performance comparison of heuristic algorithms for task scheduling in IaaS cloud computing environment.

PubMed

Madni, Syed Hamid Hussain; Abd Latiff, Muhammad Shafie; Abdullahi, Mohammed; Abdulhamid, Shafi'i Muhammad; Usman, Mohammed Joda

2017-01-01

Cloud computing infrastructure is suitable for meeting computational needs of large task sizes. Optimal scheduling of tasks in cloud computing environment has been proved to be an NP-complete problem, hence the need for the application of heuristic methods. Several heuristic algorithms have been developed and used in addressing this problem, but choosing the appropriate algorithm for solving task assignment problem of a particular nature is difficult since the methods are developed under different assumptions. Therefore, six rule based heuristic algorithms are implemented and used to schedule autonomous tasks in homogeneous and heterogeneous environments with the aim of comparing their performance in terms of cost, degree of imbalance, makespan and throughput. First Come First Serve (FCFS), Minimum Completion Time (MCT), Minimum Execution Time (MET), Max-min, Min-min and Sufferage are the heuristic algorithms considered for the performance comparison and analysis of task scheduling in cloud computing.
Performance comparison of heuristic algorithms for task scheduling in IaaS cloud computing environment

PubMed Central

Madni, Syed Hamid Hussain; Abd Latiff, Muhammad Shafie; Abdullahi, Mohammed; Usman, Mohammed Joda

2017-01-01

Cloud computing infrastructure is suitable for meeting computational needs of large task sizes. Optimal scheduling of tasks in cloud computing environment has been proved to be an NP-complete problem, hence the need for the application of heuristic methods. Several heuristic algorithms have been developed and used in addressing this problem, but choosing the appropriate algorithm for solving task assignment problem of a particular nature is difficult since the methods are developed under different assumptions. Therefore, six rule based heuristic algorithms are implemented and used to schedule autonomous tasks in homogeneous and heterogeneous environments with the aim of comparing their performance in terms of cost, degree of imbalance, makespan and throughput. First Come First Serve (FCFS), Minimum Completion Time (MCT), Minimum Execution Time (MET), Max-min, Min-min and Sufferage are the heuristic algorithms considered for the performance comparison and analysis of task scheduling in cloud computing. PMID:28467505
Bulk refrigeration of fruits and vegetables. Part 2: Computer algorithm for heat loads and moisture loss

DOE Office of Scientific and Technical Information (OSTI.GOV)

Becker, B.; Misra, A.; Fricke, B.A.

1997-12-31

A computer algorithm was developed that estimates the latent and sensible heat loads due to the bulk refrigeration of fruits and vegetables. The algorithm also predicts the commodity moisture loss and temperature distribution which occurs during refrigeration. Part 1 focused upon the thermophysical properties of commodities and the flowfield parameters which govern the heat and mass transfer from fresh fruits and vegetables. This paper, Part 2, discusses the modeling methodology utilized in the current computer algorithm and describes the development of the heat and mass transfer models. Part 2 also compares the results of the computer algorithm to experimental datamore » taken from the literature and describes a parametric study which was performed with the algorithm. In addition, this paper also reviews existing numerical models for determining the heat and mass transfer in bulk loads of fruits and vegetables.« less

Computer methods for sampling from the gamma distribution

DOE Office of Scientific and Technical Information (OSTI.GOV)

Johnson, M.E.; Tadikamalla, P.R.

1978-01-01

Considerable attention has recently been directed at developing ever faster algorithms for generating gamma random variates on digital computers. This paper surveys the current state of the art including the leading algorithms of Ahrens and Dieter, Atkinson, Cheng, Fishman, Marsaglia, Tadikamalla, and Wallace. General random variate generation techniques are explained with reference to these gamma algorithms. Computer simulation experiments on IBM and CDC computers are reported.
Algorithm implementation on the Navier-Stokes computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Krist, S.E.; Zang, T.A.

1987-03-01

The Navier-Stokes Computer is a multi-purpose parallel-processing supercomputer which is currently under development at Princeton University. It consists of multiple local memory parallel processors, called Nodes, which are interconnected in a hypercube network. Details of the procedures involved in implementing an algorithm on the Navier-Stokes computer are presented. The particular finite difference algorithm considered in this analysis was developed for simulation of laminar-turbulent transition in wall bounded shear flows. Projected timing results for implementing this algorithm indicate that operation rates in excess of 42 GFLOPS are feasible on a 128 Node machine.
Algorithm implementation on the Navier-Stokes computer

NASA Technical Reports Server (NTRS)

Krist, Steven E.; Zang, Thomas A.

1987-01-01

The Navier-Stokes Computer is a multi-purpose parallel-processing supercomputer which is currently under development at Princeton University. It consists of multiple local memory parallel processors, called Nodes, which are interconnected in a hypercube network. Details of the procedures involved in implementing an algorithm on the Navier-Stokes computer are presented. The particular finite difference algorithm considered in this analysis was developed for simulation of laminar-turbulent transition in wall bounded shear flows. Projected timing results for implementing this algorithm indicate that operation rates in excess of 42 GFLOPS are feasible on a 128 Node machine.
Design of automata theory of cubical complexes with applications to diagnosis and algorithmic description

NASA Technical Reports Server (NTRS)

Roth, J. P.

1972-01-01

The following problems are considered: (1) methods for development of logic design together with algorithms, so that it is possible to compute a test for any failure in the logic design, if such a test exists, and developing algorithms and heuristics for the purpose of minimizing the computation for tests; and (2) a method of design of logic for ultra LSI (large scale integration). It was discovered that the so-called quantum calculus can be extended to render it possible: (1) to describe the functional behavior of a mechanism component by component, and (2) to compute tests for failures, in the mechanism, using the diagnosis algorithm. The development of an algorithm for the multioutput two-level minimization problem is presented and the program MIN 360 was written for this algorithm. The program has options of mode (exact minimum or various approximations), cost function, cost bound, etc., providing flexibility.
Automated Development of Accurate Algorithms and Efficient Codes for Computational Aeroacoustics

NASA Technical Reports Server (NTRS)

Goodrich, John W.; Dyson, Rodger W.

1999-01-01

The simulation of sound generation and propagation in three space dimensions with realistic aircraft components is a very large time dependent computation with fine details. Simulations in open domains with embedded objects require accurate and robust algorithms for propagation, for artificial inflow and outflow boundaries, and for the definition of geometrically complex objects. The development, implementation, and validation of methods for solving these demanding problems is being done to support the NASA pillar goals for reducing aircraft noise levels. Our goal is to provide algorithms which are sufficiently accurate and efficient to produce usable results rapidly enough to allow design engineers to study the effects on sound levels of design changes in propulsion systems, and in the integration of propulsion systems with airframes. There is a lack of design tools for these purposes at this time. Our technical approach to this problem combines the development of new, algorithms with the use of Mathematica and Unix utilities to automate the algorithm development, code implementation, and validation. We use explicit methods to ensure effective implementation by domain decomposition for SPMD parallel computing. There are several orders of magnitude difference in the computational efficiencies of the algorithms which we have considered. We currently have new artificial inflow and outflow boundary conditions that are stable, accurate, and unobtrusive, with implementations that match the accuracy and efficiency of the propagation methods. The artificial numerical boundary treatments have been proven to have solutions which converge to the full open domain problems, so that the error from the boundary treatments can be driven as low as is required. The purpose of this paper is to briefly present a method for developing highly accurate algorithms for computational aeroacoustics, the use of computer automation in this process, and a brief survey of the algorithms that have resulted from this work. A review of computational aeroacoustics has recently been given by Lele.
Parallel computer vision

DOE Office of Scientific and Technical Information (OSTI.GOV)

Uhr, L.

1987-01-01

This book is written by research scientists involved in the development of massively parallel, but hierarchically structured, algorithms, architectures, and programs for image processing, pattern recognition, and computer vision. The book gives an integrated picture of the programs and algorithms that are being developed, and also of the multi-computer hardware architectures for which these systems are designed.
Intelligent fuzzy approach for fast fractal image compression

NASA Astrophysics Data System (ADS)

Nodehi, Ali; Sulong, Ghazali; Al-Rodhaan, Mznah; Al-Dhelaan, Abdullah; Rehman, Amjad; Saba, Tanzila

2014-12-01

Fractal image compression (FIC) is recognized as a NP-hard problem, and it suffers from a high number of mean square error (MSE) computations. In this paper, a two-phase algorithm was proposed to reduce the MSE computation of FIC. In the first phase, based on edge property, range and domains are arranged. In the second one, imperialist competitive algorithm (ICA) is used according to the classified blocks. For maintaining the quality of the retrieved image and accelerating algorithm operation, we divided the solutions into two groups: developed countries and undeveloped countries. Simulations were carried out to evaluate the performance of the developed approach. Promising results thus achieved exhibit performance better than genetic algorithm (GA)-based and Full-search algorithms in terms of decreasing the number of MSE computations. The number of MSE computations was reduced by the proposed algorithm for 463 times faster compared to the Full-search algorithm, although the retrieved image quality did not have a considerable change.
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

NASA Technical Reports Server (NTRS)

Sun, Xian-He

1997-01-01

Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper.
Computations involving differential operators and their actions on functions

NASA Technical Reports Server (NTRS)

Crouch, Peter E.; Grossman, Robert; Larson, Richard

1991-01-01

The algorithms derived by Grossmann and Larson (1989) are further developed for rewriting expressions involving differential operators. The differential operators involved arise in the local analysis of nonlinear dynamical systems. These algorithms are extended in two different directions: the algorithms are generalized so that they apply to differential operators on groups and the data structures and algorithms are developed to compute symbolically the action of differential operators on functions. Both of these generalizations are needed for applications.
Constraint treatment techniques and parallel algorithms for multibody dynamic analysis. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Chiou, Jin-Chern

1990-01-01

Computational procedures for kinematic and dynamic analysis of three-dimensional multibody dynamic (MBD) systems are developed from the differential-algebraic equations (DAE's) viewpoint. Constraint violations during the time integration process are minimized and penalty constraint stabilization techniques and partitioning schemes are developed. The governing equations of motion, a two-stage staggered explicit-implicit numerical algorithm, are treated which takes advantage of a partitioned solution procedure. A robust and parallelizable integration algorithm is developed. This algorithm uses a two-stage staggered central difference algorithm to integrate the translational coordinates and the angular velocities. The angular orientations of bodies in MBD systems are then obtained by using an implicit algorithm via the kinematic relationship between Euler parameters and angular velocities. It is shown that the combination of the present solution procedures yields a computationally more accurate solution. To speed up the computational procedures, parallel implementation of the present constraint treatment techniques, the two-stage staggered explicit-implicit numerical algorithm was efficiently carried out. The DAE's and the constraint treatment techniques were transformed into arrowhead matrices to which Schur complement form was derived. By fully exploiting the sparse matrix structural analysis techniques, a parallel preconditioned conjugate gradient numerical algorithm is used to solve the systems equations written in Schur complement form. A software testbed was designed and implemented in both sequential and parallel computers. This testbed was used to demonstrate the robustness and efficiency of the constraint treatment techniques, the accuracy of the two-stage staggered explicit-implicit numerical algorithm, and the speed up of the Schur-complement-based parallel preconditioned conjugate gradient algorithm on a parallel computer.
Algorithms Bridging Quantum Computation and Chemistry

NASA Astrophysics Data System (ADS)

McClean, Jarrod Ryan

The design of new materials and chemicals derived entirely from computation has long been a goal of computational chemistry, and the governing equation whose solution would permit this dream is known. Unfortunately, the exact solution to this equation has been far too expensive and clever approximations fail in critical situations. Quantum computers offer a novel solution to this problem. In this work, we develop not only new algorithms to use quantum computers to study hard problems in chemistry, but also explore how such algorithms can help us to better understand and improve our traditional approaches. In particular, we first introduce a new method, the variational quantum eigensolver, which is designed to maximally utilize the quantum resources available in a device to solve chemical problems. We apply this method in a real quantum photonic device in the lab to study the dissociation of the helium hydride (HeH+) molecule. We also enhance this methodology with architecture specific optimizations on ion trap computers and show how linear-scaling techniques from traditional quantum chemistry can be used to improve the outlook of similar algorithms on quantum computers. We then show how studying quantum algorithms such as these can be used to understand and enhance the development of classical algorithms. In particular we use a tool from adiabatic quantum computation, Feynman's Clock, to develop a new discrete time variational principle and further establish a connection between real-time quantum dynamics and ground state eigenvalue problems. We use these tools to develop two novel parallel-in-time quantum algorithms that outperform competitive algorithms as well as offer new insights into the connection between the fermion sign problem of ground states and the dynamical sign problem of quantum dynamics. Finally we use insights gained in the study of quantum circuits to explore a general notion of sparsity in many-body quantum systems. In particular we use developments from the field of compressed sensing to find compact representations of ground states. As an application we study electronic systems and find solutions dramatically more compact than traditional configuration interaction expansions, offering hope to extend this methodology to challenging systems in chemical and material design.
An efficient parallel termination detection algorithm

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, A. H.; Crivelli, S.; Jessup, E. R.

2004-05-27

Information local to any one processor is insufficient to monitor the overall progress of most distributed computations. Typically, a second distributed computation for detecting termination of the main computation is necessary. In order to be a useful computational tool, the termination detection routine must operate concurrently with the main computation, adding minimal overhead, and it must promptly and correctly detect termination when it occurs. In this paper, we present a new algorithm for detecting the termination of a parallel computation on distributed-memory MIMD computers that satisfies all of those criteria. A variety of termination detection algorithms have been devised. Ofmore » these, the algorithm presented by Sinha, Kale, and Ramkumar (henceforth, the SKR algorithm) is unique in its ability to adapt to the load conditions of the system on which it runs, thereby minimizing the impact of termination detection on performance. Because their algorithm also detects termination quickly, we consider it to be the most efficient practical algorithm presently available. The termination detection algorithm presented here was developed for use in the PMESC programming library for distributed-memory MIMD computers. Like the SKR algorithm, our algorithm adapts to system loads and imposes little overhead. Also like the SKR algorithm, ours is tree-based, and it does not depend on any assumptions about the physical interconnection topology of the processors or the specifics of the distributed computation. In addition, our algorithm is easier to implement and requires only half as many tree traverses as does the SKR algorithm. This paper is organized as follows. In section 2, we define our computational model. In section 3, we review the SKR algorithm. We introduce our new algorithm in section 4, and prove its correctness in section 5. We discuss its efficiency and present experimental results in section 6.« less
A new fast algorithm for computing a complex number: Theoretic transforms

NASA Technical Reports Server (NTRS)

Reed, I. S.; Liu, K. Y.; Truong, T. K.

1977-01-01

A high-radix fast Fourier transformation (FFT) algorithm for computing transforms over GF(sq q), where q is a Mersenne prime, is developed to implement fast circular convolutions. This new algorithm requires substantially fewer multiplications than the conventional FFT.
Image restoration for three-dimensional fluorescence microscopy using an orthonormal basis for efficient representation of depth-variant point-spread functions

PubMed Central

Patwary, Nurmohammed; Preza, Chrysanthe

2015-01-01

A depth-variant (DV) image restoration algorithm for wide field fluorescence microscopy, using an orthonormal basis decomposition of DV point-spread functions (PSFs), is investigated in this study. The efficient PSF representation is based on a previously developed principal component analysis (PCA), which is computationally intensive. We present an approach developed to reduce the number of DV PSFs required for the PCA computation, thereby making the PCA-based approach computationally tractable for thick samples. Restoration results from both synthetic and experimental images show consistency and that the proposed algorithm addresses efficiently depth-induced aberration using a small number of principal components. Comparison of the PCA-based algorithm with a previously-developed strata-based DV restoration algorithm demonstrates that the proposed method improves performance by 50% in terms of accuracy and simultaneously reduces the processing time by 64% using comparable computational resources. PMID:26504634
A rapid algorithm for realistic human reaching and its use in a virtual reality system

NASA Technical Reports Server (NTRS)

Aldridge, Ann; Pandya, Abhilash; Goldsby, Michael; Maida, James

1994-01-01

The Graphics Analysis Facility (GRAF) at JSC has developed a rapid algorithm for computing realistic human reaching. The algorithm was applied to GRAF's anthropometrically correct human model and used in a 3D computer graphics system and a virtual reality system. The nature of the algorithm and its uses are discussed.
Rapid execution of fan beam image reconstruction algorithms using efficient computational techniques and special-purpose processors

NASA Astrophysics Data System (ADS)

Gilbert, B. K.; Robb, R. A.; Chu, A.; Kenue, S. K.; Lent, A. H.; Swartzlander, E. E., Jr.

1981-02-01

Rapid advances during the past ten years of several forms of computer-assisted tomography (CT) have resulted in the development of numerous algorithms to convert raw projection data into cross-sectional images. These reconstruction algorithms are either 'iterative,' in which a large matrix algebraic equation is solved by successive approximation techniques; or 'closed form'. Continuing evolution of the closed form algorithms has allowed the newest versions to produce excellent reconstructed images in most applications. This paper will review several computer software and special-purpose digital hardware implementations of closed form algorithms, either proposed during the past several years by a number of workers or actually implemented in commercial or research CT scanners. The discussion will also cover a number of recently investigated algorithmic modifications which reduce the amount of computation required to execute the reconstruction process, as well as several new special-purpose digital hardware implementations under development in laboratories at the Mayo Clinic.
Job-shop scheduling applied to computer vision

NASA Astrophysics Data System (ADS)

Sebastian y Zuniga, Jose M.; Torres-Medina, Fernando; Aracil, Rafael; Reinoso, Oscar; Jimenez, Luis M.; Garcia, David

1997-09-01

This paper presents a method for minimizing the total elapsed time spent by n tasks running on m differents processors working in parallel. The developed algorithm not only minimizes the total elapsed time but also reduces the idle time and waiting time of in-process tasks. This condition is very important in some applications of computer vision in which the time to finish the total process is particularly critical -- quality control in industrial inspection, real- time computer vision, guided robots. The scheduling algorithm is based on the use of two matrices, obtained from the precedence relationships between tasks, and the data obtained from the two matrices. The developed scheduling algorithm has been tested in one application of quality control using computer vision. The results obtained have been satisfactory in the application of different image processing algorithms.
Design requirements and development of an airborne descent path definition algorithm for time navigation

NASA Technical Reports Server (NTRS)

Izumi, K. H.; Thompson, J. L.; Groce, J. L.; Schwab, R. W.

1986-01-01

The design requirements for a 4D path definition algorithm are described. These requirements were developed for the NASA ATOPS as an extension of the Local Flow Management/Profile Descent algorithm. They specify the processing flow, functional and data architectures, and system input requirements, and recommended the addition of a broad path revision (reinitialization) function capability. The document also summarizes algorithm design enhancements and the implementation status of the algorithm on an in-house PDP-11/70 computer. Finally, the requirements for the pilot-computer interfaces, the lateral path processor, and guidance and steering function are described.
Natural Inspired Intelligent Visual Computing and Its Application to Viticulture.

PubMed

Ang, Li Minn; Seng, Kah Phooi; Ge, Feng Lu

2017-05-23

This paper presents an investigation of natural inspired intelligent computing and its corresponding application towards visual information processing systems for viticulture. The paper has three contributions: (1) a review of visual information processing applications for viticulture; (2) the development of natural inspired computing algorithms based on artificial immune system (AIS) techniques for grape berry detection; and (3) the application of the developed algorithms towards real-world grape berry images captured in natural conditions from vineyards in Australia. The AIS algorithms in (2) were developed based on a nature-inspired clonal selection algorithm (CSA) which is able to detect the arcs in the berry images with precision, based on a fitness model. The arcs detected are then extended to perform the multiple arcs and ring detectors information processing for the berry detection application. The performance of the developed algorithms were compared with traditional image processing algorithms like the circular Hough transform (CHT) and other well-known circle detection methods. The proposed AIS approach gave a Fscore of 0.71 compared with Fscores of 0.28 and 0.30 for the CHT and a parameter-free circle detection technique (RPCD) respectively.
Bio-inspired algorithms applied to molecular docking simulations.

PubMed

Heberlé, G; de Azevedo, W F

2011-01-01

Nature as a source of inspiration has been shown to have a great beneficial impact on the development of new computational methodologies. In this scenario, analyses of the interactions between a protein target and a ligand can be simulated by biologically inspired algorithms (BIAs). These algorithms mimic biological systems to create new paradigms for computation, such as neural networks, evolutionary computing, and swarm intelligence. This review provides a description of the main concepts behind BIAs applied to molecular docking simulations. Special attention is devoted to evolutionary algorithms, guided-directed evolutionary algorithms, and Lamarckian genetic algorithms. Recent applications of these methodologies to protein targets identified in the Mycobacterium tuberculosis genome are described.

Dynamic Load-Balancing for Distributed Heterogeneous Computing of Parallel CFD Problems

NASA Technical Reports Server (NTRS)

Ecer, A.; Chien, Y. P.; Boenisch, T.; Akay, H. U.

2000-01-01

The developed methodology is aimed at improving the efficiency of executing block-structured algorithms on parallel, distributed, heterogeneous computers. The basic approach of these algorithms is to divide the flow domain into many sub- domains called blocks, and solve the governing equations over these blocks. Dynamic load balancing problem is defined as the efficient distribution of the blocks among the available processors over a period of several hours of computations. In environments with computers of different architecture, operating systems, CPU speed, memory size, load, and network speed, balancing the loads and managing the communication between processors becomes crucial. Load balancing software tools for mutually dependent parallel processes have been created to efficiently utilize an advanced computation environment and algorithms. These tools are dynamic in nature because of the chances in the computer environment during execution time. More recently, these tools were extended to a second operating system: NT. In this paper, the problems associated with this application will be discussed. Also, the developed algorithms were combined with the load sharing capability of LSF to efficiently utilize workstation clusters for parallel computing. Finally, results will be presented on running a NASA based code ADPAC to demonstrate the developed tools for dynamic load balancing.
A cross-disciplinary introduction to quantum annealing-based algorithms

NASA Astrophysics Data System (ADS)

Venegas-Andraca, Salvador E.; Cruz-Santos, William; McGeoch, Catherine; Lanzagorta, Marco

2018-04-01

A central goal in quantum computing is the development of quantum hardware and quantum algorithms in order to analyse challenging scientific and engineering problems. Research in quantum computation involves contributions from both physics and computer science; hence this article presents a concise introduction to basic concepts from both fields that are used in annealing-based quantum computation, an alternative to the more familiar quantum gate model. We introduce some concepts from computer science required to define difficult computational problems and to realise the potential relevance of quantum algorithms to find novel solutions to those problems. We introduce the structure of quantum annealing-based algorithms as well as two examples of this kind of algorithms for solving instances of the max-SAT and Minimum Multicut problems. An overview of the quantum annealing systems manufactured by D-Wave Systems is also presented.
Computational Algorithmization: Limitations in Problem Solving Skills in Computational Sciences Majors at University of Oriente

ERIC Educational Resources Information Center

Castillo, Antonio S.; Berenguer, Isabel A.; Sánchez, Alexander G.; Álvarez, Tomás R. R.

2017-01-01

This paper analyzes the results of a diagnostic study carried out with second year students of the computational sciences majors at University of Oriente, Cuba, to determine the limitations that they present in computational algorithmization. An exploratory research was developed using quantitative and qualitative methods. The results allowed…
Computer algorithm for coding gain

NASA Technical Reports Server (NTRS)

Dodd, E. E.

1974-01-01

Development of a computer algorithm for coding gain for use in an automated communications link design system. Using an empirical formula which defines coding gain as used in space communications engineering, an algorithm is constructed on the basis of available performance data for nonsystematic convolutional encoding with soft-decision (eight-level) Viterbi decoding.
Infrared Algorithm Development for Ocean Observations with EOS/MODIS

NASA Technical Reports Server (NTRS)

Brown, Otis B.

1997-01-01

Efforts continue under this contract to develop algorithms for the computation of sea surface temperature (SST) from MODIS infrared measurements. This effort includes radiative transfer modeling, comparison of in situ and satellite observations, development and evaluation of processing and networking methodologies for algorithm computation and data accession, evaluation of surface validation approaches for IR radiances, development of experimental instrumentation, and participation in MODIS (project) related activities. Activities in this contract period have focused on radiative transfer modeling, evaluation of atmospheric correction methodologies, undertake field campaigns, analysis of field data, and participation in MODIS meetings.
Developing Subdomain Allocation Algorithms Based on Spatial and Communicational Constraints to Accelerate Dust Storm Simulation

PubMed Central

Gui, Zhipeng; Yu, Manzhu; Yang, Chaowei; Jiang, Yunfeng; Chen, Songqing; Xia, Jizhe; Huang, Qunying; Liu, Kai; Li, Zhenlong; Hassan, Mohammed Anowarul; Jin, Baoxuan

2016-01-01

Dust storm has serious disastrous impacts on environment, human health, and assets. The developments and applications of dust storm models have contributed significantly to better understand and predict the distribution, intensity and structure of dust storms. However, dust storm simulation is a data and computing intensive process. To improve the computing performance, high performance computing has been widely adopted by dividing the entire study area into multiple subdomains and allocating each subdomain on different computing nodes in a parallel fashion. Inappropriate allocation may introduce imbalanced task loads and unnecessary communications among computing nodes. Therefore, allocation is a key factor that may impact the efficiency of parallel process. An allocation algorithm is expected to consider the computing cost and communication cost for each computing node to minimize total execution time and reduce overall communication cost for the entire simulation. This research introduces three algorithms to optimize the allocation by considering the spatial and communicational constraints: 1) an Integer Linear Programming (ILP) based algorithm from combinational optimization perspective; 2) a K-Means and Kernighan-Lin combined heuristic algorithm (K&K) integrating geometric and coordinate-free methods by merging local and global partitioning; 3) an automatic seeded region growing based geometric and local partitioning algorithm (ASRG). The performance and effectiveness of the three algorithms are compared based on different factors. Further, we adopt the K&K algorithm as the demonstrated algorithm for the experiment of dust model simulation with the non-hydrostatic mesoscale model (NMM-dust) and compared the performance with the MPI default sequential allocation. The results demonstrate that K&K method significantly improves the simulation performance with better subdomain allocation. This method can also be adopted for other relevant atmospheric and numerical modeling. PMID:27044039
Parallel Directionally Split Solver Based on Reformulation of Pipelined Thomas Algorithm

NASA Technical Reports Server (NTRS)

Povitsky, A.

1998-01-01

In this research an efficient parallel algorithm for 3-D directionally split problems is developed. The proposed algorithm is based on a reformulated version of the pipelined Thomas algorithm that starts the backward step computations immediately after the completion of the forward step computations for the first portion of lines This algorithm has data available for other computational tasks while processors are idle from the Thomas algorithm. The proposed 3-D directionally split solver is based on the static scheduling of processors where local and non-local, data-dependent and data-independent computations are scheduled while processors are idle. A theoretical model of parallelization efficiency is used to define optimal parameters of the algorithm, to show an asymptotic parallelization penalty and to obtain an optimal cover of a global domain with subdomains. It is shown by computational experiments and by the theoretical model that the proposed algorithm reduces the parallelization penalty about two times over the basic algorithm for the range of the number of processors (subdomains) considered and the number of grid nodes per subdomain.
Parallel Algorithms for the Exascale Era

DOE Office of Scientific and Technical Information (OSTI.GOV)

Robey, Robert W.

New parallel algorithms are needed to reach the Exascale level of parallelism with millions of cores. We look at some of the research developed by students in projects at LANL. The research blends ideas from the early days of computing while weaving in the fresh approach brought by students new to the field of high performance computing. We look at reproducibility of global sums and why it is important to parallel computing. Next we look at how the concept of hashing has led to the development of more scalable algorithms suitable for next-generation parallel computers. Nearly all of this workmore » has been done by undergraduates and published in leading scientific journals.« less
Massively parallel algorithms for real-time wavefront control of a dense adaptive optics system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fijany, A.; Milman, M.; Redding, D.

1994-12-31

In this paper massively parallel algorithms and architectures for real-time wavefront control of a dense adaptive optic system (SELENE) are presented. The authors have already shown that the computation of a near optimal control algorithm for SELENE can be reduced to the solution of a discrete Poisson equation on a regular domain. Although, this represents an optimal computation, due the large size of the system and the high sampling rate requirement, the implementation of this control algorithm poses a computationally challenging problem since it demands a sustained computational throughput of the order of 10 GFlops. They develop a novel algorithm,more » designated as Fast Invariant Imbedding algorithm, which offers a massive degree of parallelism with simple communication and synchronization requirements. Due to these features, this algorithm is significantly more efficient than other Fast Poisson Solvers for implementation on massively parallel architectures. The authors also discuss two massively parallel, algorithmically specialized, architectures for low-cost and optimal implementation of the Fast Invariant Imbedding algorithm.« less
Sensor placement algorithm development to maximize the efficiency of acid gas removal unit for integrated gasifiction combined sycle (IGCC) power plant with CO2 capture

DOE Office of Scientific and Technical Information (OSTI.GOV)

Paul, P.; Bhattacharyya, D.; Turton, R.

2012-01-01

Future integrated gasification combined cycle (IGCC) power plants with CO{sub 2} capture will face stricter operational and environmental constraints. Accurate values of relevant states/outputs/disturbances are needed to satisfy these constraints and to maximize the operational efficiency. Unfortunately, a number of these process variables cannot be measured while a number of them can be measured, but have low precision, reliability, or signal-to-noise ratio. In this work, a sensor placement (SP) algorithm is developed for optimal selection of sensor location, number, and type that can maximize the plant efficiency and result in a desired precision of the relevant measured/unmeasured states. In thismore » work, an SP algorithm is developed for an selective, dual-stage Selexol-based acid gas removal (AGR) unit for an IGCC plant with pre-combustion CO{sub 2} capture. A comprehensive nonlinear dynamic model of the AGR unit is developed in Aspen Plus Dynamics® (APD) and used to generate a linear state-space model that is used in the SP algorithm. The SP algorithm is developed with the assumption that an optimal Kalman filter will be implemented in the plant for state and disturbance estimation. The algorithm is developed assuming steady-state Kalman filtering and steady-state operation of the plant. The control system is considered to operate based on the estimated states and thereby, captures the effects of the SP algorithm on the overall plant efficiency. The optimization problem is solved by Genetic Algorithm (GA) considering both linear and nonlinear equality and inequality constraints. Due to the very large number of candidate sets available for sensor placement and because of the long time that it takes to solve the constrained optimization problem that includes more than 1000 states, solution of this problem is computationally expensive. For reducing the computation time, parallel computing is performed using the Distributed Computing Server (DCS®) and the Parallel Computing® toolbox from Mathworks®. In this presentation, we will share our experience in setting up parallel computing using GA in the MATLAB® environment and present the overall approach for achieving higher computational efficiency in this framework.« less
Sensor placement algorithm development to maximize the efficiency of acid gas removal unit for integrated gasification combined cycle (IGCC) power plant with CO{sub 2} capture

DOE Office of Scientific and Technical Information (OSTI.GOV)

Paul, P.; Bhattacharyya, D.; Turton, R.

2012-01-01

Future integrated gasification combined cycle (IGCC) power plants with CO{sub 2} capture will face stricter operational and environmental constraints. Accurate values of relevant states/outputs/disturbances are needed to satisfy these constraints and to maximize the operational efficiency. Unfortunately, a number of these process variables cannot be measured while a number of them can be measured, but have low precision, reliability, or signal-to-noise ratio. In this work, a sensor placement (SP) algorithm is developed for optimal selection of sensor location, number, and type that can maximize the plant efficiency and result in a desired precision of the relevant measured/unmeasured states. In thismore » work, an SP algorithm is developed for an selective, dual-stage Selexol-based acid gas removal (AGR) unit for an IGCC plant with pre-combustion CO{sub 2} capture. A comprehensive nonlinear dynamic model of the AGR unit is developed in Aspen Plus Dynamics® (APD) and used to generate a linear state-space model that is used in the SP algorithm. The SP algorithm is developed with the assumption that an optimal Kalman filter will be implemented in the plant for state and disturbance estimation. The algorithm is developed assuming steady-state Kalman filtering and steady-state operation of the plant. The control system is considered to operate based on the estimated states and thereby, captures the effects of the SP algorithm on the overall plant efficiency. The optimization problem is solved by Genetic Algorithm (GA) considering both linear and nonlinear equality and inequality constraints. Due to the very large number of candidate sets available for sensor placement and because of the long time that it takes to solve the constrained optimization problem that includes more than 1000 states, solution of this problem is computationally expensive. For reducing the computation time, parallel computing is performed using the Distributed Computing Server (DCS®) and the Parallel Computing® toolbox from Mathworks®. In this presentation, we will share our experience in setting up parallel computing using GA in the MATLAB® environment and present the overall approach for achieving higher computational efficiency in this framework.« less
Volumetric visualization algorithm development for an FPGA-based custom computing machine

NASA Astrophysics Data System (ADS)

Sallinen, Sami J.; Alakuijala, Jyrki; Helminen, Hannu; Laitinen, Joakim

1998-05-01

Rendering volumetric medical images is a burdensome computational task for contemporary computers due to the large size of the data sets. Custom designed reconfigurable hardware could considerably speed up volume visualization if an algorithm suitable for the platform is used. We present an algorithm and speedup techniques for visualizing volumetric medical CT and MR images with a custom-computing machine based on a Field Programmable Gate Array (FPGA). We also present simulated performance results of the proposed algorithm calculated with a software implementation running on a desktop PC. Our algorithm is capable of generating perspective projection renderings of single and multiple isosurfaces with transparency, simulated X-ray images, and Maximum Intensity Projections (MIP). Although more speedup techniques exist for parallel projection than for perspective projection, we have constrained ourselves to perspective viewing, because of its importance in the field of radiotherapy. The algorithm we have developed is based on ray casting, and the rendering is sped up by three different methods: shading speedup by gradient precalculation, a new generalized version of Ray-Acceleration by Distance Coding (RADC), and background ray elimination by speculative ray selection.
Supercomputing resources empowering superstack with interactive and integrated systems

NASA Astrophysics Data System (ADS)

Rückemann, Claus-Peter

2012-09-01

This paper presents the results from the development and implementation of Superstack algorithms to be dynamically used with integrated systems and supercomputing resources. Processing of geophysical data, thus named geoprocessing, is an essential part of the analysis of geoscientific data. The theory of Superstack algorithms and the practical application on modern computing architectures was inspired by developments introduced with processing of seismic data on mainframes and within the last years leading to high end scientific computing applications. There are several stacking algorithms known but with low signal to noise ratio in seismic data the use of iterative algorithms like the Superstack can support analysis and interpretation. The new Superstack algorithms are in use with wave theory and optical phenomena on highly performant computing resources for huge data sets as well as for sophisticated application scenarios in geosciences and archaeology.
Toward a computational psycholinguistics of reference production.

PubMed

van Deemter, Kees; Gatt, Albert; van Gompel, Roger P G; Krahmer, Emiel

2012-04-01

This article introduces the topic ''Production of Referring Expressions: Bridging the Gap between Computational and Empirical Approaches to Reference'' of the journal Topics in Cognitive Science. We argue that computational and psycholinguistic approaches to reference production can benefit from closer interaction, and that this is likely to result in the construction of algorithms that differ markedly from the ones currently known in the computational literature. We focus particularly on determinism, the feature of existing algorithms that is perhaps most clearly at odds with psycholinguistic results, discussing how future algorithms might include non-determinism, and how new psycholinguistic experiments could inform the development of such algorithms. Copyright © 2012 Cognitive Science Society, Inc.
Reversible Data Hiding Based on DNA Computing

PubMed Central

Xie, Yingjie

2017-01-01

Biocomputing, especially DNA, computing has got great development. It is widely used in information security. In this paper, a novel algorithm of reversible data hiding based on DNA computing is proposed. Inspired by the algorithm of histogram modification, which is a classical algorithm for reversible data hiding, we combine it with DNA computing to realize this algorithm based on biological technology. Compared with previous results, our experimental results have significantly improved the ER (Embedding Rate). Furthermore, some PSNR (peak signal-to-noise ratios) of test images are also improved. Experimental results show that it is suitable for protecting the copyright of cover image in DNA-based information security. PMID:28280504
Development of generalized pressure velocity coupling scheme for the analysis of compressible and incompressible combusting flows

NASA Technical Reports Server (NTRS)

Chen, C. P.; Wu, S. T.

1992-01-01

The objective of this investigation has been to develop an algorithm (or algorithms) for the improvement of the accuracy and efficiency of the computer fluid dynamics (CFD) models to study the fundamental physics of combustion chamber flows, which are necessary ultimately for the design of propulsion systems such as SSME and STME. During this three year study (May 19, 1978 - May 18, 1992), a unique algorithm was developed for all speed flows. This newly developed algorithm basically consists of two pressure-based algorithms (i.e. PISOC and MFICE). This PISOC is a non-iterative scheme and the FICE is an iterative scheme where PISOC has the characteristic advantages on low and high speed flows and the modified FICE has shown its efficiency and accuracy to compute the flows in the transonic region. A new algorithm is born from a combination of these two algorithms. This newly developed algorithm has general application in both time-accurate and steady state flows, and also was tested extensively for various flow conditions, such as turbulent flows, chemically reacting flows, and multiphase flows.
A New Augmentation Based Algorithm for Extracting Maximal Chordal Subgraphs.

PubMed

Bhowmick, Sanjukta; Chen, Tzu-Yi; Halappanavar, Mahantesh

2015-02-01

A graph is chordal if every cycle of length greater than three contains an edge between non-adjacent vertices. Chordal graphs are of interest both theoretically, since they admit polynomial time solutions to a range of NP-hard graph problems, and practically, since they arise in many applications including sparse linear algebra, computer vision, and computational biology. A maximal chordal subgraph is a chordal subgraph that is not a proper subgraph of any other chordal subgraph. Existing algorithms for computing maximal chordal subgraphs depend on dynamically ordering the vertices, which is an inherently sequential process and therefore limits the algorithms' parallelizability. In this paper we explore techniques to develop a scalable parallel algorithm for extracting a maximal chordal subgraph. We demonstrate that an earlier attempt at developing a parallel algorithm may induce a non-optimal vertex ordering and is therefore not guaranteed to terminate with a maximal chordal subgraph. We then give a new algorithm that first computes and then repeatedly augments a spanning chordal subgraph. After proving that the algorithm terminates with a maximal chordal subgraph, we then demonstrate that this algorithm is more amenable to parallelization and that the parallel version also terminates with a maximal chordal subgraph. That said, the complexity of the new algorithm is higher than that of the previous parallel algorithm, although the earlier algorithm computes a chordal subgraph which is not guaranteed to be maximal. We experimented with our augmentation-based algorithm on both synthetic and real-world graphs. We provide scalability results and also explore the effect of different choices for the initial spanning chordal subgraph on both the running time and on the number of edges in the maximal chordal subgraph.
High performance transcription factor-DNA docking with GPU computing

PubMed Central

2012-01-01

Background Protein-DNA docking is a very challenging problem in structural bioinformatics and has important implications in a number of applications, such as structure-based prediction of transcription factor binding sites and rational drug design. Protein-DNA docking is very computational demanding due to the high cost of energy calculation and the statistical nature of conformational sampling algorithms. More importantly, experiments show that the docking quality depends on the coverage of the conformational sampling space. It is therefore desirable to accelerate the computation of the docking algorithm, not only to reduce computing time, but also to improve docking quality. Methods In an attempt to accelerate the sampling process and to improve the docking performance, we developed a graphics processing unit (GPU)-based protein-DNA docking algorithm. The algorithm employs a potential-based energy function to describe the binding affinity of a protein-DNA pair, and integrates Monte-Carlo simulation and a simulated annealing method to search through the conformational space. Algorithmic techniques were developed to improve the computation efficiency and scalability on GPU-based high performance computing systems. Results The effectiveness of our approach is tested on a non-redundant set of 75 TF-DNA complexes and a newly developed TF-DNA docking benchmark. We demonstrated that the GPU-based docking algorithm can significantly accelerate the simulation process and thereby improving the chance of finding near-native TF-DNA complex structures. This study also suggests that further improvement in protein-DNA docking research would require efforts from two integral aspects: improvement in computation efficiency and energy function design. Conclusions We present a high performance computing approach for improving the prediction accuracy of protein-DNA docking. The GPU-based docking algorithm accelerates the search of the conformational space and thus increases the chance of finding more near-native structures. To the best of our knowledge, this is the first ad hoc effort of applying GPU or GPU clusters to the protein-DNA docking problem. PMID:22759575
A fast D.F.T. algorithm using complex integer transforms

NASA Technical Reports Server (NTRS)

Reed, I. S.; Truong, T. K.

1978-01-01

Winograd (1976) has developed a new class of algorithms which depend heavily on the computation of a cyclic convolution for computing the conventional DFT (discrete Fourier transform); this new algorithm, for a few hundred transform points, requires substantially fewer multiplications than the conventional FFT algorithm. Reed and Truong have defined a special class of finite Fourier-like transforms over GF(q squared), where q = 2 to the p power minus 1 is a Mersenne prime for p = 2, 3, 5, 7, 13, 17, 19, 31, 61. In the present paper it is shown that Winograd's algorithm can be combined with the aforementioned Fourier-like transform to yield a new algorithm for computing the DFT. A fast method for accurately computing the DFT of a sequence of complex numbers of very long transform-lengths is thus obtained.
Parallel, stochastic measurement of molecular surface area.

PubMed

Juba, Derek; Varshney, Amitabh

2008-08-01

Biochemists often wish to compute surface areas of proteins. A variety of algorithms have been developed for this task, but they are designed for traditional single-processor architectures. The current trend in computer hardware is towards increasingly parallel architectures for which these algorithms are not well suited. We describe a parallel, stochastic algorithm for molecular surface area computation that maps well to the emerging multi-core architectures. Our algorithm is also progressive, providing a rough estimate of surface area immediately and refining this estimate as time goes on. Furthermore, the algorithm generates points on the molecular surface which can be used for point-based rendering. We demonstrate a GPU implementation of our algorithm and show that it compares favorably with several existing molecular surface computation programs, giving fast estimates of the molecular surface area with good accuracy.

Correlation signatures of wet soils and snows. [algorithm development and computer programming

NASA Technical Reports Server (NTRS)

Phillips, M. R.

1972-01-01

Interpretation, analysis, and development of algorithms have provided the necessary computational programming tools for soil data processing, data handling and analysis. Algorithms that have been developed thus far, are adequate and have been proven successful for several preliminary and fundamental applications such as software interfacing capabilities, probability distributions, grey level print plotting, contour plotting, isometric data displays, joint probability distributions, boundary mapping, channel registration and ground scene classification. A description of an Earth Resources Flight Data Processor, (ERFDP), which handles and processes earth resources data under a users control is provided.
Novel hybrid GPU-CPU implementation of parallelized Monte Carlo parametric expectation maximization estimation method for population pharmacokinetic data analysis.

PubMed

Ng, C M

2013-10-01

The development of a population PK/PD model, an essential component for model-based drug development, is both time- and labor-intensive. A graphical-processing unit (GPU) computing technology has been proposed and used to accelerate many scientific computations. The objective of this study was to develop a hybrid GPU-CPU implementation of parallelized Monte Carlo parametric expectation maximization (MCPEM) estimation algorithm for population PK data analysis. A hybrid GPU-CPU implementation of the MCPEM algorithm (MCPEMGPU) and identical algorithm that is designed for the single CPU (MCPEMCPU) were developed using MATLAB in a single computer equipped with dual Xeon 6-Core E5690 CPU and a NVIDIA Tesla C2070 GPU parallel computing card that contained 448 stream processors. Two different PK models with rich/sparse sampling design schemes were used to simulate population data in assessing the performance of MCPEMCPU and MCPEMGPU. Results were analyzed by comparing the parameter estimation and model computation times. Speedup factor was used to assess the relative benefit of parallelized MCPEMGPU over MCPEMCPU in shortening model computation time. The MCPEMGPU consistently achieved shorter computation time than the MCPEMCPU and can offer more than 48-fold speedup using a single GPU card. The novel hybrid GPU-CPU implementation of parallelized MCPEM algorithm developed in this study holds a great promise in serving as the core for the next-generation of modeling software for population PK/PD analysis.
Quantum rendering

NASA Astrophysics Data System (ADS)

Lanzagorta, Marco O.; Gomez, Richard B.; Uhlmann, Jeffrey K.

2003-08-01

In recent years, computer graphics has emerged as a critical component of the scientific and engineering process, and it is recognized as an important computer science research area. Computer graphics are extensively used for a variety of aerospace and defense training systems and by Hollywood's special effects companies. All these applications require the computer graphics systems to produce high quality renderings of extremely large data sets in short periods of time. Much research has been done in "classical computing" toward the development of efficient methods and techniques to reduce the rendering time required for large datasets. Quantum Computing's unique algorithmic features offer the possibility of speeding up some of the known rendering algorithms currently used in computer graphics. In this paper we discuss possible implementations of quantum rendering algorithms. In particular, we concentrate on the implementation of Grover's quantum search algorithm for Z-buffering, ray-tracing, radiosity, and scene management techniques. We also compare the theoretical performance between the classical and quantum versions of the algorithms.
Iterative algorithms for large sparse linear systems on parallel computers

NASA Technical Reports Server (NTRS)

Adams, L. M.

1982-01-01

Algorithms for assembling in parallel the sparse system of linear equations that result from finite difference or finite element discretizations of elliptic partial differential equations, such as those that arise in structural engineering are developed. Parallel linear stationary iterative algorithms and parallel preconditioned conjugate gradient algorithms are developed for solving these systems. In addition, a model for comparing parallel algorithms on array architectures is developed and results of this model for the algorithms are given.
Hardware Acceleration of Adaptive Neural Algorithms.

DOE Office of Scientific and Technical Information (OSTI.GOV)

James, Conrad D.

As tradit ional numerical computing has faced challenges, researchers have turned towards alternative computing approaches to reduce power - per - computation metrics and improve algorithm performance. Here, we describe an approach towards non - conventional computing that strengthens the connection between machine learning and neuroscience concepts. The Hardware Acceleration of Adaptive Neural Algorithms (HAANA) project ha s develop ed neural machine learning algorithms and hardware for applications in image processing and cybersecurity. While machine learning methods are effective at extracting relevant features from many types of data, the effectiveness of these algorithms degrades when subjected to real - worldmore » conditions. Our team has generated novel neural - inspired approa ches to improve the resiliency and adaptability of machine learning algorithms. In addition, we have also designed and fabricated hardware architectures and microelectronic devices specifically tuned towards the training and inference operations of neural - inspired algorithms. Finally, our multi - scale simulation framework allows us to assess the impact of microelectronic device properties on algorithm performance.« less
A Discussion of Using a Reconfigurable Processor to Implement the Discrete Fourier Transform

NASA Technical Reports Server (NTRS)

White, Michael J.

2004-01-01

This paper presents the design and implementation of the Discrete Fourier Transform (DFT) algorithm on a reconfigurable processor system. While highly applicable to many engineering problems, the DFT is an extremely computationally intensive algorithm. Consequently, the eventual goal of this work is to enhance the execution of a floating-point precision DFT algorithm by off loading the algorithm from the computing system. This computing system, within the context of this research, is a typical high performance desktop computer with an may of field programmable gate arrays (FPGAs). FPGAs are hardware devices that are configured by software to execute an algorithm. If it is desired to change the algorithm, the software is changed to reflect the modification, then download to the FPGA, which is then itself modified. This paper will discuss methodology for developing the DFT algorithm to be implemented on the FPGA. We will discuss the algorithm, the FPGA code effort, and the results to date.
The Burn Medical Assistant: Developing Machine Learning Algorithms to Aid in the Estimation of Burn Wound Size

DTIC Science & Technology

2017-10-01

hypothesis that a computer machine learning algorithm can analyze and classify burn injures using multispectral imaging within 5% of an expert clinician...morbidity. In response to these challenges, the USAISR developed and obtained FDA 510(k) clearance of the Burn Navigator™, a computer decision support... computer decision support software (CDSS), can significantly change the CDSS algorithm’s recommendations and thus the total fluid administered to a
Conjugate-Gradient Algorithms For Dynamics Of Manipulators

NASA Technical Reports Server (NTRS)

Fijany, Amir; Scheid, Robert E.

1993-01-01

Algorithms for serial and parallel computation of forward dynamics of multiple-link robotic manipulators by conjugate-gradient method developed. Parallel algorithms have potential for speedup of computations on multiple linked, specialized processors implemented in very-large-scale integrated circuits. Such processors used to stimulate dynamics, possibly faster than in real time, for purposes of planning and control.
A new augmentation based algorithm for extracting maximal chordal subgraphs

DOE PAGES

Bhowmick, Sanjukta; Chen, Tzu-Yi; Halappanavar, Mahantesh

2014-10-18

If every cycle of a graph is chordal length greater than three then it contains an edge between non-adjacent vertices. Chordal graphs are of interest both theoretically, since they admit polynomial time solutions to a range of NP-hard graph problems, and practically, since they arise in many applications including sparse linear algebra, computer vision, and computational biology. A maximal chordal subgraph is a chordal subgraph that is not a proper subgraph of any other chordal subgraph. Existing algorithms for computing maximal chordal subgraphs depend on dynamically ordering the vertices, which is an inherently sequential process and therefore limits the algorithms’more » parallelizability. In our paper we explore techniques to develop a scalable parallel algorithm for extracting a maximal chordal subgraph. We demonstrate that an earlier attempt at developing a parallel algorithm may induce a non-optimal vertex ordering and is therefore not guaranteed to terminate with a maximal chordal subgraph. We then give a new algorithm that first computes and then repeatedly augments a spanning chordal subgraph. After proving that the algorithm terminates with a maximal chordal subgraph, we then demonstrate that this algorithm is more amenable to parallelization and that the parallel version also terminates with a maximal chordal subgraph. That said, the complexity of the new algorithm is higher than that of the previous parallel algorithm, although the earlier algorithm computes a chordal subgraph which is not guaranteed to be maximal. Finally, we experimented with our augmentation-based algorithm on both synthetic and real-world graphs. We provide scalability results and also explore the effect of different choices for the initial spanning chordal subgraph on both the running time and on the number of edges in the maximal chordal subgraph.« less
Demonstration of essentiality of entanglement in a Deutsch-like quantum algorithm

NASA Astrophysics Data System (ADS)

Huang, He-Liang; Goswami, Ashutosh K.; Bao, Wan-Su; Panigrahi, Prasanta K.

2018-06-01

Quantum algorithms can be used to efficiently solve certain classically intractable problems by exploiting quantum parallelism. However, the effectiveness of quantum entanglement in quantum computing remains a question of debate. This study presents a new quantum algorithm that shows entanglement could provide advantages over both classical algorithms and quantum algo- rithms without entanglement. Experiments are implemented to demonstrate the proposed algorithm using superconducting qubits. Results show the viability of the algorithm and suggest that entanglement is essential in obtaining quantum speedup for certain problems in quantum computing. The study provides reliable and clear guidance for developing useful quantum algorithms.
Flowfield computation of entry vehicles

NASA Technical Reports Server (NTRS)

Prabhu, Dinesh K.

1990-01-01

The equations governing the multidimensional flow of a reacting mixture of thermally perfect gasses were derived. The modeling procedures for the various terms of the conservation laws are discussed. A numerical algorithm, based on the finite-volume approach, to solve these conservation equations was developed. The advantages and disadvantages of the present numerical scheme are discussed from the point of view of accuracy, computer time, and memory requirements. A simple one-dimensional model problem was solved to prove the feasibility and accuracy of the algorithm. A computer code implementing the above algorithm was developed and is presently being applied to simple geometries and conditions. Once the code is completely debugged and validated, it will be used to compute the complete unsteady flow field around the Aeroassist Flight Experiment (AFE) body.
Redundancy management for efficient fault recovery in NASA's distributed computing system

NASA Technical Reports Server (NTRS)

Malek, Miroslaw; Pandya, Mihir; Yau, Kitty

1991-01-01

The management of redundancy in computer systems was studied and guidelines were provided for the development of NASA's fault-tolerant distributed systems. Fault recovery and reconfiguration mechanisms were examined. A theoretical foundation was laid for redundancy management by efficient reconfiguration methods and algorithmic diversity. Algorithms were developed to optimize the resources for embedding of computational graphs of tasks in the system architecture and reconfiguration of these tasks after a failure has occurred. The computational structure represented by a path and the complete binary tree was considered and the mesh and hypercube architectures were targeted for their embeddings. The innovative concept of Hybrid Algorithm Technique was introduced. This new technique provides a mechanism for obtaining fault tolerance while exhibiting improved performance.
Semiannual Report, April 1, 1989 through September 30, 1989 (Institute for Computer Applications in Science and Engineering)

DTIC Science & Technology

1990-02-01

noise. Tobias B. Orloff Work began on developing a high quality rendering algorithm based on the radiosity method. The algorithm is similar to...previous progressive radiosity algorithms except for the following improvements: 1. At each iteration vertex radiosities are computed using a modified scan...line approach, thus eliminating the quadratic cost associated with a ray tracing computation of vortex radiosities . 2. At each iteration the scene is
VLSI architectures for computing multiplications and inverses in GF(2m)

NASA Technical Reports Server (NTRS)

Wang, C. C.; Truong, T. K.; Shao, H. M.; Deutsch, L. J.; Omura, J. K.

1985-01-01

Finite field arithmetic logic is central in the implementation of Reed-Solomon coders and in some cryptographic algorithms. There is a need for good multiplication and inversion algorithms that are easily realized on VLSI chips. Massey and Omura recently developed a new multiplication algorithm for Galois fields based on a normal basis representation. A pipeline structure is developed to realize the Massey-Omura multiplier in the finite field GF(2m). With the simple squaring property of the normal-basis representation used together with this multiplier, a pipeline architecture is also developed for computing inverse elements in GF(2m). The designs developed for the Massey-Omura multiplier and the computation of inverse elements are regular, simple, expandable and, therefore, naturally suitable for VLSI implementation.
VLSI architectures for computing multiplications and inverses in GF(2-m)

NASA Technical Reports Server (NTRS)

Wang, C. C.; Truong, T. K.; Shao, H. M.; Deutsch, L. J.; Omura, J. K.; Reed, I. S.

1983-01-01

Finite field arithmetic logic is central in the implementation of Reed-Solomon coders and in some cryptographic algorithms. There is a need for good multiplication and inversion algorithms that are easily realized on VLSI chips. Massey and Omura recently developed a new multiplication algorithm for Galois fields based on a normal basis representation. A pipeline structure is developed to realize the Massey-Omura multiplier in the finite field GF(2m). With the simple squaring property of the normal-basis representation used together with this multiplier, a pipeline architecture is also developed for computing inverse elements in GF(2m). The designs developed for the Massey-Omura multiplier and the computation of inverse elements are regular, simple, expandable and, therefore, naturally suitable for VLSI implementation.
VLSI architectures for computing multiplications and inverses in GF(2m).

PubMed

Wang, C C; Truong, T K; Shao, H M; Deutsch, L J; Omura, J K; Reed, I S

1985-08-01

Finite field arithmetic logic is central in the implementation of Reed-Solomon coders and in some cryptographic algorithms. There is a need for good multiplication and inversion algorithms that can be easily realized on VLSI chips. Massey and Omura recently developed a new multiplication algorithm for Galois fields based on a normal basis representation. In this paper, a pipeline structure is developed to realize the Massey-Omura multiplier in the finite field GF(2m). With the simple squaring property of the normal basis representation used together with this multiplier, a pipeline architecture is developed for computing inverse elements in GF(2m). The designs developed for the Massey-Omura multiplier and the computation of inverse elements are regular, simple, expandable, and therefore, naturally suitable for VLSI implementation.
A novel clinical decision support algorithm for constructing complete medication histories.

PubMed

Long, Ju; Yuan, Michael Juntao

2017-07-01

A patient's complete medication history is a crucial element for physicians to develop a full understanding of the patient's medical conditions and treatment options. However, due to the fragmented nature of medical data, this process can be very time-consuming and often impossible for physicians to construct a complete medication history for complex patients. In this paper, we describe an accurate, computationally efficient and scalable algorithm to construct a medication history timeline. The algorithm is developed and validated based on 1 million random prescription records from a large national prescription data aggregator. Our evaluation shows that the algorithm can be scaled horizontally on-demand, making it suitable for future delivery in a cloud-computing environment. We also propose that this cloud-based medication history computation algorithm could be integrated into Electronic Medical Records, enabling informed clinical decision-making at the point of care. Copyright © 2017 Elsevier B.V. All rights reserved.
Design of a Performance-Responsive Drill and Practice Algorithm for Computer-Based Training.

ERIC Educational Resources Information Center

Vazquez-Abad, Jesus; LaFleur, Marc

1990-01-01

Reviews criticisms of the use of drill and practice programs in educational computing and describes potentials for its use in instruction. Topics discussed include guidelines for developing computer-based drill and practice; scripted training courseware; item format design; item bank design; and a performance-responsive algorithm for item…
Adaptive Control Strategies for Flexible Robotic Arm

NASA Technical Reports Server (NTRS)

Bialasiewicz, Jan T.

1996-01-01

The control problem of a flexible robotic arm has been investigated. The control strategies that have been developed have a wide application in approaching the general control problem of flexible space structures. The following control strategies have been developed and evaluated: neural self-tuning control algorithm, neural-network-based fuzzy logic control algorithm, and adaptive pole assignment algorithm. All of the above algorithms have been tested through computer simulation. In addition, the hardware implementation of a computer control system that controls the tip position of a flexible arm clamped on a rigid hub mounted directly on the vertical shaft of a dc motor, has been developed. An adaptive pole assignment algorithm has been applied to suppress vibrations of the described physical model of flexible robotic arm and has been successfully tested using this testbed.
Dynamic displacement measurement of large-scale structures based on the Lucas-Kanade template tracking algorithm

NASA Astrophysics Data System (ADS)

Guo, Jie; Zhu, Chang`an

2016-01-01

The development of optics and computer technologies enables the application of the vision-based technique that uses digital cameras to the displacement measurement of large-scale structures. Compared with traditional contact measurements, vision-based technique allows for remote measurement, has a non-intrusive characteristic, and does not necessitate mass introduction. In this study, a high-speed camera system is developed to complete the displacement measurement in real time. The system consists of a high-speed camera and a notebook computer. The high-speed camera can capture images at a speed of hundreds of frames per second. To process the captured images in computer, the Lucas-Kanade template tracking algorithm in the field of computer vision is introduced. Additionally, a modified inverse compositional algorithm is proposed to reduce the computing time of the original algorithm and improve the efficiency further. The modified algorithm can rapidly accomplish one displacement extraction within 1 ms without having to install any pre-designed target panel onto the structures in advance. The accuracy and the efficiency of the system in the remote measurement of dynamic displacement are demonstrated in the experiments on motion platform and sound barrier on suspension viaduct. Experimental results show that the proposed algorithm can extract accurate displacement signal and accomplish the vibration measurement of large-scale structures.

Approximate Algorithms for Computing Spatial Distance Histograms with Accuracy Guarantees

PubMed Central

Grupcev, Vladimir; Yuan, Yongke; Tu, Yi-Cheng; Huang, Jin; Chen, Shaoping; Pandit, Sagar; Weng, Michael

2014-01-01

Particle simulation has become an important research tool in many scientific and engineering fields. Data generated by such simulations impose great challenges to database storage and query processing. One of the queries against particle simulation data, the spatial distance histogram (SDH) query, is the building block of many high-level analytics, and requires quadratic time to compute using a straightforward algorithm. Previous work has developed efficient algorithms that compute exact SDHs. While beating the naive solution, such algorithms are still not practical in processing SDH queries against large-scale simulation data. In this paper, we take a different path to tackle this problem by focusing on approximate algorithms with provable error bounds. We first present a solution derived from the aforementioned exact SDH algorithm, and this solution has running time that is unrelated to the system size N. We also develop a mathematical model to analyze the mechanism that leads to errors in the basic approximate algorithm. Our model provides insights on how the algorithm can be improved to achieve higher accuracy and efficiency. Such insights give rise to a new approximate algorithm with improved time/accuracy tradeoff. Experimental results confirm our analysis. PMID:24693210
Fast and stable algorithms for computing the principal square root of a complex matrix

NASA Technical Reports Server (NTRS)

Shieh, Leang S.; Lian, Sui R.; Mcinnis, Bayliss C.

1987-01-01

This note presents recursive algorithms that are rapidly convergent and more stable for finding the principal square root of a complex matrix. Also, the developed algorithms are utilized to derive the fast and stable matrix sign algorithms which are useful in developing applications to control system problems.
Computational nuclear quantum many-body problem: The UNEDF project

NASA Astrophysics Data System (ADS)

Bogner, S.; Bulgac, A.; Carlson, J.; Engel, J.; Fann, G.; Furnstahl, R. J.; Gandolfi, S.; Hagen, G.; Horoi, M.; Johnson, C.; Kortelainen, M.; Lusk, E.; Maris, P.; Nam, H.; Navratil, P.; Nazarewicz, W.; Ng, E.; Nobre, G. P. A.; Ormand, E.; Papenbrock, T.; Pei, J.; Pieper, S. C.; Quaglioni, S.; Roche, K. J.; Sarich, J.; Schunck, N.; Sosonkina, M.; Terasaki, J.; Thompson, I.; Vary, J. P.; Wild, S. M.

2013-10-01

The UNEDF project was a large-scale collaborative effort that applied high-performance computing to the nuclear quantum many-body problem. The primary focus of the project was on constructing, validating, and applying an optimized nuclear energy density functional, which entailed a wide range of pioneering developments in microscopic nuclear structure and reactions, algorithms, high-performance computing, and uncertainty quantification. UNEDF demonstrated that close associations among nuclear physicists, mathematicians, and computer scientists can lead to novel physics outcomes built on algorithmic innovations and computational developments. This review showcases a wide range of UNEDF science results to illustrate this interplay.
Computational mechanics analysis tools for parallel-vector supercomputers

NASA Technical Reports Server (NTRS)

Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.; Qin, J.

1993-01-01

Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigen-solution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization algorithm and domain decomposition. The source code for many of these algorithms is available from NASA Langley.
High-performance computing on GPUs for resistivity logging of oil and gas wells

NASA Astrophysics Data System (ADS)

Glinskikh, V.; Dudaev, A.; Nechaev, O.; Surodina, I.

2017-10-01

We developed and implemented into software an algorithm for high-performance simulation of electrical logs from oil and gas wells using high-performance heterogeneous computing. The numerical solution of the 2D forward problem is based on the finite-element method and the Cholesky decomposition for solving a system of linear algebraic equations (SLAE). Software implementations of the algorithm used the NVIDIA CUDA technology and computing libraries are made, allowing us to perform decomposition of SLAE and find its solution on central processor unit (CPU) and graphics processor unit (GPU). The calculation time is analyzed depending on the matrix size and number of its non-zero elements. We estimated the computing speed on CPU and GPU, including high-performance heterogeneous CPU-GPU computing. Using the developed algorithm, we simulated resistivity data in realistic models.
Computing border bases using mutant strategies

NASA Astrophysics Data System (ADS)

Ullah, E.; Abbas Khan, S.

2014-01-01

Border bases, a generalization of Gröbner bases, have actively been addressed during recent years due to their applicability to industrial problems. In cryptography and coding theory a useful application of border based is to solve zero-dimensional systems of polynomial equations over finite fields, which motivates us for developing optimizations of the algorithms that compute border bases. In 2006, Kehrein and Kreuzer formulated the Border Basis Algorithm (BBA), an algorithm which allows the computation of border bases that relate to a degree compatible term ordering. In 2007, J. Ding et al. introduced mutant strategies bases on finding special lower degree polynomials in the ideal. The mutant strategies aim to distinguish special lower degree polynomials (mutants) from the other polynomials and give them priority in the process of generating new polynomials in the ideal. In this paper we develop hybrid algorithms that use the ideas of J. Ding et al. involving the concept of mutants to optimize the Border Basis Algorithm for solving systems of polynomial equations over finite fields. In particular, we recall a version of the Border Basis Algorithm which is actually called the Improved Border Basis Algorithm and propose two hybrid algorithms, called MBBA and IMBBA. The new mutants variants provide us space efficiency as well as time efficiency. The efficiency of these newly developed hybrid algorithms is discussed using standard cryptographic examples.
Multi-threaded Sparse Matrix Sparse Matrix Multiplication for Many-Core and GPU Architectures.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deveci, Mehmet; Trott, Christian Robert; Rajamanickam, Sivasankaran

Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrix- matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and datamore » structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved.« less
Multi-threaded Sparse Matrix-Matrix Multiplication for Many-Core and GPU Architectures.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deveci, Mehmet; Rajamanickam, Sivasankaran; Trott, Christian Robert

Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scienti c computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrix-matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and datamore » structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved.« less
The application of dynamic programming in production planning

NASA Astrophysics Data System (ADS)

Wu, Run

2017-05-01

Nowadays, with the popularity of the computers, various industries and fields are widely applying computer information technology, which brings about huge demand for a variety of application software. In order to develop software meeting various needs with most economical cost and best quality, programmers must design efficient algorithms. A superior algorithm can not only soul up one thing, but also maximize the benefits and generate the smallest overhead. As one of the common algorithms, dynamic programming algorithms are used to solving problems with some sort of optimal properties. When solving problems with a large amount of sub-problems that needs repetitive calculations, the ordinary sub-recursive method requires to consume exponential time, and dynamic programming algorithm can reduce the time complexity of the algorithm to the polynomial level, according to which we can conclude that dynamic programming algorithm is a very efficient compared to other algorithms reducing the computational complexity and enriching the computational results. In this paper, we expound the concept, basic elements, properties, core, solving steps and difficulties of the dynamic programming algorithm besides, establish the dynamic programming model of the production planning problem.
Development of seismic tomography software for hybrid supercomputers

NASA Astrophysics Data System (ADS)

Nikitin, Alexandr; Serdyukov, Alexandr; Duchkov, Anton

2015-04-01

Seismic tomography is a technique used for computing velocity model of geologic structure from first arrival travel times of seismic waves. The technique is used in processing of regional and global seismic data, in seismic exploration for prospecting and exploration of mineral and hydrocarbon deposits, and in seismic engineering for monitoring the condition of engineering structures and the surrounding host medium. As a consequence of development of seismic monitoring systems and increasing volume of seismic data, there is a growing need for new, more effective computational algorithms for use in seismic tomography applications with improved performance, accuracy and resolution. To achieve this goal, it is necessary to use modern high performance computing systems, such as supercomputers with hybrid architecture that use not only CPUs, but also accelerators and co-processors for computation. The goal of this research is the development of parallel seismic tomography algorithms and software package for such systems, to be used in processing of large volumes of seismic data (hundreds of gigabytes and more). These algorithms and software package will be optimized for the most common computing devices used in modern hybrid supercomputers, such as Intel Xeon CPUs, NVIDIA Tesla accelerators and Intel Xeon Phi co-processors. In this work, the following general scheme of seismic tomography is utilized. Using the eikonal equation solver, arrival times of seismic waves are computed based on assumed velocity model of geologic structure being analyzed. In order to solve the linearized inverse problem, tomographic matrix is computed that connects model adjustments with travel time residuals, and the resulting system of linear equations is regularized and solved to adjust the model. The effectiveness of parallel implementations of existing algorithms on target architectures is considered. During the first stage of this work, algorithms were developed for execution on supercomputers using multicore CPUs only, with preliminary performance tests showing good parallel efficiency on large numerical grids. Porting of the algorithms to hybrid supercomputers is currently ongoing.
Computational Fluid Dynamics. [numerical methods and algorithm development

NASA Technical Reports Server (NTRS)

1992-01-01

This collection of papers was presented at the Computational Fluid Dynamics (CFD) Conference held at Ames Research Center in California on March 12 through 14, 1991. It is an overview of CFD activities at NASA Lewis Research Center. The main thrust of computational work at Lewis is aimed at propulsion systems. Specific issues related to propulsion CFD and associated modeling will also be presented. Examples of results obtained with the most recent algorithm development will also be presented.
A GENERAL ALGORITHM FOR THE CONSTRUCTION OF CONTOUR PLOTS

NASA Technical Reports Server (NTRS)

Johnson, W.

1994-01-01

The graphical presentation of experimentally or theoretically generated data sets frequently involves the construction of contour plots. A general computer algorithm has been developed for the construction of contour plots. The algorithm provides for efficient and accurate contouring with a modular approach which allows flexibility in modifying the algorithm for special applications. The algorithm accepts as input data values at a set of points irregularly distributed over a plane. The algorithm is based on an interpolation scheme in which the points in the plane are connected by straight line segments to form a set of triangles. In general, the data is smoothed using a least-squares-error fit of the data to a bivariate polynomial. To construct the contours, interpolation along the edges of the triangles is performed, using the bivariable polynomial if data smoothing was performed. Once the contour points have been located, the contour may be drawn. This program is written in FORTRAN IV for batch execution and has been implemented on an IBM 360 series computer with a central memory requirement of approximately 100K of 8-bit bytes. This computer algorithm was developed in 1981.
New Parallel Algorithms for Landscape Evolution Model

NASA Astrophysics Data System (ADS)

Jin, Y.; Zhang, H.; Shi, Y.

2017-12-01

Most landscape evolution models (LEM) developed in the last two decades solve the diffusion equation to simulate the transportation of surface sediments. This numerical approach is difficult to parallelize due to the computation of drainage area for each node, which needs huge amount of communication if run in parallel. In order to overcome this difficulty, we developed two parallel algorithms for LEM with a stream net. One algorithm handles the partition of grid with traditional methods and applies an efficient global reduction algorithm to do the computation of drainage areas and transport rates for the stream net; the other algorithm is based on a new partition algorithm, which partitions the nodes in catchments between processes first, and then partitions the cells according to the partition of nodes. Both methods focus on decreasing communication between processes and take the advantage of massive computing techniques, and numerical experiments show that they are both adequate to handle large scale problems with millions of cells. We implemented the two algorithms in our program based on the widely used finite element library deal.II, so that it can be easily coupled with ASPECT.
A lateral guidance algorithm to reduce the post-aerobraking burn requirements for a lift-modulated orbital transfer vehicle. M.S. Thesis

NASA Technical Reports Server (NTRS)

Herman, G. C.

1986-01-01

A lateral guidance algorithm which controls the location of the line of intersection between the actual and desired orbital planes (the hinge line) is developed for the aerobraking phase of a lift-modulated orbital transfer vehicle. The on-board targeting algorithm associated with this lateral guidance algorithm is simple and concise which is very desirable since computation time and space are limited on an on-board flight computer. A variational equation which describes the movement of the hinge line is derived. Simple relationships between the plane error, the desired hinge line position, the position out-of-plane error, and the velocity out-of-plane error are found. A computer simulation is developed to test the lateral guidance algorithm for a variety of operating conditions. The algorithm does reduce the total burn magnitude needed to achieve the desired orbit by allowing the plane correction and perigee-raising burn to be combined in a single maneuver. The algorithm performs well under vacuum perigee dispersions, pot-hole density disturbance, and thick atmospheres. The results for many different operating conditions are presented.
Renewable energy in electric utility capacity planning: a decomposition approach with application to a Mexican utility

DOE Office of Scientific and Technical Information (OSTI.GOV)

Staschus, K.

1985-01-01

In this dissertation, efficient algorithms for electric-utility capacity expansion planning with renewable energy are developed. The algorithms include a deterministic phase that quickly finds a near-optimal expansion plan using derating and a linearized approximation to the time-dependent availability of nondispatchable energy sources. A probabilistic second phase needs comparatively few computer-time consuming probabilistic simulation iterations to modify this solution towards the optimal expansion plan. For the deterministic first phase, two algorithms, based on a Lagrangian Dual decomposition and a Generalized Benders Decomposition, are developed. The probabilistic second phase uses a Generalized Benders Decomposition approach. Extensive computational tests of the algorithms aremore » reported. Among the deterministic algorithms, the one based on Lagrangian Duality proves fastest. The two-phase approach is shown to save up to 80% in computing time as compared to a purely probabilistic algorithm. The algorithms are applied to determine the optimal expansion plan for the Tijuana-Mexicali subsystem of the Mexican electric utility system. A strong recommendation to push conservation programs in the desert city of Mexicali results from this implementation.« less
Hybrid massively parallel fast sweeping method for static Hamilton-Jacobi equations

NASA Astrophysics Data System (ADS)

Detrixhe, Miles; Gibou, Frédéric

2016-10-01

The fast sweeping method is a popular algorithm for solving a variety of static Hamilton-Jacobi equations. Fast sweeping algorithms for parallel computing have been developed, but are severely limited. In this work, we present a multilevel, hybrid parallel algorithm that combines the desirable traits of two distinct parallel methods. The fine and coarse grained components of the algorithm take advantage of heterogeneous computer architecture common in high performance computing facilities. We present the algorithm and demonstrate its effectiveness on a set of example problems including optimal control, dynamic games, and seismic wave propagation. We give results for convergence, parallel scaling, and show state-of-the-art speedup values for the fast sweeping method.
Strategic Control Algorithm Development : Volume 3. Strategic Algorithm Report.

DOT National Transportation Integrated Search

1974-08-01

The strategic algorithm report presents a detailed description of the functional basic strategic control arrival algorithm. This description is independent of a particular computer or language. Contained in this discussion are the geometrical and env...
A proximity algorithm accelerated by Gauss-Seidel iterations for L1/TV denoising models

NASA Astrophysics Data System (ADS)

Li, Qia; Micchelli, Charles A.; Shen, Lixin; Xu, Yuesheng

2012-09-01

Our goal in this paper is to improve the computational performance of the proximity algorithms for the L1/TV denoising model. This leads us to a new characterization of all solutions to the L1/TV model via fixed-point equations expressed in terms of the proximity operators. Based upon this observation we develop an algorithm for solving the model and establish its convergence. Furthermore, we demonstrate that the proposed algorithm can be accelerated through the use of the componentwise Gauss-Seidel iteration so that the CPU time consumed is significantly reduced. Numerical experiments using the proposed algorithm for impulsive noise removal are included, with a comparison to three recently developed algorithms. The numerical results show that while the proposed algorithm enjoys a high quality of the restored images, as the other three known algorithms do, it performs significantly better in terms of computational efficiency measured in the CPU time consumed.
LAWS simulation: Sampling strategies and wind computation algorithms

NASA Technical Reports Server (NTRS)

Emmitt, G. D. A.; Wood, S. A.; Houston, S. H.

1989-01-01

In general, work has continued on developing and evaluating algorithms designed to manage the Laser Atmospheric Wind Sounder (LAWS) lidar pulses and to compute the horizontal wind vectors from the line-of-sight (LOS) measurements. These efforts fall into three categories: Improvements to the shot management and multi-pair algorithms (SMA/MPA); observing system simulation experiments; and ground-based simulations of LAWS.
Faster than Real-Time Dynamic Simulation for Large-Size Power System with Detailed Dynamic Models using High-Performance Computing Platform

DOE Office of Scientific and Technical Information (OSTI.GOV)

Huang, Renke; Jin, Shuangshuang; Chen, Yousu

This paper presents a faster-than-real-time dynamic simulation software package that is designed for large-size power system dynamic simulation. It was developed on the GridPACKTM high-performance computing (HPC) framework. The key features of the developed software package include (1) faster-than-real-time dynamic simulation for a WECC system (17,000 buses) with different types of detailed generator, controller, and relay dynamic models, (2) a decoupled parallel dynamic simulation algorithm with optimized computation architecture to better leverage HPC resources and technologies, (3) options for HPC-based linear and iterative solvers, (4) hidden HPC details, such as data communication and distribution, to enable development centered on mathematicalmore » models and algorithms rather than on computational details for power system researchers, and (5) easy integration of new dynamic models and related algorithms into the software package.« less

A ridge tracking algorithm and error estimate for efficient computation of Lagrangian coherent structures.

PubMed

Lipinski, Doug; Mohseni, Kamran

2010-03-01

A ridge tracking algorithm for the computation and extraction of Lagrangian coherent structures (LCS) is developed. This algorithm takes advantage of the spatial coherence of LCS by tracking the ridges which form LCS to avoid unnecessary computations away from the ridges. We also make use of the temporal coherence of LCS by approximating the time dependent motion of the LCS with passive tracer particles. To justify this approximation, we provide an estimate of the difference between the motion of the LCS and that of tracer particles which begin on the LCS. In addition to the speedup in computational time, the ridge tracking algorithm uses less memory and results in smaller output files than the standard LCS algorithm. Finally, we apply our ridge tracking algorithm to two test cases, an analytically defined double gyre as well as the more complicated example of the numerical simulation of a swimming jellyfish. In our test cases, we find up to a 35 times speedup when compared with the standard LCS algorithm.
Computing Principal Eigenvectors of Large Web Graphs: Algorithms and Accelerations Related to PageRank and HITS

ERIC Educational Resources Information Center

Nagasinghe, Iranga

2010-01-01

This thesis investigates and develops a few acceleration techniques for the search engine algorithms used in PageRank and HITS computations. PageRank and HITS methods are two highly successful applications of modern Linear Algebra in computer science and engineering. They constitute the essential technologies accounted for the immense growth and…
Euler/Navier-Stokes calculations of transonic flow past fixed- and rotary-wing aircraft configurations

NASA Technical Reports Server (NTRS)

Deese, J. E.; Agarwal, R. K.

1989-01-01

Computational fluid dynamics has an increasingly important role in the design and analysis of aircraft as computer hardware becomes faster and algorithms become more efficient. Progress is being made in two directions: more complex and realistic configurations are being treated and algorithms based on higher approximations to the complete Navier-Stokes equations are being developed. The literature indicates that linear panel methods can model detailed, realistic aircraft geometries in flow regimes where this approximation is valid. As algorithms including higher approximations to the Navier-Stokes equations are developed, computer resource requirements increase rapidly. Generation of suitable grids become more difficult and the number of grid points required to resolve flow features of interest increases. Recently, the development of large vector computers has enabled researchers to attempt more complex geometries with Euler and Navier-Stokes algorithms. The results of calculations for transonic flow about a typical transport and fighter wing-body configuration using thin layer Navier-Stokes equations are described along with flow about helicopter rotor blades using both Euler/Navier-Stokes equations.
An exact computational method for performance analysis of sequential test algorithms for detecting network intrusions

NASA Astrophysics Data System (ADS)

Chen, Xinjia; Lacy, Fred; Carriere, Patrick

2015-05-01

Sequential test algorithms are playing increasingly important roles for quick detecting network intrusions such as portscanners. In view of the fact that such algorithms are usually analyzed based on intuitive approximation or asymptotic analysis, we develop an exact computational method for the performance analysis of such algorithms. Our method can be used to calculate the probability of false alarm and average detection time up to arbitrarily pre-specified accuracy.
Convex optimization problem prototyping for image reconstruction in computed tomography with the Chambolle-Pock algorithm

PubMed Central

Sidky, Emil Y.; Jørgensen, Jakob H.; Pan, Xiaochuan

2012-01-01

The primal-dual optimization algorithm developed in Chambolle and Pock (CP), 2011 is applied to various convex optimization problems of interest in computed tomography (CT) image reconstruction. This algorithm allows for rapid prototyping of optimization problems for the purpose of designing iterative image reconstruction algorithms for CT. The primal-dual algorithm is briefly summarized in the article, and its potential for prototyping is demonstrated by explicitly deriving CP algorithm instances for many optimization problems relevant to CT. An example application modeling breast CT with low-intensity X-ray illumination is presented. PMID:22538474
Algorithms for computing the geopotential using a simple density layer

NASA Technical Reports Server (NTRS)

Morrison, F.

1976-01-01

Several algorithms have been developed for computing the potential and attraction of a simple density layer. These are numerical cubature, Taylor series, and a mixed analytic and numerical integration using a singularity-matching technique. A computer program has been written to combine these techniques for computing the disturbing acceleration on an artificial earth satellite. A total of 1640 equal-area, constant surface density blocks on an oblate spheroid are used. The singularity-matching algorithm is used in the subsatellite region, Taylor series in the surrounding zone, and numerical cubature on the rest of the earth.
Data association approaches in bearings-only multi-target tracking

NASA Astrophysics Data System (ADS)

Xu, Benlian; Wang, Zhiquan

2008-03-01

According to requirements of time computation complexity and correctness of data association of the multi-target tracking, two algorithms are suggested in this paper. The proposed Algorithm 1 is developed from the modified version of dual Simplex method, and it has the advantage of direct and explicit form of the optimal solution. The Algorithm 2 is based on the idea of Algorithm 1 and rotational sort method, it combines not only advantages of Algorithm 1, but also reduces the computational burden, whose complexity is only 1/ N times that of Algorithm 1. Finally, numerical analyses are carried out to evaluate the performance of the two data association algorithms.
A Benders based rolling horizon algorithm for a dynamic facility location problem

DOE PAGES

Marufuzzaman,, Mohammad; Gedik, Ridvan; Roni, Mohammad S.

2016-06-28

This study presents a well-known capacitated dynamic facility location problem (DFLP) that satisfies the customer demand at a minimum cost by determining the time period for opening, closing, or retaining an existing facility in a given location. To solve this challenging NP-hard problem, this paper develops a unique hybrid solution algorithm that combines a rolling horizon algorithm with an accelerated Benders decomposition algorithm. Extensive computational experiments are performed on benchmark test instances to evaluate the hybrid algorithm’s efficiency and robustness in solving the DFLP problem. Computational results indicate that the hybrid Benders based rolling horizon algorithm consistently offers high qualitymore » feasible solutions in a much shorter computational time period than the standalone rolling horizon and accelerated Benders decomposition algorithms in the experimental range.« less
Group implicit concurrent algorithms in nonlinear structural dynamics

NASA Technical Reports Server (NTRS)

Ortiz, M.; Sotelino, E. D.

1989-01-01

During the 70's and 80's, considerable effort was devoted to developing efficient and reliable time stepping procedures for transient structural analysis. Mathematically, the equations governing this type of problems are generally stiff, i.e., they exhibit a wide spectrum in the linear range. The algorithms best suited to this type of applications are those which accurately integrate the low frequency content of the response without necessitating the resolution of the high frequency modes. This means that the algorithms must be unconditionally stable, which in turn rules out explicit integration. The most exciting possibility in the algorithms development area in recent years has been the advent of parallel computers with multiprocessing capabilities. So, this work is mainly concerned with the development of parallel algorithms in the area of structural dynamics. A primary objective is to devise unconditionally stable and accurate time stepping procedures which lend themselves to an efficient implementation in concurrent machines. Some features of the new computer architecture are summarized. A brief survey of current efforts in the area is presented. A new class of concurrent procedures, or Group Implicit algorithms is introduced and analyzed. The numerical simulation shows that GI algorithms hold considerable promise for application in coarse grain as well as medium grain parallel computers.
Architecutres, Models, Algorithms, and Software Tools for Configurable Computing

DTIC Science & Technology

2000-03-06

and J.G. Nash. The gated interconnection network for dynamic programming. Plenum, 1988 . [18] Ju wook Jang, Heonchul Park, and Viktor K. Prasanna. A ...Sep. 1997. [2] C. Ebeling, D. C. Cronquist , P. Franklin and C. Fisher, "RaPiD - A configurable computing architecture for compute-intensive...ABSTRACT (Maximum 200 words) The Models, Algorithms, and Architectures for Reconfigurable Computing (MAARC) project developed a sound framework for
Efficient sparse matrix-matrix multiplication for computing periodic responses by shooting method on Intel Xeon Phi

NASA Astrophysics Data System (ADS)

Stoykov, S.; Atanassov, E.; Margenov, S.

2016-10-01

Many of the scientific applications involve sparse or dense matrix operations, such as solving linear systems, matrix-matrix products, eigensolvers, etc. In what concerns structural nonlinear dynamics, the computations of periodic responses and the determination of stability of the solution are of primary interest. Shooting method iswidely used for obtaining periodic responses of nonlinear systems. The method involves simultaneously operations with sparse and dense matrices. One of the computationally expensive operations in the method is multiplication of sparse by dense matrices. In the current work, a new algorithm for sparse matrix by dense matrix products is presented. The algorithm takes into account the structure of the sparse matrix, which is obtained by space discretization of the nonlinear Mindlin's plate equation of motion by the finite element method. The algorithm is developed to use the vector engine of Intel Xeon Phi coprocessors. It is compared with the standard sparse matrix by dense matrix algorithm and the one developed by Intel MKL and it is shown that by considering the properties of the sparse matrix better algorithms can be developed.
Computational Workbench for Multibody Dynamics

NASA Technical Reports Server (NTRS)

Edmonds, Karina

2007-01-01

PyCraft is a computer program that provides an interactive, workbenchlike computing environment for developing and testing algorithms for multibody dynamics. Examples of multibody dynamic systems amenable to analysis with the help of PyCraft include land vehicles, spacecraft, robots, and molecular models. PyCraft is based on the Spatial-Operator- Algebra (SOA) formulation for multibody dynamics. The SOA operators enable construction of simple and compact representations of complex multibody dynamical equations. Within the Py-Craft computational workbench, users can, essentially, use the high-level SOA operator notation to represent the variety of dynamical quantities and algorithms and to perform computations interactively. PyCraft provides a Python-language interface to underlying C++ code. Working with SOA concepts, a user can create and manipulate Python-level operator classes in order to implement and evaluate new dynamical quantities and algorithms. During use of PyCraft, virtually all SOA-based algorithms are available for computational experiments.
Computational Fluid Dynamics: Past, Present, And Future

NASA Technical Reports Server (NTRS)

Kutler, Paul

1988-01-01

Paper reviews development of computational fluid dynamics and explores future prospects of technology. Report covers such topics as computer technology, turbulence, development of solution methodology, developemnt of algorithms, definition of flow geometries, generation of computational grids, and pre- and post-data processing.
Interactive visualization of Earth and Space Science computations

NASA Technical Reports Server (NTRS)

Hibbard, William L.; Paul, Brian E.; Santek, David A.; Dyer, Charles R.; Battaiola, Andre L.; Voidrot-Martinez, Marie-Francoise

1994-01-01

Computers have become essential tools for scientists simulating and observing nature. Simulations are formulated as mathematical models but are implemented as computer algorithms to simulate complex events. Observations are also analyzed and understood in terms of mathematical models, but the number of these observations usually dictates that we automate analyses with computer algorithms. In spite of their essential role, computers are also barriers to scientific understanding. Unlike hand calculations, automated computations are invisible and, because of the enormous numbers of individual operations in automated computations, the relation between an algorithm's input and output is often not intuitive. This problem is illustrated by the behavior of meteorologists responsible for forecasting weather. Even in this age of computers, many meteorologists manually plot weather observations on maps, then draw isolines of temperature, pressure, and other fields by hand (special pads of maps are printed for just this purpose). Similarly, radiologists use computers to collect medical data but are notoriously reluctant to apply image-processing algorithms to that data. To these scientists with life-and-death responsibilities, computer algorithms are black boxes that increase rather than reduce risk. The barrier between scientists and their computations can be bridged by techniques that make the internal workings of algorithms visible and that allow scientists to experiment with their computations. Here we describe two interactive systems developed at the University of Wisconsin-Madison Space Science and Engineering Center (SSEC) that provide these capabilities to Earth and space scientists.
Hybrid massively parallel fast sweeping method for static Hamilton–Jacobi equations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Detrixhe, Miles, E-mail: mdetrixhe@engineering.ucsb.edu; University of California Santa Barbara, Santa Barbara, CA, 93106; Gibou, Frédéric, E-mail: fgibou@engineering.ucsb.edu

The fast sweeping method is a popular algorithm for solving a variety of static Hamilton–Jacobi equations. Fast sweeping algorithms for parallel computing have been developed, but are severely limited. In this work, we present a multilevel, hybrid parallel algorithm that combines the desirable traits of two distinct parallel methods. The fine and coarse grained components of the algorithm take advantage of heterogeneous computer architecture common in high performance computing facilities. We present the algorithm and demonstrate its effectiveness on a set of example problems including optimal control, dynamic games, and seismic wave propagation. We give results for convergence, parallel scaling,more » and show state-of-the-art speedup values for the fast sweeping method.« less
Development of an algorithm for controlling a multilevel three-phase converter

NASA Astrophysics Data System (ADS)

Taissariyeva, Kyrmyzy; Ilipbaeva, Lyazzat

2017-08-01

This work is devoted to the development of an algorithm for controlling transistors in a three-phase multilevel conversion system. The developed algorithm allows to organize a correct operation and describes the state of transistors at each moment of time when constructing a computer model of a three-phase multilevel converter. The developed algorithm of operation of transistors provides in-phase of a three-phase converter and obtaining a sinusoidal voltage curve at the converter output.
A parallel implementation of an off-lattice individual-based model of multicellular populations

NASA Astrophysics Data System (ADS)

Harvey, Daniel G.; Fletcher, Alexander G.; Osborne, James M.; Pitt-Francis, Joe

2015-07-01

As computational models of multicellular populations include ever more detailed descriptions of biophysical and biochemical processes, the computational cost of simulating such models limits their ability to generate novel scientific hypotheses and testable predictions. While developments in microchip technology continue to increase the power of individual processors, parallel computing offers an immediate increase in available processing power. To make full use of parallel computing technology, it is necessary to develop specialised algorithms. To this end, we present a parallel algorithm for a class of off-lattice individual-based models of multicellular populations. The algorithm divides the spatial domain between computing processes and comprises communication routines that ensure the model is correctly simulated on multiple processors. The parallel algorithm is shown to accurately reproduce the results of a deterministic simulation performed using a pre-existing serial implementation. We test the scaling of computation time, memory use and load balancing as more processes are used to simulate a cell population of fixed size. We find approximate linear scaling of both speed-up and memory consumption on up to 32 processor cores. Dynamic load balancing is shown to provide speed-up for non-regular spatial distributions of cells in the case of a growing population.
High-Performance Computing for the Electromagnetic Modeling and Simulation of Interconnects

NASA Technical Reports Server (NTRS)

Schutt-Aine, Jose E.

1996-01-01

The electromagnetic modeling of packages and interconnects plays a very important role in the design of high-speed digital circuits, and is most efficiently performed by using computer-aided design algorithms. In recent years, packaging has become a critical area in the design of high-speed communication systems and fast computers, and the importance of the software support for their development has increased accordingly. Throughout this project, our efforts have focused on the development of modeling and simulation techniques and algorithms that permit the fast computation of the electrical parameters of interconnects and the efficient simulation of their electrical performance.
Parallel O(log n) algorithms for open- and closed-chain rigid multibody systems based on a new mass matrix factorization technique

NASA Technical Reports Server (NTRS)

Fijany, Amir

1993-01-01

In this paper, parallel O(log n) algorithms for computation of rigid multibody dynamics are developed. These parallel algorithms are derived by parallelization of new O(n) algorithms for the problem. The underlying feature of these O(n) algorithms is a drastically different strategy for decomposition of interbody force which leads to a new factorization of the mass matrix (M). Specifically, it is shown that a factorization of the inverse of the mass matrix in the form of the Schur Complement is derived as M(exp -1) = C - B(exp *)A(exp -1)B, wherein matrices C, A, and B are block tridiagonal matrices. The new O(n) algorithm is then derived as a recursive implementation of this factorization of M(exp -1). For the closed-chain systems, similar factorizations and O(n) algorithms for computation of Operational Space Mass Matrix lambda and its inverse lambda(exp -1) are also derived. It is shown that these O(n) algorithms are strictly parallel, that is, they are less efficient than other algorithms for serial computation of the problem. But, to our knowledge, they are the only known algorithms that can be parallelized and that lead to both time- and processor-optimal parallel algorithms for the problem, i.e., parallel O(log n) algorithms with O(n) processors. The developed parallel algorithms, in addition to their theoretical significance, are also practical from an implementation point of view due to their simple architectural requirements.
The science of computing - Parallel computation

NASA Technical Reports Server (NTRS)

Denning, P. J.

1985-01-01

Although parallel computation architectures have been known for computers since the 1920s, it was only in the 1970s that microelectronic components technologies advanced to the point where it became feasible to incorporate multiple processors in one machine. Concommitantly, the development of algorithms for parallel processing also lagged due to hardware limitations. The speed of computing with solid-state chips is limited by gate switching delays. The physical limit implies that a 1 Gflop operational speed is the maximum for sequential processors. A computer recently introduced features a 'hypercube' architecture with 128 processors connected in networks at 5, 6 or 7 points per grid, depending on the design choice. Its computing speed rivals that of supercomputers, but at a fraction of the cost. The added speed with less hardware is due to parallel processing, which utilizes algorithms representing different parts of an equation that can be broken into simpler statements and processed simultaneously. Present, highly developed computer languages like FORTRAN, PASCAL, COBOL, etc., rely on sequential instructions. Thus, increased emphasis will now be directed at parallel processing algorithms to exploit the new architectures.

Detecting chaos in irregularly sampled time series.

PubMed

Kulp, C W

2013-09-01

Recently, Wiebe and Virgin [Chaos 22, 013136 (2012)] developed an algorithm which detects chaos by analyzing a time series' power spectrum which is computed using the Discrete Fourier Transform (DFT). Their algorithm, like other time series characterization algorithms, requires that the time series be regularly sampled. Real-world data, however, are often irregularly sampled, thus, making the detection of chaotic behavior difficult or impossible with those methods. In this paper, a characterization algorithm is presented, which effectively detects chaos in irregularly sampled time series. The work presented here is a modification of Wiebe and Virgin's algorithm and uses the Lomb-Scargle Periodogram (LSP) to compute a series' power spectrum instead of the DFT. The DFT is not appropriate for irregularly sampled time series. However, the LSP is capable of computing the frequency content of irregularly sampled data. Furthermore, a new method of analyzing the power spectrum is developed, which can be useful for differentiating between chaotic and non-chaotic behavior. The new characterization algorithm is successfully applied to irregularly sampled data generated by a model as well as data consisting of observations of variable stars.
Measuring exertion time, duty cycle and hand activity level for industrial tasks using computer vision.

PubMed

Akkas, Oguz; Lee, Cheng Hsien; Hu, Yu Hen; Harris Adamson, Carisa; Rempel, David; Radwin, Robert G

2017-12-01

Two computer vision algorithms were developed to automatically estimate exertion time, duty cycle (DC) and hand activity level (HAL) from videos of workers performing 50 industrial tasks. The average DC difference between manual frame-by-frame analysis and the computer vision DC was -5.8% for the Decision Tree (DT) algorithm, and 1.4% for the Feature Vector Training (FVT) algorithm. The average HAL difference was 0.5 for the DT algorithm and 0.3 for the FVT algorithm. A sensitivity analysis, conducted to examine the influence that deviations in DC have on HAL, found it remained unaffected when DC error was less than 5%. Thus, a DC error less than 10% will impact HAL less than 0.5 HAL, which is negligible. Automatic computer vision HAL estimates were therefore comparable to manual frame-by-frame estimates. Practitioner Summary: Computer vision was used to automatically estimate exertion time, duty cycle and hand activity level from videos of workers performing industrial tasks.
Parallelization of Nullspace Algorithm for the computation of metabolic pathways

PubMed Central

Jevremović, Dimitrije; Trinh, Cong T.; Srienc, Friedrich; Sosa, Carlos P.; Boley, Daniel

2011-01-01

Elementary mode analysis is a useful metabolic pathway analysis tool in understanding and analyzing cellular metabolism, since elementary modes can represent metabolic pathways with unique and minimal sets of enzyme-catalyzed reactions of a metabolic network under steady state conditions. However, computation of the elementary modes of a genome- scale metabolic network with 100–1000 reactions is very expensive and sometimes not feasible with the commonly used serial Nullspace Algorithm. In this work, we develop a distributed memory parallelization of the Nullspace Algorithm to handle efficiently the computation of the elementary modes of a large metabolic network. We give an implementation in C++ language with the support of MPI library functions for the parallel communication. Our proposed algorithm is accompanied with an analysis of the complexity and identification of major bottlenecks during computation of all possible pathways of a large metabolic network. The algorithm includes methods to achieve load balancing among the compute-nodes and specific communication patterns to reduce the communication overhead and improve efficiency. PMID:22058581
Image/Time Series Mining Algorithms: Applications to Developmental Biology, Document Processing and Data Streams

ERIC Educational Resources Information Center

Tataw, Oben Moses

2013-01-01

Interdisciplinary research in computer science requires the development of computational techniques for practical application in different domains. This usually requires careful integration of different areas of technical expertise. This dissertation presents image and time series analysis algorithms, with practical interdisciplinary applications…
Ndarts

NASA Technical Reports Server (NTRS)

Jain, Abhinandan

2011-01-01

Ndarts software provides algorithms for computing quantities associated with the dynamics of articulated, rigid-link, multibody systems. It is designed as a general-purpose dynamics library that can be used for the modeling of robotic platforms, space vehicles, molecular dynamics, and other such applications. The architecture and algorithms in Ndarts are based on the Spatial Operator Algebra (SOA) theory for computational multibody and robot dynamics developed at JPL. It uses minimal, internal coordinate models. The algorithms are low-order, recursive scatter/ gather algorithms. In comparison with the earlier Darts++ software, this version has a more general and cleaner design needed to support a larger class of computational dynamics needs. It includes a frames infrastructure, allows algorithms to operate on subgraphs of the system, and implements lazy and deferred computation for better efficiency. Dynamics modeling modules such as Ndarts are core building blocks of control and simulation software for space, robotic, mechanism, bio-molecular, and material systems modeling.
Petascale self-consistent electromagnetic computations using scalable and accurate algorithms for complex structures

NASA Astrophysics Data System (ADS)

Cary, John R.; Abell, D.; Amundson, J.; Bruhwiler, D. L.; Busby, R.; Carlsson, J. A.; Dimitrov, D. A.; Kashdan, E.; Messmer, P.; Nieter, C.; Smithe, D. N.; Spentzouris, P.; Stoltz, P.; Trines, R. M.; Wang, H.; Werner, G. R.

2006-09-01

As the size and cost of particle accelerators escalate, high-performance computing plays an increasingly important role; optimization through accurate, detailed computermodeling increases performance and reduces costs. But consequently, computer simulations face enormous challenges. Early approximation methods, such as expansions in distance from the design orbit, were unable to supply detailed accurate results, such as in the computation of wake fields in complex cavities. Since the advent of message-passing supercomputers with thousands of processors, earlier approximations are no longer necessary, and it is now possible to compute wake fields, the effects of dampers, and self-consistent dynamics in cavities accurately. In this environment, the focus has shifted towards the development and implementation of algorithms that scale to large numbers of processors. So-called charge-conserving algorithms evolve the electromagnetic fields without the need for any global solves (which are difficult to scale up to many processors). Using cut-cell (or embedded) boundaries, these algorithms can simulate the fields in complex accelerator cavities with curved walls. New implicit algorithms, which are stable for any time-step, conserve charge as well, allowing faster simulation of structures with details small compared to the characteristic wavelength. These algorithmic and computational advances have been implemented in the VORPAL7 Framework, a flexible, object-oriented, massively parallel computational application that allows run-time assembly of algorithms and objects, thus composing an application on the fly.
A systematic investigation of computation models for predicting Adverse Drug Reactions (ADRs).

PubMed

Kuang, Qifan; Wang, MinQi; Li, Rong; Dong, YongCheng; Li, Yizhou; Li, Menglong

2014-01-01

Early and accurate identification of adverse drug reactions (ADRs) is critically important for drug development and clinical safety. Computer-aided prediction of ADRs has attracted increasing attention in recent years, and many computational models have been proposed. However, because of the lack of systematic analysis and comparison of the different computational models, there remain limitations in designing more effective algorithms and selecting more useful features. There is therefore an urgent need to review and analyze previous computation models to obtain general conclusions that can provide useful guidance to construct more effective computational models to predict ADRs. In the current study, the main work is to compare and analyze the performance of existing computational methods to predict ADRs, by implementing and evaluating additional algorithms that have been earlier used for predicting drug targets. Our results indicated that topological and intrinsic features were complementary to an extent and the Jaccard coefficient had an important and general effect on the prediction of drug-ADR associations. By comparing the structure of each algorithm, final formulas of these algorithms were all converted to linear model in form, based on this finding we propose a new algorithm called the general weighted profile method and it yielded the best overall performance among the algorithms investigated in this paper. Several meaningful conclusions and useful findings regarding the prediction of ADRs are provided for selecting optimal features and algorithms.
GPU-accelerated computing for Lagrangian coherent structures of multi-body gravitational regimes

NASA Astrophysics Data System (ADS)

Lin, Mingpei; Xu, Ming; Fu, Xiaoyu

2017-04-01

Based on a well-established theoretical foundation, Lagrangian Coherent Structures (LCSs) have elicited widespread research on the intrinsic structures of dynamical systems in many fields, including the field of astrodynamics. Although the application of LCSs in dynamical problems seems straightforward theoretically, its associated computational cost is prohibitive. We propose a block decomposition algorithm developed on Compute Unified Device Architecture (CUDA) platform for the computation of the LCSs of multi-body gravitational regimes. In order to take advantage of GPU's outstanding computing properties, such as Shared Memory, Constant Memory, and Zero-Copy, the algorithm utilizes a block decomposition strategy to facilitate computation of finite-time Lyapunov exponent (FTLE) fields of arbitrary size and timespan. Simulation results demonstrate that this GPU-based algorithm can satisfy double-precision accuracy requirements and greatly decrease the time needed to calculate final results, increasing speed by approximately 13 times. Additionally, this algorithm can be generalized to various large-scale computing problems, such as particle filters, constellation design, and Monte-Carlo simulation.
Computer algorithms and applications used to assist the evaluation and treatment of adolescent idiopathic scoliosis: a review of published articles 2000-2009.

PubMed

Phan, Philippe; Mezghani, Neila; Aubin, Carl-Éric; de Guise, Jacques A; Labelle, Hubert

2011-07-01

Adolescent idiopathic scoliosis (AIS) is a complex spinal deformity whose assessment and treatment present many challenges. Computer applications have been developed to assist clinicians. A literature review on computer applications used in AIS evaluation and treatment has been undertaken. The algorithms used, their accuracy and clinical usability were analyzed. Computer applications have been used to create new classifications for AIS based on 2D and 3D features, assess scoliosis severity or risk of progression and assist bracing and surgical treatment. It was found that classification accuracy could be improved using computer algorithms that AIS patient follow-up and screening could be done using surface topography thereby limiting radiation and that bracing and surgical treatment could be optimized using simulations. Yet few computer applications are routinely used in clinics. With the development of 3D imaging and databases, huge amounts of clinical and geometrical data need to be taken into consideration when researching and managing AIS. Computer applications based on advanced algorithms will be able to handle tasks that could otherwise not be done which can possibly improve AIS patients' management. Clinically oriented applications and evidence that they can improve current care will be required for their integration in the clinical setting.
Sublattice parallel replica dynamics.

PubMed

Martínez, Enrique; Uberuaga, Blas P; Voter, Arthur F

2014-06-01

Exascale computing presents a challenge for the scientific community as new algorithms must be developed to take full advantage of the new computing paradigm. Atomistic simulation methods that offer full fidelity to the underlying potential, i.e., molecular dynamics (MD) and parallel replica dynamics, fail to use the whole machine speedup, leaving a region in time and sample size space that is unattainable with current algorithms. In this paper, we present an extension of the parallel replica dynamics algorithm [A. F. Voter, Phys. Rev. B 57, R13985 (1998)] by combining it with the synchronous sublattice approach of Shim and Amar [ and , Phys. Rev. B 71, 125432 (2005)], thereby exploiting event locality to improve the algorithm scalability. This algorithm is based on a domain decomposition in which events happen independently in different regions in the sample. We develop an analytical expression for the speedup given by this sublattice parallel replica dynamics algorithm and compare it with parallel MD and traditional parallel replica dynamics. We demonstrate how this algorithm, which introduces a slight additional approximation of event locality, enables the study of physical systems unreachable with traditional methodologies and promises to better utilize the resources of current high performance and future exascale computers.
Computational mechanics analysis tools for parallel-vector supercomputers

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.; Nguyen, Duc T.; Baddourah, Majdi; Qin, Jiangning

1993-01-01

Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigensolution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization search analysis and domain decomposition. The source code for many of these algorithms is available.
Pressure algorithm for elliptic flow calculations with the PDF method

NASA Technical Reports Server (NTRS)

Anand, M. S.; Pope, S. B.; Mongia, H. C.

1991-01-01

An algorithm to determine the mean pressure field for elliptic flow calculations with the probability density function (PDF) method is developed and applied. The PDF method is a most promising approach for the computation of turbulent reacting flows. Previous computations of elliptic flows with the method were in conjunction with conventional finite volume based calculations that provided the mean pressure field. The algorithm developed and described here permits the mean pressure field to be determined within the PDF calculations. The PDF method incorporating the pressure algorithm is applied to the flow past a backward-facing step. The results are in good agreement with data for the reattachment length, mean velocities, and turbulence quantities including triple correlations.
Flight data acquisition methodology for validation of passive ranging algorithms for obstacle avoidance

NASA Technical Reports Server (NTRS)

Smith, Phillip N.

1990-01-01

The automation of low-altitude rotorcraft flight depends on the ability to detect, locate, and navigate around obstacles lying in the rotorcraft's intended flightpath. Computer vision techniques provide a passive method of obstacle detection and range estimation, for obstacle avoidance. Several algorithms based on computer vision methods have been developed for this purpose using laboratory data; however, further development and validation of candidate algorithms require data collected from rotorcraft flight. A data base containing low-altitude imagery augmented with the rotorcraft and sensor parameters required for passive range estimation is not readily available. Here, the emphasis is on the methodology used to develop such a data base from flight-test data consisting of imagery, rotorcraft and sensor parameters, and ground-truth range measurements. As part of the data preparation, a technique for obtaining the sensor calibration parameters is described. The data base will enable the further development of algorithms for computer vision-based obstacle detection and passive range estimation, as well as provide a benchmark for verification of range estimates against ground-truth measurements.
A study of real-time computer graphic display technology for aeronautical applications

NASA Technical Reports Server (NTRS)

Rajala, S. A.

1981-01-01

The development, simulation, and testing of an algorithm for anti-aliasing vector drawings is discussed. The pseudo anti-aliasing line drawing algorithm is an extension to Bresenham's algorithm for computer control of a digital plotter. The algorithm produces a series of overlapping line segments where the display intensity shifts from one segment to the other in this overlap (transition region). In this algorithm the length of the overlap and the intensity shift are essentially constants because the transition region is an aid to the eye in integrating the segments into a single smooth line.
Real time target allocation in cooperative unmanned aerial vehicles

NASA Astrophysics Data System (ADS)

Kudleppanavar, Ganesh

The prolific development of Unmanned Aerial Vehicles (UAV's) in recent years has the potential to provide tremendous advantages in military, commercial and law enforcement applications. While safety and performance take precedence in the development lifecycle, autonomous operations and, in particular, cooperative missions have the ability to significantly enhance the usability of these vehicles. The success of cooperative missions relies on the optimal allocation of targets while taking into consideration the resource limitation of each vehicle. The task allocation process can be centralized or decentralized. This effort presents the development of a real time target allocation algorithm that considers available stored energy in each vehicle while minimizing the communication between each UAV. The algorithm utilizes a nearest neighbor search algorithm to locate new targets with respect to existing targets. Simulations show that this novel algorithm compares favorably to the mixed integer linear programming method, which is computationally more expensive. The implementation of this algorithm on Arduino and Xbee wireless modules shows the capability of the algorithm to execute efficiently on hardware with minimum computation complexity.
Strategies for concurrent processing of complex algorithms in data driven architectures

NASA Technical Reports Server (NTRS)

Stoughton, John W.; Mielke, Roland R.

1988-01-01

The purpose is to document research to develop strategies for concurrent processing of complex algorithms in data driven architectures. The problem domain consists of decision-free algorithms having large-grained, computationally complex primitive operations. Such are often found in signal processing and control applications. The anticipated multiprocessor environment is a data flow architecture containing between two and twenty computing elements. Each computing element is a processor having local program memory, and which communicates with a common global data memory. A new graph theoretic model called ATAMM which establishes rules for relating a decomposed algorithm to its execution in a data flow architecture is presented. The ATAMM model is used to determine strategies to achieve optimum time performance and to develop a system diagnostic software tool. In addition, preliminary work on a new multiprocessor operating system based on the ATAMM specifications is described.
A fast bottom-up algorithm for computing the cut sets of noncoherent fault trees

DOE Office of Scientific and Technical Information (OSTI.GOV)

Corynen, G.C.

1987-11-01

An efficient procedure for finding the cut sets of large fault trees has been developed. Designed to address coherent or noncoherent systems, dependent events, shared or common-cause events, the method - called SHORTCUT - is based on a fast algorithm for transforming a noncoherent tree into a quasi-coherent tree (COHERE), and on a new algorithm for reducing cut sets (SUBSET). To assure sufficient clarity and precision, the procedure is discussed in the language of simple sets, which is also developed in this report. Although the new method has not yet been fully implemented on the computer, we report theoretical worst-casemore » estimates of its computational complexity. 12 refs., 10 figs.« less
X-Ray Radiography of Gas Turbine Ceramics.

DTIC Science & Technology

1979-10-20

Microfocus X-ray equipment. 1a4ihe definition of equipment concepts for a computer assisted tomography ( CAT ) system; and 4ffthe development of a CAT ...were obtained from these test coupons using Microfocus X-ray and image en- hancement techniques. A Computer Assisted Tomography ( CAT ) design concept...monitor. Computer reconstruction algorithms were investigated with respect to CAT and a preferred approach was determined. An appropriate CAT algorithm
Sigint Application for Polymorphous Computing Architecture (PCA): Wideband DF

DTIC Science & Technology

2006-08-01

Polymorphous Computing Architecture (PCA) program as stated by Robert Graybill is to Develop the computing foundation for agile systems by establishing...ubiquitous MUSIC algorithm rely upon an underlying narrowband signal model [8]. In this case, narrowband means that the signal bandwidth is less than...a wideband DF algorithm is needed to compensate for this model inadequacy. Among the various wideband DF techniques available, the coherent signal
The Cyborg Astrobiologist: testing a novelty detection algorithm on two mobile exploration systems at Rivas Vaciamadrid in Spain and at the Mars Desert Research Station in Utah

NASA Astrophysics Data System (ADS)

McGuire, P. C.; Gross, C.; Wendt, L.; Bonnici, A.; Souza-Egipsy, V.; Ormö, J.; Díaz-Martínez, E.; Foing, B. H.; Bose, R.; Walter, S.; Oesker, M.; Ontrup, J.; Haschke, R.; Ritter, H.

2010-01-01

In previous work, a platform was developed for testing computer-vision algorithms for robotic planetary exploration. This platform consisted of a digital video camera connected to a wearable computer for real-time processing of images at geological and astrobiological field sites. The real-time processing included image segmentation and the generation of interest points based upon uncommonness in the segmentation maps. Also in previous work, this platform for testing computer-vision algorithms has been ported to a more ergonomic alternative platform, consisting of a phone camera connected via the Global System for Mobile Communications (GSM) network to a remote-server computer. The wearable-computer platform has been tested at geological and astrobiological field sites in Spain (Rivas Vaciamadrid and Riba de Santiuste), and the phone camera has been tested at a geological field site in Malta. In this work, we (i) apply a Hopfield neural-network algorithm for novelty detection based upon colour, (ii) integrate a field-capable digital microscope on the wearable computer platform, (iii) test this novelty detection with the digital microscope at Rivas Vaciamadrid, (iv) develop a Bluetooth communication mode for the phone-camera platform, in order to allow access to a mobile processing computer at the field sites, and (v) test the novelty detection on the Bluetooth-enabled phone camera connected to a netbook computer at the Mars Desert Research Station in Utah. This systems engineering and field testing have together allowed us to develop a real-time computer-vision system that is capable, for example, of identifying lichens as novel within a series of images acquired in semi-arid desert environments. We acquired sequences of images of geologic outcrops in Utah and Spain consisting of various rock types and colours to test this algorithm. The algorithm robustly recognized previously observed units by their colour, while requiring only a single image or a few images to learn colours as familiar, demonstrating its fast learning capability.

A Decentralized Eigenvalue Computation Method for Spectrum Sensing Based on Average Consensus

NASA Astrophysics Data System (ADS)

Mohammadi, Jafar; Limmer, Steffen; Stańczak, Sławomir

2016-07-01

This paper considers eigenvalue estimation for the decentralized inference problem for spectrum sensing. We propose a decentralized eigenvalue computation algorithm based on the power method, which is referred to as generalized power method GPM; it is capable of estimating the eigenvalues of a given covariance matrix under certain conditions. Furthermore, we have developed a decentralized implementation of GPM by splitting the iterative operations into local and global computation tasks. The global tasks require data exchange to be performed among the nodes. For this task, we apply an average consensus algorithm to efficiently perform the global computations. As a special case, we consider a structured graph that is a tree with clusters of nodes at its leaves. For an accelerated distributed implementation, we propose to use computation over multiple access channel (CoMAC) as a building block of the algorithm. Numerical simulations are provided to illustrate the performance of the two algorithms.
The Construction of 3-d Neutral Density for Arbitrary Data Sets

NASA Astrophysics Data System (ADS)

Riha, S.; McDougall, T. J.; Barker, P. M.

2014-12-01

The Neutral Density variable allows inference of water pathways from thermodynamic properties in the global ocean, and is therefore an essential component of global ocean circulation analysis. The widely used algorithm for the computation of Neutral Density yields accurate results for data sets which are close to the observed climatological ocean. Long-term numerical climate simulations, however, often generate a significant drift from present-day climate, which renders the existing algorithm inaccurate. To remedy this problem, new algorithms which operate on arbitrary data have been developed, which may potentially be used to compute Neutral Density during runtime of a numerical model.We review existing approaches for the construction of Neutral Density in arbitrary data sets, detail their algorithmic structure, and present an analysis of the computational cost for implementations on a single-CPU computer. We discuss possible strategies for the implementation in state-of-the-art numerical models, with a focus on distributed computing environments.
A Scheduling Algorithm for Computational Grids that Minimizes Centralized Processing in Genome Assembly of Next-Generation Sequencing Data

PubMed Central

Lima, Jakelyne; Cerdeira, Louise Teixeira; Bol, Erick; Schneider, Maria Paula Cruz; Silva, Artur; Azevedo, Vasco; Abelém, Antônio Jorge Gomes

2012-01-01

Improvements in genome sequencing techniques have resulted in generation of huge volumes of data. As a consequence of this progress, the genome assembly stage demands even more computational power, since the incoming sequence files contain large amounts of data. To speed up the process, it is often necessary to distribute the workload among a group of machines. However, this requires hardware and software solutions specially configured for this purpose. Grid computing try to simplify this process of aggregate resources, but do not always offer the best performance possible due to heterogeneity and decentralized management of its resources. Thus, it is necessary to develop software that takes into account these peculiarities. In order to achieve this purpose, we developed an algorithm aimed to optimize the functionality of de novo assembly software ABySS in order to optimize its operation in grids. We run ABySS with and without the algorithm we developed in the grid simulator SimGrid. Tests showed that our algorithm is viable, flexible, and scalable even on a heterogeneous environment, which improved the genome assembly time in computational grids without changing its quality. PMID:22461785
An Improved Clustering Algorithm of Tunnel Monitoring Data for Cloud Computing

PubMed Central

Zhong, Luo; Tang, KunHao; Li, Lin; Yang, Guang; Ye, JingJing

2014-01-01

With the rapid development of urban construction, the number of urban tunnels is increasing and the data they produce become more and more complex. It results in the fact that the traditional clustering algorithm cannot handle the mass data of the tunnel. To solve this problem, an improved parallel clustering algorithm based on k-means has been proposed. It is a clustering algorithm using the MapReduce within cloud computing that deals with data. It not only has the advantage of being used to deal with mass data but also is more efficient. Moreover, it is able to compute the average dissimilarity degree of each cluster in order to clean the abnormal data. PMID:24982971
Space shuttle propulsion parameter estimation using optional estimation techniques

NASA Technical Reports Server (NTRS)

1983-01-01

A regression analyses on tabular aerodynamic data provided. A representative aerodynamic model for coefficient estimation. It also reduced the storage requirements for the "normal' model used to check out the estimation algorithms. The results of the regression analyses are presented. The computer routines for the filter portion of the estimation algorithm and the :"bringing-up' of the SRB predictive program on the computer was developed. For the filter program, approximately 54 routines were developed. The routines were highly subsegmented to facilitate overlaying program segments within the partitioned storage space on the computer.
Simulated parallel annealing within a neighborhood for optimization of biomechanical systems.

PubMed

Higginson, J S; Neptune, R R; Anderson, F C

2005-09-01

Optimization problems for biomechanical systems have become extremely complex. Simulated annealing (SA) algorithms have performed well in a variety of test problems and biomechanical applications; however, despite advances in computer speed, convergence to optimal solutions for systems of even moderate complexity has remained prohibitive. The objective of this study was to develop a portable parallel version of a SA algorithm for solving optimization problems in biomechanics. The algorithm for simulated parallel annealing within a neighborhood (SPAN) was designed to minimize interprocessor communication time and closely retain the heuristics of the serial SA algorithm. The computational speed of the SPAN algorithm scaled linearly with the number of processors on different computer platforms for a simple quadratic test problem and for a more complex forward dynamic simulation of human pedaling.
A Computational Algorithm for Functional Clustering of Proteome Dynamics During Development

PubMed Central

Wang, Yaqun; Wang, Ningtao; Hao, Han; Guo, Yunqian; Zhen, Yan; Shi, Jisen; Wu, Rongling

2014-01-01

Phenotypic traits, such as seed development, are a consequence of complex biochemical interactions among genes, proteins and metabolites, but the underlying mechanisms that operate in a coordinated and sequential manner remain elusive. Here, we address this issue by developing a computational algorithm to monitor proteome changes during the course of trait development. The algorithm is built within the mixture-model framework in which each mixture component is modeled by a specific group of proteins that display a similar temporal pattern of expression in trait development. A nonparametric approach based on Legendre orthogonal polynomials was used to fit dynamic changes of protein expression, increasing the power and flexibility of protein clustering. By analyzing a dataset of proteomic dynamics during early embryogenesis of the Chinese fir, the algorithm has successfully identified several distinct types of proteins that coordinate with each other to determine seed development in this forest tree commercially and environmentally important to China. The algorithm will find its immediate applications for the characterization of mechanistic underpinnings for any other biological processes in which protein abundance plays a key role. PMID:24955031
Strategic Control Algorithm Development : Volume 4A. Computer Program Report.

DOT National Transportation Integrated Search

1974-08-01

A description of the strategic algorithm evaluation model is presented, both at the user and programmer levels. The model representation of an airport configuration, environmental considerations, the strategic control algorithm logic, and the airplan...
Strategic Control Algorithm Development : Volume 4B. Computer Program Report (Concluded)

DOT National Transportation Integrated Search

1974-08-01

A description of the strategic algorithm evaluation model is presented, both at the user and programmer levels. The model representation of an airport configuration, environmental considerations, the strategic control algorithm logic, and the airplan...
State-Space System Realization with Input- and Output-Data Correlation

NASA Technical Reports Server (NTRS)

Juang, Jer-Nan

1997-01-01

This paper introduces a general version of the information matrix consisting of the autocorrelation and cross-correlation matrices of the shifted input and output data. Based on the concept of data correlation, a new system realization algorithm is developed to create a model directly from input and output data. The algorithm starts by computing a special type of correlation matrix derived from the information matrix. The special correlation matrix provides information on the system-observability matrix and the state-vector correlation. A system model is then developed from the observability matrix in conjunction with other algebraic manipulations. This approach leads to several different algorithms for computing system matrices for use in representing the system model. The relationship of the new algorithms with other realization algorithms in the time and frequency domains is established with matrix factorization of the information matrix. Several examples are given to illustrate the validity and usefulness of these new algorithms.
Terascale Optimal PDE Simulations (TOPS) Center

DOE Office of Scientific and Technical Information (OSTI.GOV)

Professor Olof B. Widlund

2007-07-09

Our work has focused on the development and analysis of domain decomposition algorithms for a variety of problems arising in continuum mechanics modeling. In particular, we have extended and analyzed FETI-DP and BDDC algorithms; these iterative solvers were first introduced and studied by Charbel Farhat and his collaborators, see [11, 45, 12], and by Clark Dohrmann of SANDIA, Albuquerque, see [43, 2, 1], respectively. These two closely related families of methods are of particular interest since they are used more extensively than other iterative substructuring methods to solve very large and difficult problems. Thus, the FETI algorithms are part ofmore » the SALINAS system developed by the SANDIA National Laboratories for very large scale computations, and as already noted, BDDC was first developed by a SANDIA scientist, Dr. Clark Dohrmann. The FETI algorithms are also making inroads in commercial engineering software systems. We also note that the analysis of these algorithms poses very real mathematical challenges. The success in developing this theory has, in several instances, led to significant improvements in the performance of these algorithms. A very desirable feature of these iterative substructuring and other domain decomposition algorithms is that they respect the memory hierarchy of modern parallel and distributed computing systems, which is essential for approaching peak floating point performance. The development of improved methods, together with more powerful computer systems, is making it possible to carry out simulations in three dimensions, with quite high resolution, relatively easily. This work is supported by high quality software systems, such as Argonne's PETSc library, which facilitates code development as well as the access to a variety of parallel and distributed computer systems. The success in finding scalable and robust domain decomposition algorithms for very large number of processors and very large finite element problems is, e.g., illustrated in [24, 25, 26]. This work is based on [29, 31]. Our work over these five and half years has, in our opinion, helped advance the knowledge of domain decomposition methods significantly. We see these methods as providing valuable alternatives to other iterative methods, in particular, those based on multi-grid. In our opinion, our accomplishments also match the goals of the TOPS project quite closely.« less
Algorithm for Surface of Translation Attached Radiators (A-STAR). Volume 2. Users manual

NASA Astrophysics Data System (ADS)

Medgyesimitschang, L. N.; Putnam, J. M.

1982-05-01

A hierarchy of computer programs implementing the method of moments for bodies of translation (MM/BOT) is described. The algorithm treats the far-field radiation from off-surface and aperture antennas on finite-length open or closed bodies of arbitrary cross section. The near fields and antenna coupling on such bodies are computed. The theoretical development underlying the algorithm is described in Volume 1 of this report.
The development of a scalable parallel 3-D CFD algorithm for turbomachinery. M.S. Thesis Final Report

NASA Technical Reports Server (NTRS)

Luke, Edward Allen

1993-01-01

Two algorithms capable of computing a transonic 3-D inviscid flow field about rotating machines are considered for parallel implementation. During the study of these algorithms, a significant new method of measuring the performance of parallel algorithms is developed. The theory that supports this new method creates an empirical definition of scalable parallel algorithms that is used to produce quantifiable evidence that a scalable parallel application was developed. The implementation of the parallel application and an automated domain decomposition tool are also discussed.
Implementation of a fully-balanced periodic tridiagonal solver on a parallel distributed memory architecture

NASA Technical Reports Server (NTRS)

Eidson, T. M.; Erlebacher, G.

1994-01-01

While parallel computers offer significant computational performance, it is generally necessary to evaluate several programming strategies. Two programming strategies for a fairly common problem - a periodic tridiagonal solver - are developed and evaluated. Simple model calculations as well as timing results are presented to evaluate the various strategies. The particular tridiagonal solver evaluated is used in many computational fluid dynamic simulation codes. The feature that makes this algorithm unique is that these simulation codes usually require simultaneous solutions for multiple right-hand-sides (RHS) of the system of equations. Each RHS solutions is independent and thus can be computed in parallel. Thus a Gaussian elimination type algorithm can be used in a parallel computation and the more complicated approaches such as cyclic reduction are not required. The two strategies are a transpose strategy and a distributed solver strategy. For the transpose strategy, the data is moved so that a subset of all the RHS problems is solved on each of the several processors. This usually requires significant data movement between processor memories across a network. The second strategy attempts to have the algorithm allow the data across processor boundaries in a chained manner. This usually requires significantly less data movement. An approach to accomplish this second strategy in a near-perfect load-balanced manner is developed. In addition, an algorithm will be shown to directly transform a sequential Gaussian elimination type algorithm into the parallel chained, load-balanced algorithm.
Near real-time, on-the-move software PED using VPEF

NASA Astrophysics Data System (ADS)

Green, Kevin; Geyer, Chris; Burnette, Chris; Agarwal, Sanjeev; Swett, Bruce; Phan, Chung; Deterline, Diane

2015-05-01

The scope of the Micro-Cloud for Operational, Vehicle-Based EO-IR Reconnaissance System (MOVERS) development effort, managed by the Night Vision and Electronic Sensors Directorate (NVESD), is to develop, integrate, and demonstrate new sensor technologies and algorithms that improve improvised device/mine detection using efficient and effective exploitation and fusion of sensor data and target cues from existing and future Route Clearance Package (RCP) sensor systems. Unfortunately, the majority of forward looking Full Motion Video (FMV) and computer vision processing, exploitation, and dissemination (PED) algorithms are often developed using proprietary, incompatible software. This makes the insertion of new algorithms difficult due to the lack of standardized processing chains. In order to overcome these limitations, EOIR developed the Government off-the-shelf (GOTS) Video Processing and Exploitation Framework (VPEF) to be able to provide standardized interfaces (e.g., input/output video formats, sensor metadata, and detected objects) for exploitation software and to rapidly integrate and test computer vision algorithms. EOIR developed a vehicle-based computing framework within the MOVERS and integrated it with VPEF. VPEF was further enhanced for automated processing, detection, and publishing of detections in near real-time, thus improving the efficiency and effectiveness of RCP sensor systems.
The Research and Implementation of MUSER CLEAN Algorithm Based on OpenCL

NASA Astrophysics Data System (ADS)

Feng, Y.; Chen, K.; Deng, H.; Wang, F.; Mei, Y.; Wei, S. L.; Dai, W.; Yang, Q. P.; Liu, Y. B.; Wu, J. P.

2017-03-01

It's urgent to carry out high-performance data processing with a single machine in the development of astronomical software. However, due to the different configuration of the machine, traditional programming techniques such as multi-threading, and CUDA (Compute Unified Device Architecture)+GPU (Graphic Processing Unit) have obvious limitations in portability and seamlessness between different operation systems. The OpenCL (Open Computing Language) used in the development of MUSER (MingantU SpEctral Radioheliograph) data processing system is introduced. And the Högbom CLEAN algorithm is re-implemented into parallel CLEAN algorithm by the Python language and PyOpenCL extended package. The experimental results show that the CLEAN algorithm based on OpenCL has approximately equally operating efficiency compared with the former CLEAN algorithm based on CUDA. More important, the data processing in merely CPU (Central Processing Unit) environment of this system can also achieve high performance, which has solved the problem of environmental dependence of CUDA+GPU. Overall, the research improves the adaptability of the system with emphasis on performance of MUSER image clean computing. In the meanwhile, the realization of OpenCL in MUSER proves its availability in scientific data processing. In view of the high-performance computing features of OpenCL in heterogeneous environment, it will probably become the preferred technology in the future high-performance astronomical software development.
Parallel Optimization of 3D Cardiac Electrophysiological Model Using GPU

PubMed Central

Xia, Yong; Zhang, Henggui

2015-01-01

Large-scale 3D virtual heart model simulations are highly demanding in computational resources. This imposes a big challenge to the traditional computation resources based on CPU environment, which already cannot meet the requirement of the whole computation demands or are not easily available due to expensive costs. GPU as a parallel computing environment therefore provides an alternative to solve the large-scale computational problems of whole heart modeling. In this study, using a 3D sheep atrial model as a test bed, we developed a GPU-based simulation algorithm to simulate the conduction of electrical excitation waves in the 3D atria. In the GPU algorithm, a multicellular tissue model was split into two components: one is the single cell model (ordinary differential equation) and the other is the diffusion term of the monodomain model (partial differential equation). Such a decoupling enabled realization of the GPU parallel algorithm. Furthermore, several optimization strategies were proposed based on the features of the virtual heart model, which enabled a 200-fold speedup as compared to a CPU implementation. In conclusion, an optimized GPU algorithm has been developed that provides an economic and powerful platform for 3D whole heart simulations. PMID:26581957
Parallel Optimization of 3D Cardiac Electrophysiological Model Using GPU.

PubMed

Xia, Yong; Wang, Kuanquan; Zhang, Henggui

2015-01-01

Large-scale 3D virtual heart model simulations are highly demanding in computational resources. This imposes a big challenge to the traditional computation resources based on CPU environment, which already cannot meet the requirement of the whole computation demands or are not easily available due to expensive costs. GPU as a parallel computing environment therefore provides an alternative to solve the large-scale computational problems of whole heart modeling. In this study, using a 3D sheep atrial model as a test bed, we developed a GPU-based simulation algorithm to simulate the conduction of electrical excitation waves in the 3D atria. In the GPU algorithm, a multicellular tissue model was split into two components: one is the single cell model (ordinary differential equation) and the other is the diffusion term of the monodomain model (partial differential equation). Such a decoupling enabled realization of the GPU parallel algorithm. Furthermore, several optimization strategies were proposed based on the features of the virtual heart model, which enabled a 200-fold speedup as compared to a CPU implementation. In conclusion, an optimized GPU algorithm has been developed that provides an economic and powerful platform for 3D whole heart simulations.
Research progress on quantum informatics and quantum computation

NASA Astrophysics Data System (ADS)

Zhao, Yusheng

2018-03-01

Quantum informatics is an emerging interdisciplinary subject developed by the combination of quantum mechanics, information science, and computer science in the 1980s. The birth and development of quantum information science has far-reaching significance in science and technology. At present, the application of quantum information technology has become the direction of people’s efforts. The preparation, storage, purification and regulation, transmission, quantum coding and decoding of quantum state have become the hotspot of scientists and technicians, which have a profound impact on the national economy and the people’s livelihood, technology and defense technology. This paper first summarizes the background of quantum information science and quantum computer and the current situation of domestic and foreign research, and then introduces the basic knowledge and basic concepts of quantum computing. Finally, several quantum algorithms are introduced in detail, including Quantum Fourier transform, Deutsch-Jozsa algorithm, Shor’s quantum algorithm, quantum phase estimation.
Real-time dynamic simulation of the Cassini spacecraft using DARTS. Part 2: Parallel/vectorized real-time implementation

NASA Technical Reports Server (NTRS)

Fijany, A.; Roberts, J. A.; Jain, A.; Man, G. K.

1993-01-01

Part 1 of this paper presented the requirements for the real-time simulation of Cassini spacecraft along with some discussion of the DARTS algorithm. Here, in Part 2 we discuss the development and implementation of parallel/vectorized DARTS algorithm and architecture for real-time simulation. Development of the fast algorithms and architecture for real-time hardware-in-the-loop simulation of spacecraft dynamics is motivated by the fact that it represents a hard real-time problem, in the sense that the correctness of the simulation depends on both the numerical accuracy and the exact timing of the computation. For a given model fidelity, the computation should be computed within a predefined time period. Further reduction in computation time allows increasing the fidelity of the model (i.e., inclusion of more flexible modes) and the integration routine.

Automated planning of computer assisted mosaic arthroplasty.

PubMed

Inoue, Jiro; Kunz, Manuela; Hurtig, Mark B; Waldman, Stephen D; Stewart, A James

2011-01-01

We describe and evaluate a computer algorithm that automatically develops a surgical plan for computer assisted mosaic arthroplasty, a technically demanding procedure in which a set of osteochondral plugs are transplanted from a non-load-bearing area of the joint to the site of a cartilage defect. We found that the algorithm produced plans that were at least as good as a human expert, had less variability, and took less time.
Advanced biologically plausible algorithms for low-level image processing

NASA Astrophysics Data System (ADS)

Gusakova, Valentina I.; Podladchikova, Lubov N.; Shaposhnikov, Dmitry G.; Markin, Sergey N.; Golovan, Alexander V.; Lee, Seong-Whan

1999-08-01

At present, in computer vision, the approach based on modeling the biological vision mechanisms is extensively developed. However, up to now, real world image processing has no effective solution in frameworks of both biologically inspired and conventional approaches. Evidently, new algorithms and system architectures based on advanced biological motivation should be developed for solution of computational problems related to this visual task. Basic problems that should be solved for creation of effective artificial visual system to process real world imags are a search for new algorithms of low-level image processing that, in a great extent, determine system performance. In the present paper, the result of psychophysical experiments and several advanced biologically motivated algorithms for low-level processing are presented. These algorithms are based on local space-variant filter, context encoding visual information presented in the center of input window, and automatic detection of perceptually important image fragments. The core of latter algorithm are using local feature conjunctions such as noncolinear oriented segment and composite feature map formation. Developed algorithms were integrated into foveal active vision model, the MARR. It is supposed that proposed algorithms may significantly improve model performance while real world image processing during memorizing, search, and recognition.
A Systematic Investigation of Computation Models for Predicting Adverse Drug Reactions (ADRs)

PubMed Central

Kuang, Qifan; Wang, MinQi; Li, Rong; Dong, YongCheng; Li, Yizhou; Li, Menglong

2014-01-01

Background Early and accurate identification of adverse drug reactions (ADRs) is critically important for drug development and clinical safety. Computer-aided prediction of ADRs has attracted increasing attention in recent years, and many computational models have been proposed. However, because of the lack of systematic analysis and comparison of the different computational models, there remain limitations in designing more effective algorithms and selecting more useful features. There is therefore an urgent need to review and analyze previous computation models to obtain general conclusions that can provide useful guidance to construct more effective computational models to predict ADRs. Principal Findings In the current study, the main work is to compare and analyze the performance of existing computational methods to predict ADRs, by implementing and evaluating additional algorithms that have been earlier used for predicting drug targets. Our results indicated that topological and intrinsic features were complementary to an extent and the Jaccard coefficient had an important and general effect on the prediction of drug-ADR associations. By comparing the structure of each algorithm, final formulas of these algorithms were all converted to linear model in form, based on this finding we propose a new algorithm called the general weighted profile method and it yielded the best overall performance among the algorithms investigated in this paper. Conclusion Several meaningful conclusions and useful findings regarding the prediction of ADRs are provided for selecting optimal features and algorithms. PMID:25180585
A real time microcomputer implementation of sensor failure detection for turbofan engines

NASA Technical Reports Server (NTRS)

Delaat, John C.; Merrill, Walter C.

1989-01-01

An algorithm was developed which detects, isolates, and accommodates sensor failures using analytical redundancy. The performance of this algorithm was demonstrated on a full-scale F100 turbofan engine. The algorithm was implemented in real-time on a microprocessor-based controls computer which includes parallel processing and high order language programming. Parallel processing was used to achieve the required computational power for the real-time implementation. High order language programming was used in order to reduce the programming and maintenance costs of the algorithm implementation software. The sensor failure algorithm was combined with an existing multivariable control algorithm to give a complete control implementation with sensor analytical redundancy. The real-time microprocessor implementation of the algorithm which resulted in the successful completion of the algorithm engine demonstration, is described.
Gradient gravitational search: An efficient metaheuristic algorithm for global optimization.

PubMed

Dash, Tirtharaj; Sahu, Prabhat K

2015-05-30

The adaptation of novel techniques developed in the field of computational chemistry to solve the concerned problems for large and flexible molecules is taking the center stage with regard to efficient algorithm, computational cost and accuracy. In this article, the gradient-based gravitational search (GGS) algorithm, using analytical gradients for a fast minimization to the next local minimum has been reported. Its efficiency as metaheuristic approach has also been compared with Gradient Tabu Search and others like: Gravitational Search, Cuckoo Search, and Back Tracking Search algorithms for global optimization. Moreover, the GGS approach has also been applied to computational chemistry problems for finding the minimal value potential energy of two-dimensional and three-dimensional off-lattice protein models. The simulation results reveal the relative stability and physical accuracy of protein models with efficient computational cost. © 2015 Wiley Periodicals, Inc.
Using Computer-Based "Experiments" in the Analysis of Chemical Reaction Equilibria

ERIC Educational Resources Information Center

Li, Zhao; Corti, David S.

2018-01-01

The application of the Reaction Monte Carlo (RxMC) algorithm to standard textbook problems in chemical reaction equilibria is discussed. The RxMC method is a molecular simulation algorithm for studying the equilibrium properties of reactive systems, and therefore provides the opportunity to develop computer-based "experiments" for the…
An Empirical Generative Framework for Computational Modeling of Language Acquisition

ERIC Educational Resources Information Center

Waterfall, Heidi R.; Sandbank, Ben; Onnis, Luca; Edelman, Shimon

2010-01-01

This paper reports progress in developing a computer model of language acquisition in the form of (1) a generative grammar that is (2) algorithmically learnable from realistic corpus data, (3) viable in its large-scale quantitative performance and (4) psychologically real. First, we describe new algorithmic methods for unsupervised learning of…
Integration of symbolic and algorithmic hardware and software for the automation of space station subsystems

NASA Technical Reports Server (NTRS)

Gregg, Hugh; Healey, Kathleen; Hack, Edmund; Wong, Carla

1987-01-01

Expert systems that require access to data bases, complex simulations and real time instrumentation have both symbolic as well as algorithmic computing needs. These needs could both be met using a general computing workstation running both symbolic and algorithmic code, or separate, specialized computers networked together. The later approach was chosen to implement TEXSYS, the thermal expert system, developed to demonstrate the ability of an expert system to autonomously control the thermal control system of the space station. TEXSYS has been implemented on a Symbolics workstation, and will be linked to a microVAX computer that will control a thermal test bed. Integration options are explored and several possible solutions are presented.
Developments in the application of the geometrical theory of diffraction and computer graphics to aircraft inter-antenna coupling analysis

NASA Astrophysics Data System (ADS)

Bogusz, Michael

1993-01-01

The need for a systematic methodology for the analysis of aircraft electromagnetic compatibility (EMC) problems is examined. The available computer aids used in aircraft EMC analysis are assessed and a theoretical basis is established for the complex algorithms which identify and quantify electromagnetic interactions. An overview is presented of one particularly well established aircraft antenna to antenna EMC analysis code, the Aircraft Inter-Antenna Propagation with Graphics (AAPG) Version 07 software. The specific new algorithms created to compute cone geodesics and their associated path losses and to graph the physical coupling path are discussed. These algorithms are validated against basic principles. Loss computations apply the uniform geometrical theory of diffraction and are subsequently compared to measurement data. The increased modelling and analysis capabilities of the newly developed AAPG Version 09 are compared to those of Version 07. Several models of real aircraft, namely the Electronic Systems Trainer Challenger, are generated and provided as a basis for this preliminary comparative assessment. Issues such as software reliability, algorithm stability, and quality of hardcopy output are also discussed.
QRS detection based ECG quality assessment.

PubMed

Hayn, Dieter; Jammerbund, Bernhard; Schreier, Günter

2012-09-01

Although immediate feedback concerning ECG signal quality during recording is useful, up to now not much literature describing quality measures is available. We have implemented and evaluated four ECG quality measures. Empty lead criterion (A), spike detection criterion (B) and lead crossing point criterion (C) were calculated from basic signal properties. Measure D quantified the robustness of QRS detection when applied to the signal. An advanced Matlab-based algorithm combining all four measures and a simplified algorithm for Android platforms, excluding measure D, were developed. Both algorithms were evaluated by taking part in the Computing in Cardiology Challenge 2011. Each measure's accuracy and computing time was evaluated separately. During the challenge, the advanced algorithm correctly classified 93.3% of the ECGs in the training-set and 91.6 % in the test-set. Scores for the simplified algorithm were 0.834 in event 2 and 0.873 in event 3. Computing time for measure D was almost five times higher than for other measures. Required accuracy levels depend on the application and are related to computing time. While our simplified algorithm may be accurate for real-time feedback during ECG self-recordings, QRS detection based measures can further increase the performance if sufficient computing power is available.
Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment.

PubMed

Lee, Wei-Po; Hsiao, Yu-Ting; Hwang, Wei-Che

2014-01-16

To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high quality solutions can be obtained within relatively short time. This integrated approach is a promising way for inferring large networks.
Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment

PubMed Central

2014-01-01

Background To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. Results This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Conclusions Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high quality solutions can be obtained within relatively short time. This integrated approach is a promising way for inferring large networks. PMID:24428926
Histopathological Image Analysis: A Review

PubMed Central

Gurcan, Metin N.; Boucheron, Laura; Can, Ali; Madabhushi, Anant; Rajpoot, Nasir; Yener, Bulent

2010-01-01

Over the past decade, dramatic increases in computational power and improvement in image analysis algorithms have allowed the development of powerful computer-assisted analytical approaches to radiological data. With the recent advent of whole slide digital scanners, tissue histopathology slides can now be digitized and stored in digital image form. Consequently, digitized tissue histopathology has now become amenable to the application of computerized image analysis and machine learning techniques. Analogous to the role of computer-assisted diagnosis (CAD) algorithms in medical imaging to complement the opinion of a radiologist, CAD algorithms have begun to be developed for disease detection, diagnosis, and prognosis prediction to complement to the opinion of the pathologist. In this paper, we review the recent state of the art CAD technology for digitized histopathology. This paper also briefly describes the development and application of novel image analysis technology for a few specific histopathology related problems being pursued in the United States and Europe. PMID:20671804
Real-time optical flow estimation on a GPU for a skied-steered mobile robot

NASA Astrophysics Data System (ADS)

Kniaz, V. V.

2016-04-01

Accurate egomotion estimation is required for mobile robot navigation. Often the egomotion is estimated using optical flow algorithms. For an accurate estimation of optical flow most of modern algorithms require high memory resources and processor speed. However simple single-board computers that control the motion of the robot usually do not provide such resources. On the other hand, most of modern single-board computers are equipped with an embedded GPU that could be used in parallel with a CPU to improve the performance of the optical flow estimation algorithm. This paper presents a new Z-flow algorithm for efficient computation of an optical flow using an embedded GPU. The algorithm is based on the phase correlation optical flow estimation and provide a real-time performance on a low cost embedded GPU. The layered optical flow model is used. Layer segmentation is performed using graph-cut algorithm with a time derivative based energy function. Such approach makes the algorithm both fast and robust in low light and low texture conditions. The algorithm implementation for a Raspberry Pi Model B computer is discussed. For evaluation of the algorithm the computer was mounted on a Hercules mobile skied-steered robot equipped with a monocular camera. The evaluation was performed using a hardware-in-the-loop simulation and experiments with Hercules mobile robot. Also the algorithm was evaluated using KITTY Optical Flow 2015 dataset. The resulting endpoint error of the optical flow calculated with the developed algorithm was low enough for navigation of the robot along the desired trajectory.
Probabilistic analysis algorithm for UA slope software program.

DOT National Transportation Integrated Search

2013-12-01

A reliability-based computational algorithm for using a single row and equally spaced drilled shafts to : stabilize an unstable slope has been developed in this research. The Monte-Carlo simulation (MCS) : technique was used in the previously develop...
A unified algorithm for predicting partition coefficients for PBPK modeling of drugs and environmental chemicals

DOE Office of Scientific and Technical Information (OSTI.GOV)

Peyret, Thomas; Poulin, Patrick; Krishnan, Kannan, E-mail: kannan.krishnan@umontreal.ca

The algorithms in the literature focusing to predict tissue:blood PC (P{sub tb}) for environmental chemicals and tissue:plasma PC based on total (K{sub p}) or unbound concentration (K{sub pu}) for drugs differ in their consideration of binding to hemoglobin, plasma proteins and charged phospholipids. The objective of the present study was to develop a unified algorithm such that P{sub tb}, K{sub p} and K{sub pu} for both drugs and environmental chemicals could be predicted. The development of the unified algorithm was accomplished by integrating all mechanistic algorithms previously published to compute the PCs. Furthermore, the algorithm was structured in such amore » way as to facilitate predictions of the distribution of organic compounds at the macro (i.e. whole tissue) and micro (i.e. cells and fluids) levels. The resulting unified algorithm was applied to compute the rat P{sub tb}, K{sub p} or K{sub pu} of muscle (n = 174), liver (n = 139) and adipose tissue (n = 141) for acidic, neutral, zwitterionic and basic drugs as well as ketones, acetate esters, alcohols, aliphatic hydrocarbons, aromatic hydrocarbons and ethers. The unified algorithm reproduced adequately the values predicted previously by the published algorithms for a total of 142 drugs and chemicals. The sensitivity analysis demonstrated the relative importance of the various compound properties reflective of specific mechanistic determinants relevant to prediction of PC values of drugs and environmental chemicals. Overall, the present unified algorithm uniquely facilitates the computation of macro and micro level PCs for developing organ and cellular-level PBPK models for both chemicals and drugs.« less
A generalized Condat's algorithm of 1D total variation regularization

NASA Astrophysics Data System (ADS)

Makovetskii, Artyom; Voronin, Sergei; Kober, Vitaly

2017-09-01

A common way for solving the denosing problem is to utilize the total variation (TV) regularization. Many efficient numerical algorithms have been developed for solving the TV regularization problem. Condat described a fast direct algorithm to compute the processed 1D signal. Also there exists a direct algorithm with a linear time for 1D TV denoising referred to as the taut string algorithm. The Condat's algorithm is based on a dual problem to the 1D TV regularization. In this paper, we propose a variant of the Condat's algorithm based on the direct 1D TV regularization problem. The usage of the Condat's algorithm with the taut string approach leads to a clear geometric description of the extremal function. Computer simulation results are provided to illustrate the performance of the proposed algorithm for restoration of degraded signals.
Quantum simulator review

NASA Astrophysics Data System (ADS)

Bednar, Earl; Drager, Steven L.

2007-04-01

Quantum information processing's objective is to utilize revolutionary computing capability based on harnessing the paradigm shift offered by quantum computing to solve classically hard and computationally challenging problems. Some of our computationally challenging problems of interest include: the capability for rapid image processing, rapid optimization of logistics, protecting information, secure distributed simulation, and massively parallel computation. Currently, one important problem with quantum information processing is that the implementation of quantum computers is difficult to realize due to poor scalability and great presence of errors. Therefore, we have supported the development of Quantum eXpress and QuIDD Pro, two quantum computer simulators running on classical computers for the development and testing of new quantum algorithms and processes. This paper examines the different methods used by these two quantum computing simulators. It reviews both simulators, highlighting each simulators background, interface, and special features. It also demonstrates the implementation of current quantum algorithms on each simulator. It concludes with summary comments on both simulators.
A Subspace Pursuit–based Iterative Greedy Hierarchical Solution to the Neuromagnetic Inverse Problem

PubMed Central

Babadi, Behtash; Obregon-Henao, Gabriel; Lamus, Camilo; Hämäläinen, Matti S.; Brown, Emery N.; Purdon, Patrick L.

2013-01-01

Magnetoencephalography (MEG) is an important non-invasive method for studying activity within the human brain. Source localization methods can be used to estimate spatiotemporal activity from MEG measurements with high temporal resolution, but the spatial resolution of these estimates is poor due to the ill-posed nature of the MEG inverse problem. Recent developments in source localization methodology have emphasized temporal as well as spatial constraints to improve source localization accuracy, but these methods can be computationally intense. Solutions emphasizing spatial sparsity hold tremendous promise, since the underlying neurophysiological processes generating MEG signals are often sparse in nature, whether in the form of focal sources, or distributed sources representing large-scale functional networks. Recent developments in the theory of compressed sensing (CS) provide a rigorous framework to estimate signals with sparse structure. In particular, a class of CS algorithms referred to as greedy pursuit algorithms can provide both high recovery accuracy and low computational complexity. Greedy pursuit algorithms are difficult to apply directly to the MEG inverse problem because of the high-dimensional structure of the MEG source space and the high spatial correlation in MEG measurements. In this paper, we develop a novel greedy pursuit algorithm for sparse MEG source localization that overcomes these fundamental problems. This algorithm, which we refer to as the Subspace Pursuit-based Iterative Greedy Hierarchical (SPIGH) inverse solution, exhibits very low computational complexity while achieving very high localization accuracy. We evaluate the performance of the proposed algorithm using comprehensive simulations, as well as the analysis of human MEG data during spontaneous brain activity and somatosensory stimuli. These studies reveal substantial performance gains provided by the SPIGH algorithm in terms of computational complexity, localization accuracy, and robustness. PMID:24055554
Algorithms for Zonal Methods and Development of Three Dimensional Mesh Generation Procedures.

DTIC Science & Technology

1984-02-01

a r-re complete set of equations is used, but their effect is imposed by means of a right hand side forcing function, not by means of a zonal boundary...modifications of flow-simulation algorithms The explicit finite-difference code of Magnus and are discussed. Computational tests in two dimensions...used to simplify the task of grid generation without an adverse achieve computational efficiency. More recently, effect on flow-field algorithms and

Vectorization of transport and diffusion computations on the CDC Cyber 205

DOE Office of Scientific and Technical Information (OSTI.GOV)

Abu-Shumays, I.K.

1986-01-01

The development and testing of alternative numerical methods and computational algorithms specifically designed for the vectorization of transport and diffusion computations on a Control Data Corporation (CDC) Cyber 205 vector computer are described. Two solution methods for the discrete ordinates approximation to the transport equation are summarized and compared. Factors of 4 to 7 reduction in run times for certain large transport problems were achieved on a Cyber 205 as compared with run times on a CDC-7600. The solution of tridiagonal systems of linear equations, central to several efficient numerical methods for multidimensional diffusion computations and essential for fluid flowmore » and other physics and engineering problems, is also dealt with. Among the methods tested, a combined odd-even cyclic reduction and modified Cholesky factorization algorithm for solving linear symmetric positive definite tridiagonal systems is found to be the most effective for these systems on a Cyber 205. For large tridiagonal systems, computation with this algorithm is an order of magnitude faster on a Cyber 205 than computation with the best algorithm for tridiagonal systems on a CDC-7600.« less
The remote sensing image segmentation mean shift algorithm parallel processing based on MapReduce

NASA Astrophysics Data System (ADS)

Chen, Xi; Zhou, Liqing

2015-12-01

With the development of satellite remote sensing technology and the remote sensing image data, traditional remote sensing image segmentation technology cannot meet the massive remote sensing image processing and storage requirements. This article put cloud computing and parallel computing technology in remote sensing image segmentation process, and build a cheap and efficient computer cluster system that uses parallel processing to achieve MeanShift algorithm of remote sensing image segmentation based on the MapReduce model, not only to ensure the quality of remote sensing image segmentation, improved split speed, and better meet the real-time requirements. The remote sensing image segmentation MeanShift algorithm parallel processing algorithm based on MapReduce shows certain significance and a realization of value.
The challenge of computer mathematics.

PubMed

Barendregt, Henk; Wiedijk, Freek

2005-10-15

Progress in the foundations of mathematics has made it possible to formulate all thinkable mathematical concepts, algorithms and proofs in one language and in an impeccable way. This is not in spite of, but partially based on the famous results of Gödel and Turing. In this way statements are about mathematical objects and algorithms, proofs show the correctness of statements and computations, and computations are dealing with objects and proofs. Interactive computer systems for a full integration of defining, computing and proving are based on this. The human defines concepts, constructs algorithms and provides proofs, while the machine checks that the definitions are well formed and the proofs and computations are correct. Results formalized so far demonstrate the feasibility of this 'computer mathematics'. Also there are very good applications. The challenge is to make the systems more mathematician-friendly, by building libraries and tools. The eventual goal is to help humans to learn, develop, communicate, referee and apply mathematics.
Methodologies and systems for heterogeneous concurrent computing

NASA Technical Reports Server (NTRS)

Sunderam, V. S.

1994-01-01

Heterogeneous concurrent computing is gaining increasing acceptance as an alternative or complementary paradigm to multiprocessor-based parallel processing as well as to conventional supercomputing. While algorithmic and programming aspects of heterogeneous concurrent computing are similar to their parallel processing counterparts, system issues, partitioning and scheduling, and performance aspects are significantly different. In this paper, we discuss critical design and implementation issues in heterogeneous concurrent computing, and describe techniques for enhancing its effectiveness. In particular, we highlight the system level infrastructures that are required, aspects of parallel algorithm development that most affect performance, system capabilities and limitations, and tools and methodologies for effective computing in heterogeneous networked environments. We also present recent developments and experiences in the context of the PVM system and comment on ongoing and future work.
ICASE Computer Science Program

NASA Technical Reports Server (NTRS)

1985-01-01

The Institute for Computer Applications in Science and Engineering computer science program is discussed in outline form. Information is given on such topics as problem decomposition, algorithm development, programming languages, and parallel architectures.
Design of automata theory of cubical complexes with applications to diagnosis and algorithmic description

NASA Technical Reports Server (NTRS)

Roth, J. P.

1972-01-01

Methods for development of logic design together with algorithms for failure testing, a method for design of logic for ultra-large-scale integration, extension of quantum calculus to describe the functional behavior of a mechanism component-by-component and to computer tests for failures in the mechanism using the diagnosis algorithm, and the development of an algorithm for the multi-output 2-level minimization problem are discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Farrell, Kathryn, E-mail: kfarrell@ices.utexas.edu; Oden, J. Tinsley, E-mail: oden@ices.utexas.edu; Faghihi, Danial, E-mail: danial@ices.utexas.edu

A general adaptive modeling algorithm for selection and validation of coarse-grained models of atomistic systems is presented. A Bayesian framework is developed to address uncertainties in parameters, data, and model selection. Algorithms for computing output sensitivities to parameter variances, model evidence and posterior model plausibilities for given data, and for computing what are referred to as Occam Categories in reference to a rough measure of model simplicity, make up components of the overall approach. Computational results are provided for representative applications.
Local spatio-temporal analysis in vision systems

NASA Astrophysics Data System (ADS)

Geisler, Wilson S.; Bovik, Alan; Cormack, Lawrence; Ghosh, Joydeep; Gildeen, David

1994-07-01

The aims of this project are the following: (1) develop a physiologically and psychophysically based model of low-level human visual processing (a key component of which are local frequency coding mechanisms); (2) develop image models and image-processing methods based upon local frequency coding; (3) develop algorithms for performing certain complex visual tasks based upon local frequency representations, (4) develop models of human performance in certain complex tasks based upon our understanding of low-level processing; and (5) develop a computational testbed for implementing, evaluating and visualizing the proposed models and algorithms, using a massively parallel computer. Progress has been substantial on all aims. The highlights include the following: (1) completion of a number of psychophysical and physiological experiments revealing new, systematic and exciting properties of the primate (human and monkey) visual system; (2) further development of image models that can accurately represent the local frequency structure in complex images; (3) near completion in the construction of the Texas Active Vision Testbed; (4) development and testing of several new computer vision algorithms dealing with shape-from-texture, shape-from-stereo, and depth-from-focus; (5) implementation and evaluation of several new models of human visual performance; and (6) evaluation, purchase and installation of a MasPar parallel computer.
AdaBoost-based algorithm for network intrusion detection.

PubMed

Hu, Weiming; Hu, Wei; Maybank, Steve

2008-04-01

Network intrusion detection aims at distinguishing the attacks on the Internet from normal use of the Internet. It is an indispensable part of the information security system. Due to the variety of network behaviors and the rapid development of attack fashions, it is necessary to develop fast machine-learning-based intrusion detection algorithms with high detection rates and low false-alarm rates. In this correspondence, we propose an intrusion detection algorithm based on the AdaBoost algorithm. In the algorithm, decision stumps are used as weak classifiers. The decision rules are provided for both categorical and continuous features. By combining the weak classifiers for continuous features and the weak classifiers for categorical features into a strong classifier, the relations between these two different types of features are handled naturally, without any forced conversions between continuous and categorical features. Adaptable initial weights and a simple strategy for avoiding overfitting are adopted to improve the performance of the algorithm. Experimental results show that our algorithm has low computational complexity and error rates, as compared with algorithms of higher computational complexity, as tested on the benchmark sample data.
Generic algorithms for high performance scalable geocomputing

NASA Astrophysics Data System (ADS)

de Jong, Kor; Schmitz, Oliver; Karssenberg, Derek

2016-04-01

During the last decade, the characteristics of computing hardware have changed a lot. For example, instead of a single general purpose CPU core, personal computers nowadays contain multiple cores per CPU and often general purpose accelerators, like GPUs. Additionally, compute nodes are often grouped together to form clusters or a supercomputer, providing enormous amounts of compute power. For existing earth simulation models to be able to use modern hardware platforms, their compute intensive parts must be rewritten. This can be a major undertaking and may involve many technical challenges. Compute tasks must be distributed over CPU cores, offloaded to hardware accelerators, or distributed to different compute nodes. And ideally, all of this should be done in such a way that the compute task scales well with the hardware resources. This presents two challenges: 1) how to make good use of all the compute resources and 2) how to make these compute resources available for developers of simulation models, who may not (want to) have the required technical background for distributing compute tasks. The first challenge requires the use of specialized technology (e.g.: threads, OpenMP, MPI, OpenCL, CUDA). The second challenge requires the abstraction of the logic handling the distribution of compute tasks from the model-specific logic, hiding the technical details from the model developer. To assist the model developer, we are developing a C++ software library (called Fern) containing algorithms that can use all CPU cores available in a single compute node (distributing tasks over multiple compute nodes will be done at a later stage). The algorithms are grid-based (finite difference) and include local and spatial operations such as convolution filters. The algorithms handle distribution of the compute tasks to CPU cores internally. In the resulting model the low-level details of how this is done is separated from the model-specific logic representing the modeled system. This contrasts with practices in which code for distributing of compute tasks is mixed with model-specific code, and results in a better maintainable model. For flexibility and efficiency, the algorithms are configurable at compile-time with the respect to the following aspects: data type, value type, no-data handling, input value domain handling, and output value range handling. This makes the algorithms usable in very different contexts, without the need for making intrusive changes to existing models when using them. Applications that benefit from using the Fern library include the construction of forward simulation models in (global) hydrology (e.g. PCR-GLOBWB (Van Beek et al. 2011)), ecology, geomorphology, or land use change (e.g. PLUC (Verstegen et al. 2014)) and manipulation of hyper-resolution land surface data such as digital elevation models and remote sensing data. Using the Fern library, we have also created an add-on to the PCRaster Python Framework (Karssenberg et al. 2010) allowing its users to speed up their spatio-temporal models, sometimes by changing just a single line of Python code in their model. In our presentation we will give an overview of the design of the algorithms, providing examples of different contexts where they can be used to replace existing sequential algorithms, including the PCRaster environmental modeling software (www.pcraster.eu). We will show how the algorithms can be configured to behave differently when necessary. References Karssenberg, D., Schmitz, O., Salamon, P., De Jong, K. and Bierkens, M.F.P., 2010, A software framework for construction of process-based stochastic spatio-temporal models and data assimilation. Environmental Modelling & Software, 25, pp. 489-502, Link. Best Paper Award 2010: Software and Decision Support. Van Beek, L. P. H., Y. Wada, and M. F. P. Bierkens. 2011. Global monthly water stress: 1. Water balance and water availability. Water Resources Research. 47. Verstegen, J. A., D. Karssenberg, F. van der Hilst, and A. P. C. Faaij. 2014. Identifying a land use change cellular automaton by Bayesian data assimilation. Environmental Modelling & Software 53:121-136.
Automatic image analysis and spot classification for detection of pathogenic Escherichia coli on glass slide DNA microarrays

USDA-ARS?s Scientific Manuscript database

A computer algorithm was created to inspect scanned images from DNA microarray slides developed to rapidly detect and genotype E. Coli O157 virulent strains. The algorithm computes centroid locations for signal and background pixels in RGB space and defines a plane perpendicular to the line connect...
Satellite Doppler data processing using a microcomputer

NASA Technical Reports Server (NTRS)

Schmid, P. E.; Lynn, J. J.

1977-01-01

A microcomputer which was developed to compute ground radio beacon position locations using satellite measurements of Doppler frequency shift is described. Both the computational algorithms and the microcomputer hardware incorporating these algorithms were discussed. Results are presented where the microcomputer in conjunction with the NIMBUS-6 random access measurement system provides real time calculation of beacon latitude and longitude.
A Branch-and-Bound Algorithm for Fitting Anti-Robinson Structures to Symmetric Dissimilarity Matrices.

ERIC Educational Resources Information Center

Brusco, Michael J.

2002-01-01

Developed a branch-and-bound algorithm that can be used to seriate a symmetric dissimilarity matrix by identifying a reordering of rows and columns of the matrix optimizing an anti-Robinson criterion. Computational results suggest that with respect to computational efficiency, the approach is generally competitive with dynamic programming. (SLD)
Monkey search algorithm for ECE components partitioning

NASA Astrophysics Data System (ADS)

Kuliev, Elmar; Kureichik, Vladimir; Kureichik, Vladimir, Jr.

2018-05-01

The paper considers one of the important design problems – a partitioning of electronic computer equipment (ECE) components (blocks). It belongs to the NP-hard class of problems and has a combinatorial and logic nature. In the paper, a partitioning problem formulation can be found as a partition of graph into parts. To solve the given problem, the authors suggest using a bioinspired approach based on a monkey search algorithm. Based on the developed software, computational experiments were carried out that show the algorithm efficiency, as well as its recommended settings for obtaining more effective solutions in comparison with a genetic algorithm.
Lower bound on the time complexity of local adiabatic evolution

NASA Astrophysics Data System (ADS)

Chen, Zhenghao; Koh, Pang Wei; Zhao, Yan

2006-11-01

The adiabatic theorem of quantum physics has been, in recent times, utilized in the design of local search quantum algorithms, and has been proven to be equivalent to standard quantum computation, that is, the use of unitary operators [D. Aharonov in Proceedings of the 45th Annual Symposium on the Foundations of Computer Science, 2004, Rome, Italy (IEEE Computer Society Press, New York, 2004), pp. 42-51]. Hence, the study of the time complexity of adiabatic evolution algorithms gives insight into the computational power of quantum algorithms. In this paper, we present two different approaches of evaluating the time complexity for local adiabatic evolution using time-independent parameters, thus providing effective tests (not requiring the evaluation of the entire time-dependent gap function) for the time complexity of newly developed algorithms. We further illustrate our tests by displaying results from the numerical simulation of some problems, viz. specially modified instances of the Hamming weight problem.
Computational Intelligence in Early Diabetes Diagnosis: A Review

PubMed Central

Shankaracharya; Odedra, Devang; Samanta, Subir; Vidyarthi, Ambarish S.

2010-01-01

The development of an effective diabetes diagnosis system by taking advantage of computational intelligence is regarded as a primary goal nowadays. Many approaches based on artificial network and machine learning algorithms have been developed and tested against diabetes datasets, which were mostly related to individuals of Pima Indian origin. Yet, despite high accuracies of up to 99% in predicting the correct diabetes diagnosis, none of these approaches have reached clinical application so far. One reason for this failure may be that diabetologists or clinical investigators are sparsely informed about, or trained in the use of, computational diagnosis tools. Therefore, this article aims at sketching out an outline of the wide range of options, recent developments, and potentials in machine learning algorithms as diabetes diagnosis tools. One focus is on supervised and unsupervised methods, which have made significant impacts in the detection and diagnosis of diabetes at primary and advanced stages. Particular attention is paid to algorithms that show promise in improving diabetes diagnosis. A key advance has been the development of a more in-depth understanding and theoretical analysis of critical issues related to algorithmic construction and learning theory. These include trade-offs for maximizing generalization performance, use of physically realistic constraints, and incorporation of prior knowledge and uncertainty. The review presents and explains the most accurate algorithms, and discusses advantages and pitfalls of methodologies. This should provide a good resource for researchers from all backgrounds interested in computational intelligence-based diabetes diagnosis methods, and allows them to extend their knowledge into this kind of research. PMID:21713313
Computational intelligence in early diabetes diagnosis: a review.

PubMed

Shankaracharya; Odedra, Devang; Samanta, Subir; Vidyarthi, Ambarish S

2010-01-01

The development of an effective diabetes diagnosis system by taking advantage of computational intelligence is regarded as a primary goal nowadays. Many approaches based on artificial network and machine learning algorithms have been developed and tested against diabetes datasets, which were mostly related to individuals of Pima Indian origin. Yet, despite high accuracies of up to 99% in predicting the correct diabetes diagnosis, none of these approaches have reached clinical application so far. One reason for this failure may be that diabetologists or clinical investigators are sparsely informed about, or trained in the use of, computational diagnosis tools. Therefore, this article aims at sketching out an outline of the wide range of options, recent developments, and potentials in machine learning algorithms as diabetes diagnosis tools. One focus is on supervised and unsupervised methods, which have made significant impacts in the detection and diagnosis of diabetes at primary and advanced stages. Particular attention is paid to algorithms that show promise in improving diabetes diagnosis. A key advance has been the development of a more in-depth understanding and theoretical analysis of critical issues related to algorithmic construction and learning theory. These include trade-offs for maximizing generalization performance, use of physically realistic constraints, and incorporation of prior knowledge and uncertainty. The review presents and explains the most accurate algorithms, and discusses advantages and pitfalls of methodologies. This should provide a good resource for researchers from all backgrounds interested in computational intelligence-based diabetes diagnosis methods, and allows them to extend their knowledge into this kind of research.
Aspects of Unstructured Grids and Finite-Volume Solvers for the Euler and Navier-Stokes Equations

NASA Technical Reports Server (NTRS)

Barth, Timothy J.

1992-01-01

One of the major achievements in engineering science has been the development of computer algorithms for solving nonlinear differential equations such as the Navier-Stokes equations. In the past, limited computer resources have motivated the development of efficient numerical schemes in computational fluid dynamics (CFD) utilizing structured meshes. The use of structured meshes greatly simplifies the implementation of CFD algorithms on conventional computers. Unstructured grids on the other hand offer an alternative to modeling complex geometries. Unstructured meshes have irregular connectivity and usually contain combinations of triangles, quadrilaterals, tetrahedra, and hexahedra. The generation and use of unstructured grids poses new challenges in CFD. The purpose of this note is to present recent developments in the unstructured grid generation and flow solution technology.
A return mapping algorithm for isotropic and anisotropic plasticity models using a line search method

DOE PAGES

Scherzinger, William M.

2016-05-01

The numerical integration of constitutive models in computational solid mechanics codes allows for the solution of boundary value problems involving complex material behavior. Metal plasticity models, in particular, have been instrumental in the development of these codes. Here, most plasticity models implemented in computational codes use an isotropic von Mises yield surface. The von Mises, of J 2, yield surface has a simple predictor-corrector algorithm - the radial return algorithm - to integrate the model.
Numerical simulation of steady supersonic flow. [spatial marching

NASA Technical Reports Server (NTRS)

Schiff, L. B.; Steger, J. L.

1981-01-01

A noniterative, implicit, space-marching, finite-difference algorithm was developed for the steady thin-layer Navier-Stokes equations in conservation-law form. The numerical algorithm is applicable to steady supersonic viscous flow over bodies of arbitrary shape. In addition, the same code can be used to compute supersonic inviscid flow or three-dimensional boundary layers. Computed results from two-dimensional and three-dimensional versions of the numerical algorithm are in good agreement with those obtained from more costly time-marching techniques.

Gamma-Weighted Discrete Ordinate Two-Stream Approximation for Computation of Domain Averaged Solar Irradiance

NASA Technical Reports Server (NTRS)

Kato, S.; Smith, G. L.; Barker, H. W.

2001-01-01

An algorithm is developed for the gamma-weighted discrete ordinate two-stream approximation that computes profiles of domain-averaged shortwave irradiances for horizontally inhomogeneous cloudy atmospheres. The algorithm assumes that frequency distributions of cloud optical depth at unresolved scales can be represented by a gamma distribution though it neglects net horizontal transport of radiation. This algorithm is an alternative to the one used in earlier studies that adopted the adding method. At present, only overcast cloudy layers are permitted.
Bioinformatics algorithm based on a parallel implementation of a machine learning approach using transducers

NASA Astrophysics Data System (ADS)

Roche-Lima, Abiel; Thulasiram, Ruppa K.

2012-02-01

Finite automata, in which each transition is augmented with an output label in addition to the familiar input label, are considered finite-state transducers. Transducers have been used to analyze some fundamental issues in bioinformatics. Weighted finite-state transducers have been proposed to pairwise alignments of DNA and protein sequences; as well as to develop kernels for computational biology. Machine learning algorithms for conditional transducers have been implemented and used for DNA sequence analysis. Transducer learning algorithms are based on conditional probability computation. It is calculated by using techniques, such as pair-database creation, normalization (with Maximum-Likelihood normalization) and parameters optimization (with Expectation-Maximization - EM). These techniques are intrinsically costly for computation, even worse when are applied to bioinformatics, because the databases sizes are large. In this work, we describe a parallel implementation of an algorithm to learn conditional transducers using these techniques. The algorithm is oriented to bioinformatics applications, such as alignments, phylogenetic trees, and other genome evolution studies. Indeed, several experiences were developed using the parallel and sequential algorithm on Westgrid (specifically, on the Breeze cluster). As results, we obtain that our parallel algorithm is scalable, because execution times are reduced considerably when the data size parameter is increased. Another experience is developed by changing precision parameter. In this case, we obtain smaller execution times using the parallel algorithm. Finally, number of threads used to execute the parallel algorithm on the Breezy cluster is changed. In this last experience, we obtain as result that speedup is considerably increased when more threads are used; however there is a convergence for number of threads equal to or greater than 16.
An algorithm for calculi segmentation on ureteroscopic images.

PubMed

Rosa, Benoît; Mozer, Pierre; Szewczyk, Jérôme

2011-03-01

The purpose of the study is to develop an algorithm for the segmentation of renal calculi on ureteroscopic images. In fact, renal calculi are common source of urological obstruction, and laser lithotripsy during ureteroscopy is a possible therapy. A laser-based system to sweep the calculus surface and vaporize it was developed to automate a very tedious manual task. The distal tip of the ureteroscope is directed using image guidance, and this operation is not possible without an efficient segmentation of renal calculi on the ureteroscopic images. We proposed and developed a region growing algorithm to segment renal calculi on ureteroscopic images. Using real video images to compute ground truth and compare our segmentation with a reference segmentation, we computed statistics on different image metrics, such as Precision, Recall, and Yasnoff Measure, for comparison with ground truth. The algorithm and its parameters were established for the most likely clinical scenarii. The segmentation results are encouraging: the developed algorithm was able to correctly detect more than 90% of the surface of the calculi, according to an expert observer. Implementation of an algorithm for the segmentation of calculi on ureteroscopic images is feasible. The next step is the integration of our algorithm in the command scheme of a motorized system to build a complete operating prototype.
An efficient parallel algorithm: Poststack and prestack Kirchhoff 3D depth migration using flexi-depth iterations

NASA Astrophysics Data System (ADS)

Rastogi, Richa; Srivastava, Abhishek; Khonde, Kiran; Sirasala, Kirannmayi M.; Londhe, Ashutosh; Chavhan, Hitesh

2015-07-01

This paper presents an efficient parallel 3D Kirchhoff depth migration algorithm suitable for current class of multicore architecture. The fundamental Kirchhoff depth migration algorithm exhibits inherent parallelism however, when it comes to 3D data migration, as the data size increases the resource requirement of the algorithm also increases. This challenges its practical implementation even on current generation high performance computing systems. Therefore a smart parallelization approach is essential to handle 3D data for migration. The most compute intensive part of Kirchhoff depth migration algorithm is the calculation of traveltime tables due to its resource requirements such as memory/storage and I/O. In the current research work, we target this area and develop a competent parallel algorithm for post and prestack 3D Kirchhoff depth migration, using hybrid MPI+OpenMP programming techniques. We introduce a concept of flexi-depth iterations while depth migrating data in parallel imaging space, using optimized traveltime table computations. This concept provides flexibility to the algorithm by migrating data in a number of depth iterations, which depends upon the available node memory and the size of data to be migrated during runtime. Furthermore, it minimizes the requirements of storage, I/O and inter-node communication, thus making it advantageous over the conventional parallelization approaches. The developed parallel algorithm is demonstrated and analysed on Yuva II, a PARAM series of supercomputers. Optimization, performance and scalability experiment results along with the migration outcome show the effectiveness of the parallel algorithm.
Computer sciences

NASA Technical Reports Server (NTRS)

Smith, Paul H.

1988-01-01

The Computer Science Program provides advanced concepts, techniques, system architectures, algorithms, and software for both space and aeronautics information sciences and computer systems. The overall goal is to provide the technical foundation within NASA for the advancement of computing technology in aerospace applications. The research program is improving the state of knowledge of fundamental aerospace computing principles and advancing computing technology in space applications such as software engineering and information extraction from data collected by scientific instruments in space. The program includes the development of special algorithms and techniques to exploit the computing power provided by high performance parallel processors and special purpose architectures. Research is being conducted in the fundamentals of data base logic and improvement techniques for producing reliable computing systems.
Computational algorithms for simulations in atmospheric optics.

PubMed

Konyaev, P A; Lukin, V P

2016-04-20

A computer simulation technique for atmospheric and adaptive optics based on parallel programing is discussed. A parallel propagation algorithm is designed and a modified spectral-phase method for computer generation of 2D time-variant random fields is developed. Temporal power spectra of Laguerre-Gaussian beam fluctuations are considered as an example to illustrate the applications discussed. Implementation of the proposed algorithms using Intel MKL and IPP libraries and NVIDIA CUDA technology is shown to be very fast and accurate. The hardware system for the computer simulation is an off-the-shelf desktop with an Intel Core i7-4790K CPU operating at a turbo-speed frequency up to 5 GHz and an NVIDIA GeForce GTX-960 graphics accelerator with 1024 1.5 GHz processors.
Advances in Artificial Neural Networks - Methodological Development and Application

USDA-ARS?s Scientific Manuscript database

Artificial neural networks as a major soft-computing technology have been extensively studied and applied during the last three decades. Research on backpropagation training algorithms for multilayer perceptron networks has spurred development of other neural network training algorithms for other ne...
A conservative finite difference algorithm for the unsteady transonic potential equation in generalized coordinates

NASA Technical Reports Server (NTRS)

Bridgeman, J. O.; Steger, J. L.; Caradonna, F. X.

1982-01-01

An implicit, approximate-factorization, finite-difference algorithm has been developed for the computation of unsteady, inviscid transonic flows in two and three dimensions. The computer program solves the full-potential equation in generalized coordinates in conservation-law form in order to properly capture shock-wave position and speed. A body-fitted coordinate system is employed for the simple and accurate treatment of boundary conditions on the body surface. The time-accurate algorithm is modified to a conventional ADI relaxation scheme for steady-state computations. Results from two- and three-dimensional steady and two-dimensional unsteady calculations are compared with existing methods.
The computation of pi to 29,360,000 decimal digits using Borweins' quartically convergent algorithm

NASA Technical Reports Server (NTRS)

Bailey, David H.

1988-01-01

The quartically convergent numerical algorithm developed by Borwein and Borwein (1987) for 1/pi is implemented via a prime-modulus-transform multiprecision technique on the NASA Ames Cray-2 supercomputer to compute the first 2.936 x 10 to the 7th digits of the decimal expansion of pi. The history of pi computations is briefly recalled; the most recent algorithms are characterized; the implementation procedures are described; and samples of the output listing are presented. Statistical analyses show that the present decimal expansion is completely random, with only acceptable numbers of long repeating strings and single-digit runs.
CREKID: A computer code for transient, gas-phase combustion of kinetics

NASA Technical Reports Server (NTRS)

Pratt, D. T.; Radhakrishnan, K.

1984-01-01

A new algorithm was developed for fast, automatic integration of chemical kinetic rate equations describing homogeneous, gas-phase combustion at constant pressure. Particular attention is paid to the distinguishing physical and computational characteristics of the induction, heat-release and equilibration regimes. The two-part predictor-corrector algorithm, based on an exponentially-fitted trapezoidal rule, includes filtering of ill-posed initial conditions, automatic selection of Newton-Jacobi or Newton iteration for convergence to achieve maximum computational efficiency while observing a prescribed error tolerance. The new algorithm was found to compare favorably with LSODE on two representative test problems drawn from combustion kinetics.
Field-Programmable Gate Array Computer in Structural Analysis: An Initial Exploration

NASA Technical Reports Server (NTRS)

Singleterry, Robert C., Jr.; Sobieszczanski-Sobieski, Jaroslaw; Brown, Samuel

2002-01-01

This paper reports on an initial assessment of using a Field-Programmable Gate Array (FPGA) computational device as a new tool for solving structural mechanics problems. A FPGA is an assemblage of binary gates arranged in logical blocks that are interconnected via software in a manner dependent on the algorithm being implemented and can be reprogrammed thousands of times per second. In effect, this creates a computer specialized for the problem that automatically exploits all the potential for parallel computing intrinsic in an algorithm. This inherent parallelism is the most important feature of the FPGA computational environment. It is therefore important that if a problem offers a choice of different solution algorithms, an algorithm of a higher degree of inherent parallelism should be selected. It is found that in structural analysis, an 'analog computer' style of programming, which solves problems by direct simulation of the terms in the governing differential equations, yields a more favorable solution algorithm than current solution methods. This style of programming is facilitated by a 'drag-and-drop' graphic programming language that is supplied with the particular type of FPGA computer reported in this paper. Simple examples in structural dynamics and statics illustrate the solution approach used. The FPGA system also allows linear scalability in computing capability. As the problem grows, the number of FPGA chips can be increased with no loss of computing efficiency due to data flow or algorithmic latency that occurs when a single problem is distributed among many conventional processors that operate in parallel. This initial assessment finds the FPGA hardware and software to be in their infancy in regard to the user conveniences; however, they have enormous potential for shrinking the elapsed time of structural analysis solutions if programmed with algorithms that exhibit inherent parallelism and linear scalability. This potential warrants further development of FPGA-tailored algorithms for structural analysis.
From evolutionary computation to the evolution of things.

PubMed

Eiben, Agoston E; Smith, Jim

2015-05-28

Evolution has provided a source of inspiration for algorithm designers since the birth of computers. The resulting field, evolutionary computation, has been successful in solving engineering tasks ranging in outlook from the molecular to the astronomical. Today, the field is entering a new phase as evolutionary algorithms that take place in hardware are developed, opening up new avenues towards autonomous machines that can adapt to their environment. We discuss how evolutionary computation compares with natural evolution and what its benefits are relative to other computing approaches, and we introduce the emerging area of artificial evolution in physical systems.
Evolving Non-Dominated Parameter Sets for Computational Models from Multiple Experiments

NASA Astrophysics Data System (ADS)

Lane, Peter C. R.; Gobet, Fernand

2013-03-01

Creating robust, reproducible and optimal computational models is a key challenge for theorists in many sciences. Psychology and cognitive science face particular challenges as large amounts of data are collected and many models are not amenable to analytical techniques for calculating parameter sets. Particular problems are to locate the full range of acceptable model parameters for a given dataset, and to confirm the consistency of model parameters across different datasets. Resolving these problems will provide a better understanding of the behaviour of computational models, and so support the development of general and robust models. In this article, we address these problems using evolutionary algorithms to develop parameters for computational models against multiple sets of experimental data; in particular, we propose the `speciated non-dominated sorting genetic algorithm' for evolving models in several theories. We discuss the problem of developing a model of categorisation using twenty-nine sets of data and models drawn from four different theories. We find that the evolutionary algorithms generate high quality models, adapted to provide a good fit to all available data.
Parallel Optimization of Polynomials for Large-scale Problems in Stability and Control

NASA Astrophysics Data System (ADS)

Kamyar, Reza

In this thesis, we focus on some of the NP-hard problems in control theory. Thanks to the converse Lyapunov theory, these problems can often be modeled as optimization over polynomials. To avoid the problem of intractability, we establish a trade off between accuracy and complexity. In particular, we develop a sequence of tractable optimization problems --- in the form of Linear Programs (LPs) and/or Semi-Definite Programs (SDPs) --- whose solutions converge to the exact solution of the NP-hard problem. However, the computational and memory complexity of these LPs and SDPs grow exponentially with the progress of the sequence - meaning that improving the accuracy of the solutions requires solving SDPs with tens of thousands of decision variables and constraints. Setting up and solving such problems is a significant challenge. The existing optimization algorithms and software are only designed to use desktop computers or small cluster computers --- machines which do not have sufficient memory for solving such large SDPs. Moreover, the speed-up of these algorithms does not scale beyond dozens of processors. This in fact is the reason we seek parallel algorithms for setting-up and solving large SDPs on large cluster- and/or super-computers. We propose parallel algorithms for stability analysis of two classes of systems: 1) Linear systems with a large number of uncertain parameters; 2) Nonlinear systems defined by polynomial vector fields. First, we develop a distributed parallel algorithm which applies Polya's and/or Handelman's theorems to some variants of parameter-dependent Lyapunov inequalities with parameters defined over the standard simplex. The result is a sequence of SDPs which possess a block-diagonal structure. We then develop a parallel SDP solver which exploits this structure in order to map the computation, memory and communication to a distributed parallel environment. Numerical tests on a supercomputer demonstrate the ability of the algorithm to efficiently utilize hundreds and potentially thousands of processors, and analyze systems with 100+ dimensional state-space. Furthermore, we extend our algorithms to analyze robust stability over more complicated geometries such as hypercubes and arbitrary convex polytopes. Our algorithms can be readily extended to address a wide variety of problems in control such as Hinfinity synthesis for systems with parametric uncertainty and computing control Lyapunov functions.
Quantitative Imaging Biomarkers: A Review of Statistical Methods for Computer Algorithm Comparisons

PubMed Central

2014-01-01

Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research. PMID:24919829
A POSTERIORI ERROR ANALYSIS OF TWO STAGE COMPUTATION METHODS WITH APPLICATION TO EFFICIENT DISCRETIZATION AND THE PARAREAL ALGORITHM.

PubMed

Chaudhry, Jehanzeb Hameed; Estep, Don; Tavener, Simon; Carey, Varis; Sandelin, Jeff

2016-01-01

We consider numerical methods for initial value problems that employ a two stage approach consisting of solution on a relatively coarse discretization followed by solution on a relatively fine discretization. Examples include adaptive error control, parallel-in-time solution schemes, and efficient solution of adjoint problems for computing a posteriori error estimates. We describe a general formulation of two stage computations then perform a general a posteriori error analysis based on computable residuals and solution of an adjoint problem. The analysis accommodates various variations in the two stage computation and in formulation of the adjoint problems. We apply the analysis to compute "dual-weighted" a posteriori error estimates, to develop novel algorithms for efficient solution that take into account cancellation of error, and to the Parareal Algorithm. We test the various results using several numerical examples.
Light reflection models for computer graphics.

PubMed

Greenberg, D P

1989-04-14

During the past 20 years, computer graphic techniques for simulating the reflection of light have progressed so that today images of photorealistic quality can be produced. Early algorithms considered direct lighting only, but global illumination phenomena with indirect lighting, surface interreflections, and shadows can now be modeled with ray tracing, radiosity, and Monte Carlo simulations. This article describes the historical development of computer graphic algorithms for light reflection and pictorially illustrates what will be commonly available in the near future.
Algorithms and Libraries

NASA Technical Reports Server (NTRS)

Dongarra, Jack

1998-01-01

This exploratory study initiated our inquiry into algorithms and applications that would benefit by latency tolerant approach to algorithm building, including the construction of new algorithms where appropriate. In a multithreaded execution, when a processor reaches a point where remote memory access is necessary, the request is sent out on the network and a context--switch occurs to a new thread of computation. This effectively masks a long and unpredictable latency due to remote loads, thereby providing tolerance to remote access latency. We began to develop standards to profile various algorithm and application parameters, such as the degree of parallelism, granularity, precision, instruction set mix, interprocessor communication, latency etc. These tools will continue to develop and evolve as the Information Power Grid environment matures. To provide a richer context for this research, the project also focused on issues of fault-tolerance and computation migration of numerical algorithms and software. During the initial phase we tried to increase our understanding of the bottlenecks in single processor performance. Our work began by developing an approach for the automatic generation and optimization of numerical software for processors with deep memory hierarchies and pipelined functional units. Based on the results we achieved in this study we are planning to study other architectures of interest, including development of cost models, and developing code generators appropriate to these architectures.
al3c: high-performance software for parameter inference using Approximate Bayesian Computation.

PubMed

Stram, Alexander H; Marjoram, Paul; Chen, Gary K

2015-11-01

The development of Approximate Bayesian Computation (ABC) algorithms for parameter inference which are both computationally efficient and scalable in parallel computing environments is an important area of research. Monte Carlo rejection sampling, a fundamental component of ABC algorithms, is trivial to distribute over multiple processors but is inherently inefficient. While development of algorithms such as ABC Sequential Monte Carlo (ABC-SMC) help address the inherent inefficiencies of rejection sampling, such approaches are not as easily scaled on multiple processors. As a result, current Bayesian inference software offerings that use ABC-SMC lack the ability to scale in parallel computing environments. We present al3c, a C++ framework for implementing ABC-SMC in parallel. By requiring only that users define essential functions such as the simulation model and prior distribution function, al3c abstracts the user from both the complexities of parallel programming and the details of the ABC-SMC algorithm. By using the al3c framework, the user is able to scale the ABC-SMC algorithm in parallel computing environments for his or her specific application, with minimal programming overhead. al3c is offered as a static binary for Linux and OS-X computing environments. The user completes an XML configuration file and C++ plug-in template for the specific application, which are used by al3c to obtain the desired results. Users can download the static binaries, source code, reference documentation and examples (including those in this article) by visiting https://github.com/ahstram/al3c. astram@usc.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Assessment of the information content of patterns: an algorithm

NASA Astrophysics Data System (ADS)

Daemi, M. Farhang; Beurle, R. L.

1991-12-01

A preliminary investigation confirmed the possibility of assessing the translational and rotational information content of simple artificial images. The calculation is tedious, and for more realistic patterns it is essential to implement the method on a computer. This paper describes an algorithm developed for this purpose which confirms the results of the preliminary investigation. Use of the algorithm facilitates much more comprehensive analysis of the combined effect of continuous rotation and fine translation, and paves the way for analysis of more realistic patterns. Owing to the volume of calculation involved in these algorithms, extensive computing facilities were necessary. The major part of the work was carried out using an ICL 3900 series mainframe computer as well as other powerful workstations such as a RISC architecture MIPS machine.

Tomography by iterative convolution - Empirical study and application to interferometry

NASA Technical Reports Server (NTRS)

Vest, C. M.; Prikryl, I.

1984-01-01

An algorithm for computer tomography has been developed that is applicable to reconstruction from data having incomplete projections because an opaque object blocks some of the probing radiation as it passes through the object field. The algorithm is based on iteration between the object domain and the projection (Radon transform) domain. Reconstructions are computed during each iteration by the well-known convolution method. Although it is demonstrated that this algorithm does not converge, an empirically justified criterion for terminating the iteration when the most accurate estimate has been computed is presented. The algorithm has been studied by using it to reconstruct several different object fields with several different opaque regions. It also has been used to reconstruct aerodynamic density fields from interferometric data recorded in wind tunnel tests.
Evaluation of Semantic Web Technologies for Storing Computable Definitions of Electronic Health Records Phenotyping Algorithms.

PubMed

Papež, Václav; Denaxas, Spiros; Hemingway, Harry

2017-01-01

Electronic Health Records are electronic data generated during or as a byproduct of routine patient care. Structured, semi-structured and unstructured EHR offer researchers unprecedented phenotypic breadth and depth and have the potential to accelerate the development of precision medicine approaches at scale. A main EHR use-case is defining phenotyping algorithms that identify disease status, onset and severity. Phenotyping algorithms utilize diagnoses, prescriptions, laboratory tests, symptoms and other elements in order to identify patients with or without a specific trait. No common standardized, structured, computable format exists for storing phenotyping algorithms. The majority of algorithms are stored as human-readable descriptive text documents making their translation to code challenging due to their inherent complexity and hinders their sharing and re-use across the community. In this paper, we evaluate the two key Semantic Web Technologies, the Web Ontology Language and the Resource Description Framework, for enabling computable representations of EHR-driven phenotyping algorithms.
A fast Fourier transform on multipoles (FFTM) algorithm for solving Helmholtz equation in acoustics analysis.

PubMed

Ong, Eng Teo; Lee, Heow Pueh; Lim, Kian Meng

2004-09-01

This article presents a fast algorithm for the efficient solution of the Helmholtz equation. The method is based on the translation theory of the multipole expansions. Here, the speedup comes from the convolution nature of the translation operators, which can be evaluated rapidly using fast Fourier transform algorithms. Also, the computations of the translation operators are accelerated by using the recursive formulas developed recently by Gumerov and Duraiswami [SIAM J. Sci. Comput. 25, 1344-1381(2003)]. It is demonstrated that the algorithm can produce good accuracy with a relatively low order of expansion. Efficiency analyses of the algorithm reveal that it has computational complexities of O(Na), where a ranges from 1.05 to 1.24. However, this method requires substantially more memory to store the translation operators as compared to the fast multipole method. Hence, despite its simplicity in implementation, this memory requirement issue may limit the application of this algorithm to solving very large-scale problems.
A survey of SAT solver

NASA Astrophysics Data System (ADS)

Gong, Weiwei; Zhou, Xu

2017-06-01

In Computer Science, the Boolean Satisfiability Problem(SAT) is the problem of determining if there exists an interpretation that satisfies a given Boolean formula. SAT is one of the first problems that was proven to be NP-complete, which is also fundamental to artificial intelligence, algorithm and hardware design. This paper reviews the main algorithms of the SAT solver in recent years, including serial SAT algorithms, parallel SAT algorithms, SAT algorithms based on GPU, and SAT algorithms based on FPGA. The development of SAT is analyzed comprehensively in this paper. Finally, several possible directions for the development of the SAT problem are proposed.
Coordinate Systems, Numerical Objects and Algorithmic Operations of Computational Experiment in Fluid Mechanics

NASA Astrophysics Data System (ADS)

Degtyarev, Alexander; Khramushin, Vasily

2016-02-01

The paper deals with the computer implementation of direct computational experiments in fluid mechanics, constructed on the basis of the approach developed by the authors. The proposed approach allows the use of explicit numerical scheme, which is an important condition for increasing the effciency of the algorithms developed by numerical procedures with natural parallelism. The paper examines the main objects and operations that let you manage computational experiments and monitor the status of the computation process. Special attention is given to a) realization of tensor representations of numerical schemes for direct simulation; b) realization of representation of large particles of a continuous medium motion in two coordinate systems (global and mobile); c) computing operations in the projections of coordinate systems, direct and inverse transformation in these systems. Particular attention is paid to the use of hardware and software of modern computer systems.
A Comparative Analysis of Community Detection Algorithms on Artificial Networks

PubMed Central

Yang, Zhao; Algesheimer, René; Tessone, Claudio J.

2016-01-01

Many community detection algorithms have been developed to uncover the mesoscopic properties of complex networks. However how good an algorithm is, in terms of accuracy and computing time, remains still open. Testing algorithms on real-world network has certain restrictions which made their insights potentially biased: the networks are usually small, and the underlying communities are not defined objectively. In this study, we employ the Lancichinetti-Fortunato-Radicchi benchmark graph to test eight state-of-the-art algorithms. We quantify the accuracy using complementary measures and algorithms’ computing time. Based on simple network properties and the aforementioned results, we provide guidelines that help to choose the most adequate community detection algorithm for a given network. Moreover, these rules allow uncovering limitations in the use of specific algorithms given macroscopic network properties. Our contribution is threefold: firstly, we provide actual techniques to determine which is the most suited algorithm in most circumstances based on observable properties of the network under consideration. Secondly, we use the mixing parameter as an easily measurable indicator of finding the ranges of reliability of the different algorithms. Finally, we study the dependency with network size focusing on both the algorithm’s predicting power and the effective computing time. PMID:27476470
Desiderata for computable representations of electronic health records-driven phenotype algorithms

PubMed Central

Mo, Huan; Thompson, William K; Rasmussen, Luke V; Pacheco, Jennifer A; Jiang, Guoqian; Kiefer, Richard; Zhu, Qian; Xu, Jie; Montague, Enid; Carrell, David S; Lingren, Todd; Mentch, Frank D; Ni, Yizhao; Wehbe, Firas H; Peissig, Peggy L; Tromp, Gerard; Larson, Eric B; Chute, Christopher G; Pathak, Jyotishman; Speltz, Peter; Kho, Abel N; Jarvik, Gail P; Bejan, Cosmin A; Williams, Marc S; Borthwick, Kenneth; Kitchner, Terrie E; Roden, Dan M; Harris, Paul A

2015-01-01

Background Electronic health records (EHRs) are increasingly used for clinical and translational research through the creation of phenotype algorithms. Currently, phenotype algorithms are most commonly represented as noncomputable descriptive documents and knowledge artifacts that detail the protocols for querying diagnoses, symptoms, procedures, medications, and/or text-driven medical concepts, and are primarily meant for human comprehension. We present desiderata for developing a computable phenotype representation model (PheRM). Methods A team of clinicians and informaticians reviewed common features for multisite phenotype algorithms published in PheKB.org and existing phenotype representation platforms. We also evaluated well-known diagnostic criteria and clinical decision-making guidelines to encompass a broader category of algorithms. Results We propose 10 desired characteristics for a flexible, computable PheRM: (1) structure clinical data into queryable forms; (2) recommend use of a common data model, but also support customization for the variability and availability of EHR data among sites; (3) support both human-readable and computable representations of phenotype algorithms; (4) implement set operations and relational algebra for modeling phenotype algorithms; (5) represent phenotype criteria with structured rules; (6) support defining temporal relations between events; (7) use standardized terminologies and ontologies, and facilitate reuse of value sets; (8) define representations for text searching and natural language processing; (9) provide interfaces for external software algorithms; and (10) maintain backward compatibility. Conclusion A computable PheRM is needed for true phenotype portability and reliability across different EHR products and healthcare systems. These desiderata are a guide to inform the establishment and evolution of EHR phenotype algorithm authoring platforms and languages. PMID:26342218
Air traffic surveillance and control using hybrid estimation and protocol-based conflict resolution

NASA Astrophysics Data System (ADS)

Hwang, Inseok

The continued growth of air travel and recent advances in new technologies for navigation, surveillance, and communication have led to proposals by the Federal Aviation Administration (FAA) to provide reliable and efficient tools to aid Air Traffic Control (ATC) in performing their tasks. In this dissertation, we address four problems frequently encountered in air traffic surveillance and control; multiple target tracking and identity management, conflict detection, conflict resolution, and safety verification. We develop a set of algorithms and tools to aid ATC; These algorithms have the provable properties of safety, computational efficiency, and convergence. Firstly, we develop a multiple-maneuvering-target tracking and identity management algorithm which can keep track of maneuvering aircraft in noisy environments and of their identities. Secondly, we propose a hybrid probabilistic conflict detection algorithm between multiple aircraft which uses flight mode estimates as well as aircraft current state estimates. Our algorithm is based on hybrid models of aircraft, which incorporate both continuous dynamics and discrete mode switching. Thirdly, we develop an algorithm for multiple (greater than two) aircraft conflict avoidance that is based on a closed-form analytic solution and thus provides guarantees of safety. Finally, we consider the problem of safety verification of control laws for safety critical systems, with application to air traffic control systems. We approach safety verification through reachability analysis, which is a computationally expensive problem. We develop an over-approximate method for reachable set computation using polytopic approximation methods and dynamic optimization. These algorithms may be used either in a fully autonomous way, or as supporting tools to increase controllers' situational awareness and to reduce their work load.
Computational Proficiency and Algorithmic Learning of Sixth-Graders Using DMP Materials: A Formative Evaluation. Technical Report No. 401.

ERIC Educational Resources Information Center

Wiles, Clyde

Two questions were investigated in this study: (1) How did the computational proficiency of sixth graders who had one year's experience with Developing Mathematical Processes (DMP) materials compare with an equivalent group of students who used the usual textbook program; and (2) What occurs when sixth graders study algorithms as sequences of rule…
Research in Parallel Algorithms and Software for Computational Aerosciences

DOT National Transportation Integrated Search

1996-04-01

Phase I is complete for the development of a Computational Fluid Dynamics : with automatic grid generation and adaptation for the Euler : analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian : grid code developed at Lockheed...
Optimal design of low-density SNP arrays for genomic prediction: algorithm and applications

USDA-ARS?s Scientific Manuscript database

Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for their optimal design. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optim...
SIMULATION OF DISPERSION OF A POWER PLANT PLUME USING AN ADAPTIVE GRID ALGORITHM

EPA Science Inventory

A new dynamic adaptive grid algorithm has been developed for use in air quality modeling. This algorithm uses a higher order numerical scheme?the piecewise parabolic method (PPM)?for computing advective solution fields; a weight function capable of promoting grid node clustering ...
Efficient Pricing Technique for Resource Allocation Problem in Downlink OFDM Cognitive Radio Networks

NASA Astrophysics Data System (ADS)

Abdulghafoor, O. B.; Shaat, M. M. R.; Ismail, M.; Nordin, R.; Yuwono, T.; Alwahedy, O. N. A.

2017-05-01

In this paper, the problem of resource allocation in OFDM-based downlink cognitive radio (CR) networks has been proposed. The purpose of this research is to decrease the computational complexity of the resource allocation algorithm for downlink CR network while concerning the interference constraint of primary network. The objective has been secured by adopting pricing scheme to develop power allocation algorithm with the following concerns: (i) reducing the complexity of the proposed algorithm and (ii) providing firm power control to the interference introduced to primary users (PUs). The performance of the proposed algorithm is tested for OFDM- CRNs. The simulation results show that the performance of the proposed algorithm approached the performance of the optimal algorithm at a lower computational complexity, i.e., O(NlogN), which makes the proposed algorithm suitable for more practical applications.
Active control of impulsive noise with symmetric α-stable distribution based on an improved step-size normalized adaptive algorithm

NASA Astrophysics Data System (ADS)

Zhou, Yali; Zhang, Qizhi; Yin, Yixin

2015-05-01

In this paper, active control of impulsive noise with symmetric α-stable (SαS) distribution is studied. A general step-size normalized filtered-x Least Mean Square (FxLMS) algorithm is developed based on the analysis of existing algorithms, and the Gaussian distribution function is used to normalize the step size. Compared with existing algorithms, the proposed algorithm needs neither the parameter selection and thresholds estimation nor the process of cost function selection and complex gradient computation. Computer simulations have been carried out to suggest that the proposed algorithm is effective for attenuating SαS impulsive noise, and then the proposed algorithm has been implemented in an experimental ANC system. Experimental results show that the proposed scheme has good performance for SαS impulsive noise attenuation.
An accurate and computationally efficient algorithm for ground peak identification in large footprint waveform LiDAR data

NASA Astrophysics Data System (ADS)

Zhuang, Wei; Mountrakis, Giorgos

2014-09-01

Large footprint waveform LiDAR sensors have been widely used for numerous airborne studies. Ground peak identification in a large footprint waveform is a significant bottleneck in exploring full usage of the waveform datasets. In the current study, an accurate and computationally efficient algorithm was developed for ground peak identification, called Filtering and Clustering Algorithm (FICA). The method was evaluated on Land, Vegetation, and Ice Sensor (LVIS) waveform datasets acquired over Central NY. FICA incorporates a set of multi-scale second derivative filters and a k-means clustering algorithm in order to avoid detecting false ground peaks. FICA was tested in five different land cover types (deciduous trees, coniferous trees, shrub, grass and developed area) and showed more accurate results when compared to existing algorithms. More specifically, compared with Gaussian decomposition, the RMSE ground peak identification by FICA was 2.82 m (5.29 m for GD) in deciduous plots, 3.25 m (4.57 m for GD) in coniferous plots, 2.63 m (2.83 m for GD) in shrub plots, 0.82 m (0.93 m for GD) in grass plots, and 0.70 m (0.51 m for GD) in plots of developed areas. FICA performance was also relatively consistent under various slope and canopy coverage (CC) conditions. In addition, FICA showed better computational efficiency compared to existing methods. FICA's major computational and accuracy advantage is a result of the adopted multi-scale signal processing procedures that concentrate on local portions of the signal as opposed to the Gaussian decomposition that uses a curve-fitting strategy applied in the entire signal. The FICA algorithm is a good candidate for large-scale implementation on future space-borne waveform LiDAR sensors.
A different Deutsch-Jozsa

NASA Astrophysics Data System (ADS)

Bera, Debajyoti

2015-06-01

One of the early achievements of quantum computing was demonstrated by Deutsch and Jozsa (Proc R Soc Lond A Math Phys Sci 439(1907):553, 1992) regarding classification of a particular type of Boolean functions. Their solution demonstrated an exponential speedup compared to classical approaches to the same problem; however, their solution was the only known quantum algorithm for that specific problem so far. This paper demonstrates another quantum algorithm for the same problem, with the same exponential advantage compared to classical algorithms. The novelty of this algorithm is the use of quantum amplitude amplification, a technique that is the key component of another celebrated quantum algorithm developed by Grover (Proceedings of the twenty-eighth annual ACM symposium on theory of computing, ACM Press, New York, 1996). A lower bound for randomized (classical) algorithms is also presented which establishes a sound gap between the effectiveness of our quantum algorithm and that of any randomized algorithm with similar efficiency.
Workflow of the Grover algorithm simulation incorporating CUDA and GPGPU

NASA Astrophysics Data System (ADS)

Lu, Xiangwen; Yuan, Jiabin; Zhang, Weiwei

2013-09-01

The Grover quantum search algorithm, one of only a few representative quantum algorithms, can speed up many classical algorithms that use search heuristics. No true quantum computer has yet been developed. For the present, simulation is one effective means of verifying the search algorithm. In this work, we focus on the simulation workflow using a compute unified device architecture (CUDA). Two simulation workflow schemes are proposed. These schemes combine the characteristics of the Grover algorithm and the parallelism of general-purpose computing on graphics processing units (GPGPU). We also analyzed the optimization of memory space and memory access from this perspective. We implemented four programs on CUDA to evaluate the performance of schemes and optimization. Through experimentation, we analyzed the organization of threads suited to Grover algorithm simulations, compared the storage costs of the four programs, and validated the effectiveness of optimization. Experimental results also showed that the distinguished program on CUDA outperformed the serial program of libquantum on a CPU with a speedup of up to 23 times (12 times on average), depending on the scale of the simulation.
The Algorithms of Euclid and Jacobi

ERIC Educational Resources Information Center

Johnson, R. W.; Waterman, M. S.

1976-01-01

In a thesis written for the Doctor of Arts in Mathematics, the connection between Euclid's algorithm and continued fractions is developed and extended to n dimensions. Applications to computer sciences are noted. (SD)
Grid generation in three dimensions by Poisson equations with control of cell size and skewness at boundary surfaces

NASA Technical Reports Server (NTRS)

Sorenson, R. L.; Steger, J. L.

1983-01-01

An algorithm for generating computational grids about arbitrary three-dimensional bodies is developed. The elliptic partial differential equation (PDE) approach developed by Steger and Sorenson and used in the NASA computer program GRAPE is extended from two to three dimensions. Forcing functions which are found automatically by the algorithm give the user the ability to control mesh cell size and skewness at boundary surfaces. This algorithm, as is typical of PDE grid generators, gives smooth grid lines and spacing in the interior of the grid. The method is applied to a rectilinear wind-tunnel case and to two body shapes in spherical coordinates.
Logic circuits based on molecular spider systems.

PubMed

Mo, Dandan; Lakin, Matthew R; Stefanovic, Darko

2016-08-01

Spatial locality brings the advantages of computation speed-up and sequence reuse to molecular computing. In particular, molecular walkers that undergo localized reactions are of interest for implementing logic computations at the nanoscale. We use molecular spider walkers to implement logic circuits. We develop an extended multi-spider model with a dynamic environment wherein signal transmission is triggered via localized reactions, and use this model to implement three basic gates (AND, OR, NOT) and a cascading mechanism. We develop an algorithm to automatically generate the layout of the circuit. We use a kinetic Monte Carlo algorithm to simulate circuit computations, and we analyze circuit complexity: our design scales linearly with formula size and has a logarithmic time complexity. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

EUROPLANET-RI modelling service for the planetary science community: European Modelling and Data Analysis Facility (EMDAF)

NASA Astrophysics Data System (ADS)

Khodachenko, Maxim; Miller, Steven; Stoeckler, Robert; Topf, Florian

2010-05-01

Computational modeling and observational data analysis are two major aspects of the modern scientific research. Both appear nowadays under extensive development and application. Many of the scientific goals of planetary space missions require robust models of planetary objects and environments as well as efficient data analysis algorithms, to predict conditions for mission planning and to interpret the experimental data. Europe has great strength in these areas, but it is insufficiently coordinated; individual groups, models, techniques and algorithms need to be coupled and integrated. Existing level of scientific cooperation and the technical capabilities for operative communication, allow considerable progress in the development of a distributed international Research Infrastructure (RI) which is based on the existing in Europe computational modelling and data analysis centers, providing the scientific community with dedicated services in the fields of their computational and data analysis expertise. These services will appear as a product of the collaborative communication and joint research efforts of the numerical and data analysis experts together with planetary scientists. The major goal of the EUROPLANET-RI / EMDAF is to make computational models and data analysis algorithms associated with particular national RIs and teams, as well as their outputs, more readily available to their potential user community and more tailored to scientific user requirements, without compromising front-line specialized research on model and data analysis algorithms development and software implementation. This objective will be met through four keys subdivisions/tasks of EMAF: 1) an Interactive Catalogue of Planetary Models; 2) a Distributed Planetary Modelling Laboratory; 3) a Distributed Data Analysis Laboratory, and 4) enabling Models and Routines for High Performance Computing Grids. Using the advantages of the coordinated operation and efficient communication between the involved computational modelling, research and data analysis expert teams and their related research infrastructures, EMDAF will provide a 1) flexible, 2) scientific user oriented, 3) continuously developing and fast upgrading computational and data analysis service to support and intensify the European planetary scientific research. At the beginning EMDAF will create a set of demonstrators and operational tests of this service in key areas of European planetary science. This work will aim at the following objectives: (a) Development and implementation of tools for distant interactive communication between the planetary scientists and computing experts (including related RIs); (b) Development of standard routine packages, and user-friendly interfaces for operation of the existing numerical codes and data analysis algorithms by the specialized planetary scientists; (c) Development of a prototype of numerical modelling services "on demand" for space missions and planetary researchers; (d) Development of a prototype of data analysis services "on demand" for space missions and planetary researchers; (e) Development of a prototype of coordinated interconnected simulations of planetary phenomena and objects (global multi-model simulators); (f) Providing the demonstrators of a coordinated use of high performance computing facilities (super-computer networks), done in cooperation with European HPC Grid DEISA.
A supportive architecture for CFD-based design optimisation

NASA Astrophysics Data System (ADS)

Li, Ni; Su, Zeya; Bi, Zhuming; Tian, Chao; Ren, Zhiming; Gong, Guanghong

2014-03-01

Multi-disciplinary design optimisation (MDO) is one of critical methodologies to the implementation of enterprise systems (ES). MDO requiring the analysis of fluid dynamics raises a special challenge due to its extremely intensive computation. The rapid development of computational fluid dynamic (CFD) technique has caused a rise of its applications in various fields. Especially for the exterior designs of vehicles, CFD has become one of the three main design tools comparable to analytical approaches and wind tunnel experiments. CFD-based design optimisation is an effective way to achieve the desired performance under the given constraints. However, due to the complexity of CFD, integrating with CFD analysis in an intelligent optimisation algorithm is not straightforward. It is a challenge to solve a CFD-based design problem, which is usually with high dimensions, and multiple objectives and constraints. It is desirable to have an integrated architecture for CFD-based design optimisation. However, our review on existing works has found that very few researchers have studied on the assistive tools to facilitate CFD-based design optimisation. In the paper, a multi-layer architecture and a general procedure are proposed to integrate different CFD toolsets with intelligent optimisation algorithms, parallel computing technique and other techniques for efficient computation. In the proposed architecture, the integration is performed either at the code level or data level to fully utilise the capabilities of different assistive tools. Two intelligent algorithms are developed and embedded with parallel computing. These algorithms, together with the supportive architecture, lay a solid foundation for various applications of CFD-based design optimisation. To illustrate the effectiveness of the proposed architecture and algorithms, the case studies on aerodynamic shape design of a hypersonic cruising vehicle are provided, and the result has shown that the proposed architecture and developed algorithms have performed successfully and efficiently in dealing with the design optimisation with over 200 design variables.
A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data.

PubMed

Baur, Brittany; Bozdag, Serdar

2016-01-01

DNA methylation is an important epigenetic event that effects gene expression during development and various diseases such as cancer. Understanding the mechanism of action of DNA methylation is important for downstream analysis. In the Illumina Infinium HumanMethylation 450K array, there are tens of probes associated with each gene. Given methylation intensities of all these probes, it is necessary to compute which of these probes are most representative of the gene centric methylation level. In this study, we developed a feature selection algorithm based on sequential forward selection that utilized different classification methods to compute gene centric DNA methylation using probe level DNA methylation data. We compared our algorithm to other feature selection algorithms such as support vector machines with recursive feature elimination, genetic algorithms and ReliefF. We evaluated all methods based on the predictive power of selected probes on their mRNA expression levels and found that a K-Nearest Neighbors classification using the sequential forward selection algorithm performed better than other algorithms based on all metrics. We also observed that transcriptional activities of certain genes were more sensitive to DNA methylation changes than transcriptional activities of other genes. Our algorithm was able to predict the expression of those genes with high accuracy using only DNA methylation data. Our results also showed that those DNA methylation-sensitive genes were enriched in Gene Ontology terms related to the regulation of various biological processes.
Machine Learning and Computer Vision System for Phenotype Data Acquisition and Analysis in Plants.

PubMed

Navarro, Pedro J; Pérez, Fernando; Weiss, Julia; Egea-Cortines, Marcos

2016-05-05

Phenomics is a technology-driven approach with promising future to obtain unbiased data of biological systems. Image acquisition is relatively simple. However data handling and analysis are not as developed compared to the sampling capacities. We present a system based on machine learning (ML) algorithms and computer vision intended to solve the automatic phenotype data analysis in plant material. We developed a growth-chamber able to accommodate species of various sizes. Night image acquisition requires near infrared lightning. For the ML process, we tested three different algorithms: k-nearest neighbour (kNN), Naive Bayes Classifier (NBC), and Support Vector Machine. Each ML algorithm was executed with different kernel functions and they were trained with raw data and two types of data normalisation. Different metrics were computed to determine the optimal configuration of the machine learning algorithms. We obtained a performance of 99.31% in kNN for RGB images and a 99.34% in SVM for NIR. Our results show that ML techniques can speed up phenomic data analysis. Furthermore, both RGB and NIR images can be segmented successfully but may require different ML algorithms for segmentation.
A Kirchhoff approach to seismic modeling and prestack depth migration

NASA Astrophysics Data System (ADS)

Liu, Zhen-Yue

1993-05-01

The Kirchhoff integral provides a robust method for implementing seismic modeling and prestack depth migration, which can handle lateral velocity variation and turning waves. With a little extra computation cost, the Kirchoff-type migration can obtain multiple outputs that have the same phase but different amplitudes, compared with that of other migration methods. The ratio of these amplitudes is helpful in computing some quantities such as reflection angle. I develop a seismic modeling and prestack depth migration method based on the Kirchhoff integral, that handles both laterally variant velocity and a dip beyond 90 degrees. The method uses a finite-difference algorithm to calculate travel times and WKBJ amplitudes for the Kirchhoff integral. Compared to ray-tracing algorithms, the finite-difference algorithm gives an efficient implementation and single-valued quantities (first arrivals) on output. In my finite difference algorithm, the upwind scheme is used to calculate travel times, and the Crank-Nicolson scheme is used to calculate amplitudes. Moreover, interpolation is applied to save computation cost. The modeling and migration algorithms require a smooth velocity function. I develop a velocity-smoothing technique based on damped least-squares to aid in obtaining a successful migration.
Internet (WWW) based system of ultrasonic image processing tools for remote image analysis.

PubMed

Zeng, Hong; Fei, Ding-Yu; Fu, Cai-Ting; Kraft, Kenneth A

2003-07-01

Ultrasonic Doppler color imaging can provide anatomic information and simultaneously render flow information within blood vessels for diagnostic purpose. Many researchers are currently developing ultrasound image processing algorithms in order to provide physicians with accurate clinical parameters from the images. Because researchers use a variety of computer languages and work on different computer platforms to implement their algorithms, it is difficult for other researchers and physicians to access those programs. A system has been developed using World Wide Web (WWW) technologies and HTTP communication protocols to publish our ultrasonic Angle Independent Doppler Color Image (AIDCI) processing algorithm and several general measurement tools on the Internet, where authorized researchers and physicians can easily access the program using web browsers to carry out remote analysis of their local ultrasonic images or images provided from the database. In order to overcome potential incompatibility between programs and users' computer platforms, ActiveX technology was used in this project. The technique developed may also be used for other research fields.
Architecture and data processing alternatives for the TSE computer. Volume 3: Execution of a parallel counting algorithm using array logic (Tse) devices

NASA Technical Reports Server (NTRS)

Metcalfe, A. G.; Bodenheimer, R. E.

1976-01-01

A parallel algorithm for counting the number of logic-l elements in a binary array or image developed during preliminary investigation of the Tse concept is described. The counting algorithm is implemented using a basic combinational structure. Modifications which improve the efficiency of the basic structure are also presented. A programmable Tse computer structure is proposed, along with a hardware control unit, Tse instruction set, and software program for execution of the counting algorithm. Finally, a comparison is made between the different structures in terms of their more important characteristics.
Fast parallel approach for 2-D DHT-based real-valued discrete Gabor transform.

PubMed

Tao, Liang; Kwan, Hon Keung

2009-12-01

Two-dimensional fast Gabor transform algorithms are useful for real-time applications due to the high computational complexity of the traditional 2-D complex-valued discrete Gabor transform (CDGT). This paper presents two block time-recursive algorithms for 2-D DHT-based real-valued discrete Gabor transform (RDGT) and its inverse transform and develops a fast parallel approach for the implementation of the two algorithms. The computational complexity of the proposed parallel approach is analyzed and compared with that of the existing 2-D CDGT algorithms. The results indicate that the proposed parallel approach is attractive for real time image processing.
Communications oriented programming of parallel iterative solutions of sparse linear systems

NASA Technical Reports Server (NTRS)

Patrick, M. L.; Pratt, T. W.

1986-01-01

Parallel algorithms are developed for a class of scientific computational problems by partitioning the problems into smaller problems which may be solved concurrently. The effectiveness of the resulting parallel solutions is determined by the amount and frequency of communication and synchronization and the extent to which communication can be overlapped with computation. Three different parallel algorithms for solving the same class of problems are presented, and their effectiveness is analyzed from this point of view. The algorithms are programmed using a new programming environment. Run-time statistics and experience obtained from the execution of these programs assist in measuring the effectiveness of these algorithms.
Interaction sorting method for molecular dynamics on multi-core SIMD CPU architecture.

PubMed

Matvienko, Sergey; Alemasov, Nikolay; Fomin, Eduard

2015-02-01

Molecular dynamics (MD) is widely used in computational biology for studying binding mechanisms of molecules, molecular transport, conformational transitions, protein folding, etc. The method is computationally expensive; thus, the demand for the development of novel, much more efficient algorithms is still high. Therefore, the new algorithm designed in 2007 and called interaction sorting (IS) clearly attracted interest, as it outperformed the most efficient MD algorithms. In this work, a new IS modification is proposed which allows the algorithm to utilize SIMD processor instructions. This paper shows that the improvement provides an additional gain in performance, 9% to 45% in comparison to the original IS method.
Detection and Tracking of Moving Objects with Real-Time Onboard Vision System

NASA Astrophysics Data System (ADS)

Erokhin, D. Y.; Feldman, A. B.; Korepanov, S. E.

2017-05-01

Detection of moving objects in video sequence received from moving video sensor is a one of the most important problem in computer vision. The main purpose of this work is developing set of algorithms, which can detect and track moving objects in real time computer vision system. This set includes three main parts: the algorithm for estimation and compensation of geometric transformations of images, an algorithm for detection of moving objects, an algorithm to tracking of the detected objects and prediction their position. The results can be claimed to create onboard vision systems of aircraft, including those relating to small and unmanned aircraft.
Recursive dynamics for flexible multibody systems using spatial operators

NASA Technical Reports Server (NTRS)

Jain, A.; Rodriguez, G.

1990-01-01

Due to their structural flexibility, spacecraft and space manipulators are multibody systems with complex dynamics and possess a large number of degrees of freedom. Here the spatial operator algebra methodology is used to develop a new dynamics formulation and spatially recursive algorithms for such flexible multibody systems. A key feature of the formulation is that the operator description of the flexible system dynamics is identical in form to the corresponding operator description of the dynamics of rigid multibody systems. A significant advantage of this unifying approach is that it allows ideas and techniques for rigid multibody systems to be easily applied to flexible multibody systems. The algorithms use standard finite-element and assumed modes models for the individual body deformation. A Newton-Euler Operator Factorization of the mass matrix of the multibody system is first developed. It forms the basis for recursive algorithms such as for the inverse dynamics, the computation of the mass matrix, and the composite body forward dynamics for the system. Subsequently, an alternative Innovations Operator Factorization of the mass matrix, each of whose factors is invertible, is developed. It leads to an operator expression for the inverse of the mass matrix, and forms the basis for the recursive articulated body forward dynamics algorithm for the flexible multibody system. For simplicity, most of the development here focuses on serial chain multibody systems. However, extensions of the algorithms to general topology flexible multibody systems are described. While the computational cost of the algorithms depends on factors such as the topology and the amount of flexibility in the multibody system, in general, it appears that in contrast to the rigid multibody case, the articulated body forward dynamics algorithm is the more efficient algorithm for flexible multibody systems containing even a small number of flexible bodies. The variety of algorithms described here permits a user to choose the algorithm which is optimal for the multibody system at hand. The availability of a number of algorithms is even more important for real-time applications, where implementation on parallel processors or custom computing hardware is often necessary to maximize speed.
Parallel Computational Protein Design.

PubMed

Zhou, Yichao; Donald, Bruce R; Zeng, Jianyang

2017-01-01

Computational structure-based protein design (CSPD) is an important problem in computational biology, which aims to design or improve a prescribed protein function based on a protein structure template. It provides a practical tool for real-world protein engineering applications. A popular CSPD method that guarantees to find the global minimum energy solution (GMEC) is to combine both dead-end elimination (DEE) and A* tree search algorithms. However, in this framework, the A* search algorithm can run in exponential time in the worst case, which may become the computation bottleneck of large-scale computational protein design process. To address this issue, we extend and add a new module to the OSPREY program that was previously developed in the Donald lab (Gainza et al., Methods Enzymol 523:87, 2013) to implement a GPU-based massively parallel A* algorithm for improving protein design pipeline. By exploiting the modern GPU computational framework and optimizing the computation of the heuristic function for A* search, our new program, called gOSPREY, can provide up to four orders of magnitude speedups in large protein design cases with a small memory overhead comparing to the traditional A* search algorithm implementation, while still guaranteeing the optimality. In addition, gOSPREY can be configured to run in a bounded-memory mode to tackle the problems in which the conformation space is too large and the global optimal solution cannot be computed previously. Furthermore, the GPU-based A* algorithm implemented in the gOSPREY program can be combined with the state-of-the-art rotamer pruning algorithms such as iMinDEE (Gainza et al., PLoS Comput Biol 8:e1002335, 2012) and DEEPer (Hallen et al., Proteins 81:18-39, 2013) to also consider continuous backbone and side-chain flexibility.
Rapid Calculation of Max-Min Fair Rates for Multi-Commodity Flows in Fat-Tree Networks

DOE PAGES

Mollah, Md Atiqul; Yuan, Xin; Pakin, Scott; ...

2017-08-29

Max-min fairness is often used in the performance modeling of interconnection networks. Existing methods to compute max-min fair rates for multi-commodity flows have high complexity and are computationally infeasible for large networks. In this paper, we show that by considering topological features, this problem can be solved efficiently for the fat-tree topology that is widely used in data centers and high performance compute clusters. Several efficient new algorithms are developed for this problem, including a parallel algorithm that can take advantage of multi-core and shared-memory architectures. Using these algorithms, we demonstrate that it is possible to find the max-min fairmore » rate allocation for multi-commodity flows in fat-tree networks that support tens of thousands of nodes. We evaluate the run-time performance of the proposed algorithms and show improvement in orders of magnitude over the previously best known method. Finally, we further demonstrate a new application of max-min fair rate allocation that is only computationally feasible using our new algorithms.« less
Rapid Calculation of Max-Min Fair Rates for Multi-Commodity Flows in Fat-Tree Networks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mollah, Md Atiqul; Yuan, Xin; Pakin, Scott

Max-min fairness is often used in the performance modeling of interconnection networks. Existing methods to compute max-min fair rates for multi-commodity flows have high complexity and are computationally infeasible for large networks. In this paper, we show that by considering topological features, this problem can be solved efficiently for the fat-tree topology that is widely used in data centers and high performance compute clusters. Several efficient new algorithms are developed for this problem, including a parallel algorithm that can take advantage of multi-core and shared-memory architectures. Using these algorithms, we demonstrate that it is possible to find the max-min fairmore » rate allocation for multi-commodity flows in fat-tree networks that support tens of thousands of nodes. We evaluate the run-time performance of the proposed algorithms and show improvement in orders of magnitude over the previously best known method. Finally, we further demonstrate a new application of max-min fair rate allocation that is only computationally feasible using our new algorithms.« less
Marc Henry de Frahan | NREL

Science.gov Websites

Computing Project, Marc develops high-fidelity turbulence models to enhance simulation accuracy and efficient numerical algorithms for future high performance computing hardware architectures. Research Interests High performance computing High order numerical methods for computational fluid dynamics Fluid
Optimizing Approximate Weighted Matching on Nvidia Kepler K40

DOE Office of Scientific and Technical Information (OSTI.GOV)

Naim, Md; Manne, Fredrik; Halappanavar, Mahantesh

Matching is a fundamental graph problem with numerous applications in science and engineering. While algorithms for computing optimal matchings are difficult to parallelize, approximation algorithms on the other hand generally compute high quality solutions and are amenable to parallelization. In this paper, we present efficient implementations of the current best algorithm for half-approximate weighted matching, the Suitor algorithm, on Nvidia Kepler K-40 platform. We develop four variants of the algorithm that exploit hardware features to address key challenges for a GPU implementation. We also experiment with different combinations of work assigned to a warp. Using an exhaustive set ofmore » $269$ inputs, we demonstrate that the new implementation outperforms the previous best GPU algorithm by $10$ to $$100\\times$$ for over $100$ instances, and from $100$ to $$1000\\times$$ for $15$ instances. We also demonstrate up to $$20\\times$$ speedup relative to $2$ threads, and up to $$5\\times$$ relative to $16$ threads on Intel Xeon platform with $16$ cores for the same algorithm. The new algorithms and implementations provided in this paper will have a direct impact on several applications that repeatedly use matching as a key compute kernel. Further, algorithm designs and insights provided in this paper will benefit other researchers implementing graph algorithms on modern GPU architectures.« less
Computation of multi-dimensional viscous supersonic jet flow

NASA Technical Reports Server (NTRS)

Kim, Y. N.; Buggeln, R. C.; Mcdonald, H.

1986-01-01

A new method has been developed for two- and three-dimensional computations of viscous supersonic flows with embedded subsonic regions adjacent to solid boundaries. The approach employs a reduced form of the Navier-Stokes equations which allows solution as an initial-boundary value problem in space, using an efficient noniterative forward marching algorithm. Numerical instability associated with forward marching algorithms for flows with embedded subsonic regions is avoided by approximation of the reduced form of the Navier-Stokes equations in the subsonic regions of the boundary layers. Supersonic and subsonic portions of the flow field are simultaneously calculated by a consistently split linearized block implicit computational algorithm. The results of computations for a series of test cases relevant to internal supersonic flow is presented and compared with data. Comparison between data and computation are in general excellent thus indicating that the computational technique has great promise as a tool for calculating supersonic flow with embedded subsonic regions. Finally, a User's Manual is presented for the computer code used to perform the calculations.
SIMULATION OF DISPERSION OF A POWER PLANT PLUME USING AN ADAPTIVE GRID ALGORITHM. (R827028)

EPA Science Inventory

A new dynamic adaptive grid algorithm has been developed for use in air quality modeling. This algorithm uses a higher order numerical scheme––the piecewise parabolic method (PPM)––for computing advective solution fields; a weight function capable o...
A novel computer algorithm for modeling and treating mandibular fractures: A pilot study.

PubMed

Rizzi, Christopher J; Ortlip, Timothy; Greywoode, Jewel D; Vakharia, Kavita T; Vakharia, Kalpesh T

2017-02-01

To describe a novel computer algorithm that can model mandibular fracture repair. To evaluate the algorithm as a tool to model mandibular fracture reduction and hardware selection. Retrospective pilot study combined with cross-sectional survey. A computer algorithm utilizing Aquarius Net (TeraRecon, Inc, Foster City, CA) and Adobe Photoshop CS6 (Adobe Systems, Inc, San Jose, CA) was developed to model mandibular fracture repair. Ten different fracture patterns were selected from nine patients who had already undergone mandibular fracture repair. The preoperative computed tomography (CT) images were processed with the computer algorithm to create virtual images that matched the actual postoperative three-dimensional CT images. A survey comparing the true postoperative image with the virtual postoperative images was created and administered to otolaryngology resident and attending physicians. They were asked to rate on a scale from 0 to 10 (0 = completely different; 10 = identical) the similarity between the two images in terms of the fracture reduction and fixation hardware. Ten mandible fracture cases were analyzed and processed. There were 15 survey respondents. The mean score for overall similarity between the images was 8.41 ± 0.91; the mean score for similarity of fracture reduction was 8.61 ± 0.98; and the mean score for hardware appearance was 8.27 ± 0.97. There were no significant differences between attending and resident responses. There were no significant differences based on fracture location. This computer algorithm can accurately model mandibular fracture repair. Images created by the algorithm are highly similar to true postoperative images. The algorithm can potentially assist a surgeon planning mandibular fracture repair. 4. Laryngoscope, 2016 127:331-336, 2017. © 2016 The American Laryngological, Rhinological and Otological Society, Inc.

U.S. Army Research Laboratory (ARL) multimodal signatures database

NASA Astrophysics Data System (ADS)

Bennett, Kelly

2008-04-01

The U.S. Army Research Laboratory (ARL) Multimodal Signatures Database (MMSDB) is a centralized collection of sensor data of various modalities that are co-located and co-registered. The signatures include ground and air vehicles, personnel, mortar, artillery, small arms gunfire from potential sniper weapons, explosives, and many other high value targets. This data is made available to Department of Defense (DoD) and DoD contractors, Intel agencies, other government agencies (OGA), and academia for use in developing target detection, tracking, and classification algorithms and systems to protect our Soldiers. A platform independent Web interface disseminates the signatures to researchers and engineers within the scientific community. Hierarchical Data Format 5 (HDF5) signature models provide an excellent solution for the sharing of complex multimodal signature data for algorithmic development and database requirements. Many open source tools for viewing and plotting HDF5 signatures are available over the Web. Seamless integration of HDF5 signatures is possible in both proprietary computational environments, such as MATLAB, and Free and Open Source Software (FOSS) computational environments, such as Octave and Python, for performing signal processing, analysis, and algorithm development. Future developments include extending the Web interface into a portal system for accessing ARL algorithms and signatures, High Performance Computing (HPC) resources, and integrating existing database and signature architectures into sensor networking environments.
Incremental update of electrostatic interactions in adaptively restrained particle simulations.

PubMed

Edorh, Semeho Prince A; Redon, Stéphane

2018-04-06

The computation of long-range potentials is one of the demanding tasks in Molecular Dynamics. During the last decades, an inventive panoply of methods was developed to reduce the CPU time of this task. In this work, we propose a fast method dedicated to the computation of the electrostatic potential in adaptively restrained systems. We exploit the fact that, in such systems, only some particles are allowed to move at each timestep. We developed an incremental algorithm derived from a multigrid-based alternative to traditional Fourier-based methods. Our algorithm was implemented inside LAMMPS, a popular molecular dynamics simulation package. We evaluated the method on different systems. We showed that the new algorithm's computational complexity scales with the number of active particles in the simulated system, and is able to outperform the well-established Particle Particle Particle Mesh (P3M) for adaptively restrained simulations. © 2018 Wiley Periodicals, Inc. © 2018 Wiley Periodicals, Inc.
Fast computation algorithms for speckle pattern simulation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nascov, Victor; Samoilă, Cornel; Ursuţiu, Doru

2013-11-13

We present our development of a series of efficient computation algorithms, generally usable to calculate light diffraction and particularly for speckle pattern simulation. We use mainly the scalar diffraction theory in the form of Rayleigh-Sommerfeld diffraction formula and its Fresnel approximation. Our algorithms are based on a special form of the convolution theorem and the Fast Fourier Transform. They are able to evaluate the diffraction formula much faster than by direct computation and we have circumvented the restrictions regarding the relative sizes of the input and output domains, met on commonly used procedures. Moreover, the input and output planes canmore » be tilted each to other and the output domain can be off-axis shifted.« less
Hybrid Nested Partitions and Math Programming Framework for Large-scale Combinatorial Optimization

DTIC Science & Technology

2010-03-31

optimization problems: 1) exact algorithms and 2) metaheuristic algorithms . This project will integrate concepts from these two technologies to develop...optimal solutions within an acceptable amount of computation time, and 2) metaheuristic algorithms such as genetic algorithms , tabu search, and the...integer programming decomposition approaches, such as Dantzig Wolfe decomposition and Lagrangian relaxation, and metaheuristics such as the Nested
Positive dwell time algorithm with minimum equal extra material removal in deterministic optical surfacing technology.

PubMed

Li, Longxiang; Xue, Donglin; Deng, Weijie; Wang, Xu; Bai, Yang; Zhang, Feng; Zhang, Xuejun

2017-11-10

In deterministic computer-controlled optical surfacing, accurate dwell time execution by computer numeric control machines is crucial in guaranteeing a high-convergence ratio for the optical surface error. It is necessary to consider the machine dynamics limitations in the numerical dwell time algorithms. In this paper, these constraints on dwell time distribution are analyzed, and a model of the equal extra material removal is established. A positive dwell time algorithm with minimum equal extra material removal is developed. Results of simulations based on deterministic magnetorheological finishing demonstrate the necessity of considering machine dynamics performance and illustrate the validity of the proposed algorithm. Indeed, the algorithm effectively facilitates the determinacy of sub-aperture optical surfacing processes.
A computational model of the human hand 93-ERI-053

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hollerbach, K.; Axelrod, T.

1996-03-01

The objectives of the Computational Hand Modeling project were to prove the feasibility of the Laboratory`s NIKE3D finite element code to orthopaedic problems. Because of the great complexity of anatomical structures and the nonlinearity of their behavior, we have focused on a subset of joints of the hand and lower extremity and have developed algorithms to model their behavior. The algorithms developed here solve fundamental problems in computational biomechanics and can be expanded to describe any other joints of the human body. This kind of computational modeling has never successfully been attempted before, due in part to a lack ofmore » biomaterials data and a lack of computational resources. With the computational resources available at the National Laboratories and the collaborative relationships we have established with experimental and other modeling laboratories, we have been in a position to pursue our innovative approach to biomechanical and orthopedic modeling.« less
Parallel Algorithms and Patterns

DOE Office of Scientific and Technical Information (OSTI.GOV)

Robey, Robert W.

2016-06-16

This is a powerpoint presentation on parallel algorithms and patterns. A parallel algorithm is a well-defined, step-by-step computational procedure that emphasizes concurrency to solve a problem. Examples of problems include: Sorting, searching, optimization, matrix operations. A parallel pattern is a computational step in a sequence of independent, potentially concurrent operations that occurs in diverse scenarios with some frequency. Examples are: Reductions, prefix scans, ghost cell updates. We only touch on parallel patterns in this presentation. It really deserves its own detailed discussion which Gabe Rockefeller would like to develop.
Solution of partial differential equations on vector and parallel computers

NASA Technical Reports Server (NTRS)

Ortega, J. M.; Voigt, R. G.

1985-01-01

The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed.
Development of Parallel Architectures for Sensor Array Processing. Volume 1

DTIC Science & Technology

1993-08-01

required for the DOA estimation [ 1-7]. The Multiple Signal Classification ( MUSIC ) [ 1] and the Estimation of Signal Parameters by Rotational...manifold and the estimated subspace. Although MUSIC is a high resolution algorithm, it has several drawbacks including the fact that complete knowledge of...thoroughly, MUSIC algorithm was selected to develop special purpose hardware for real time computation. Summary of the MUSIC algorithm is as follows
An Algorithm for Converting Static Earth Sensor Measurements into Earth Observation Vectors

NASA Technical Reports Server (NTRS)

Harman, R.; Hashmall, Joseph A.; Sedlak, Joseph

2004-01-01

An algorithm has been developed that converts penetration angles reported by Static Earth Sensors (SESs) into Earth observation vectors. This algorithm allows compensation for variation in the horizon height including that caused by Earth oblateness. It also allows pitch and roll to be computed using any number (greater than 1) of simultaneous sensor penetration angles simplifying processing during periods of Sun and Moon interference. The algorithm computes body frame unit vectors through each SES cluster. It also computes GCI vectors from the spacecraft to the position on the Earth's limb where each cluster detects the Earth's limb. These body frame vectors are used as sensor observation vectors and the GCI vectors are used as reference vectors in an attitude solution. The attitude, with the unobservable yaw discarded, is iteratively refined to provide the Earth observation vector solution.
Current algorithmic solutions for peptide-based proteomics data generation and identification.

PubMed

Hoopmann, Michael R; Moritz, Robert L

2013-02-01

Peptide-based proteomic data sets are ever increasing in size and complexity. These data sets provide computational challenges when attempting to quickly analyze spectra and obtain correct protein identifications. Database search and de novo algorithms must consider high-resolution MS/MS spectra and alternative fragmentation methods. Protein inference is a tricky problem when analyzing large data sets of degenerate peptide identifications. Combining multiple algorithms for improved peptide identification puts significant strain on computational systems when investigating large data sets. This review highlights some of the recent developments in peptide and protein identification algorithms for analyzing shotgun mass spectrometry data when encountering the aforementioned hurdles. Also explored are the roles that analytical pipelines, public spectral libraries, and cloud computing play in the evolution of peptide-based proteomics. Copyright © 2012 Elsevier Ltd. All rights reserved.
Accelerating global optimization of aerodynamic shapes using a new surrogate-assisted parallel genetic algorithm

NASA Astrophysics Data System (ADS)

Ebrahimi, Mehdi; Jahangirian, Alireza

2017-12-01

An efficient strategy is presented for global shape optimization of wing sections with a parallel genetic algorithm. Several computational techniques are applied to increase the convergence rate and the efficiency of the method. A variable fidelity computational evaluation method is applied in which the expensive Navier-Stokes flow solver is complemented by an inexpensive multi-layer perceptron neural network for the objective function evaluations. A population dispersion method that consists of two phases, of exploration and refinement, is developed to improve the convergence rate and the robustness of the genetic algorithm. Owing to the nature of the optimization problem, a parallel framework based on the master/slave approach is used. The outcomes indicate that the method is able to find the global optimum with significantly lower computational time in comparison to the conventional genetic algorithm.
Ocean Models and Proper Orthogonal Decomposition

NASA Astrophysics Data System (ADS)

Salas-de-Leon, D. A.

2007-05-01

The increasing computational developments and the better understanding of mathematical and physical systems resulted in an increasing number of ocean models. Long time ago, modelers were like a secret organization and recognize each other by using secret codes and languages that only a select group of people was able to recognize and understand. The access to computational systems was reduced, on one hand equipment and the using time of computers were expensive and restricted, and on the other hand, they required an advance computational languages that not everybody wanted to learn. Now a days most college freshman own a personal computer (PC or laptop), and/or have access to more sophisticated computational systems than those available for research in the early 80's. The resource availability resulted in a mayor access to all kind models. Today computer speed and time and the algorithms does not seem to be a problem, even though some models take days to run in small computational systems. Almost every oceanographic institution has their own model, what is more, in the same institution from one office to the next there are different models for the same phenomena, developed by different research member, the results does not differ substantially since the equations are the same, and the solving algorithms are similar. The algorithms and the grids, constructed with algorithms, can be found in text books and/or over the internet. Every year more sophisticated models are constructed. The Proper Orthogonal Decomposition is a technique that allows the reduction of the number of variables to solve keeping the model properties, for which it can be a very useful tool in diminishing the processes that have to be solved using "small" computational systems, making sophisticated models available for a greater community.
Computing Quantitative Characteristics of Finite-State Real-Time Systems

DTIC Science & Technology

1994-05-04

Current methods for verifying real - time systems are essentially decision procedures that establish whether the system model satisfies a given...specification. We present a general method for computing quantitative information about finite-state real - time systems . We have developed algorithms that...our technique can be extended to a more general representation of real - time systems , namely, timed transition graphs. The algorithms presented in this
A Systolic VLSI Design of a Pipeline Reed-solomon Decoder

NASA Technical Reports Server (NTRS)

Shao, H. M.; Truong, T. K.; Deutsch, L. J.; Yuen, J. H.; Reed, I. S.

1984-01-01

A pipeline structure of a transform decoder similar to a systolic array was developed to decode Reed-Solomon (RS) codes. An important ingredient of this design is a modified Euclidean algorithm for computing the error locator polynomial. The computation of inverse field elements is completely avoided in this modification of Euclid's algorithm. The new decoder is regular and simple, and naturally suitable for VLSI implementation.
A VLSI design of a pipeline Reed-Solomon decoder

NASA Technical Reports Server (NTRS)

Shao, H. M.; Truong, T. K.; Deutsch, L. J.; Yuen, J. H.; Reed, I. S.

1985-01-01

A pipeline structure of a transform decoder similar to a systolic array was developed to decode Reed-Solomon (RS) codes. An important ingredient of this design is a modified Euclidean algorithm for computing the error locator polynomial. The computation of inverse field elements is completely avoided in this modification of Euclid's algorithm. The new decoder is regular and simple, and naturally suitable for VLSI implementation.
Communication Avoiding and Overlapping for Numerical Linear Algebra

DTIC Science & Technology

2012-05-08

future exascale systems, communication cost must be avoided or overlapped. Communication-avoiding 2.5D algorithms improve scalability by reducing...linear algebra problems to future exascale systems, communication cost must be avoided or overlapped. Communication-avoiding 2.5D algorithms improve...will continue to grow relative to the cost of computation. With exascale computing as the long-term goal, the community needs to develop techniques
Advanced Dynamically Adaptive Algorithms for Stochastic Simulations on Extreme Scales

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xiu, Dongbin

2017-03-03

The focus of the project is the development of mathematical methods and high-performance computational tools for stochastic simulations, with a particular emphasis on computations on extreme scales. The core of the project revolves around the design of highly efficient and scalable numerical algorithms that can adaptively and accurately, in high dimensional spaces, resolve stochastic problems with limited smoothness, even containing discontinuities.
Challenges facing developers of CAD/CAM models that seek to predict human working postures

NASA Astrophysics Data System (ADS)

Wiker, Steven F.

2005-11-01

This paper outlines the need for development of human posture prediction models for Computer Aided Design (CAD) and Computer Aided Manufacturing (CAM) design applications in product, facility and work design. Challenges facing developers of posture prediction algorithms are presented and discussed.
Nuclear IHC enumeration: A digital phantom to evaluate the performance of automated algorithms in digital pathology.

PubMed

Niazi, Muhammad Khalid Khan; Abas, Fazly Salleh; Senaras, Caglar; Pennell, Michael; Sahiner, Berkman; Chen, Weijie; Opfer, John; Hasserjian, Robert; Louissaint, Abner; Shana'ah, Arwa; Lozanski, Gerard; Gurcan, Metin N

2018-01-01

Automatic and accurate detection of positive and negative nuclei from images of immunostained tissue biopsies is critical to the success of digital pathology. The evaluation of most nuclei detection algorithms relies on manually generated ground truth prepared by pathologists, which is unfortunately time-consuming and suffers from inter-pathologist variability. In this work, we developed a digital immunohistochemistry (IHC) phantom that can be used for evaluating computer algorithms for enumeration of IHC positive cells. Our phantom development consists of two main steps, 1) extraction of the individual as well as nuclei clumps of both positive and negative nuclei from real WSI images, and 2) systematic placement of the extracted nuclei clumps on an image canvas. The resulting images are visually similar to the original tissue images. We created a set of 42 images with different concentrations of positive and negative nuclei. These images were evaluated by four board certified pathologists in the task of estimating the ratio of positive to total number of nuclei. The resulting concordance correlation coefficients (CCC) between the pathologist and the true ratio range from 0.86 to 0.95 (point estimates). The same ratio was also computed by an automated computer algorithm, which yielded a CCC value of 0.99. Reading the phantom data with known ground truth, the human readers show substantial variability and lower average performance than the computer algorithm in terms of CCC. This shows the limitation of using a human reader panel to establish a reference standard for the evaluation of computer algorithms, thereby highlighting the usefulness of the phantom developed in this work. Using our phantom images, we further developed a function that can approximate the true ratio from the area of the positive and negative nuclei, hence avoiding the need to detect individual nuclei. The predicted ratios of 10 held-out images using the function (trained on 32 images) are within ±2.68% of the true ratio. Moreover, we also report the evaluation of a computerized image analysis method on the synthetic tissue dataset.

Predicting protein structures with a multiplayer online game.

PubMed

Cooper, Seth; Khatib, Firas; Treuille, Adrien; Barbero, Janos; Lee, Jeehyung; Beenen, Michael; Leaver-Fay, Andrew; Baker, David; Popović, Zoran; Players, Foldit

2010-08-05

People exert large amounts of problem-solving effort playing computer games. Simple image- and text-recognition tasks have been successfully 'crowd-sourced' through games, but it is not clear if more complex scientific problems can be solved with human-directed computing. Protein structure prediction is one such problem: locating the biologically relevant native conformation of a protein is a formidable computational challenge given the very large size of the search space. Here we describe Foldit, a multiplayer online game that engages non-scientists in solving hard prediction problems. Foldit players interact with protein structures using direct manipulation tools and user-friendly versions of algorithms from the Rosetta structure prediction methodology, while they compete and collaborate to optimize the computed energy. We show that top-ranked Foldit players excel at solving challenging structure refinement problems in which substantial backbone rearrangements are necessary to achieve the burial of hydrophobic residues. Players working collaboratively develop a rich assortment of new strategies and algorithms; unlike computational approaches, they explore not only the conformational space but also the space of possible search strategies. The integration of human visual problem-solving and strategy development capabilities with traditional computational algorithms through interactive multiplayer games is a powerful new approach to solving computationally-limited scientific problems.
In Praise of Numerical Computation

NASA Astrophysics Data System (ADS)

Yap, Chee K.

Theoretical Computer Science has developed an almost exclusively discrete/algebraic persona. We have effectively shut ourselves off from half of the world of computing: a host of problems in Computational Science & Engineering (CS&E) are defined on the continuum, and, for them, the discrete viewpoint is inadequate. The computational techniques in such problems are well-known to numerical analysis and applied mathematics, but are rarely discussed in theoretical algorithms: iteration, subdivision and approximation. By various case studies, I will indicate how our discrete/algebraic view of computing has many shortcomings in CS&E. We want embrace the continuous/analytic view, but in a new synthesis with the discrete/algebraic view. I will suggest a pathway, by way of an exact numerical model of computation, that allows us to incorporate iteration and approximation into our algorithms’ design. Some recent results give a peek into how this view of algorithmic development might look like, and its distinctive form suggests the name “numerical computational geometry” for such activities.
Petri nets SM-cover-based on heuristic coloring algorithm

NASA Astrophysics Data System (ADS)

Tkacz, Jacek; Doligalski, Michał

2015-09-01

In the paper, coloring heuristic algorithm of interpreted Petri nets is presented. Coloring is used to determine the State Machines (SM) subnets. The present algorithm reduces the Petri net in order to reduce the computational complexity and finds one of its possible State Machines cover. The proposed algorithm uses elements of interpretation of Petri nets. The obtained result may not be the best, but it is sufficient for use in rapid prototyping of logic controllers. Found SM-cover will be also used in the development of algorithms for decomposition, and modular synthesis and implementation of parallel logic controllers. Correctness developed heuristic algorithm was verified using Gentzen formal reasoning system.
Parallel fuzzy connected image segmentation on GPU

PubMed Central

Zhuge, Ying; Cao, Yong; Udupa, Jayaram K.; Miller, Robert W.

2011-01-01

Purpose: Image segmentation techniques using fuzzy connectedness (FC) principles have shown their effectiveness in segmenting a variety of objects in several large applications. However, one challenge in these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays, commodity graphics hardware provides a highly parallel computing environment. In this paper, the authors present a parallel fuzzy connected image segmentation algorithm implementation on NVIDIA’s compute unified device Architecture (cuda) platform for segmenting medical image data sets. Methods: In the FC algorithm, there are two major computational tasks: (i) computing the fuzzy affinity relations and (ii) computing the fuzzy connectedness relations. These two tasks are implemented as cuda kernels and executed on GPU. A dramatic improvement in speed for both tasks is achieved as a result. Results: Our experiments based on three data sets of small, medium, and large data size demonstrate the efficiency of the parallel algorithm, which achieves a speed-up factor of 24.4x, 18.1x, and 10.3x, correspondingly, for the three data sets on the NVIDIA Tesla C1060 over the implementation of the algorithm on CPU, and takes 0.25, 0.72, and 15.04 s, correspondingly, for the three data sets. Conclusions: The authors developed a parallel algorithm of the widely used fuzzy connected image segmentation method on the NVIDIA GPUs, which are far more cost- and speed-effective than both cluster of workstations and multiprocessing systems. A near-interactive speed of segmentation has been achieved, even for the large data set. PMID:21859037
Parallel fuzzy connected image segmentation on GPU.

PubMed

Zhuge, Ying; Cao, Yong; Udupa, Jayaram K; Miller, Robert W

2011-07-01

Image segmentation techniques using fuzzy connectedness (FC) principles have shown their effectiveness in segmenting a variety of objects in several large applications. However, one challenge in these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays, commodity graphics hardware provides a highly parallel computing environment. In this paper, the authors present a parallel fuzzy connected image segmentation algorithm implementation on NVIDIA's compute unified device Architecture (CUDA) platform for segmenting medical image data sets. In the FC algorithm, there are two major computational tasks: (i) computing the fuzzy affinity relations and (ii) computing the fuzzy connectedness relations. These two tasks are implemented as CUDA kernels and executed on GPU. A dramatic improvement in speed for both tasks is achieved as a result. Our experiments based on three data sets of small, medium, and large data size demonstrate the efficiency of the parallel algorithm, which achieves a speed-up factor of 24.4x, 18.1x, and 10.3x, correspondingly, for the three data sets on the NVIDIA Tesla C1060 over the implementation of the algorithm on CPU, and takes 0.25, 0.72, and 15.04 s, correspondingly, for the three data sets. The authors developed a parallel algorithm of the widely used fuzzy connected image segmentation method on the NVIDIA GPUs, which are far more cost- and speed-effective than both cluster of workstations and multiprocessing systems. A near-interactive speed of segmentation has been achieved, even for the large data set.
Consistent and efficient processing of ADCP streamflow measurements

USGS Publications Warehouse

Mueller, David S.; Constantinescu, George; Garcia, Marcelo H.; Hanes, Dan

2016-01-01

The use of Acoustic Doppler Current Profilers (ADCPs) from a moving boat is a commonly used method for measuring streamflow. Currently, the algorithms used to compute the average depth, compute edge discharge, identify invalid data, and estimate velocity and discharge for invalid data vary among manufacturers. These differences could result in different discharges being computed from identical data. Consistent computational algorithm, automated filtering, and quality assessment of ADCP streamflow measurements that are independent of the ADCP manufacturer are being developed in a software program that can process ADCP moving-boat discharge measurements independent of the ADCP used to collect the data.
An algorithm for automatic reduction of complex signal flow graphs

NASA Technical Reports Server (NTRS)

Young, K. R.; Hoberock, L. L.; Thompson, J. G.

1976-01-01

A computer algorithm is developed that provides efficient means to compute transmittances directly from a signal flow graph or a block diagram. Signal flow graphs are cast as directed graphs described by adjacency matrices. Nonsearch computation, designed for compilers without symbolic capability, is used to identify all arcs that are members of simple cycles for use with Mason's gain formula. The routine does not require the visual acumen of an interpreter to reduce the topology of the graph, and it is particularly useful for analyzing control systems described for computer analyses by means of interactive graphics.
A combined direct/inverse three-dimensional transonic wing design method for vector computers

NASA Technical Reports Server (NTRS)

Weed, R. A.; Carlson, L. A.; Anderson, W. K.

1984-01-01

A three-dimensional transonic-wing design algorithm for vector computers is developed, and the results of sample computations are presented graphically. The method incorporates the direct/inverse scheme of Carlson (1975), a Cartesian grid system with boundary conditions applied at a mean plane, and a potential-flow solver based on the conservative form of the full potential equation and using the ZEBRA II vectorizable solution algorithm of South et al. (1980). The accuracy and consistency of the method with regard to direct and inverse analysis and trailing-edge closure are verified in the test computations.
Benchmarking neuromorphic vision: lessons learnt from computer vision

PubMed Central

Tan, Cheston; Lallee, Stephane; Orchard, Garrick

2015-01-01

Neuromorphic Vision sensors have improved greatly since the first silicon retina was presented almost three decades ago. They have recently matured to the point where they are commercially available and can be operated by laymen. However, despite improved availability of sensors, there remains a lack of good datasets, while algorithms for processing spike-based visual data are still in their infancy. On the other hand, frame-based computer vision algorithms are far more mature, thanks in part to widely accepted datasets which allow direct comparison between algorithms and encourage competition. We are presented with a unique opportunity to shape the development of Neuromorphic Vision benchmarks and challenges by leveraging what has been learnt from the use of datasets in frame-based computer vision. Taking advantage of this opportunity, in this paper we review the role that benchmarks and challenges have played in the advancement of frame-based computer vision, and suggest guidelines for the creation of Neuromorphic Vision benchmarks and challenges. We also discuss the unique challenges faced when benchmarking Neuromorphic Vision algorithms, particularly when attempting to provide direct comparison with frame-based computer vision. PMID:26528120
SAMSAN- MODERN NUMERICAL METHODS FOR CLASSICAL SAMPLED SYSTEM ANALYSIS

NASA Technical Reports Server (NTRS)

Frisch, H. P.

1994-01-01

SAMSAN was developed to aid the control system analyst by providing a self consistent set of computer algorithms that support large order control system design and evaluation studies, with an emphasis placed on sampled system analysis. Control system analysts have access to a vast array of published algorithms to solve an equally large spectrum of controls related computational problems. The analyst usually spends considerable time and effort bringing these published algorithms to an integrated operational status and often finds them less general than desired. SAMSAN reduces the burden on the analyst by providing a set of algorithms that have been well tested and documented, and that can be readily integrated for solving control system problems. Algorithm selection for SAMSAN has been biased toward numerical accuracy for large order systems with computational speed and portability being considered important but not paramount. In addition to containing relevant subroutines from EISPAK for eigen-analysis and from LINPAK for the solution of linear systems and related problems, SAMSAN contains the following not so generally available capabilities: 1) Reduction of a real non-symmetric matrix to block diagonal form via a real similarity transformation matrix which is well conditioned with respect to inversion, 2) Solution of the generalized eigenvalue problem with balancing and grading, 3) Computation of all zeros of the determinant of a matrix of polynomials, 4) Matrix exponentiation and the evaluation of integrals involving the matrix exponential, with option to first block diagonalize, 5) Root locus and frequency response for single variable transfer functions in the S, Z, and W domains, 6) Several methods of computing zeros for linear systems, and 7) The ability to generate documentation "on demand". All matrix operations in the SAMSAN algorithms assume non-symmetric matrices with real double precision elements. There is no fixed size limit on any matrix in any SAMSAN algorithm; however, it is generally agreed by experienced users, and in the numerical error analysis literature, that computation with non-symmetric matrices of order greater than about 200 should be avoided or treated with extreme care. SAMSAN attempts to support the needs of application oriented analysis by providing: 1) a methodology with unlimited growth potential, 2) a methodology to insure that associated documentation is current and available "on demand", 3) a foundation of basic computational algorithms that most controls analysis procedures are based upon, 4) a set of check out and evaluation programs which demonstrate usage of the algorithms on a series of problems which are structured to expose the limits of each algorithm's applicability, and 5) capabilities which support both a priori and a posteriori error analysis for the computational algorithms provided. The SAMSAN algorithms are coded in FORTRAN 77 for batch or interactive execution and have been implemented on a DEC VAX computer under VMS 4.7. An effort was made to assure that the FORTRAN source code was portable and thus SAMSAN may be adaptable to other machine environments. The documentation is included on the distribution tape or can be purchased separately at the price below. SAMSAN version 2.0 was developed in 1982 and updated to version 3.0 in 1988.
PROCESS SIMULATION OF COLD PRESSING OF ARMSTRONG CP-Ti POWDERS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sabau, Adrian S; Gorti, Sarma B; Peter, William H

A computational methodology is presented for the process simulation of cold pressing of Armstrong CP-Ti Powders. The computational model was implemented in the commercial finite element program ABAQUSTM. Since the powder deformation and consolidation is governed by specific pressure-dependent constitutive equations, several solution algorithms were developed for the ABAQUS user material subroutine, UMAT. The solution algorithms were developed for computing the plastic strain increments based on an implicit integration of the nonlinear yield function, flow rule, and hardening equations that describe the evolution of the state variables. Since ABAQUS requires the use of a full Newton-Raphson algorithm for the stress-strainmore » equations, an algorithm for obtaining the tangent/linearization moduli, which is consistent with the return-mapping algorithm, also was developed. Numerical simulation results are presented for the cold compaction of the Ti powders. Several simulations were conducted for cylindrical samples with different aspect ratios. The numerical simulation results showed that for the disk samples, the minimum von Mises stress was approximately half than its maximum value. The hydrostatic stress distribution exhibits a variation smaller than that of the von Mises stress. It was found that for the disk and cylinder samples the minimum hydrostatic stresses were approximately 23 and 50% less than its maximum value, respectively. It was also found that the minimum density was noticeably affected by the sample height.« less
A historical survey of algorithms and hardware architectures for neural-inspired and neuromorphic computing applications

DOE PAGES

James, Conrad D.; Aimone, James B.; Miner, Nadine E.; ...

2017-01-04

In this study, biological neural networks continue to inspire new developments in algorithms and microelectronic hardware to solve challenging data processing and classification problems. Here in this research, we survey the history of neural-inspired and neuromorphic computing in order to examine the complex and intertwined trajectories of the mathematical theory and hardware developed in this field. Early research focused on adapting existing hardware to emulate the pattern recognition capabilities of living organisms. Contributions from psychologists, mathematicians, engineers, neuroscientists, and other professions were crucial to maturing the field from narrowly-tailored demonstrations to more generalizable systems capable of addressing difficult problem classesmore » such as object detection and speech recognition. Algorithms that leverage fundamental principles found in neuroscience such as hierarchical structure, temporal integration, and robustness to error have been developed, and some of these approaches are achieving world-leading performance on particular data classification tasks. Additionally, novel microelectronic hardware is being developed to perform logic and to serve as memory in neuromorphic computing systems with optimized system integration and improved energy efficiency. Key to such advancements was the incorporation of new discoveries in neuroscience research, the transition away from strict structural replication and towards the functional replication of neural systems, and the use of mathematical theory frameworks to guide algorithm and hardware developments.« less
A historical survey of algorithms and hardware architectures for neural-inspired and neuromorphic computing applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

James, Conrad D.; Aimone, James B.; Miner, Nadine E.

In this study, biological neural networks continue to inspire new developments in algorithms and microelectronic hardware to solve challenging data processing and classification problems. Here in this research, we survey the history of neural-inspired and neuromorphic computing in order to examine the complex and intertwined trajectories of the mathematical theory and hardware developed in this field. Early research focused on adapting existing hardware to emulate the pattern recognition capabilities of living organisms. Contributions from psychologists, mathematicians, engineers, neuroscientists, and other professions were crucial to maturing the field from narrowly-tailored demonstrations to more generalizable systems capable of addressing difficult problem classesmore » such as object detection and speech recognition. Algorithms that leverage fundamental principles found in neuroscience such as hierarchical structure, temporal integration, and robustness to error have been developed, and some of these approaches are achieving world-leading performance on particular data classification tasks. Additionally, novel microelectronic hardware is being developed to perform logic and to serve as memory in neuromorphic computing systems with optimized system integration and improved energy efficiency. Key to such advancements was the incorporation of new discoveries in neuroscience research, the transition away from strict structural replication and towards the functional replication of neural systems, and the use of mathematical theory frameworks to guide algorithm and hardware developments.« less
Applying Computer Models to Realize Closed-Loop Neonatal Oxygen Therapy.

PubMed

Morozoff, Edmund; Smyth, John A; Saif, Mehrdad

2017-01-01

Within the context of automating neonatal oxygen therapy, this article describes the transformation of an idea verified by a computer model into a device actuated by a computer model. Computer modeling of an entire neonatal oxygen therapy system can facilitate the development of closed-loop control algorithms by providing a verification platform and speeding up algorithm development. In this article, we present a method of mathematically modeling the system's components: the oxygen transport within the patient, the oxygen blender, the controller, and the pulse oximeter. Furthermore, within the constraints of engineering a product, an idealized model of the neonatal oxygen transport component may be integrated effectively into the control algorithm of a device, referred to as the adaptive model. Manual and closed-loop oxygen therapy performance were defined in this article by 3 criteria in the following order of importance: percent duration of SpO2 spent in normoxemia (target SpO2 ± 2.5%), hypoxemia (less than normoxemia), and hyperoxemia (more than normoxemia); number of 60-second periods <85% SpO2 and >95% SpO2; and number of manual adjustments. Results from a clinical evaluation that compared the performance of 3 closed-loop control algorithms (state machine, proportional-integral-differential, and adaptive model) with manual oxygen therapy on 7 low-birth-weight ventilated preterm babies, are presented. Compared with manual therapy, all closed-loop control algorithms significantly increased the patients' duration in normoxemia and reduced hyperoxemia (P < 0.05). The number of manual adjustments was also significantly reduced by all of the closed-loop control algorithms (P < 0.05). Although the performance of the 3 control algorithms was equivalent, it is suggested that the adaptive model, with its ease of use, may have the best utility.
A proposed study of multiple scattering through clouds up to 1 THz

NASA Technical Reports Server (NTRS)

Gerace, G. C.; Smith, E. K.

1992-01-01

A rigorous computation of the electromagnetic field scattered from an atmospheric liquid water cloud is proposed. The recent development of a fast recursive algorithm (Chew algorithm) for computing the fields scattered from numerous scatterers now makes a rigorous computation feasible. A method is presented for adapting this algorithm to a general case where there are an extremely large number of scatterers. It is also proposed to extend a new binary PAM channel coding technique (El-Khamy coding) to multiple levels with non-square pulse shapes. The Chew algorithm can be used to compute the transfer function of a cloud channel. Then the transfer function can be used to design an optimum El-Khamy code. In principle, these concepts can be applied directly to the realistic case of a time-varying cloud (adaptive channel coding and adaptive equalization). A brief review is included of some preliminary work on cloud dispersive effects on digital communication signals and on cloud liquid water spectra and correlations.
The application of generalized, cyclic, and modified numerical integration algorithms to problems of satellite orbit computation

NASA Technical Reports Server (NTRS)

Chesler, L.; Pierce, S.

1971-01-01

Generalized, cyclic, and modified multistep numerical integration methods are developed and evaluated for application to problems of satellite orbit computation. Generalized methods are compared with the presently utilized Cowell methods; new cyclic methods are developed for special second-order differential equations; and several modified methods are developed and applied to orbit computation problems. Special computer programs were written to generate coefficients for these methods, and subroutines were written which allow use of these methods with NASA's GEOSTAR computer program.
A conservative implicit finite difference algorithm for the unsteady transonic full potential equation

NASA Technical Reports Server (NTRS)

Steger, J. L.; Caradonna, F. X.

1980-01-01

An implicit finite difference procedure is developed to solve the unsteady full potential equation in conservation law form. Computational efficiency is maintained by use of approximate factorization techniques. The numerical algorithm is first order in time and second order in space. A circulation model and difference equations are developed for lifting airfoils in unsteady flow; however, thin airfoil body boundary conditions have been used with stretching functions to simplify the development of the numerical algorithm.
Desiderata for computable representations of electronic health records-driven phenotype algorithms.

PubMed

Mo, Huan; Thompson, William K; Rasmussen, Luke V; Pacheco, Jennifer A; Jiang, Guoqian; Kiefer, Richard; Zhu, Qian; Xu, Jie; Montague, Enid; Carrell, David S; Lingren, Todd; Mentch, Frank D; Ni, Yizhao; Wehbe, Firas H; Peissig, Peggy L; Tromp, Gerard; Larson, Eric B; Chute, Christopher G; Pathak, Jyotishman; Denny, Joshua C; Speltz, Peter; Kho, Abel N; Jarvik, Gail P; Bejan, Cosmin A; Williams, Marc S; Borthwick, Kenneth; Kitchner, Terrie E; Roden, Dan M; Harris, Paul A

2015-11-01

Electronic health records (EHRs) are increasingly used for clinical and translational research through the creation of phenotype algorithms. Currently, phenotype algorithms are most commonly represented as noncomputable descriptive documents and knowledge artifacts that detail the protocols for querying diagnoses, symptoms, procedures, medications, and/or text-driven medical concepts, and are primarily meant for human comprehension. We present desiderata for developing a computable phenotype representation model (PheRM). A team of clinicians and informaticians reviewed common features for multisite phenotype algorithms published in PheKB.org and existing phenotype representation platforms. We also evaluated well-known diagnostic criteria and clinical decision-making guidelines to encompass a broader category of algorithms. We propose 10 desired characteristics for a flexible, computable PheRM: (1) structure clinical data into queryable forms; (2) recommend use of a common data model, but also support customization for the variability and availability of EHR data among sites; (3) support both human-readable and computable representations of phenotype algorithms; (4) implement set operations and relational algebra for modeling phenotype algorithms; (5) represent phenotype criteria with structured rules; (6) support defining temporal relations between events; (7) use standardized terminologies and ontologies, and facilitate reuse of value sets; (8) define representations for text searching and natural language processing; (9) provide interfaces for external software algorithms; and (10) maintain backward compatibility. A computable PheRM is needed for true phenotype portability and reliability across different EHR products and healthcare systems. These desiderata are a guide to inform the establishment and evolution of EHR phenotype algorithm authoring platforms and languages. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Performance evaluation for volumetric segmentation of multiple sclerosis lesions using MATLAB and computing engine in the graphical processing unit (GPU)

NASA Astrophysics Data System (ADS)

Le, Anh H.; Park, Young W.; Ma, Kevin; Jacobs, Colin; Liu, Brent J.

2010-03-01

Multiple Sclerosis (MS) is a progressive neurological disease affecting myelin pathways in the brain. Multiple lesions in the white matter can cause paralysis and severe motor disabilities of the affected patient. To solve the issue of inconsistency and user-dependency in manual lesion measurement of MRI, we have proposed a 3-D automated lesion quantification algorithm to enable objective and efficient lesion volume tracking. The computer-aided detection (CAD) of MS, written in MATLAB, utilizes K-Nearest Neighbors (KNN) method to compute the probability of lesions on a per-voxel basis. Despite the highly optimized algorithm of imaging processing that is used in CAD development, MS CAD integration and evaluation in clinical workflow is technically challenging due to the requirement of high computation rates and memory bandwidth in the recursive nature of the algorithm. In this paper, we present the development and evaluation of using a computing engine in the graphical processing unit (GPU) with MATLAB for segmentation of MS lesions. The paper investigates the utilization of a high-end GPU for parallel computing of KNN in the MATLAB environment to improve algorithm performance. The integration is accomplished using NVIDIA's CUDA developmental toolkit for MATLAB. The results of this study will validate the practicality and effectiveness of the prototype MS CAD in a clinical setting. The GPU method may allow MS CAD to rapidly integrate in an electronic patient record or any disease-centric health care system.
Correction of Visual Perception Based on Neuro-Fuzzy Learning for the Humanoid Robot TEO.

PubMed

Hernandez-Vicen, Juan; Martinez, Santiago; Garcia-Haro, Juan Miguel; Balaguer, Carlos

2018-03-25

New applications related to robotic manipulation or transportation tasks, with or without physical grasping, are continuously being developed. To perform these activities, the robot takes advantage of different kinds of perceptions. One of the key perceptions in robotics is vision. However, some problems related to image processing makes the application of visual information within robot control algorithms difficult. Camera-based systems have inherent errors that affect the quality and reliability of the information obtained. The need of correcting image distortion slows down image parameter computing, which decreases performance of control algorithms. In this paper, a new approach to correcting several sources of visual distortions on images in only one computing step is proposed. The goal of this system/algorithm is the computation of the tilt angle of an object transported by a robot, minimizing image inherent errors and increasing computing speed. After capturing the image, the computer system extracts the angle using a Fuzzy filter that corrects at the same time all possible distortions, obtaining the real angle in only one processing step. This filter has been developed by the means of Neuro-Fuzzy learning techniques, using datasets with information obtained from real experiments. In this way, the computing time has been decreased and the performance of the application has been improved. The resulting algorithm has been tried out experimentally in robot transportation tasks in the humanoid robot TEO (Task Environment Operator) from the University Carlos III of Madrid.

Correction of Visual Perception Based on Neuro-Fuzzy Learning for the Humanoid Robot TEO

PubMed Central

2018-01-01

New applications related to robotic manipulation or transportation tasks, with or without physical grasping, are continuously being developed. To perform these activities, the robot takes advantage of different kinds of perceptions. One of the key perceptions in robotics is vision. However, some problems related to image processing makes the application of visual information within robot control algorithms difficult. Camera-based systems have inherent errors that affect the quality and reliability of the information obtained. The need of correcting image distortion slows down image parameter computing, which decreases performance of control algorithms. In this paper, a new approach to correcting several sources of visual distortions on images in only one computing step is proposed. The goal of this system/algorithm is the computation of the tilt angle of an object transported by a robot, minimizing image inherent errors and increasing computing speed. After capturing the image, the computer system extracts the angle using a Fuzzy filter that corrects at the same time all possible distortions, obtaining the real angle in only one processing step. This filter has been developed by the means of Neuro-Fuzzy learning techniques, using datasets with information obtained from real experiments. In this way, the computing time has been decreased and the performance of the application has been improved. The resulting algorithm has been tried out experimentally in robot transportation tasks in the humanoid robot TEO (Task Environment Operator) from the University Carlos III of Madrid. PMID:29587392
A Scalable O(N) Algorithm for Large-Scale Parallel First-Principles Molecular Dynamics Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Osei-Kuffuor, Daniel; Fattebert, Jean-Luc

2014-01-01

Traditional algorithms for first-principles molecular dynamics (FPMD) simulations only gain a modest capability increase from current petascale computers, due to their O(N 3) complexity and their heavy use of global communications. To address this issue, we are developing a truly scalable O(N) complexity FPMD algorithm, based on density functional theory (DFT), which avoids global communications. The computational model uses a general nonorthogonal orbital formulation for the DFT energy functional, which requires knowledge of selected elements of the inverse of the associated overlap matrix. We present a scalable algorithm for approximately computing selected entries of the inverse of the overlap matrix,more » based on an approximate inverse technique, by inverting local blocks corresponding to principal submatrices of the global overlap matrix. The new FPMD algorithm exploits sparsity and uses nearest neighbor communication to provide a computational scheme capable of extreme scalability. Accuracy is controlled by the mesh spacing of the finite difference discretization, the size of the localization regions in which the electronic orbitals are confined, and a cutoff beyond which the entries of the overlap matrix can be omitted when computing selected entries of its inverse. We demonstrate the algorithm's excellent parallel scaling for up to O(100K) atoms on O(100K) processors, with a wall-clock time of O(1) minute per molecular dynamics time step.« less
Mathematical model of a rotational bioreactor for the dynamic cultivation of scaffold-adhered human mesenchymal stem cells for bone regeneration

NASA Astrophysics Data System (ADS)

Ganimedov, V. L.; Papaeva, E. O.; Maslov, N. A.; Larionov, P. M.

2017-09-01

Development of cell-mediated scaffold technologies for the treatment of critical bone defects is very important for the purpose of reparative bone regeneration. Today the properties of the bioreactor for cell-seeded scaffold cultivation are the subject of intensive research. We used the mathematical modeling of rotational reactor and construct computational algorithm with the help of ANSYS software package to develop this new procedure. The solution obtained with the help of the constructed computational algorithm is in good agreement with the analytical solution of Couette for the task of two coaxial cylinders. The series of flow computations for different rotation frequencies (1, 0.75, 0.5, 0.33, 1.125 Hz) was performed for the laminar flow regime approximation with the help of computational algorithm. It was found that Taylor vortices appear in the annular gap between the cylinders in a simulated bioreactor. It was obtained that shear stress in the range of interest (0.002-0.1 Pa) arise on outer surface of inner cylinder when it rotates with the frequency not exceeding 0.8 Hz. So the constructed mathematical model and the created computational algorithm for calculating the flow parameters allow predicting the shear stress and pressure values depending on the rotation frequency and geometric parameters, as well as optimizing the operating mode of the bioreactor.
Multi-source Geospatial Data Analysis with Google Earth Engine

NASA Astrophysics Data System (ADS)

Erickson, T.

2014-12-01

The Google Earth Engine platform is a cloud computing environment for data analysis that combines a public data catalog with a large-scale computational facility optimized for parallel processing of geospatial data. The data catalog is a multi-petabyte archive of georeferenced datasets that include images from Earth observing satellite and airborne sensors (examples: USGS Landsat, NASA MODIS, USDA NAIP), weather and climate datasets, and digital elevation models. Earth Engine supports both a just-in-time computation model that enables real-time preview and debugging during algorithm development for open-ended data exploration, and a batch computation mode for applying algorithms over large spatial and temporal extents. The platform automatically handles many traditionally-onerous data management tasks, such as data format conversion, reprojection, and resampling, which facilitates writing algorithms that combine data from multiple sensors and/or models. Although the primary use of Earth Engine, to date, has been the analysis of large Earth observing satellite datasets, the computational platform is generally applicable to a wide variety of use cases that require large-scale geospatial data analyses. This presentation will focus on how Earth Engine facilitates the analysis of geospatial data streams that originate from multiple separate sources (and often communities) and how it enables collaboration during algorithm development and data exploration. The talk will highlight current projects/analyses that are enabled by this functionality.https://earthengine.google.org
Advanced processing for high-bandwidth sensor systems

NASA Astrophysics Data System (ADS)

Szymanski, John J.; Blain, Phil C.; Bloch, Jeffrey J.; Brislawn, Christopher M.; Brumby, Steven P.; Cafferty, Maureen M.; Dunham, Mark E.; Frigo, Janette R.; Gokhale, Maya; Harvey, Neal R.; Kenyon, Garrett; Kim, Won-Ha; Layne, J.; Lavenier, Dominique D.; McCabe, Kevin P.; Mitchell, Melanie; Moore, Kurt R.; Perkins, Simon J.; Porter, Reid B.; Robinson, S.; Salazar, Alfonso; Theiler, James P.; Young, Aaron C.

2000-11-01

Compute performance and algorithm design are key problems of image processing and scientific computing in general. For example, imaging spectrometers are capable of producing data in hundreds of spectral bands with millions of pixels. These data sets show great promise for remote sensing applications, but require new and computationally intensive processing. The goal of the Deployable Adaptive Processing Systems (DAPS) project at Los Alamos National Laboratory is to develop advanced processing hardware and algorithms for high-bandwidth sensor applications. The project has produced electronics for processing multi- and hyper-spectral sensor data, as well as LIDAR data, while employing processing elements using a variety of technologies. The project team is currently working on reconfigurable computing technology and advanced feature extraction techniques, with an emphasis on their application to image and RF signal processing. This paper presents reconfigurable computing technology and advanced feature extraction algorithm work and their application to multi- and hyperspectral image processing. Related projects on genetic algorithms as applied to image processing will be introduced, as will the collaboration between the DAPS project and the DARPA Adaptive Computing Systems program. Further details are presented in other talks during this conference and in other conferences taking place during this symposium.
Multilevel Iterative Methods in Nonlinear Computational Plasma Physics

NASA Astrophysics Data System (ADS)

Knoll, D. A.; Finn, J. M.

1997-11-01

Many applications in computational plasma physics involve the implicit numerical solution of coupled systems of nonlinear partial differential equations or integro-differential equations. Such problems arise in MHD, systems of Vlasov-Fokker-Planck equations, edge plasma fluid equations. We have been developing matrix-free Newton-Krylov algorithms for such problems and have applied these algorithms to the edge plasma fluid equations [1,2] and to the Vlasov-Fokker-Planck equation [3]. Recently we have found that with increasing grid refinement, the number of Krylov iterations required per Newton iteration has grown unmanageable [4]. This has led us to the study of multigrid methods as a means of preconditioning matrix-free Newton-Krylov methods. In this poster we will give details of the general multigrid preconditioned Newton-Krylov algorithm, as well as algorithm performance details on problems of interest in the areas of magnetohydrodynamics and edge plasma physics. Work supported by US DoE 1. Knoll and McHugh, J. Comput. Phys., 116, pg. 281 (1995) 2. Knoll and McHugh, Comput. Phys. Comm., 88, pg. 141 (1995) 3. Mousseau and Knoll, J. Comput. Phys. (1997) (to appear) 4. Knoll and McHugh, SIAM J. Sci. Comput. 19, (1998) (to appear)
Computational complexity of algorithms for sequence comparison, short-read assembly and genome alignment.

PubMed

Baichoo, Shakuntala; Ouzounis, Christos A

A multitude of algorithms for sequence comparison, short-read assembly and whole-genome alignment have been developed in the general context of molecular biology, to support technology development for high-throughput sequencing, numerous applications in genome biology and fundamental research on comparative genomics. The computational complexity of these algorithms has been previously reported in original research papers, yet this often neglected property has not been reviewed previously in a systematic manner and for a wider audience. We provide a review of space and time complexity of key sequence analysis algorithms and highlight their properties in a comprehensive manner, in order to identify potential opportunities for further research in algorithm or data structure optimization. The complexity aspect is poised to become pivotal as we will be facing challenges related to the continuous increase of genomic data on unprecedented scales and complexity in the foreseeable future, when robust biological simulation at the cell level and above becomes a reality. Copyright © 2017 Elsevier B.V. All rights reserved.
Integration of Libration Point Orbit Dynamics into a Universal 3-D Autonomous Formation Flying Algorithm

NASA Technical Reports Server (NTRS)

Folta, David; Bauer, Frank H. (Technical Monitor)

2001-01-01

The autonomous formation flying control algorithm developed by the Goddard Space Flight Center (GSFC) for the New Millennium Program (NMP) Earth Observing-1 (EO-1) mission is investigated for applicability to libration point orbit formations. In the EO-1 formation-flying algorithm, control is accomplished via linearization about a reference transfer orbit with a state transition matrix (STM) computed from state inputs. The effect of libration point orbit dynamics on this algorithm architecture is explored via computation of STMs using the flight proven code, a monodromy matrix developed from a N-body model of a libration orbit, and a standard STM developed from the gravitational and coriolis effects as measured at the libration point. A comparison of formation flying Delta-Vs calculated from these methods is made to a standard linear quadratic regulator (LQR) method. The universal 3-D approach is optimal in the sense that it can be accommodated as an open-loop or closed-loop control using only state information.
Modeling and optimum time performance for concurrent processing

NASA Technical Reports Server (NTRS)

Mielke, Roland R.; Stoughton, John W.; Som, Sukhamoy

1988-01-01

The development of a new graph theoretic model for describing the relation between a decomposed algorithm and its execution in a data flow environment is presented. Called ATAMM, the model consists of a set of Petri net marked graphs useful for representing decision-free algorithms having large-grained, computationally complex primitive operations. Performance time measures which determine computing speed and throughput capacity are defined, and the ATAMM model is used to develop lower bounds for these times. A concurrent processing operating strategy for achieving optimum time performance is presented and illustrated by example.
An Analysis of the RCA Price-S Cost Estimation Model as it Relates to Current Air Force Computer Software Acquisition and Management.

DTIC Science & Technology

1979-12-01

because of the use of complex computational algorithms (Ref 25). Another important factor effecting the cost of soft- ware is the size of the development...involved the alignment and navigational algorithm portions of the software. The second avionics system application was the development of an inertial...001 1 COAT CONL CREA CINT CMAT CSTR COPR CAPP New Code .001 .001 .001 .001 1001 ,OO .00 Device TDAT T03NL TREA TINT Types o * Quantity OGAT OONL OREA
Application of machine learning methods in bioinformatics

NASA Astrophysics Data System (ADS)

Yang, Haoyu; An, Zheng; Zhou, Haotian; Hou, Yawen

2018-05-01

Faced with the development of bioinformatics, high-throughput genomic technology have enabled biology to enter the era of big data. [1] Bioinformatics is an interdisciplinary, including the acquisition, management, analysis, interpretation and application of biological information, etc. It derives from the Human Genome Project. The field of machine learning, which aims to develop computer algorithms that improve with experience, holds promise to enable computers to assist humans in the analysis of large, complex data sets.[2]. This paper analyzes and compares various algorithms of machine learning and their applications in bioinformatics.
Development of numerical methods for overset grids with applications for the integrated Space Shuttle vehicle

NASA Technical Reports Server (NTRS)

Chan, William M.

1995-01-01

Algorithms and computer code developments were performed for the overset grid approach to solving computational fluid dynamics problems. The techniques developed are applicable to compressible Navier-Stokes flow for any general complex configurations. The computer codes developed were tested on different complex configurations with the Space Shuttle launch vehicle configuration as the primary test bed. General, efficient and user-friendly codes were produced for grid generation, flow solution and force and moment computation.
Optimization Techniques for Analysis of Biological and Social Networks

DTIC Science & Technology

2012-03-28

analyzing a new metaheuristic technique, variable objective search. 3. Experimentation and application: Implement the proposed algorithms , test and fine...alternative mathematical programming formulations, their theoretical analysis, the development of exact algorithms , and heuristics. Originally, clusters...systematic fashion under a unifying theoretical and algorithmic framework. Optimization, Complex Networks, Social Network Analysis, Computational
Algorithm Development for the Multi-Fluid Plasma Model

DTIC Science & Technology

2011-05-30

392, Sep 1995. [13] L Chacon , DC Barnes, DA Knoll, and GH Miley. An implicit energy- conservative 2D Fokker-Planck algorithm. Journal of Computational...Physics, 157(2):618–653, 2000. [14] L Chacon , DC Barnes, DA Knoll, and GH Miley. An implicit energy- conservative 2D Fokker-Planck algorithm - II
Quantitative imaging biomarkers: a review of statistical methods for computer algorithm comparisons.

PubMed

Obuchowski, Nancy A; Reeves, Anthony P; Huang, Erich P; Wang, Xiao-Feng; Buckler, Andrew J; Kim, Hyun J Grace; Barnhart, Huiman X; Jackson, Edward F; Giger, Maryellen L; Pennello, Gene; Toledano, Alicia Y; Kalpathy-Cramer, Jayashree; Apanasovich, Tatiyana V; Kinahan, Paul E; Myers, Kyle J; Goldgof, Dmitry B; Barboriak, Daniel P; Gillies, Robert J; Schwartz, Lawrence H; Sullivan, Daniel C

2015-02-01

Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research. © The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Tumuluru, Jaya Shankar; McCulloch, Richard Chet James

In this work a new hybrid genetic algorithm was developed which combines a rudimentary adaptive steepest ascent hill climbing algorithm with a sophisticated evolutionary algorithm in order to optimize complex multivariate design problems. By combining a highly stochastic algorithm (evolutionary) with a simple deterministic optimization algorithm (adaptive steepest ascent) computational resources are conserved and the solution converges rapidly when compared to either algorithm alone. In genetic algorithms natural selection is mimicked by random events such as breeding and mutation. In the adaptive steepest ascent algorithm each variable is perturbed by a small amount and the variable that caused the mostmore » improvement is incremented by a small step. If the direction of most benefit is exactly opposite of the previous direction with the most benefit then the step size is reduced by a factor of 2, thus the step size adapts to the terrain. A graphical user interface was created in MATLAB to provide an interface between the hybrid genetic algorithm and the user. Additional features such as bounding the solution space and weighting the objective functions individually are also built into the interface. The algorithm developed was tested to optimize the functions developed for a wood pelleting process. Using process variables (such as feedstock moisture content, die speed, and preheating temperature) pellet properties were appropriately optimized. Specifically, variables were found which maximized unit density, bulk density, tapped density, and durability while minimizing pellet moisture content and specific energy consumption. The time and computational resources required for the optimization were dramatically decreased using the hybrid genetic algorithm when compared to MATLAB's native evolutionary optimization tool.« less
User's guide to the Fault Inferring Nonlinear Detection System (FINDS) computer program

NASA Technical Reports Server (NTRS)

Caglayan, A. K.; Godiwala, P. M.; Satz, H. S.

1988-01-01

Described are the operation and internal structure of the computer program FINDS (Fault Inferring Nonlinear Detection System). The FINDS algorithm is designed to provide reliable estimates for aircraft position, velocity, attitude, and horizontal winds to be used for guidance and control laws in the presence of possible failures in the avionics sensors. The FINDS algorithm was developed with the use of a digital simulation of a commercial transport aircraft and tested with flight recorded data. The algorithm was then modified to meet the size constraints and real-time execution requirements on a flight computer. For the real-time operation, a multi-rate implementation of the FINDS algorithm has been partitioned to execute on a dual parallel processor configuration: one based on the translational dynamics and the other on the rotational kinematics. The report presents an overview of the FINDS algorithm, the implemented equations, the flow charts for the key subprograms, the input and output files, program variable indexing convention, subprogram descriptions, and the common block descriptions used in the program.
Differential-Evolution Control Parameter Optimization for Unmanned Aerial Vehicle Path Planning

PubMed Central

Kok, Kai Yit; Rajendran, Parvathy

2016-01-01

The differential evolution algorithm has been widely applied on unmanned aerial vehicle (UAV) path planning. At present, four random tuning parameters exist for differential evolution algorithm, namely, population size, differential weight, crossover, and generation number. These tuning parameters are required, together with user setting on path and computational cost weightage. However, the optimum settings of these tuning parameters vary according to application. Instead of trial and error, this paper presents an optimization method of differential evolution algorithm for tuning the parameters of UAV path planning. The parameters that this research focuses on are population size, differential weight, crossover, and generation number. The developed algorithm enables the user to simply define the weightage desired between the path and computational cost to converge with the minimum generation required based on user requirement. In conclusion, the proposed optimization of tuning parameters in differential evolution algorithm for UAV path planning expedites and improves the final output path and computational cost. PMID:26943630
Using Mathematics to Make Computing on Encrypted Data Secure and Practical

DTIC Science & Technology

2015-12-01

LLL) lattice basis reduction algorithm, G-Lattice, Cryptography , Security, Gentry-Szydlo Algorithm, Ring-LWE 16. SECURITY CLASSIFICATION OF: 17...with symmetry be further developed, in order to quantify the security of lattice-based cryptography , including especially the security of homomorphic...the Gentry-Szydlo algorithm, and the ideas should be applicable to a range of questions in cryptography . The new algorithm of Lenstra and Silverberg
On the development of efficient algorithms for three dimensional fluid flow

NASA Technical Reports Server (NTRS)

Maccormack, R. W.

1988-01-01

The difficulties of constructing efficient algorithms for three-dimensional flow are discussed. Reasonable candidates are analyzed and tested, and most are found to have obvious shortcomings. Yet, there is promise that an efficient class of algorithms exist between the severely time-step sized-limited explicit or approximately factored algorithms and the computationally intensive direct inversion of large sparse matrices by Gaussian elimination.

Efficient 3D geometric and Zernike moments computation from unstructured surface meshes.

PubMed

Pozo, José María; Villa-Uriol, Maria-Cruz; Frangi, Alejandro F

2011-03-01

This paper introduces and evaluates a fast exact algorithm and a series of faster approximate algorithms for the computation of 3D geometric moments from an unstructured surface mesh of triangles. Being based on the object surface reduces the computational complexity of these algorithms with respect to volumetric grid-based algorithms. In contrast, it can only be applied for the computation of geometric moments of homogeneous objects. This advantage and restriction is shared with other proposed algorithms based on the object boundary. The proposed exact algorithm reduces the computational complexity for computing geometric moments up to order N with respect to previously proposed exact algorithms, from N(9) to N(6). The approximate series algorithm appears as a power series on the rate between triangle size and object size, which can be truncated at any desired degree. The higher the number and quality of the triangles, the better the approximation. This approximate algorithm reduces the computational complexity to N(3). In addition, the paper introduces a fast algorithm for the computation of 3D Zernike moments from the computed geometric moments, with a computational complexity N(4), while the previously proposed algorithm is of order N(6). The error introduced by the proposed approximate algorithms is evaluated in different shapes and the cost-benefit ratio in terms of error, and computational time is analyzed for different moment orders.
Algorithm and code development for unsteady three-dimensional Navier-Stokes equations

NASA Technical Reports Server (NTRS)

Obayashi, Shigeru

1991-01-01

A streamwise upwind algorithm for solving the unsteady 3-D Navier-Stokes equations was extended to handle the moving grid system. It is noted that the finite volume concept is essential to extend the algorithm. The resulting algorithm is conservative for any motion of the coordinate system. Two extensions to an implicit method were considered and the implicit extension that makes the algorithm computationally efficient is implemented into Ames's aeroelasticity code, ENSAERO. The new flow solver has been validated through the solution of test problems. Test cases include three-dimensional problems with fixed and moving grids. The first test case shown is an unsteady viscous flow over an F-5 wing, while the second test considers the motion of the leading edge vortex as well as the motion of the shock wave for a clipped delta wing. The resulting algorithm has been implemented into ENSAERO. The upwind version leads to higher accuracy in both steady and unsteady computations than the previously used central-difference method does, while the increase in the computational time is small.
Fundamentals and Recent Developments in Approximate Bayesian Computation

PubMed Central

Lintusaari, Jarno; Gutmann, Michael U.; Dutta, Ritabrata; Kaski, Samuel; Corander, Jukka

2017-01-01

Abstract Bayesian inference plays an important role in phylogenetics, evolutionary biology, and in many other branches of science. It provides a principled framework for dealing with uncertainty and quantifying how it changes in the light of new evidence. For many complex models and inference problems, however, only approximate quantitative answers are obtainable. Approximate Bayesian computation (ABC) refers to a family of algorithms for approximate inference that makes a minimal set of assumptions by only requiring that sampling from a model is possible. We explain here the fundamentals of ABC, review the classical algorithms, and highlight recent developments. [ABC; approximate Bayesian computation; Bayesian inference; likelihood-free inference; phylogenetics; simulator-based models; stochastic simulation models; tree-based models.] PMID:28175922
Computation of multi-dimensional viscous supersonic flow

NASA Technical Reports Server (NTRS)

Buggeln, R. C.; Kim, Y. N.; Mcdonald, H.

1986-01-01

A method has been developed for two- and three-dimensional computations of viscous supersonic jet flows interacting with an external flow. The approach employs a reduced form of the Navier-Stokes equations which allows solution as an initial-boundary value problem in space, using an efficient noniterative forward marching algorithm. Numerical instability associated with forward marching algorithms for flows with embedded subsonic regions is avoided by approximation of the reduced form of the Navier-Stokes equations in the subsonic regions of the boundary layers. Supersonic and subsonic portions of the flow field are simultaneously calculated by a consistently split linearized block implicit computational algorithm. The results of computations for a series of test cases associated with supersonic jet flow is presented and compared with other calculations for axisymmetric cases. Demonstration calculations indicate that the computational technique has great promise as a tool for calculating a wide range of supersonic flow problems including jet flow. Finally, a User's Manual is presented for the computer code used to perform the calculations.
Face pose tracking using the four-point algorithm

NASA Astrophysics Data System (ADS)

Fung, Ho Yin; Wong, Kin Hong; Yu, Ying Kin; Tsui, Kwan Pang; Kam, Ho Chuen

2017-06-01

In this paper, we have developed an algorithm to track the pose of a human face robustly and efficiently. Face pose estimation is very useful in many applications such as building virtual reality systems and creating an alternative input method for the disabled. Firstly, we have modified a face detection toolbox called DLib for the detection of a face in front of a camera. The detected face features are passed to a pose estimation method, known as the four-point algorithm, for pose computation. The theory applied and the technical problems encountered during system development are discussed in the paper. It is demonstrated that the system is able to track the pose of a face in real time using a consumer grade laptop computer.
Robust Vision-Based Pose Estimation Algorithm for AN Uav with Known Gravity Vector

NASA Astrophysics Data System (ADS)

Kniaz, V. V.

2016-06-01

Accurate estimation of camera external orientation with respect to a known object is one of the central problems in photogrammetry and computer vision. In recent years this problem is gaining an increasing attention in the field of UAV autonomous flight. Such application requires a real-time performance and robustness of the external orientation estimation algorithm. The accuracy of the solution is strongly dependent on the number of reference points visible on the given image. The problem only has an analytical solution if 3 or more reference points are visible. However, in limited visibility conditions it is often needed to perform external orientation with only 2 visible reference points. In such case the solution could be found if the gravity vector direction in the camera coordinate system is known. A number of algorithms for external orientation estimation for the case of 2 known reference points and a gravity vector were developed to date. Most of these algorithms provide analytical solution in the form of polynomial equation that is subject to large errors in the case of complex reference points configurations. This paper is focused on the development of a new computationally effective and robust algorithm for external orientation based on positions of 2 known reference points and a gravity vector. The algorithm implementation for guidance of a Parrot AR.Drone 2.0 micro-UAV is discussed. The experimental evaluation of the algorithm proved its computational efficiency and robustness against errors in reference points positions and complex configurations.
Using advanced computer vision algorithms on small mobile robots

NASA Astrophysics Data System (ADS)

Kogut, G.; Birchmore, F.; Biagtan Pacis, E.; Everett, H. R.

2006-05-01

The Technology Transfer project employs a spiral development process to enhance the functionality and autonomy of mobile robot systems in the Joint Robotics Program (JRP) Robotic Systems Pool by converging existing component technologies onto a transition platform for optimization. An example of this approach is the implementation of advanced computer vision algorithms on small mobile robots. We demonstrate the implementation and testing of the following two algorithms useful on mobile robots: 1) object classification using a boosted Cascade of classifiers trained with the Adaboost training algorithm, and 2) human presence detection from a moving platform. Object classification is performed with an Adaboost training system developed at the University of California, San Diego (UCSD) Computer Vision Lab. This classification algorithm has been used to successfully detect the license plates of automobiles in motion in real-time. While working towards a solution to increase the robustness of this system to perform generic object recognition, this paper demonstrates an extension to this application by detecting soda cans in a cluttered indoor environment. The human presence detection from a moving platform system uses a data fusion algorithm which combines results from a scanning laser and a thermal imager. The system is able to detect the presence of humans while both the humans and the robot are moving simultaneously. In both systems, the two aforementioned algorithms were implemented on embedded hardware and optimized for use in real-time. Test results are shown for a variety of environments.
SMV⊥: Simplex of maximal volume based upon the Gram-Schmidt process

NASA Astrophysics Data System (ADS)

Salazar-Vazquez, Jairo; Mendez-Vazquez, Andres

2015-10-01

In recent years, different algorithms for Hyperspectral Image (HI) analysis have been introduced. The high spectral resolution of these images allows to develop different algorithms for target detection, material mapping, and material identification for applications in Agriculture, Security and Defense, Industry, etc. Therefore, from the computer science's point of view, there is fertile field of research for improving and developing algorithms in HI analysis. In some applications, the spectral pixels of a HI can be classified using laboratory spectral signatures. Nevertheless, for many others, there is no enough available prior information or spectral signatures, making any analysis a difficult task. One of the most popular algorithms for the HI analysis is the N-FINDR because it is easy to understand and provides a way to unmix the original HI in the respective material compositions. The N-FINDR is computationally expensive and its performance depends on a random initialization process. This paper proposes a novel idea to reduce the complexity of the N-FINDR by implementing a bottom-up approach based in an observation from linear algebra and the use of the Gram-Schmidt process. Therefore, the Simplex of Maximal Volume Perpendicular (SMV⊥) algorithm is proposed for fast endmember extraction in hyperspectral imagery. This novel algorithm has complexity O(n) with respect to the number of pixels. In addition, the evidence shows that SMV⊥ calculates a bigger volume, and has lower computational time complexity than other poular algorithms on synthetic and real scenarios.
Parallel Algorithms for Computational Models of Geophysical Systems

NASA Astrophysics Data System (ADS)

Carrillo Ledesma, A.; Herrera, I.; de la Cruz, L. M.; Hernández, G.; Grupo de Modelacion Matematica y Computacional

2013-05-01

Mathematical models of many systems of interest, including very important continuous systems of Earth Sciences and Engineering, lead to a great variety of partial differential equations (PDEs) whose solution methods are based on the computational processing of large-scale algebraic systems. Furthermore, the incredible expansion experienced by the existing computational hardware and software has made amenable to effective treatment problems of an ever increasing diversity and complexity, posed by scientific and engineering applications. Parallel computing is outstanding among the new computational tools and, in order to effectively use the most advanced computers available today, massively parallel software is required. Domain decomposition methods (DDMs) have been developed precisely for effectively treating PDEs in paralle. Ideally, the main objective of domain decomposition research is to produce algorithms capable of 'obtaining the global solution by exclusively solving local problems', but up-to-now this has only been an aspiration; that is, a strong desire for achieving such a property and so we call it 'the DDM-paradigm'. In recent times, numerically competitive DDM-algorithms are non-overlapping, preconditioned and necessarily incorporate constraints which pose an additional challenge for achieving the DDM-paradigm. Recently a group of four algorithms, referred to as the 'DVS-algorithms', which fulfill the DDM-paradigm, was developed. To derive them a new discretization method, which uses a non-overlapping system of nodes (the derived-nodes), was introduced. This discretization procedure can be applied to any boundary-value problem, or system of such equations. In turn, the resulting system of discrete equations can be treated using any available DDM-algorithm. In particular, two of the four DVS-algorithms mentioned above were obtained by application of the well-known and very effective algorithms BDDC and FETI-DP; these will be referred to as the DVS-BDDC and DVS-FETI-DP algorithms. The other two, which will be referred to as the DVS-PRIMAL and DVS-DUAL algorithms, were obtained by application of two new algorithms that had not been previously reported in the literature. As said before, the four DVS-algorithms constitute a group of preconditioned and constrained algorithms that, for the first time, fulfill the DDM-paradigm. Both, BDDC and FETI-DP, are very well-known; and both are highly efficient. Recently, it was established that these two methods are closely related and its numerical performance is quite similar. On the other hand, through numerical experiments, we have established that the numerical performances of each one of the members of DVS-algorithms group (DVS-BDDC, DVS-FETI-DP, DVS-PRIMAL and DVS-DUAL) are very similar too. Furthermore, we have carried out comparisons of the performances of the standard versions of BDDC and FETI-DP with DVS-BDDC and DVS-FETI-DP, and in all such numerical experiments the DVS algorithms have performed significantly better.
Quantum computation in the analysis of hyperspectral data

NASA Astrophysics Data System (ADS)

Gomez, Richard B.; Ghoshal, Debabrata; Jayanna, Anil

2004-08-01

Recent research on the topic of quantum computation provides us with some quantum algorithms with higher efficiency and speedup compared to their classical counterparts. In this paper, it is our intent to provide the results of our investigation of several applications of such quantum algorithms - especially the Grover's Search algorithm - in the analysis of Hyperspectral Data. We found many parallels with Grover's method in existing data processing work that make use of classical spectral matching algorithms. Our efforts also included the study of several methods dealing with hyperspectral image analysis work where classical computation methods involving large data sets could be replaced with quantum computation methods. The crux of the problem in computation involving a hyperspectral image data cube is to convert the large amount of data in high dimensional space to real information. Currently, using the classical model, different time consuming methods and steps are necessary to analyze these data including: Animation, Minimum Noise Fraction Transform, Pixel Purity Index algorithm, N-dimensional scatter plot, Identification of Endmember spectra - are such steps. If a quantum model of computation involving hyperspectral image data can be developed and formalized - it is highly likely that information retrieval from hyperspectral image data cubes would be a much easier process and the final information content would be much more meaningful and timely. In this case, dimensionality would not be a curse, but a blessing.
The diffusive finite state projection algorithm for efficient simulation of the stochastic reaction-diffusion master equation.

PubMed

Drawert, Brian; Lawson, Michael J; Petzold, Linda; Khammash, Mustafa

2010-02-21

We have developed a computational framework for accurate and efficient simulation of stochastic spatially inhomogeneous biochemical systems. The new computational method employs a fractional step hybrid strategy. A novel formulation of the finite state projection (FSP) method, called the diffusive FSP method, is introduced for the efficient and accurate simulation of diffusive transport. Reactions are handled by the stochastic simulation algorithm.
Treecode with a Special-Purpose Processor

NASA Astrophysics Data System (ADS)

Makino, Junichiro

1991-08-01

We describe an implementation of the modified Barnes-Hut tree algorithm for a gravitational N-body calculation on a GRAPE (GRAvity PipE) backend processor. GRAPE is a special-purpose computer for N-body calculations. It receives the positions and masses of particles from a host computer and then calculates the gravitational force at each coordinate specified by the host. To use this GRAPE processor with the hierarchical tree algorithm, the host computer must maintain a list of all nodes that exert force on a particle. If we create this list for each particle of the system at each timestep, the number of floating-point operations on the host and that on GRAPE would become comparable, and the increased speed obtained by using GRAPE would be small. In our modified algorithm, we create a list of nodes for many particles. Thus, the amount of the work required of the host is significantly reduced. This algorithm was originally developed by Barnes in order to vectorize the force calculation on a Cyber 205. With this algorithm, the computing time of the force calculation becomes comparable to that of the tree construction, if the GRAPE backend processor is sufficiently fast. The obtained speed-up factor is 30 to 50 for a RISC-based host computer and GRAPE-1A with a peak speed of 240 Mflops.
Automatic target detection using binary template matching

NASA Astrophysics Data System (ADS)

Jun, Dong-San; Sun, Sun-Gu; Park, HyunWook

2005-03-01

This paper presents a new automatic target detection (ATD) algorithm to detect targets such as battle tanks and armored personal carriers in ground-to-ground scenarios. Whereas most ATD algorithms were developed for forward-looking infrared (FLIR) images, we have developed an ATD algorithm for charge-coupled device (CCD) images, which have superior quality to FLIR images in daylight. The proposed algorithm uses fast binary template matching with an adaptive binarization, which is robust to various light conditions in CCD images and saves computation time. Experimental results show that the proposed method has good detection performance.
Binary mesh partitioning for cache-efficient visualization.

PubMed

Tchiboukdjian, Marc; Danjean, Vincent; Raffin, Bruno

2010-01-01

One important bottleneck when visualizing large data sets is the data transfer between processor and memory. Cache-aware (CA) and cache-oblivious (CO) algorithms take into consideration the memory hierarchy to design cache efficient algorithms. CO approaches have the advantage to adapt to unknown and varying memory hierarchies. Recent CA and CO algorithms developed for 3D mesh layouts significantly improve performance of previous approaches, but they lack of theoretical performance guarantees. We present in this paper a {\\schmi O}(N\\log N) algorithm to compute a CO layout for unstructured but well shaped meshes. We prove that a coherent traversal of a N-size mesh in dimension d induces less than N/B+{\\schmi O}(N/M;{1/d}) cache-misses where B and M are the block size and the cache size, respectively. Experiments show that our layout computation is faster and significantly less memory consuming than the best known CO algorithm. Performance is comparable to this algorithm for classical visualization algorithm access patterns, or better when the BSP tree produced while computing the layout is used as an acceleration data structure adjusted to the layout. We also show that cache oblivious approaches lead to significant performance increases on recent GPU architectures.
Scalable and fault tolerant orthogonalization based on randomized distributed data aggregation

PubMed Central

Gansterer, Wilfried N.; Niederbrucker, Gerhard; Straková, Hana; Schulze Grotthoff, Stefan

2013-01-01

The construction of distributed algorithms for matrix computations built on top of distributed data aggregation algorithms with randomized communication schedules is investigated. For this purpose, a new aggregation algorithm for summing or averaging distributed values, the push-flow algorithm, is developed, which achieves superior resilience properties with respect to failures compared to existing aggregation methods. It is illustrated that on a hypercube topology it asymptotically requires the same number of iterations as the optimal all-to-all reduction operation and that it scales well with the number of nodes. Orthogonalization is studied as a prototypical matrix computation task. A new fault tolerant distributed orthogonalization method rdmGS, which can produce accurate results even in the presence of node failures, is built on top of distributed data aggregation algorithms. PMID:24748902
Numerical Algorithms for Acoustic Integrals - The Devil is in the Details

NASA Technical Reports Server (NTRS)

Brentner, Kenneth S.

1996-01-01

The accurate prediction of the aeroacoustic field generated by aerospace vehicles or nonaerospace machinery is necessary for designers to control and reduce source noise. Powerful computational aeroacoustic methods, based on various acoustic analogies (primarily the Lighthill acoustic analogy) and Kirchhoff methods, have been developed for prediction of noise from complicated sources, such as rotating blades. Both methods ultimately predict the noise through a numerical evaluation of an integral formulation. In this paper, we consider three generic acoustic formulations and several numerical algorithms that have been used to compute the solutions to these formulations. Algorithms for retarded-time formulations are the most efficient and robust, but they are difficult to implement for supersonic-source motion. Collapsing-sphere and emission-surface formulations are good alternatives when supersonic-source motion is present, but the numerical implementations of these formulations are more computationally demanding. New algorithms - which utilize solution adaptation to provide a specified error level - are needed.
Computational electromagnetics: the physics of smooth versus oscillatory fields.

PubMed

Chew, W C

2004-03-15

This paper starts by discussing the difference in the physics between solutions to Laplace's equation (static) and Maxwell's equations for dynamic problems (Helmholtz equation). Their differing physical characters are illustrated by how the two fields convey information away from their source point. The paper elucidates the fact that their differing physical characters affect the use of Laplacian field and Helmholtz field in imaging. They also affect the design of fast computational algorithms for electromagnetic scattering problems. Specifically, a comparison is made between fast algorithms developed using wavelets, the simple fast multipole method, and the multi-level fast multipole algorithm for electrodynamics. The impact of the physical characters of the dynamic field on the parallelization of the multi-level fast multipole algorithm is also discussed. The relationship of diagonalization of translators to group theory is presented. Finally, future areas of research for computational electromagnetics are described.
Radiation and scattering from bodies of translation. Volume 2: User's manual, computer program documentation

NASA Astrophysics Data System (ADS)

Medgyesi-Mitschang, L. N.; Putnam, J. M.

1980-04-01

A hierarchy of computer programs implementing the method of moments for bodies of translation (MM/BOT) is described. The algorithm treats the far-field radiation and scattering from finite-length open cylinders of arbitrary cross section as well as the near fields and aperture-coupled fields for rectangular apertures on such bodies. The theoretical development underlying the algorithm is described in Volume 1. The structure of the computer algorithm is such that no a priori knowledge of the method of moments technique or detailed FORTRAN experience are presupposed for the user. A set of carefully drawn example problems illustrates all the options of the algorithm. For more detailed understanding of the workings of the codes, special cross referencing to the equations in Volume 1 is provided. For additional clarity, comment statements are liberally interspersed in the code listings, summarized in the present volume.
A Bayesian framework for adaptive selection, calibration, and validation of coarse-grained models of atomistic systems

NASA Astrophysics Data System (ADS)

Farrell, Kathryn; Oden, J. Tinsley; Faghihi, Danial

2015-08-01

A general adaptive modeling algorithm for selection and validation of coarse-grained models of atomistic systems is presented. A Bayesian framework is developed to address uncertainties in parameters, data, and model selection. Algorithms for computing output sensitivities to parameter variances, model evidence and posterior model plausibilities for given data, and for computing what are referred to as Occam Categories in reference to a rough measure of model simplicity, make up components of the overall approach. Computational results are provided for representative applications.
Evaluation of a computer-aided detection algorithm for timely diagnosis of small acute intracranial hemorrhage on computed tomography in a critical care environment

NASA Astrophysics Data System (ADS)

Lee, Joon K.; Chan, Tao; Liu, Brent J.; Huang, H. K.

2009-02-01

Detection of acute intracranial hemorrhage (AIH) is a primary task in the interpretation of computed tomography (CT) brain scans of patients suffering from acute neurological disturbances or after head trauma. Interpretation can be difficult especially when the lesion is inconspicuous or the reader is inexperienced. We have previously developed a computeraided detection (CAD) algorithm to detect small AIH. One hundred and thirty five small AIH CT studies from the Los Angeles County (LAC) + USC Hospital were identified and matched by age and sex with one hundred and thirty five normal studies. These cases were then processed using our AIH CAD system to evaluate the efficacy and constraints of the algorithm.

Towards a computational- and algorithmic-level account of concept blending using analogies and amalgams

NASA Astrophysics Data System (ADS)

Besold, Tarek R.; Kühnberger, Kai-Uwe; Plaza, Enric

2017-10-01

Concept blending - a cognitive process which allows for the combination of certain elements (and their relations) from originally distinct conceptual spaces into a new unified space combining these previously separate elements, and enables reasoning and inference over the combination - is taken as a key element of creative thought and combinatorial creativity. In this article, we summarise our work towards the development of a computational-level and algorithmic-level account of concept blending, combining approaches from computational analogy-making and case-based reasoning (CBR). We present the theoretical background, as well as an algorithmic proposal integrating higher-order anti-unification matching and generalisation from analogy with amalgams from CBR. The feasibility of the approach is then exemplified in two case studies.
The development of an explicit thermochemical nonequilibrium algorithm and its application to compute three dimensional AFE flowfields

NASA Technical Reports Server (NTRS)

Palmer, Grant

1989-01-01

This study presents a three-dimensional explicit, finite-difference, shock-capturing numerical algorithm applied to viscous hypersonic flows in thermochemical nonequilibrium. The algorithm employs a two-temperature physical model. Equations governing the finite-rate chemical reactions are fully-coupled to the gas dynamic equations using a novel coupling technique. The new coupling method maintains stability in the explicit, finite-rate formulation while allowing relatively large global time steps. The code uses flux-vector accuracy. Comparisons with experimental data and other numerical computations verify the accuracy of the present method. The code is used to compute the three-dimensional flowfield over the Aeroassist Flight Experiment (AFE) vehicle at one of its trajectory points.
Colour based fire detection method with temporal intensity variation filtration

NASA Astrophysics Data System (ADS)

Trambitckii, K.; Anding, K.; Musalimov, V.; Linß, G.

2015-02-01

Development of video, computing technologies and computer vision gives a possibility of automatic fire detection on video information. Under that project different algorithms was implemented to find more efficient way of fire detection. In that article colour based fire detection algorithm is described. But it is not enough to use only colour information to detect fire properly. The main reason of this is that in the shooting conditions may be a lot of things having colour similar to fire. A temporary intensity variation of pixels is used to separate them from the fire. These variations are averaged over the series of several frames. This algorithm shows robust work and was realised as a computer program by using of the OpenCV library.
Fragmenting networks by targeting collective influencers at a mesoscopic level.

PubMed

Kobayashi, Teruyoshi; Masuda, Naoki

2016-11-25

A practical approach to protecting networks against epidemic processes such as spreading of infectious diseases, malware, and harmful viral information is to remove some influential nodes beforehand to fragment the network into small components. Because determining the optimal order to remove nodes is a computationally hard problem, various approximate algorithms have been proposed to efficiently fragment networks by sequential node removal. Morone and Makse proposed an algorithm employing the non-backtracking matrix of given networks, which outperforms various existing algorithms. In fact, many empirical networks have community structure, compromising the assumption of local tree-like structure on which the original algorithm is based. We develop an immunization algorithm by synergistically combining the Morone-Makse algorithm and coarse graining of the network in which we regard a community as a supernode. In this way, we aim to identify nodes that connect different communities at a reasonable computational cost. The proposed algorithm works more efficiently than the Morone-Makse and other algorithms on networks with community structure.
Fragmenting networks by targeting collective influencers at a mesoscopic level

NASA Astrophysics Data System (ADS)

Kobayashi, Teruyoshi; Masuda, Naoki

2016-11-01

A practical approach to protecting networks against epidemic processes such as spreading of infectious diseases, malware, and harmful viral information is to remove some influential nodes beforehand to fragment the network into small components. Because determining the optimal order to remove nodes is a computationally hard problem, various approximate algorithms have been proposed to efficiently fragment networks by sequential node removal. Morone and Makse proposed an algorithm employing the non-backtracking matrix of given networks, which outperforms various existing algorithms. In fact, many empirical networks have community structure, compromising the assumption of local tree-like structure on which the original algorithm is based. We develop an immunization algorithm by synergistically combining the Morone-Makse algorithm and coarse graining of the network in which we regard a community as a supernode. In this way, we aim to identify nodes that connect different communities at a reasonable computational cost. The proposed algorithm works more efficiently than the Morone-Makse and other algorithms on networks with community structure.
Fragmenting networks by targeting collective influencers at a mesoscopic level

PubMed Central

Kobayashi, Teruyoshi; Masuda, Naoki

2016-01-01

A practical approach to protecting networks against epidemic processes such as spreading of infectious diseases, malware, and harmful viral information is to remove some influential nodes beforehand to fragment the network into small components. Because determining the optimal order to remove nodes is a computationally hard problem, various approximate algorithms have been proposed to efficiently fragment networks by sequential node removal. Morone and Makse proposed an algorithm employing the non-backtracking matrix of given networks, which outperforms various existing algorithms. In fact, many empirical networks have community structure, compromising the assumption of local tree-like structure on which the original algorithm is based. We develop an immunization algorithm by synergistically combining the Morone-Makse algorithm and coarse graining of the network in which we regard a community as a supernode. In this way, we aim to identify nodes that connect different communities at a reasonable computational cost. The proposed algorithm works more efficiently than the Morone-Makse and other algorithms on networks with community structure. PMID:27886251
Efficient storage, computation, and exposure of computer-generated holograms by electron-beam lithography.

PubMed

Newman, D M; Hawley, R W; Goeckel, D L; Crawford, R D; Abraham, S; Gallagher, N C

1993-05-10

An efficient storage format was developed for computer-generated holograms for use in electron-beam lithography. This method employs run-length encoding and Lempel-Ziv-Welch compression and succeeds in exposing holograms that were previously infeasible owing to the hologram's tremendous pattern-data file size. These holograms also require significant computation; thus the algorithm was implemented on a parallel computer, which improved performance by 2 orders of magnitude. The decompression algorithm was integrated into the Cambridge electron-beam machine's front-end processor.Although this provides much-needed ability, some hardware enhancements will be required in the future to overcome inadequacies in the current front-end processor that result in a lengthy exposure time.
FPGA-based real-time embedded system for RISS/GPS integrated navigation.

PubMed

Abdelfatah, Walid Farid; Georgy, Jacques; Iqbal, Umar; Noureldin, Aboelmagd

2012-01-01

Navigation algorithms integrating measurements from multi-sensor systems overcome the problems that arise from using GPS navigation systems in standalone mode. Algorithms which integrate the data from 2D low-cost reduced inertial sensor system (RISS), consisting of a gyroscope and an odometer or wheel encoders, along with a GPS receiver via a Kalman filter has proved to be worthy in providing a consistent and more reliable navigation solution compared to standalone GPS receivers. It has been also shown to be beneficial, especially in GPS-denied environments such as urban canyons and tunnels. The main objective of this paper is to narrow the idea-to-implementation gap that follows the algorithm development by realizing a low-cost real-time embedded navigation system capable of computing the data-fused positioning solution. The role of the developed system is to synchronize the measurements from the three sensors, relative to the pulse per second signal generated from the GPS, after which the navigation algorithm is applied to the synchronized measurements to compute the navigation solution in real-time. Employing a customizable soft-core processor on an FPGA in the kernel of the navigation system, provided the flexibility for communicating with the various sensors and the computation capability required by the Kalman filter integration algorithm.
FPGA-Based Real-Time Embedded System for RISS/GPS Integrated Navigation

PubMed Central

Abdelfatah, Walid Farid; Georgy, Jacques; Iqbal, Umar; Noureldin, Aboelmagd

2012-01-01

Navigation algorithms integrating measurements from multi-sensor systems overcome the problems that arise from using GPS navigation systems in standalone mode. Algorithms which integrate the data from 2D low-cost reduced inertial sensor system (RISS), consisting of a gyroscope and an odometer or wheel encoders, along with a GPS receiver via a Kalman filter has proved to be worthy in providing a consistent and more reliable navigation solution compared to standalone GPS receivers. It has been also shown to be beneficial, especially in GPS-denied environments such as urban canyons and tunnels. The main objective of this paper is to narrow the idea-to-implementation gap that follows the algorithm development by realizing a low-cost real-time embedded navigation system capable of computing the data-fused positioning solution. The role of the developed system is to synchronize the measurements from the three sensors, relative to the pulse per second signal generated from the GPS, after which the navigation algorithm is applied to the synchronized measurements to compute the navigation solution in real-time. Employing a customizable soft-core processor on an FPGA in the kernel of the navigation system, provided the flexibility for communicating with the various sensors and the computation capability required by the Kalman filter integration algorithm. PMID:22368460
Scaling Watershed Models: Modern Approaches to Science Computation with MapReduce, Parallelization, and Cloud Optimization

EPA Science Inventory

Environmental models are products of the computer architecture and software tools available at the time of development. Scientifically sound algorithms may persist in their original state even as system architectures and software development approaches evolve and progress. Dating...
A modern control theory based algorithm for control of the NASA/JPL 70-meter antenna axis servos

NASA Technical Reports Server (NTRS)

Hill, R. E.

1987-01-01

A digital computer-based state variable controller was designed and applied to the 70-m antenna axis servos. The general equations and structure of the algorithm and provisions for alternate position error feedback modes to accommodate intertarget slew, encoder referenced tracking, and precision tracking modes are descibed. Development of the discrete time domain control model and computation of estimator and control gain parameters based on closed loop pole placement criteria are discussed. The new algorithm was successfully implemented and tested in the 70-m antenna at Deep Space Network station 63 in Spain.
Simulation of an enhanced TCAS 2 system in operation

NASA Technical Reports Server (NTRS)

Rojas, R. G.; Law, P.; Burnside, W. D.

1987-01-01

Described is a computer simulation of a Boeing 737 aircraft equipped with an enhanced Traffic and Collision Avoidance System (TCAS II). In particular, an algorithm is developed which permits the computer simulation of the tracking of a target airplane by a Boeing 373 which has a TCAS II array mounted on top of its fuselage. This algorithm has four main components: namely, the target path, the noise source, the alpha-beta filter, and threat detection. The implementation of each of these four components is described. Furthermore, the areas where the present algorithm needs to be improved are also mentioned.
Research of the effectiveness of parallel multithreaded realizations of interpolation methods for scaling raster images

NASA Astrophysics Data System (ADS)

Vnukov, A. A.; Shershnev, M. B.

2018-01-01

The aim of this work is the software implementation of three image scaling algorithms using parallel computations, as well as the development of an application with a graphical user interface for the Windows operating system to demonstrate the operation of algorithms and to study the relationship between system performance, algorithm execution time and the degree of parallelization of computations. Three methods of interpolation were studied, formalized and adapted to scale images. The result of the work is a program for scaling images by different methods. Comparison of the quality of scaling by different methods is given.
Algorithms for the Computation of Debris Risk

NASA Technical Reports Server (NTRS)

Matney, Mark J.

2017-01-01

Determining the risks from space debris involve a number of statistical calculations. These calculations inevitably involve assumptions about geometry - including the physical geometry of orbits and the geometry of satellites. A number of tools have been developed in NASA’s Orbital Debris Program Office to handle these calculations; many of which have never been published before. These include algorithms that are used in NASA’s Orbital Debris Engineering Model ORDEM 3.0, as well as other tools useful for computing orbital collision rates and ground casualty risks. This paper presents an introduction to these algorithms and the assumptions upon which they are based.
Algorithms for the Computation of Debris Risks

NASA Technical Reports Server (NTRS)

Matney, Mark

2017-01-01

Determining the risks from space debris involve a number of statistical calculations. These calculations inevitably involve assumptions about geometry - including the physical geometry of orbits and the geometry of non-spherical satellites. A number of tools have been developed in NASA's Orbital Debris Program Office to handle these calculations; many of which have never been published before. These include algorithms that are used in NASA's Orbital Debris Engineering Model ORDEM 3.0, as well as other tools useful for computing orbital collision rates and ground casualty risks. This paper will present an introduction to these algorithms and the assumptions upon which they are based.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Manteuffel, T.A.

The objective of this project is the development of numerical solution techniques for deterministic models of the transport of neutral and charged particles and the demonstration of their effectiveness in both a production environment and on advanced architecture computers. The primary focus is on various versions of the linear Boltzman equation. These equations are fundamental in many important applications. This project is an attempt to integrate the development of numerical algorithms with the process of developing production software. A major thrust of this reject will be the implementation of these algorithms on advanced architecture machines that reside at the Advancedmore » Computing Laboratory (ACL) at Los Alamos National Laboratories (LANL).« less
Automatic computation for optimum height planning of apartment buildings to improve solar access

DOE Office of Scientific and Technical Information (OSTI.GOV)

Seong, Yoon-Bok; Kim, Yong-Yee; Seok, Ho-Tae

2011-01-15

The objective of this study is to suggest a mathematical model and an optimal algorithm for determining the height of apartment buildings to satisfy the solar rights of survey buildings or survey housing units. The objective is also to develop an automatic computation model for the optimum height of apartment buildings and then to clarify the performance and expected effects. To accomplish the objective of this study, the following procedures were followed: (1) The necessity of the height planning of obstruction buildings to satisfy the solar rights of survey buildings or survey housing units is demonstrated by analyzing through amore » literature review the recent trend of disputes related to solar rights and to examining the social requirements in terms of solar rights. In addition, the necessity of the automatic computation system for height planning of apartment buildings is demonstrated and a suitable analysis method for this system is chosen by investigating the characteristics of analysis methods for solar rights assessment. (2) A case study on the process of height planning of apartment buildings will be briefly described and the problems occurring in this process will then be examined carefully. (3) To develop an automatic computation model for height planning of apartment buildings, geometrical elements forming apartment buildings are defined by analyzing the geometrical characteristics of apartment buildings. In addition, design factors and regulations required in height planning of apartment buildings are investigated. Based on this knowledge, the methodology and mathematical algorithm to adjust the height of apartment buildings by automatic computation are suggested and probable problems and the ways to resolve these problems are discussed. Finally, the methodology and algorithm for the optimization are suggested. (4) Based on the suggested methodology and mathematical algorithm, the automatic computation model for optimum height of apartment buildings is developed and the developed system is verified through the application of some cases. The effects of the suggested model are then demonstrated quantitatively and qualitatively. (author)« less
A special purpose silicon compiler for designing supercomputing VLSI systems

NASA Technical Reports Server (NTRS)

Venkateswaran, N.; Murugavel, P.; Kamakoti, V.; Shankarraman, M. J.; Rangarajan, S.; Mallikarjun, M.; Karthikeyan, B.; Prabhakar, T. S.; Satish, V.; Venkatasubramaniam, P. R.

1991-01-01

Design of general/special purpose supercomputing VLSI systems for numeric algorithm execution involves tackling two important aspects, namely their computational and communication complexities. Development of software tools for designing such systems itself becomes complex. Hence a novel design methodology has to be developed. For designing such complex systems a special purpose silicon compiler is needed in which: the computational and communicational structures of different numeric algorithms should be taken into account to simplify the silicon compiler design, the approach is macrocell based, and the software tools at different levels (algorithm down to the VLSI circuit layout) should get integrated. In this paper a special purpose silicon (SPS) compiler based on PACUBE macrocell VLSI arrays for designing supercomputing VLSI systems is presented. It is shown that turn-around time and silicon real estate get reduced over the silicon compilers based on PLA's, SLA's, and gate arrays. The first two silicon compiler characteristics mentioned above enable the SPS compiler to perform systolic mapping (at the macrocell level) of algorithms whose computational structures are of GIPOP (generalized inner product outer product) form. Direct systolic mapping on PLA's, SLA's, and gate arrays is very difficult as they are micro-cell based. A novel GIPOP processor is under development using this special purpose silicon compiler.
Robust linearized image reconstruction for multifrequency EIT of the breast.

PubMed

Boverman, Gregory; Kao, Tzu-Jen; Kulkarni, Rujuta; Kim, Bong Seok; Isaacson, David; Saulnier, Gary J; Newell, Jonathan C

2008-10-01

Electrical impedance tomography (EIT) is a developing imaging modality that is beginning to show promise for detecting and characterizing tumors in the breast. At Rensselaer Polytechnic Institute, we have developed a combined EIT-tomosynthesis system that allows for the coregistered and simultaneous analysis of the breast using EIT and X-ray imaging. A significant challenge in EIT is the design of computationally efficient image reconstruction algorithms which are robust to various forms of model mismatch. Specifically, we have implemented a scaling procedure that is robust to the presence of a thin highly-resistive layer of skin at the boundary of the breast and we have developed an algorithm to detect and exclude from the image reconstruction electrodes that are in poor contact with the breast. In our initial clinical studies, it has been difficult to ensure that all electrodes make adequate contact with the breast, and thus procedures for the use of data sets containing poorly contacting electrodes are particularly important. We also present a novel, efficient method to compute the Jacobian matrix for our linearized image reconstruction algorithm by reducing the computation of the sensitivity for each voxel to a quadratic form. Initial clinical results are presented, showing the potential of our algorithms to detect and localize breast tumors.
Using parallel computing for the display and simulation of the space debris environment

NASA Astrophysics Data System (ADS)

Möckel, M.; Wiedemann, C.; Flegel, S.; Gelhaus, J.; Vörsmann, P.; Klinkrad, H.; Krag, H.

2011-07-01

Parallelism is becoming the leading paradigm in today's computer architectures. In order to take full advantage of this development, new algorithms have to be specifically designed for parallel execution while many old ones have to be upgraded accordingly. One field in which parallel computing has been firmly established for many years is computer graphics. Calculating and displaying three-dimensional computer generated imagery in real time requires complex numerical operations to be performed at high speed on a large number of objects. Since most of these objects can be processed independently, parallel computing is applicable in this field. Modern graphics processing units (GPUs) have become capable of performing millions of matrix and vector operations per second on multiple objects simultaneously. As a side project, a software tool is currently being developed at the Institute of Aerospace Systems that provides an animated, three-dimensional visualization of both actual and simulated space debris objects. Due to the nature of these objects it is possible to process them individually and independently from each other. Therefore, an analytical orbit propagation algorithm has been implemented to run on a GPU. By taking advantage of all its processing power a huge performance increase, compared to its CPU-based counterpart, could be achieved. For several years efforts have been made to harness this computing power for applications other than computer graphics. Software tools for the simulation of space debris are among those that could profit from embracing parallelism. With recently emerged software development tools such as OpenCL it is possible to transfer the new algorithms used in the visualization outside the field of computer graphics and implement them, for example, into the space debris simulation environment. This way they can make use of parallel hardware such as GPUs and Multi-Core-CPUs for faster computation. In this paper the visualization software will be introduced, including a comparison between the serial and the parallel method of orbit propagation. Ways of how to use the benefits of the latter method for space debris simulation will be discussed. An introduction to OpenCL will be given as well as an exemplary algorithm from the field of space debris simulation.

Using parallel computing for the display and simulation of the space debris environment

NASA Astrophysics Data System (ADS)

Moeckel, Marek; Wiedemann, Carsten; Flegel, Sven Kevin; Gelhaus, Johannes; Klinkrad, Heiner; Krag, Holger; Voersmann, Peter

Parallelism is becoming the leading paradigm in today's computer architectures. In order to take full advantage of this development, new algorithms have to be specifically designed for parallel execution while many old ones have to be upgraded accordingly. One field in which parallel computing has been firmly established for many years is computer graphics. Calculating and displaying three-dimensional computer generated imagery in real time requires complex numerical operations to be performed at high speed on a large number of objects. Since most of these objects can be processed independently, parallel computing is applicable in this field. Modern graphics processing units (GPUs) have become capable of performing millions of matrix and vector operations per second on multiple objects simultaneously. As a side project, a software tool is currently being developed at the Institute of Aerospace Systems that provides an animated, three-dimensional visualization of both actual and simulated space debris objects. Due to the nature of these objects it is possible to process them individually and independently from each other. Therefore, an analytical orbit propagation algorithm has been implemented to run on a GPU. By taking advantage of all its processing power a huge performance increase, compared to its CPU-based counterpart, could be achieved. For several years efforts have been made to harness this computing power for applications other than computer graphics. Software tools for the simulation of space debris are among those that could profit from embracing parallelism. With recently emerged software development tools such as OpenCL it is possible to transfer the new algorithms used in the visualization outside the field of computer graphics and implement them, for example, into the space debris simulation environment. This way they can make use of parallel hardware such as GPUs and Multi-Core-CPUs for faster computation. In this paper the visualization software will be introduced, including a comparison between the serial and the parallel method of orbit propagation. Ways of how to use the benefits of the latter method for space debris simulation will be discussed. An introduction of OpenCL will be given as well as an exemplary algorithm from the field of space debris simulation.
Type Theory, Computation and Interactive Theorem Proving

DTIC Science & Technology

2015-09-01

postdoc Cody Roux, to develop new methods of verifying real-valued inequalities automatically. They developed a prototype implementation in Python [8] (an...he has developed new heuristic, geometric methods of verifying real-valued inequalities. A python -based implementation has performed surprisingly...express complex mathematical and computational assertions. In this project, Avigad and Harper developed type-theoretic algorithms and formalisms that
Computational path planner for product assembly in complex environments

NASA Astrophysics Data System (ADS)

Shang, Wei; Liu, Jianhua; Ning, Ruxin; Liu, Mi

2013-03-01

Assembly path planning is a crucial problem in assembly related design and manufacturing processes. Sampling based motion planning algorithms are used for computational assembly path planning. However, the performance of such algorithms may degrade much in environments with complex product structure, narrow passages or other challenging scenarios. A computational path planner for automatic assembly path planning in complex 3D environments is presented. The global planning process is divided into three phases based on the environment and specific algorithms are proposed and utilized in each phase to solve the challenging issues. A novel ray test based stochastic collision detection method is proposed to evaluate the intersection between two polyhedral objects. This method avoids fake collisions in conventional methods and degrades the geometric constraint when a part has to be removed with surface contact with other parts. A refined history based rapidly-exploring random tree (RRT) algorithm which bias the growth of the tree based on its planning history is proposed and employed in the planning phase where the path is simple but the space is highly constrained. A novel adaptive RRT algorithm is developed for the path planning problem with challenging scenarios and uncertain environment. With extending values assigned on each tree node and extending schemes applied, the tree can adapts its growth to explore complex environments more efficiently. Experiments on the key algorithms are carried out and comparisons are made between the conventional path planning algorithms and the presented ones. The comparing results show that based on the proposed algorithms, the path planner can compute assembly path in challenging complex environments more efficiently and with higher success. This research provides the references to the study of computational assembly path planning under complex environments.
On the Hilbert-Huang Transform Theoretical Foundation

NASA Technical Reports Server (NTRS)

Kizhner, Semion; Blank, Karin; Huang, Norden E.

2004-01-01

The Hilbert-Huang Transform [HHT] is a novel empirical method for spectrum analysis of non-linear and non-stationary signals. The HHT is a recent development and much remains to be done to establish the theoretical foundation of the HHT algorithms. This paper develops the theoretical foundation for the convergence of the HHT sifting algorithm and it proves that the finest spectrum scale will always be the first generated by the HHT Empirical Mode Decomposition (EMD) algorithm. The theoretical foundation for cutting an extrema data points set into two parts is also developed. This then allows parallel signal processing for the HHT computationally complex sifting algorithm and its optimization in hardware.
Distributed Matrix Completion: Applications to Cooperative Positioning in Noisy Environments

DTIC Science & Technology

2013-12-11

positioning, and a gossip version of low-rank approximation were developed. A convex relaxation for positioning in the presence of noise was shown...computing the leading eigenvectors of a large data matrix through gossip algorithms. A new algorithm is proposed that amounts to iteratively multiplying...generalization of gossip algorithms for consensus. The algorithms outperform state-of-the-art methods in a communication-limited scenario. Positioning via
An adaptive replacement algorithm for paged-memory computer systems.

NASA Technical Reports Server (NTRS)

Thorington, J. M., Jr.; Irwin, J. D.

1972-01-01

A general class of adaptive replacement schemes for use in paged memories is developed. One such algorithm, called SIM, is simulated using a probability model that generates memory traces, and the results of the simulation of this adaptive scheme are compared with those obtained using the best nonlookahead algorithms. A technique for implementing this type of adaptive replacement algorithm with state of the art digital hardware is also presented.
Computer-aided diagnosis workstation and network system for chest diagnosis based on multislice CT images

NASA Astrophysics Data System (ADS)

Satoh, Hitoshi; Niki, Noboru; Eguchi, Kenji; Moriyama, Noriyuki; Ohmatsu, Hironobu; Masuda, Hideo; Machida, Suguru

2008-03-01

Mass screening based on multi-helical CT images requires a considerable number of images to be read. It is this time-consuming step that makes the use of helical CT for mass screening impractical at present. To overcome this problem, we have provided diagnostic assistance methods to medical screening specialists by developing a lung cancer screening algorithm that automatically detects suspected lung cancers in helical CT images, a coronary artery calcification screening algorithm that automatically detects suspected coronary artery calcification and a vertebra body analysis algorithm for quantitative evaluation of osteoporosis likelihood by using helical CT scanner for the lung cancer mass screening. The function to observe suspicious shadow in detail are provided in computer-aided diagnosis workstation with these screening algorithms. We also have developed the telemedicine network by using Web medical image conference system with the security improvement of images transmission, Biometric fingerprint authentication system and Biometric face authentication system. Biometric face authentication used on site of telemedicine makes "Encryption of file" and Success in login" effective. As a result, patients' private information is protected. Based on these diagnostic assistance methods, we have developed a new computer-aided workstation and a new telemedicine network that can display suspected lesions three-dimensionally in a short time. The results of this study indicate that our radiological information system without film by using computer-aided diagnosis workstation and our telemedicine network system can increase diagnostic speed, diagnostic accuracy and security improvement of medical information.
Algorithms and software used in selecting structure of machine-training cluster based on neurocomputers

NASA Astrophysics Data System (ADS)

Romanchuk, V. A.; Lukashenko, V. V.

2018-05-01

The technique of functioning of a control system by a computing cluster based on neurocomputers is proposed. Particular attention is paid to the method of choosing the structure of the computing cluster due to the fact that the existing methods are not effective because of a specialized hardware base - neurocomputers, which are highly parallel computer devices with an architecture different from the von Neumann architecture. A developed algorithm for choosing the computational structure of a cloud cluster is described, starting from the direction of data transfer in the flow control graph of the program and its adjacency matrix.
Fast computational scheme of image compression for 32-bit microprocessors

NASA Technical Reports Server (NTRS)

Kasperovich, Leonid

1994-01-01

This paper presents a new computational scheme of image compression based on the discrete cosine transform (DCT), underlying JPEG and MPEG International Standards. The algorithm for the 2-d DCT computation uses integer operations (register shifts and additions / subtractions only); its computational complexity is about 8 additions per image pixel. As a meaningful example of an on-board image compression application we consider the software implementation of the algorithm for the Mars Rover (Marsokhod, in Russian) imaging system being developed as a part of Mars-96 International Space Project. It's shown that fast software solution for 32-bit microprocessors may compete with the DCT-based image compression hardware.
Parameterized Algorithmics for Finding Exact Solutions of NP-Hard Biological Problems.

PubMed

Hüffner, Falk; Komusiewicz, Christian; Niedermeier, Rolf; Wernicke, Sebastian

2017-01-01

Fixed-parameter algorithms are designed to efficiently find optimal solutions to some computationally hard (NP-hard) problems by identifying and exploiting "small" problem-specific parameters. We survey practical techniques to develop such algorithms. Each technique is introduced and supported by case studies of applications to biological problems, with additional pointers to experimental results.
A genetic algorithm for replica server placement

NASA Astrophysics Data System (ADS)

Eslami, Ghazaleh; Toroghi Haghighat, Abolfazl

2012-01-01

Modern distribution systems use replication to improve communication delay experienced by their clients. Some techniques have been developed for web server replica placement. One of the previous studies was Greedy algorithm proposed by Qiu et al, that needs knowledge about network topology. In This paper, first we introduce a genetic algorithm for web server replica placement. Second, we compare our algorithm with Greedy algorithm proposed by Qiu et al, and Optimum algorithm. We found that our approach can achieve better results than Greedy algorithm proposed by Qiu et al but it's computational time is more than Greedy algorithm.
A genetic algorithm for replica server placement

NASA Astrophysics Data System (ADS)

Eslami, Ghazaleh; Toroghi Haghighat, Abolfazl

2011-12-01

Modern distribution systems use replication to improve communication delay experienced by their clients. Some techniques have been developed for web server replica placement. One of the previous studies was Greedy algorithm proposed by Qiu et al, that needs knowledge about network topology. In This paper, first we introduce a genetic algorithm for web server replica placement. Second, we compare our algorithm with Greedy algorithm proposed by Qiu et al, and Optimum algorithm. We found that our approach can achieve better results than Greedy algorithm proposed by Qiu et al but it's computational time is more than Greedy algorithm.
Algorithm development for Maxwell's equations for computational electromagnetism

NASA Technical Reports Server (NTRS)

Goorjian, Peter M.

1990-01-01

A new algorithm has been developed for solving Maxwell's equations for the electromagnetic field. It solves the equations in the time domain with central, finite differences. The time advancement is performed implicitly, using an alternating direction implicit procedure. The space discretization is performed with finite volumes, using curvilinear coordinates with electromagnetic components along those directions. Sample calculations are presented of scattering from a metal pin, a square and a circle to demonstrate the capabilities of the new algorithm.
Rapid prototyping of update algorithm of discrete Fourier transform for real-time signal processing

NASA Astrophysics Data System (ADS)

Kakad, Yogendra P.; Sherlock, Barry G.; Chatapuram, Krishnan V.; Bishop, Stephen

2001-10-01

An algorithm is developed in the companion paper, to update the existing DFT to represent the new data series that results when a new signal point is received. Updating the DFT in this way uses less computation than directly evaluating the DFT using the FFT algorithm, This reduces the computational order by a factor of log2 N. The algorithm is able to work in the presence of data window function, for use with rectangular window, the split triangular, Hanning, Hamming, and Blackman windows. In this paper, a hardware implementation of this algorithm, using FPGA technology, is outlined. Unlike traditional fully customized VLSI circuits, FPGAs represent a technical break through in the corresponding industry. The FPGA implements thousands of gates of logic in a single IC chip and it can be programmed by users at their site in a few seconds or less depending on the type of device used. The risk is low and the development time is short. The advantages have made FPGAs very popular for rapid prototyping of algorithms in the area of digital communication, digital signal processing, and image processing. Our paper addresses the related issues of implementation using hardware descriptive language in the development of the design and the subsequent downloading on the programmable hardware chip.
Academic consortium for the evaluation of computer-aided diagnosis (CADx) in mammography

NASA Astrophysics Data System (ADS)

Mun, Seong K.; Freedman, Matthew T.; Wu, Chris Y.; Lo, Shih-Chung B.; Floyd, Carey E., Jr.; Lo, Joseph Y.; Chan, Heang-Ping; Helvie, Mark A.; Petrick, Nicholas; Sahiner, Berkman; Wei, Datong; Chakraborty, Dev P.; Clarke, Laurence P.; Kallergi, Maria; Clark, Bob; Kim, Yongmin

1995-04-01

Computer aided diagnosis (CADx) is a promising technology for the detection of breast cancer in screening mammography. A number of different approaches have been developed for CADx research that have achieved significant levels of performance. Research teams now recognize the need for a careful and detailed evaluation study of approaches to accelerate the development of CADx, to make CADx more clinically relevant and to optimize the CADx algorithms based on unbiased evaluations. The results of such a comparative study may provide each of the participating teams with new insights into the optimization of their individual CADx algorithms. This consortium of experienced CADx researchers is working as a group to compare results of the algorithms and to optimize the performance of CADx algorithms by learning from each other. Each institution will be contributing an equal number of cases that will be collected under a standard protocol for case selection, truth determination, and data acquisition to establish a common and unbiased database for the evaluation study. An evaluation procedure for the comparison studies are being developed to analyze the results of individual algorithms for each of the test cases in the common database. Optimization of individual CADx algorithms can be made based on the comparison studies. The consortium effort is expected to accelerate the eventual clinical implementation of CADx algorithms at participating institutions.
A discrete Fourier transform for virtual memory machines

NASA Technical Reports Server (NTRS)

Galant, David C.

1992-01-01

An algebraic theory of the Discrete Fourier Transform is developed in great detail. Examination of the details of the theory leads to a computationally efficient fast Fourier transform for the use on computers with virtual memory. Such an algorithm is of great use on modern desktop machines. A FORTRAN coded version of the algorithm is given for the case when the sequence of numbers to be transformed is a power of two.
Resolution Study of a Hyperspectral Sensor using Computed Tomography in the Presence of Noise

DTIC Science & Technology

2012-06-14

diffraction efficiency is dependent on wavelength. Compared to techniques developed by later work, simple algebraic reconstruction techniques were used...spectral di- mension, using computed tomography (CT) techniques with only a finite number of diverse images. CTHIS require a reconstruction algorithm in...many frames are needed to reconstruct the spectral cube of a simple object using a theoretical lower bound. In this research a new algorithm is derived
Development and application of the GIM code for the Cyber 203 computer

NASA Technical Reports Server (NTRS)

Stainaker, J. F.; Robinson, M. A.; Rawlinson, E. G.; Anderson, P. G.; Mayne, A. W.; Spradley, L. W.

1982-01-01

The GIM computer code for fluid dynamics research was developed. Enhancement of the computer code, implicit algorithm development, turbulence model implementation, chemistry model development, interactive input module coding and wing/body flowfield computation are described. The GIM quasi-parabolic code development was completed, and the code used to compute a number of example cases. Turbulence models, algebraic and differential equations, were added to the basic viscous code. An equilibrium reacting chemistry model and implicit finite difference scheme were also added. Development was completed on the interactive module for generating the input data for GIM. Solutions for inviscid hypersonic flow over a wing/body configuration are also presented.
Satellite orbit computation methods

NASA Technical Reports Server (NTRS)

1977-01-01

Mathematical and algorithmical techniques for solution of problems in satellite dynamics were developed, along with solutions to satellite orbit motion. Dynamical analysis of shuttle on-orbit operations were conducted. Computer software routines for use in shuttle mission planning were developed and analyzed, while mathematical models of atmospheric density were formulated.
QPSO-Based Adaptive DNA Computing Algorithm

PubMed Central

Karakose, Mehmet; Cigdem, Ugur

2013-01-01

DNA (deoxyribonucleic acid) computing that is a new computation model based on DNA molecules for information storage has been increasingly used for optimization and data analysis in recent years. However, DNA computing algorithm has some limitations in terms of convergence speed, adaptability, and effectiveness. In this paper, a new approach for improvement of DNA computing is proposed. This new approach aims to perform DNA computing algorithm with adaptive parameters towards the desired goal using quantum-behaved particle swarm optimization (QPSO). Some contributions provided by the proposed QPSO based on adaptive DNA computing algorithm are as follows: (1) parameters of population size, crossover rate, maximum number of operations, enzyme and virus mutation rate, and fitness function of DNA computing algorithm are simultaneously tuned for adaptive process, (2) adaptive algorithm is performed using QPSO algorithm for goal-driven progress, faster operation, and flexibility in data, and (3) numerical realization of DNA computing algorithm with proposed approach is implemented in system identification. Two experiments with different systems were carried out to evaluate the performance of the proposed approach with comparative results. Experimental results obtained with Matlab and FPGA demonstrate ability to provide effective optimization, considerable convergence speed, and high accuracy according to DNA computing algorithm. PMID:23935409

Algorithm To Architecture Mapping Model (ATAMM) multicomputer operating system functional specification

NASA Technical Reports Server (NTRS)

Mielke, R.; Stoughton, J.; Som, S.; Obando, R.; Malekpour, M.; Mandala, B.

1990-01-01

A functional description of the ATAMM Multicomputer Operating System is presented. ATAMM (Algorithm to Architecture Mapping Model) is a marked graph model which describes the implementation of large grained, decomposed algorithms on data flow architectures. AMOS, the ATAMM Multicomputer Operating System, is an operating system which implements the ATAMM rules. A first generation version of AMOS which was developed for the Advanced Development Module (ADM) is described. A second generation version of AMOS being developed for the Generic VHSIC Spaceborne Computer (GVSC) is also presented.
A reductionist approach to the analysis of learning in brain-computer interfaces.

PubMed

Danziger, Zachary

2014-04-01

The complexity and scale of brain-computer interface (BCI) studies limit our ability to investigate how humans learn to use BCI systems. It also limits our capacity to develop adaptive algorithms needed to assist users with their control. Adaptive algorithm development is forced offline and typically uses static data sets. But this is a poor substitute for the online, dynamic environment where algorithms are ultimately deployed and interact with an adapting user. This work evaluates a paradigm that simulates the control problem faced by human subjects when controlling a BCI, but which avoids the many complications associated with full-scale BCI studies. Biological learners can be studied in a reductionist way as they solve BCI-like control problems, and machine learning algorithms can be developed and tested in closed loop with the subjects before being translated to full BCIs. The method is to map 19 joint angles of the hand (representing neural signals) to the position of a 2D cursor which must be piloted to displayed targets (a typical BCI task). An investigation is presented on how closely the joint angle method emulates BCI systems; a novel learning algorithm is evaluated, and a performance difference between genders is discussed.
Accelerating Dust Storm Simulation by Balancing Task Allocation in Parallel Computing Environment

NASA Astrophysics Data System (ADS)

Gui, Z.; Yang, C.; XIA, J.; Huang, Q.; YU, M.

2013-12-01

Dust storm has serious negative impacts on environment, human health, and assets. The continuing global climate change has increased the frequency and intensity of dust storm in the past decades. To better understand and predict the distribution, intensity and structure of dust storm, a series of dust storm models have been developed, such as Dust Regional Atmospheric Model (DREAM), the NMM meteorological module (NMM-dust) and Chinese Unified Atmospheric Chemistry Environment for Dust (CUACE/Dust). The developments and applications of these models have contributed significantly to both scientific research and our daily life. However, dust storm simulation is a data and computing intensive process. Normally, a simulation for a single dust storm event may take several days or hours to run. It seriously impacts the timeliness of prediction and potential applications. To speed up the process, high performance computing is widely adopted. By partitioning a large study area into small subdomains according to their geographic location and executing them on different computing nodes in a parallel fashion, the computing performance can be significantly improved. Since spatiotemporal correlations exist in the geophysical process of dust storm simulation, each subdomain allocated to a node need to communicate with other geographically adjacent subdomains to exchange data. Inappropriate allocations may introduce imbalance task loads and unnecessary communications among computing nodes. Therefore, task allocation method is the key factor, which may impact the feasibility of the paralleling. The allocation algorithm needs to carefully leverage the computing cost and communication cost for each computing node to minimize total execution time and reduce overall communication cost for the entire system. This presentation introduces two algorithms for such allocation and compares them with evenly distributed allocation method. Specifically, 1) In order to get optimized solutions, a quadratic programming based modeling method is proposed. This algorithm performs well with small amount of computing tasks. However, its efficiency decreases significantly as the subdomain number and computing node number increase. 2) To compensate performance decreasing for large scale tasks, a K-Means clustering based algorithm is introduced. Instead of dedicating to get optimized solutions, this method can get relatively good feasible solutions within acceptable time. However, it may introduce imbalance communication for nodes or node-isolated subdomains. This research shows both two algorithms have their own strength and weakness for task allocation. A combination of the two algorithms is under study to obtain a better performance. Keywords: Scheduling; Parallel Computing; Load Balance; Optimization; Cost Model
Simple algorithms for digital pulse-shape discrimination with liquid scintillation detectors

NASA Astrophysics Data System (ADS)

Alharbi, T.

2015-01-01

The development of compact, battery-powered digital liquid scintillation neutron detection systems for field applications requires digital pulse processing (DPP) algorithms with minimum computational overhead. To meet this demand, two DPP algorithms for the discrimination of neutron and γ-rays with liquid scintillation detectors were developed and examined by using a NE213 liquid scintillation detector in a mixed radiation field. The first algorithm is based on the relation between the amplitude of a current pulse at the output of a photomultiplier tube and the amount of charge contained in the pulse. A figure-of-merit (FOM) value of 0.98 with 450 keVee (electron equivalent energy) energy threshold was achieved with this method when pulses were sampled at 250 MSample/s and with 8-bit resolution. Compared to the similar method of charge-comparison this method requires only a single integration window, thereby reducing the amount of computations by approximately 40%. The second approach is a digital version of the trailing-edge constant-fraction discrimination method. A FOM value of 0.84 with an energy threshold of 450 keVee was achieved with this method. In comparison with the similar method of rise-time discrimination this method requires a single time pick-off, thereby reducing the amount of computations by approximately 50%. The algorithms described in this work are useful for developing portable detection systems for applications such as homeland security, radiation dosimetry and environmental monitoring.
The Quantitative Analysis of User Behavior Online - Data, Models and Algorithms

NASA Astrophysics Data System (ADS)

Raghavan, Prabhakar

By blending principles from mechanism design, algorithms, machine learning and massive distributed computing, the search industry has become good at optimizing monetization on sound scientific principles. This represents a successful and growing partnership between computer science and microeconomics. When it comes to understanding how online users respond to the content and experiences presented to them, we have more of a lacuna in the collaboration between computer science and certain social sciences. We will use a concrete technical example from image search results presentation, developing in the process some algorithmic and machine learning problems of interest in their own right. We then use this example to motivate the kinds of studies that need to grow between computer science and the social sciences; a critical element of this is the need to blend large-scale data analysis with smaller-scale eye-tracking and "individualized" lab studies.
West Virginia US Department of Energy experimental program to stimulate competitive research. Section 2: Human resource development; Section 3: Carbon-based structural materials research cluster; Section 3: Data parallel algorithms for scientific computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1994-02-02

This report consists of three separate but related reports. They are (1) Human Resource Development, (2) Carbon-based Structural Materials Research Cluster, and (3) Data Parallel Algorithms for Scientific Computing. To meet the objectives of the Human Resource Development plan, the plan includes K--12 enrichment activities, undergraduate research opportunities for students at the state`s two Historically Black Colleges and Universities, graduate research through cluster assistantships and through a traineeship program targeted specifically to minorities, women and the disabled, and faculty development through participation in research clusters. One research cluster is the chemistry and physics of carbon-based materials. The objective of thismore » cluster is to develop a self-sustaining group of researchers in carbon-based materials research within the institutions of higher education in the state of West Virginia. The projects will involve analysis of cokes, graphites and other carbons in order to understand the properties that provide desirable structural characteristics including resistance to oxidation, levels of anisotropy and structural characteristics of the carbons themselves. In the proposed cluster on parallel algorithms, research by four WVU faculty and three state liberal arts college faculty are: (1) modeling of self-organized critical systems by cellular automata; (2) multiprefix algorithms and fat-free embeddings; (3) offline and online partitioning of data computation; and (4) manipulating and rendering three dimensional objects. This cluster furthers the state Experimental Program to Stimulate Competitive Research plan by building on existing strengths at WVU in parallel algorithms.« less
BIBLIO: A Reprint File Management Algorithm

ERIC Educational Resources Information Center

Zelnio, Robert N.; And Others

1977-01-01

The development of a simple computer algorithm designed for use by the individual educator or researcher in maintaining and searching reprint files is reported. Called BIBLIO, the system is inexpensive and easy to operate and maintain without sacrificing flexibility and utility. (LBH)
Applications of Parallel Computation in Micro-Mechanics and Finite Element Method

NASA Technical Reports Server (NTRS)

Tan, Hui-Qian

1996-01-01

This project discusses the application of parallel computations related with respect to material analyses. Briefly speaking, we analyze some kind of material by elements computations. We call an element a cell here. A cell is divided into a number of subelements called subcells and all subcells in a cell have the identical structure. The detailed structure will be given later in this paper. It is obvious that the problem is "well-structured". SIMD machine would be a better choice. In this paper we try to look into the potentials of SIMD machine in dealing with finite element computation by developing appropriate algorithms on MasPar, a SIMD parallel machine. In section 2, the architecture of MasPar will be discussed. A brief review of the parallel programming language MPL also is given in that section. In section 3, some general parallel algorithms which might be useful to the project will be proposed. And, combining with the algorithms, some features of MPL will be discussed in more detail. In section 4, the computational structure of cell/subcell model will be given. The idea of designing the parallel algorithm for the model will be demonstrated. Finally in section 5, a summary will be given.
A programmable five qubit quantum computer using trapped atomic ions

NASA Astrophysics Data System (ADS)

Debnath, Shantanu

2017-04-01

In order to harness the power of quantum information processing, several candidate systems have been investigated, and tailored to demonstrate only specific computations. In my thesis work, we construct a general-purpose multi-qubit device using a linear chain of trapped ion qubits, which in principle can be programmed to run any quantum algorithm. To achieve such flexibility, we develop a pulse shaping technique to realize a set of fully connected two-qubit rotations that entangle arbitrary pairs of qubits using multiple motional modes of the chain. Following a computation architecture, such highly expressive two-qubit gates along with arbitrary single-qubit rotations can be used to compile modular universal logic gates that are effected by targeted optical fields and hence can be reconfigured according to any algorithm circuit programmed in the software. As a demonstration, we run the Deutsch-Jozsa and Bernstein-Vazirani algorithm, and a fully coherent quantum Fourier transform, that we use to solve the `period finding' and `quantum phase estimation' problem. Combining these results with recent demonstrations of quantum fault-tolerance, Grover's search algorithm, and simulation of boson hopping establishes the versatility of such a computation module that can potentially be connected to other modules for future large-scale computations.
Automated Software Acceleration in Programmable Logic for an Efficient NFFT Algorithm Implementation: A Case Study.

PubMed

Rodríguez, Manuel; Magdaleno, Eduardo; Pérez, Fernando; García, Cristhian

2017-03-28

Non-equispaced Fast Fourier transform (NFFT) is a very important algorithm in several technological and scientific areas such as synthetic aperture radar, computational photography, medical imaging, telecommunications, seismic analysis and so on. However, its computation complexity is high. In this paper, we describe an efficient NFFT implementation with a hardware coprocessor using an All-Programmable System-on-Chip (APSoC). This is a hybrid device that employs an Advanced RISC Machine (ARM) as Processing System with Programmable Logic for high-performance digital signal processing through parallelism and pipeline techniques. The algorithm has been coded in C language with pragma directives to optimize the architecture of the system. We have used the very novel Software Develop System-on-Chip (SDSoC) evelopment tool that simplifies the interface and partitioning between hardware and software. This provides shorter development cycles and iterative improvements by exploring several architectures of the global system. The computational results shows that hardware acceleration significantly outperformed the software based implementation.
Automated Software Acceleration in Programmable Logic for an Efficient NFFT Algorithm Implementation: A Case Study

PubMed Central

Rodríguez, Manuel; Magdaleno, Eduardo; Pérez, Fernando; García, Cristhian

2017-01-01

Non-equispaced Fast Fourier transform (NFFT) is a very important algorithm in several technological and scientific areas such as synthetic aperture radar, computational photography, medical imaging, telecommunications, seismic analysis and so on. However, its computation complexity is high. In this paper, we describe an efficient NFFT implementation with a hardware coprocessor using an All-Programmable System-on-Chip (APSoC). This is a hybrid device that employs an Advanced RISC Machine (ARM) as Processing System with Programmable Logic for high-performance digital signal processing through parallelism and pipeline techniques. The algorithm has been coded in C language with pragma directives to optimize the architecture of the system. We have used the very novel Software Develop System-on-Chip (SDSoC) evelopment tool that simplifies the interface and partitioning between hardware and software. This provides shorter development cycles and iterative improvements by exploring several architectures of the global system. The computational results shows that hardware acceleration significantly outperformed the software based implementation. PMID:28350358
Automating software design system DESTA

NASA Technical Reports Server (NTRS)

Lovitsky, Vladimir A.; Pearce, Patricia D.

1992-01-01

'DESTA' is the acronym for the Dialogue Evolutionary Synthesizer of Turnkey Algorithms by means of a natural language (Russian or English) functional specification of algorithms or software being developed. DESTA represents the computer-aided and/or automatic artificial intelligence 'forgiving' system which provides users with software tools support for algorithm and/or structured program development. The DESTA system is intended to provide support for the higher levels and earlier stages of engineering design of software in contrast to conventional Computer Aided Design (CAD) systems which provide low level tools for use at a stage when the major planning and structuring decisions have already been taken. DESTA is a knowledge-intensive system. The main features of the knowledge are procedures, functions, modules, operating system commands, batch files, their natural language specifications, and their interlinks. The specific domain for the DESTA system is a high level programming language like Turbo Pascal 6.0. The DESTA system is operational and runs on an IBM PC computer.
Numerical computation of solar neutrino flux attenuated by the MSW mechanism

NASA Astrophysics Data System (ADS)

Kim, Jai Sam; Chae, Yoon Sang; Kim, Jung Dae

1999-07-01

We compute the survival probability of an electron neutrino in its flight through the solar core experiencing the Mikheyev-Smirnov-Wolfenstein effect with all three neutrino species considered. We adopted a hybrid method that uses an accurate approximation formula in the non-resonance region and numerical integration in the non-adiabatic resonance region. The key of our algorithm is to use the importance sampling method for sampling the neutrino creation energy and position and to find the optimum radii to start and stop numerical integration. We further developed a parallel algorithm for a message passing parallel computer. By using an idea of job token, we have developed a dynamical load balancing mechanism which is effective under any irregular load distributions
Shared Memory Parallelization of an Implicit ADI-type CFD Code

NASA Technical Reports Server (NTRS)

Hauser, Th.; Huang, P. G.

1999-01-01

A parallelization study designed for ADI-type algorithms is presented using the OpenMP specification for shared-memory multiprocessor programming. Details of optimizations specifically addressed to cache-based computer architectures are described and performance measurements for the single and multiprocessor implementation are summarized. The paper demonstrates that optimization of memory access on a cache-based computer architecture controls the performance of the computational algorithm. A hybrid MPI/OpenMP approach is proposed for clusters of shared memory machines to further enhance the parallel performance. The method is applied to develop a new LES/DNS code, named LESTool. A preliminary DNS calculation of a fully developed channel flow at a Reynolds number of 180, Re(sub tau) = 180, has shown good agreement with existing data.
Three-dimensional monochromatic x-ray computed tomography using synchrotron radiation

NASA Astrophysics Data System (ADS)

Saito, Tsuneo; Kudo, Hiroyuki; Takeda, Tohoru; Itai, Yuji; Tokumori, Kenji; Toyofuku, Fukai; Hyodo, Kazuyuki; Ando, Masami; Nishimura, Katsuyuki; Uyama, Chikao

1998-08-01

We describe a technique of 3D computed tomography (3D CT) using monochromatic x rays generated by synchrotron radiation, which performs a direct reconstruction of a 3D volume image of an object from its cone-beam projections. For the development, we propose a practical scanning orbit of the x-ray source to obtain complete 3D information on an object, and its corresponding 3D image reconstruction algorithm. The validity and usefulness of the proposed scanning orbit and reconstruction algorithm were confirmed by computer simulation studies. Based on these investigations, we have developed a prototype 3D monochromatic x-ray CT using synchrotron radiation, which provides exact 3D reconstruction and material-selective imaging by using the K-edge energy subtraction technique.
Algorithmics - Is There Hope for a Unified Theory?

NASA Astrophysics Data System (ADS)

Hromkovič, Juraj

Computer science was born with the formal definition of the notion of an algorithm. This definition provides clear limits of automatization, separating problems into algorithmically solvable problems and algorithmically unsolvable ones. The second big bang of computer science was the development of the concept of computational complexity. People recognized that problems that do not admit efficient algorithms are not solvable in practice. The search for a reasonable, clear and robust definition of the class of practically solvable algorithmic tasks started with the notion of the class {P} and of {NP}-completeness. In spite of the fact that this robust concept is still fundamental for judging the hardness of computational problems, a variety of approaches was developed for solving instances of {NP}-hard problems in many applications. Our 40-years short attempt to fix the fuzzy border between the practically solvable problems and the practically unsolvable ones partially reminds of the never-ending search for the definition of "life" in biology or for the definitions of matter and energy in physics. Can the search for the formal notion of "practical solvability" also become a never-ending story or is there hope for getting a well-accepted, robust definition of it? Hopefully, it is not surprising that we are not able to answer this question in this invited talk. But to deal with this question is of crucial importance, because only due to enormous effort scientists get a better and better feeling of what the fundamental notions of science like life and energy mean. In the flow of numerous technical results, we must not forget the fact that most of the essential revolutionary contributions to science were done by defining new concepts and notions.
Image processing via VLSI: A concept paper

NASA Technical Reports Server (NTRS)

Nathan, R.

1982-01-01

Implementing specific image processing algorithms via very large scale integrated systems offers a potent solution to the problem of handling high data rates. Two algorithms stand out as being particularly critical -- geometric map transformation and filtering or correlation. These two functions form the basis for data calibration, registration and mosaicking. VLSI presents itself as an inexpensive ancillary function to be added to almost any general purpose computer and if the geometry and filter algorithms are implemented in VLSI, the processing rate bottleneck would be significantly relieved. A set of image processing functions that limit present systems to deal with future throughput needs, translates these functions to algorithms, implements via VLSI technology and interfaces the hardware to a general purpose digital computer is developed.
Towards developing robust algorithms for solving partial differential equations on MIMD machines

NASA Technical Reports Server (NTRS)

Saltz, Joel H.; Naik, Vijay K.

1988-01-01

Methods for efficient computation of numerical algorithms on a wide variety of MIMD machines are proposed. These techniques reorganize the data dependency patterns to improve the processor utilization. The model problem finds the time-accurate solution to a parabolic partial differential equation discretized in space and implicitly marched forward in time. The algorithms are extensions of Jacobi and SOR. The extensions consist of iterating over a window of several timesteps, allowing efficient overlap of computation with communication. The methods increase the degree to which work can be performed while data are communicated between processors. The effect of the window size and of domain partitioning on the system performance is examined both by implementing the algorithm on a simulated multiprocessor system.
Towards developing robust algorithms for solving partial differential equations on MIMD machines

NASA Technical Reports Server (NTRS)

Saltz, J. H.; Naik, V. K.

1985-01-01

Methods for efficient computation of numerical algorithms on a wide variety of MIMD machines are proposed. These techniques reorganize the data dependency patterns to improve the processor utilization. The model problem finds the time-accurate solution to a parabolic partial differential equation discretized in space and implicitly marched forward in time. The algorithms are extensions of Jacobi and SOR. The extensions consist of iterating over a window of several timesteps, allowing efficient overlap of computation with communication. The methods increase the degree to which work can be performed while data are communicated between processors. The effect of the window size and of domain partitioning on the system performance is examined both by implementing the algorithm on a simulated multiprocessor system.
Computing all hybridization networks for multiple binary phylogenetic input trees.

PubMed

Albrecht, Benjamin

2015-07-30

The computation of phylogenetic trees on the same set of species that are based on different orthologous genes can lead to incongruent trees. One possible explanation for this behavior are interspecific hybridization events recombining genes of different species. An important approach to analyze such events is the computation of hybridization networks. This work presents the first algorithm computing the hybridization number as well as a set of representative hybridization networks for multiple binary phylogenetic input trees on the same set of taxa. To improve its practical runtime, we show how this algorithm can be parallelized. Moreover, we demonstrate the efficiency of the software Hybroscale, containing an implementation of our algorithm, by comparing it to PIRNv2.0, which is so far the best available software computing the exact hybridization number for multiple binary phylogenetic trees on the same set of taxa. The algorithm is part of the software Hybroscale, which was developed specifically for the investigation of hybridization networks including their computation and visualization. Hybroscale is freely available(1) and runs on all three major operating systems. Our simulation study indicates that our approach is on average 100 times faster than PIRNv2.0. Moreover, we show how Hybroscale improves the interpretation of the reported hybridization networks by adding certain features to its graphical representation.

Jacobi spectral Galerkin method for elliptic Neumann problems

NASA Astrophysics Data System (ADS)

Doha, E.; Bhrawy, A.; Abd-Elhameed, W.

2009-01-01

This paper is concerned with fast spectral-Galerkin Jacobi algorithms for solving one- and two-dimensional elliptic equations with homogeneous and nonhomogeneous Neumann boundary conditions. The paper extends the algorithms proposed by Shen (SIAM J Sci Comput 15:1489-1505, 1994) and Auteri et al. (J Comput Phys 185:427-444, 2003), based on Legendre polynomials, to Jacobi polynomials with arbitrary α and β. The key to the efficiency of our algorithms is to construct appropriate basis functions with zero slope at the endpoints, which lead to systems with sparse matrices for the discrete variational formulations. The direct solution algorithm developed for the homogeneous Neumann problem in two-dimensions relies upon a tensor product process. Nonhomogeneous Neumann data are accounted for by means of a lifting. Numerical results indicating the high accuracy and effectiveness of these algorithms are presented.
Geometric modeling for computer aided design

NASA Technical Reports Server (NTRS)

Schwing, James L.

1993-01-01

Over the past several years, it has been the primary goal of this grant to design and implement software to be used in the conceptual design of aerospace vehicles. The work carried out under this grant was performed jointly with members of the Vehicle Analysis Branch (VAB) of NASA LaRC, Computer Sciences Corp., and Vigyan Corp. This has resulted in the development of several packages and design studies. Primary among these are the interactive geometric modeling tool, the Solid Modeling Aerospace Research Tool (smart), and the integration and execution tools provided by the Environment for Application Software Integration and Execution (EASIE). In addition, it is the purpose of the personnel of this grant to provide consultation in the areas of structural design, algorithm development, and software development and implementation, particularly in the areas of computer aided design, geometric surface representation, and parallel algorithms.
Supersonic reacting internal flowfields

NASA Astrophysics Data System (ADS)

Drummond, J. P.

The national program to develop a trans-atmospheric vehicle has kindled a renewed interest in the modeling of supersonic reacting flows. A supersonic combustion ramjet, or scramjet, has been proposed to provide the propulsion system for this vehicle. The development of computational techniques for modeling supersonic reacting flowfields, and the application of these techniques to an increasingly difficult set of combustion problems are studied. Since the scramjet problem has been largely responsible for motivating this computational work, a brief history is given of hypersonic vehicles and their propulsion systems. A discussion is also given of some early modeling efforts applied to high speed reacting flows. Current activities to develop accurate and efficient algorithms and improved physical models for modeling supersonic combustion is then discussed. Some new problems where computer codes based on these algorithms and models are being applied are described.
Supersonic reacting internal flow fields

NASA Technical Reports Server (NTRS)

Drummond, J. Philip

1989-01-01

The national program to develop a trans-atmospheric vehicle has kindled a renewed interest in the modeling of supersonic reacting flows. A supersonic combustion ramjet, or scramjet, has been proposed to provide the propulsion system for this vehicle. The development of computational techniques for modeling supersonic reacting flow fields, and the application of these techniques to an increasingly difficult set of combustion problems are studied. Since the scramjet problem has been largely responsible for motivating this computational work, a brief history is given of hypersonic vehicles and their propulsion systems. A discussion is also given of some early modeling efforts applied to high speed reacting flows. Current activities to develop accurate and efficient algorithms and improved physical models for modeling supersonic combustion is then discussed. Some new problems where computer codes based on these algorithms and models are being applied are described.
Implementation of real-time digital signal processing systems

NASA Technical Reports Server (NTRS)

Narasimha, M.; Peterson, A.; Narayan, S.

1978-01-01

Special purpose hardware implementation of DFT Computers and digital filters is considered in the light of newly introduced algorithms and IC devices. Recent work by Winograd on high-speed convolution techniques for computing short length DFT's, has motivated the development of more efficient algorithms, compared to the FFT, for evaluating the transform of longer sequences. Among these, prime factor algorithms appear suitable for special purpose hardware implementations. Architectural considerations in designing DFT computers based on these algorithms are discussed. With the availability of monolithic multiplier-accumulators, a direct implementation of IIR and FIR filters, using random access memories in place of shift registers, appears attractive. The memory addressing scheme involved in such implementations is discussed. A simple counter set-up to address the data memory in the realization of FIR filters is also described. The combination of a set of simple filters (weighting network) and a DFT computer is shown to realize a bank of uniform bandpass filters. The usefulness of this concept in arriving at a modular design for a million channel spectrum analyzer, based on microprocessors, is discussed.
Quantum plug n’ play: modular computation in the quantum regime

NASA Astrophysics Data System (ADS)

Thompson, Jayne; Modi, Kavan; Vedral, Vlatko; Gu, Mile

2018-01-01

Classical computation is modular. It exploits plug n’ play architectures which allow us to use pre-fabricated circuits without knowing their construction. This bestows advantages such as allowing parts of the computational process to be outsourced, and permitting individual circuit components to be exchanged and upgraded. Here, we introduce a formal framework to describe modularity in the quantum regime. We demonstrate a ‘no-go’ theorem, stipulating that it is not always possible to make use of quantum circuits without knowing their construction. This has significant consequences for quantum algorithms, forcing the circuit implementation of certain quantum algorithms to be rebuilt almost entirely from scratch after incremental changes in the problem—such as changing the number being factored in Shor’s algorithm. We develop a workaround capable of restoring modularity, and apply it to design a modular version of Shor’s algorithm that exhibits increased versatility and reduced complexity. In doing so we pave the way to a realistic framework whereby ‘quantum chips’ and remote servers can be invoked (or assembled) to implement various parts of a more complex quantum computation.
A Novel Latin Hypercube Algorithm via Translational Propagation

PubMed Central

Pan, Guang; Ye, Pengcheng

2014-01-01

Metamodels have been widely used in engineering design to facilitate analysis and optimization of complex systems that involve computationally expensive simulation programs. The accuracy of metamodels is directly related to the experimental designs used. Optimal Latin hypercube designs are frequently used and have been shown to have good space-filling and projective properties. However, the high cost in constructing them limits their use. In this paper, a methodology for creating novel Latin hypercube designs via translational propagation and successive local enumeration algorithm (TPSLE) is developed without using formal optimization. TPSLE algorithm is based on the inspiration that a near optimal Latin Hypercube design can be constructed by a simple initial block with a few points generated by algorithm SLE as a building block. In fact, TPSLE algorithm offers a balanced trade-off between the efficiency and sampling performance. The proposed algorithm is compared to two existing algorithms and is found to be much more efficient in terms of the computation time and has acceptable space-filling and projective properties. PMID:25276844
Decision support methods for the detection of adverse events in post-marketing data.

PubMed

Hauben, M; Bate, A

2009-04-01

Spontaneous reporting is a crucial component of post-marketing drug safety surveillance despite its significant limitations. The size and complexity of some spontaneous reporting system databases represent a challenge for drug safety professionals who traditionally have relied heavily on the scientific and clinical acumen of the prepared mind. Computer algorithms that calculate statistical measures of reporting frequency for huge numbers of drug-event combinations are increasingly used to support pharamcovigilance analysts screening large spontaneous reporting system databases. After an overview of pharmacovigilance and spontaneous reporting systems, we discuss the theory and application of contemporary computer algorithms in regular use, those under development, and the practical considerations involved in the implementation of computer algorithms within a comprehensive and holistic drug safety signal detection program.
SU-F-I-45: An Automated Technique to Measure Image Contrast in Clinical CT Images

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sanders, J; Abadi, E; Meng, B

Purpose: To develop and validate an automated technique for measuring image contrast in chest computed tomography (CT) exams. Methods: An automated computer algorithm was developed to measure the distribution of Hounsfield units (HUs) inside four major organs: the lungs, liver, aorta, and bones. These organs were first segmented or identified using computer vision and image processing techniques. Regions of interest (ROIs) were automatically placed inside the lungs, liver, and aorta and histograms of the HUs inside the ROIs were constructed. The mean and standard deviation of each histogram were computed for each CT dataset. Comparison of the mean and standardmore » deviation of the HUs in the different organs provides different contrast values. The ROI for the bones is simply the segmentation mask of the bones. Since the histogram for bones does not follow a Gaussian distribution, the 25th and 75th percentile were computed instead of the mean. The sensitivity and accuracy of the algorithm was investigated by comparing the automated measurements with manual measurements. Fifteen contrast enhanced and fifteen non-contrast enhanced chest CT clinical datasets were examined in the validation procedure. Results: The algorithm successfully measured the histograms of the four organs in both contrast and non-contrast enhanced chest CT exams. The automated measurements were in agreement with manual measurements. The algorithm has sufficient sensitivity as indicated by the near unity slope of the automated versus manual measurement plots. Furthermore, the algorithm has sufficient accuracy as indicated by the high coefficient of determination, R2, values ranging from 0.879 to 0.998. Conclusion: Patient-specific image contrast can be measured from clinical datasets. The algorithm can be run on both contrast enhanced and non-enhanced clinical datasets. The method can be applied to automatically assess the contrast characteristics of clinical chest CT images and quantify dependencies that may not be captured in phantom data.« less
Task scheduling in dataflow computer architectures

NASA Technical Reports Server (NTRS)

Katsinis, Constantine

1994-01-01

Dataflow computers provide a platform for the solution of a large class of computational problems, which includes digital signal processing and image processing. Many typical applications are represented by a set of tasks which can be repetitively executed in parallel as specified by an associated dataflow graph. Research in this area aims to model these architectures, develop scheduling procedures, and predict the transient and steady state performance. Researchers at NASA have created a model and developed associated software tools which are capable of analyzing a dataflow graph and predicting its runtime performance under various resource and timing constraints. These models and tools were extended and used in this work. Experiments using these tools revealed certain properties of such graphs that require further study. Specifically, the transient behavior at the beginning of the execution of a graph can have a significant effect on the steady state performance. Transformation and retiming of the application algorithm and its initial conditions can produce a different transient behavior and consequently different steady state performance. The effect of such transformations on the resource requirements or under resource constraints requires extensive study. Task scheduling to obtain maximum performance (based on user-defined criteria), or to satisfy a set of resource constraints, can also be significantly affected by a transformation of the application algorithm. Since task scheduling is performed by heuristic algorithms, further research is needed to determine if new scheduling heuristics can be developed that can exploit such transformations. This work has provided the initial development for further long-term research efforts. A simulation tool was completed to provide insight into the transient and steady state execution of a dataflow graph. A set of scheduling algorithms was completed which can operate in conjunction with the modeling and performance tools previously developed. Initial studies on the performance of these algorithms were done to examine the effects of application algorithm transformations as measured by such quantities as number of processors, time between outputs, time between input and output, communication time, and memory size.
A structure preserving Lanczos algorithm for computing the optical absorption spectrum

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shao, Meiyue; Jornada, Felipe H. da; Lin, Lin

2016-11-16

We present a new structure preserving Lanczos algorithm for approximating the optical absorption spectrum in the context of solving full Bethe-Salpeter equation without Tamm-Dancoff approximation. The new algorithm is based on a structure preserving Lanczos procedure, which exploits the special block structure of Bethe-Salpeter Hamiltonian matrices. A recently developed technique of generalized averaged Gauss quadrature is incorporated to accelerate the convergence. We also establish the connection between our structure preserving Lanczos procedure with several existing Lanczos procedures developed in different contexts. Numerical examples are presented to demonstrate the effectiveness of our Lanczos algorithm.
Implicit, nonswitching, vector-oriented algorithm for steady transonic flow

NASA Technical Reports Server (NTRS)

Lottati, I.

1983-01-01

A rapid computation of a sequence of transonic flow solutions has to be performed in many areas of aerodynamic technology. The employment of low-cost vector array processors makes the conduction of such calculations economically feasible. However, for a full utilization of the new hardware, the developed algorithms must take advantage of the special characteristics of the vector array processor. The present investigation has the objective to develop an efficient algorithm for solving transonic flow problems governed by mixed partial differential equations on an array processor.
Agent-Based Multicellular Modeling for Predictive Toxicology

EPA Science Inventory

Biological modeling is a rapidly growing field that has benefited significantly from recent technological advances, expanding traditional methods with greater computing power, parameter-determination algorithms, and the development of novel computational approaches to modeling bi...
A multi-emitter fitting algorithm for potential live cell super-resolution imaging over a wide range of molecular densities.

PubMed

Takeshima, T; Takahashi, T; Yamashita, J; Okada, Y; Watanabe, S

2018-05-25

Multi-emitter fitting algorithms have been developed to improve the temporal resolution of single-molecule switching nanoscopy, but the molecular density range they can analyse is narrow and the computation required is intensive, significantly limiting their practical application. Here, we propose a computationally fast method, wedged template matching (WTM), an algorithm that uses a template matching technique to localise molecules at any overlapping molecular density from sparse to ultrahigh density with subdiffraction resolution. WTM achieves the localization of overlapping molecules at densities up to 600 molecules μm -2 with a high detection sensitivity and fast computational speed. WTM also shows localization precision comparable with that of DAOSTORM (an algorithm for high-density super-resolution microscopy), at densities up to 20 molecules μm -2 , and better than DAOSTORM at higher molecular densities. The application of WTM to a high-density biological sample image demonstrated that it resolved protein dynamics from live cell images with subdiffraction resolution and a temporal resolution of several hundred milliseconds or less through a significant reduction in the number of camera images required for a high-density reconstruction. WTM algorithm is a computationally fast, multi-emitter fitting algorithm that can analyse over a wide range of molecular densities. The algorithm is available through the website. https://doi.org/10.17632/bf3z6xpn5j.1. © 2018 The Authors. Journal of Microscopy published by JohnWiley & Sons Ltd on behalf of Royal Microscopical Society.
Algorithmic trends in computational fluid dynamics; The Institute for Computer Applications in Science and Engineering (ICASE)/LaRC Workshop, NASA Langley Research Center, Hampton, VA, US, Sep. 15-17, 1991

NASA Technical Reports Server (NTRS)

Hussaini, M. Y. (Editor); Kumar, A. (Editor); Salas, M. D. (Editor)

1993-01-01

The purpose here is to assess the state of the art in the areas of numerical analysis that are particularly relevant to computational fluid dynamics (CFD), to identify promising new developments in various areas of numerical analysis that will impact CFD, and to establish a long-term perspective focusing on opportunities and needs. Overviews are given of discretization schemes, computational fluid dynamics, algorithmic trends in CFD for aerospace flow field calculations, simulation of compressible viscous flow, and massively parallel computation. Also discussed are accerelation methods, spectral and high-order methods, multi-resolution and subcell resolution schemes, and inherently multidimensional schemes.
A real-time implementation of an advanced sensor failure detection, isolation, and accommodation algorithm

NASA Technical Reports Server (NTRS)

Delaat, J. C.; Merrill, W. C.

1983-01-01

A sensor failure detection, isolation, and accommodation algorithm was developed which incorporates analytic sensor redundancy through software. This algorithm was implemented in a high level language on a microprocessor based controls computer. Parallel processing and state-of-the-art 16-bit microprocessors are used along with efficient programming practices to achieve real-time operation.
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.

Code of Federal Regulations, 2011 CFR

2011-10-01

... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... or will be developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar...
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.

Code of Federal Regulations, 2012 CFR

2012-10-01

... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... or will be developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar...
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.

Code of Federal Regulations, 2014 CFR

2014-10-01

... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... or will be developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar...
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.

Code of Federal Regulations, 2010 CFR

2010-10-01

... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar data produced for...

Improved algorithm for computerized detection and quantification of pulmonary emphysema at high-resolution computed tomography (HRCT)

NASA Astrophysics Data System (ADS)

Tylen, Ulf; Friman, Ola; Borga, Magnus; Angelhed, Jan-Erik

2001-05-01

Emphysema is characterized by destruction of lung tissue with development of small or large holes within the lung. These areas will have Hounsfield values (HU) approaching -1000. It is possible to detect and quantificate such areas using simple density mask technique. The edge enhancement reconstruction algorithm, gravity and motion of the heart and vessels during scanning causes artefacts, however. The purpose of our work was to construct an algorithm that detects such image artefacts and corrects them. The first step is to apply inverse filtering to the image removing much of the effect of the edge enhancement reconstruction algorithm. The next step implies computation of the antero-posterior density gradient caused by gravity and correction for that. Motion artefacts are in a third step corrected for by use of normalized averaging, thresholding and region growing. Twenty healthy volunteers were investigated, 10 with slight emphysema and 10 without. Using simple density mask technique it was not possible to separate persons with disease from those without. Our algorithm improved separation of the two groups considerably. Our algorithm needs further refinement, but may form a basis for further development of methods for computerized diagnosis and quantification of emphysema by HRCT.
RGCA: A Reliable GPU Cluster Architecture for Large-Scale Internet of Things Computing Based on Effective Performance-Energy Optimization

PubMed Central

Chen, Qingkui; Zhao, Deyu; Wang, Jingjuan

2017-01-01

This paper aims to develop a low-cost, high-performance and high-reliability computing system to process large-scale data using common data mining algorithms in the Internet of Things (IoT) computing environment. Considering the characteristics of IoT data processing, similar to mainstream high performance computing, we use a GPU (Graphics Processing Unit) cluster to achieve better IoT services. Firstly, we present an energy consumption calculation method (ECCM) based on WSNs. Then, using the CUDA (Compute Unified Device Architecture) Programming model, we propose a Two-level Parallel Optimization Model (TLPOM) which exploits reasonable resource planning and common compiler optimization techniques to obtain the best blocks and threads configuration considering the resource constraints of each node. The key to this part is dynamic coupling Thread-Level Parallelism (TLP) and Instruction-Level Parallelism (ILP) to improve the performance of the algorithms without additional energy consumption. Finally, combining the ECCM and the TLPOM, we use the Reliable GPU Cluster Architecture (RGCA) to obtain a high-reliability computing system considering the nodes’ diversity, algorithm characteristics, etc. The results show that the performance of the algorithms significantly increased by 34.1%, 33.96% and 24.07% for Fermi, Kepler and Maxwell on average with TLPOM and the RGCA ensures that our IoT computing system provides low-cost and high-reliability services. PMID:28777325
RGCA: A Reliable GPU Cluster Architecture for Large-Scale Internet of Things Computing Based on Effective Performance-Energy Optimization.

PubMed

Fang, Yuling; Chen, Qingkui; Xiong, Neal N; Zhao, Deyu; Wang, Jingjuan

2017-08-04

This paper aims to develop a low-cost, high-performance and high-reliability computing system to process large-scale data using common data mining algorithms in the Internet of Things (IoT) computing environment. Considering the characteristics of IoT data processing, similar to mainstream high performance computing, we use a GPU (Graphics Processing Unit) cluster to achieve better IoT services. Firstly, we present an energy consumption calculation method (ECCM) based on WSNs. Then, using the CUDA (Compute Unified Device Architecture) Programming model, we propose a Two-level Parallel Optimization Model (TLPOM) which exploits reasonable resource planning and common compiler optimization techniques to obtain the best blocks and threads configuration considering the resource constraints of each node. The key to this part is dynamic coupling Thread-Level Parallelism (TLP) and Instruction-Level Parallelism (ILP) to improve the performance of the algorithms without additional energy consumption. Finally, combining the ECCM and the TLPOM, we use the Reliable GPU Cluster Architecture (RGCA) to obtain a high-reliability computing system considering the nodes' diversity, algorithm characteristics, etc. The results show that the performance of the algorithms significantly increased by 34.1%, 33.96% and 24.07% for Fermi, Kepler and Maxwell on average with TLPOM and the RGCA ensures that our IoT computing system provides low-cost and high-reliability services.
An improved shuffled frog leaping algorithm based evolutionary framework for currency exchange rate prediction

NASA Astrophysics Data System (ADS)

Dash, Rajashree

2017-11-01

Forecasting purchasing power of one currency with respect to another currency is always an interesting topic in the field of financial time series prediction. Despite the existence of several traditional and computational models for currency exchange rate forecasting, there is always a need for developing simpler and more efficient model, which will produce better prediction capability. In this paper, an evolutionary framework is proposed by using an improved shuffled frog leaping (ISFL) algorithm with a computationally efficient functional link artificial neural network (CEFLANN) for prediction of currency exchange rate. The model is validated by observing the monthly prediction measures obtained for three currency exchange data sets such as USD/CAD, USD/CHF, and USD/JPY accumulated within same period of time. The model performance is also compared with two other evolutionary learning techniques such as Shuffled frog leaping algorithm and Particle Swarm optimization algorithm. Practical analysis of results suggest that, the proposed model developed using the ISFL algorithm with CEFLANN network is a promising predictor model for currency exchange rate prediction compared to other models included in the study.
Large-scale image region documentation for fully automated image biomarker algorithm development and evaluation.

PubMed

Reeves, Anthony P; Xie, Yiting; Liu, Shuang

2017-04-01

With the advent of fully automated image analysis and modern machine learning methods, there is a need for very large image datasets having documented segmentations for both computer algorithm training and evaluation. This paper presents a method and implementation for facilitating such datasets that addresses the critical issue of size scaling for algorithm validation and evaluation; current evaluation methods that are usually used in academic studies do not scale to large datasets. This method includes protocols for the documentation of many regions in very large image datasets; the documentation may be incrementally updated by new image data and by improved algorithm outcomes. This method has been used for 5 years in the context of chest health biomarkers from low-dose chest CT images that are now being used with increasing frequency in lung cancer screening practice. The lung scans are segmented into over 100 different anatomical regions, and the method has been applied to a dataset of over 20,000 chest CT images. Using this framework, the computer algorithms have been developed to achieve over 90% acceptable image segmentation on the complete dataset.
Superior Generalization Capability of Hardware-Learing Algorithm Developed for Self-Learning Neuron-MOS Neural Networks

NASA Astrophysics Data System (ADS)

Kondo, Shuhei; Shibata, Tadashi; Ohmi, Tadahiro

1995-02-01

We have investigated the learning performance of the hardware backpropagation (HBP) algorithm, a hardware-oriented learning algorithm developed for the self-learning architecture of neural networks constructed using neuron MOS (metal-oxide-semiconductor) transistors. The solution to finding a mirror symmetry axis in a 4×4 binary pixel array was tested by computer simulation based on the HBP algorithm. Despite the inherent restrictions imposed on the hardware-learning algorithm, HBP exhibits equivalent learning performance to that of the original backpropagation (BP) algorithm when all the pertinent parameters are optimized. Very importantly, we have found that HBP has a superior generalization capability over BP; namely, HBP exhibits higher performance in solving problems that the network has not yet learnt.
Parallel solution of sparse one-dimensional dynamic programming problems

NASA Technical Reports Server (NTRS)

Nicol, David M.

1989-01-01

Parallel computation offers the potential for quickly solving large computational problems. However, it is often a non-trivial task to effectively use parallel computers. Solution methods must sometimes be reformulated to exploit parallelism; the reformulations are often more complex than their slower serial counterparts. We illustrate these points by studying the parallelization of sparse one-dimensional dynamic programming problems, those which do not obviously admit substantial parallelization. We propose a new method for parallelizing such problems, develop analytic models which help us to identify problems which parallelize well, and compare the performance of our algorithm with existing algorithms on a multiprocessor.
On the application of a fast polynomial transform and the Chinese remainder theorem to compute a two-dimensional convolution

NASA Technical Reports Server (NTRS)

Truong, T. K.; Lipes, R.; Reed, I. S.; Wu, C.

1980-01-01

A fast algorithm is developed to compute two dimensional convolutions of an array of d sub 1 X d sub 2 complex number points, where d sub 2 = 2(M) and d sub 1 = 2(m-r+) for some 1 or = r or = m. This algorithm requires fewer multiplications and about the same number of additions as the conventional fast fourier transform method for computing the two dimensional convolution. It also has the advantage that the operation of transposing the matrix of data can be avoided.
Improving Quantum Gate Simulation using a GPU

NASA Astrophysics Data System (ADS)

Gutierrez, Eladio; Romero, Sergio; Trenas, Maria A.; Zapata, Emilio L.

2008-11-01

Due to the increasing computing power of the graphics processing units (GPU), they are becoming more and more popular when solving general purpose algorithms. As the simulation of quantum computers results on a problem with exponential complexity, it is advisable to perform a parallel computation, such as the one provided by the SIMD multiprocessors present in recent GPUs. In this paper, we focus on an important quantum algorithm, the quantum Fourier transform (QTF), in order to evaluate different parallelization strategies on a novel GPU architecture. Our implementation makes use of the new CUDA software/hardware architecture developed recently by NVIDIA.
Radio Synthesis Imaging - A High Performance Computing and Communications Project

NASA Astrophysics Data System (ADS)

Crutcher, Richard M.

The National Science Foundation has funded a five-year High Performance Computing and Communications project at the National Center for Supercomputing Applications (NCSA) for the direct implementation of several of the computing recommendations of the Astronomy and Astrophysics Survey Committee (the "Bahcall report"). This paper is a summary of the project goals and a progress report. The project will implement a prototype of the next generation of astronomical telescope systems - remotely located telescopes connected by high-speed networks to very high performance, scalable architecture computers and on-line data archives, which are accessed by astronomers over Gbit/sec networks. Specifically, a data link has been installed between the BIMA millimeter-wave synthesis array at Hat Creek, California and NCSA at Urbana, Illinois for real-time transmission of data to NCSA. Data are automatically archived, and may be browsed and retrieved by astronomers using the NCSA Mosaic software. In addition, an on-line digital library of processed images will be established. BIMA data will be processed on a very high performance distributed computing system, with I/O, user interface, and most of the software system running on the NCSA Convex C3880 supercomputer or Silicon Graphics Onyx workstations connected by HiPPI to the high performance, massively parallel Thinking Machines Corporation CM-5. The very computationally intensive algorithms for calibration and imaging of radio synthesis array observations will be optimized for the CM-5 and new algorithms which utilize the massively parallel architecture will be developed. Code running simultaneously on the distributed computers will communicate using the Data Transport Mechanism developed by NCSA. The project will also use the BLANCA Gbit/s testbed network between Urbana and Madison, Wisconsin to connect an Onyx workstation in the University of Wisconsin Astronomy Department to the NCSA CM-5, for development of long-distance distributed computing. Finally, the project is developing 2D and 3D visualization software as part of the international AIPS++ project. This research and development project is being carried out by a team of experts in radio astronomy, algorithm development for massively parallel architectures, high-speed networking, database management, and Thinking Machines Corporation personnel. The development of this complete software, distributed computing, and data archive and library solution to the radio astronomy computing problem will advance our expertise in high performance computing and communications technology and the application of these techniques to astronomical data processing.
A parallel implementation of the network identification by multiple regression (NIR) algorithm to reverse-engineer regulatory gene networks.

PubMed

Gregoretti, Francesco; Belcastro, Vincenzo; di Bernardo, Diego; Oliva, Gennaro

2010-04-21

The reverse engineering of gene regulatory networks using gene expression profile data has become crucial to gain novel biological knowledge. Large amounts of data that need to be analyzed are currently being produced due to advances in microarray technologies. Using current reverse engineering algorithms to analyze large data sets can be very computational-intensive. These emerging computational requirements can be met using parallel computing techniques. It has been shown that the Network Identification by multiple Regression (NIR) algorithm performs better than the other ready-to-use reverse engineering software. However it cannot be used with large networks with thousands of nodes--as is the case in biological networks--due to the high time and space complexity. In this work we overcome this limitation by designing and developing a parallel version of the NIR algorithm. The new implementation of the algorithm reaches a very good accuracy even for large gene networks, improving our understanding of the gene regulatory networks that is crucial for a wide range of biomedical applications.
Loci-STREAM Version 0.9

NASA Technical Reports Server (NTRS)

Wright, Jeffrey; Thakur, Siddharth

2006-01-01

Loci-STREAM is an evolving computational fluid dynamics (CFD) software tool for simulating possibly chemically reacting, possibly unsteady flows in diverse settings, including rocket engines, turbomachines, oil refineries, etc. Loci-STREAM implements a pressure- based flow-solving algorithm that utilizes unstructured grids. (The benefit of low memory usage by pressure-based algorithms is well recognized by experts in the field.) The algorithm is robust for flows at all speeds from zero to hypersonic. The flexibility of arbitrary polyhedral grids enables accurate, efficient simulation of flows in complex geometries, including those of plume-impingement problems. The present version - Loci-STREAM version 0.9 - includes an interface with the Portable, Extensible Toolkit for Scientific Computation (PETSc) library for access to enhanced linear-equation-solving programs therein that accelerate convergence toward a solution. The name "Loci" reflects the creation of this software within the Loci computational framework, which was developed at Mississippi State University for the primary purpose of simplifying the writing of complex multidisciplinary application programs to run in distributed-memory computing environments including clusters of personal computers. Loci has been designed to relieve application programmers of the details of programming for distributed-memory computers.
Accuracy of patient-specific organ dose estimates obtained using an automated image segmentation algorithm.

PubMed

Schmidt, Taly Gilat; Wang, Adam S; Coradi, Thomas; Haas, Benjamin; Star-Lack, Josh

2016-10-01

The overall goal of this work is to develop a rapid, accurate, and automated software tool to estimate patient-specific organ doses from computed tomography (CT) scans using simulations to generate dose maps combined with automated segmentation algorithms. This work quantified the accuracy of organ dose estimates obtained by an automated segmentation algorithm. We hypothesized that the autosegmentation algorithm is sufficiently accurate to provide organ dose estimates, since small errors delineating organ boundaries will have minimal effect when computing mean organ dose. A leave-one-out validation study of the automated algorithm was performed with 20 head-neck CT scans expertly segmented into nine regions. Mean organ doses of the automatically and expertly segmented regions were computed from Monte Carlo-generated dose maps and compared. The automated segmentation algorithm estimated the mean organ dose to be within 10% of the expert segmentation for regions other than the spinal canal, with the median error for each organ region below 2%. In the spinal canal region, the median error was [Formula: see text], with a maximum absolute error of 28% for the single-atlas approach and 11% for the multiatlas approach. The results demonstrate that the automated segmentation algorithm can provide accurate organ dose estimates despite some segmentation errors.
Accuracy of patient-specific organ dose estimates obtained using an automated image segmentation algorithm

PubMed Central

Schmidt, Taly Gilat; Wang, Adam S.; Coradi, Thomas; Haas, Benjamin; Star-Lack, Josh

2016-01-01

Abstract. The overall goal of this work is to develop a rapid, accurate, and automated software tool to estimate patient-specific organ doses from computed tomography (CT) scans using simulations to generate dose maps combined with automated segmentation algorithms. This work quantified the accuracy of organ dose estimates obtained by an automated segmentation algorithm. We hypothesized that the autosegmentation algorithm is sufficiently accurate to provide organ dose estimates, since small errors delineating organ boundaries will have minimal effect when computing mean organ dose. A leave-one-out validation study of the automated algorithm was performed with 20 head-neck CT scans expertly segmented into nine regions. Mean organ doses of the automatically and expertly segmented regions were computed from Monte Carlo-generated dose maps and compared. The automated segmentation algorithm estimated the mean organ dose to be within 10% of the expert segmentation for regions other than the spinal canal, with the median error for each organ region below 2%. In the spinal canal region, the median error was −7%, with a maximum absolute error of 28% for the single-atlas approach and 11% for the multiatlas approach. The results demonstrate that the automated segmentation algorithm can provide accurate organ dose estimates despite some segmentation errors. PMID:27921070
Discrete-State Simulated Annealing For Traveling-Wave Tube Slow-Wave Circuit Optimization

NASA Technical Reports Server (NTRS)

Wilson, Jeffrey D.; Bulson, Brian A.; Kory, Carol L.; Williams, W. Dan (Technical Monitor)

2001-01-01

Algorithms based on the global optimization technique of simulated annealing (SA) have proven useful in designing traveling-wave tube (TWT) slow-wave circuits for high RF power efficiency. The characteristic of SA that enables it to determine a globally optimized solution is its ability to accept non-improving moves in a controlled manner. In the initial stages of the optimization, the algorithm moves freely through configuration space, accepting most of the proposed designs. This freedom of movement allows non-intuitive designs to be explored rather than restricting the optimization to local improvement upon the initial configuration. As the optimization proceeds, the rate of acceptance of non-improving moves is gradually reduced until the algorithm converges to the optimized solution. The rate at which the freedom of movement is decreased is known as the annealing or cooling schedule of the SA algorithm. The main disadvantage of SA is that there is not a rigorous theoretical foundation for determining the parameters of the cooling schedule. The choice of these parameters is highly problem dependent and the designer needs to experiment in order to determine values that will provide a good optimization in a reasonable amount of computational time. This experimentation can absorb a large amount of time especially when the algorithm is being applied to a new type of design. In order to eliminate this disadvantage, a variation of SA known as discrete-state simulated annealing (DSSA), was recently developed. DSSA provides the theoretical foundation for a generic cooling schedule which is problem independent, Results of similar quality to SA can be obtained, but without the extra computational time required to tune the cooling parameters. Two algorithm variations based on DSSA were developed and programmed into a Microsoft Excel spreadsheet graphical user interface (GUI) to the two-dimensional nonlinear multisignal helix traveling-wave amplifier analysis program TWA3. The algorithms were used to optimize the computed RF efficiency of a TWT by determining the phase velocity profile of the slow-wave circuit. The mathematical theory and computational details of the DSSA algorithms will be presented and results will be compared to those obtained with a SA algorithm.
Accelerated fast iterative shrinkage thresholding algorithms for sparsity-regularized cone-beam CT image reconstruction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, Qiaofeng; Sawatzky, Alex; Anastasio, Mark A., E-mail: anastasio@wustl.edu

Purpose: The development of iterative image reconstruction algorithms for cone-beam computed tomography (CBCT) remains an active and important research area. Even with hardware acceleration, the overwhelming majority of the available 3D iterative algorithms that implement nonsmooth regularizers remain computationally burdensome and have not been translated for routine use in time-sensitive applications such as image-guided radiation therapy (IGRT). In this work, two variants of the fast iterative shrinkage thresholding algorithm (FISTA) are proposed and investigated for accelerated iterative image reconstruction in CBCT. Methods: Algorithm acceleration was achieved by replacing the original gradient-descent step in the FISTAs by a subproblem that ismore » solved by use of the ordered subset simultaneous algebraic reconstruction technique (OS-SART). Due to the preconditioning matrix adopted in the OS-SART method, two new weighted proximal problems were introduced and corresponding fast gradient projection-type algorithms were developed for solving them. We also provided efficient numerical implementations of the proposed algorithms that exploit the massive data parallelism of multiple graphics processing units. Results: The improved rates of convergence of the proposed algorithms were quantified in computer-simulation studies and by use of clinical projection data corresponding to an IGRT study. The accelerated FISTAs were shown to possess dramatically improved convergence properties as compared to the standard FISTAs. For example, the number of iterations to achieve a specified reconstruction error could be reduced by an order of magnitude. Volumetric images reconstructed from clinical data were produced in under 4 min. Conclusions: The FISTA achieves a quadratic convergence rate and can therefore potentially reduce the number of iterations required to produce an image of a specified image quality as compared to first-order methods. We have proposed and investigated accelerated FISTAs for use with two nonsmooth penalty functions that will lead to further reductions in image reconstruction times while preserving image quality. Moreover, with the help of a mixed sparsity-regularization, better preservation of soft-tissue structures can be potentially obtained. The algorithms were systematically evaluated by use of computer-simulated and clinical data sets.« less
Accelerated fast iterative shrinkage thresholding algorithms for sparsity-regularized cone-beam CT image reconstruction.

PubMed

Xu, Qiaofeng; Yang, Deshan; Tan, Jun; Sawatzky, Alex; Anastasio, Mark A

2016-04-01

The development of iterative image reconstruction algorithms for cone-beam computed tomography (CBCT) remains an active and important research area. Even with hardware acceleration, the overwhelming majority of the available 3D iterative algorithms that implement nonsmooth regularizers remain computationally burdensome and have not been translated for routine use in time-sensitive applications such as image-guided radiation therapy (IGRT). In this work, two variants of the fast iterative shrinkage thresholding algorithm (FISTA) are proposed and investigated for accelerated iterative image reconstruction in CBCT. Algorithm acceleration was achieved by replacing the original gradient-descent step in the FISTAs by a subproblem that is solved by use of the ordered subset simultaneous algebraic reconstruction technique (OS-SART). Due to the preconditioning matrix adopted in the OS-SART method, two new weighted proximal problems were introduced and corresponding fast gradient projection-type algorithms were developed for solving them. We also provided efficient numerical implementations of the proposed algorithms that exploit the massive data parallelism of multiple graphics processing units. The improved rates of convergence of the proposed algorithms were quantified in computer-simulation studies and by use of clinical projection data corresponding to an IGRT study. The accelerated FISTAs were shown to possess dramatically improved convergence properties as compared to the standard FISTAs. For example, the number of iterations to achieve a specified reconstruction error could be reduced by an order of magnitude. Volumetric images reconstructed from clinical data were produced in under 4 min. The FISTA achieves a quadratic convergence rate and can therefore potentially reduce the number of iterations required to produce an image of a specified image quality as compared to first-order methods. We have proposed and investigated accelerated FISTAs for use with two nonsmooth penalty functions that will lead to further reductions in image reconstruction times while preserving image quality. Moreover, with the help of a mixed sparsity-regularization, better preservation of soft-tissue structures can be potentially obtained. The algorithms were systematically evaluated by use of computer-simulated and clinical data sets.
Accelerated fast iterative shrinkage thresholding algorithms for sparsity-regularized cone-beam CT image reconstruction

PubMed Central

Xu, Qiaofeng; Yang, Deshan; Tan, Jun; Sawatzky, Alex; Anastasio, Mark A.

2016-01-01

Purpose: The development of iterative image reconstruction algorithms for cone-beam computed tomography (CBCT) remains an active and important research area. Even with hardware acceleration, the overwhelming majority of the available 3D iterative algorithms that implement nonsmooth regularizers remain computationally burdensome and have not been translated for routine use in time-sensitive applications such as image-guided radiation therapy (IGRT). In this work, two variants of the fast iterative shrinkage thresholding algorithm (FISTA) are proposed and investigated for accelerated iterative image reconstruction in CBCT. Methods: Algorithm acceleration was achieved by replacing the original gradient-descent step in the FISTAs by a subproblem that is solved by use of the ordered subset simultaneous algebraic reconstruction technique (OS-SART). Due to the preconditioning matrix adopted in the OS-SART method, two new weighted proximal problems were introduced and corresponding fast gradient projection-type algorithms were developed for solving them. We also provided efficient numerical implementations of the proposed algorithms that exploit the massive data parallelism of multiple graphics processing units. Results: The improved rates of convergence of the proposed algorithms were quantified in computer-simulation studies and by use of clinical projection data corresponding to an IGRT study. The accelerated FISTAs were shown to possess dramatically improved convergence properties as compared to the standard FISTAs. For example, the number of iterations to achieve a specified reconstruction error could be reduced by an order of magnitude. Volumetric images reconstructed from clinical data were produced in under 4 min. Conclusions: The FISTA achieves a quadratic convergence rate and can therefore potentially reduce the number of iterations required to produce an image of a specified image quality as compared to first-order methods. We have proposed and investigated accelerated FISTAs for use with two nonsmooth penalty functions that will lead to further reductions in image reconstruction times while preserving image quality. Moreover, with the help of a mixed sparsity-regularization, better preservation of soft-tissue structures can be potentially obtained. The algorithms were systematically evaluated by use of computer-simulated and clinical data sets. PMID:27036582
A study of hydrogen diffusion flames using PDF turbulence model

NASA Technical Reports Server (NTRS)

Hsu, Andrew T.

1991-01-01

The application of probability density function (pdf) turbulence models is addressed. For the purpose of accurate prediction of turbulent combustion, an algorithm that combines a conventional computational fluid dynamic (CFD) flow solver with the Monte Carlo simulation of the pdf evolution equation was developed. The algorithm was validated using experimental data for a heated turbulent plane jet. The study of H2-F2 diffusion flames was carried out using this algorithm. Numerical results compared favorably with experimental data. The computations show that the flame center shifts as the equivalence ratio changes, and that for the same equivalence ratio, similarity solutions for flames exist.
Computationally efficient method for Fourier transform of highly chirped pulses for laser and parametric amplifier modeling.

PubMed

Andrianov, Alexey; Szabo, Aron; Sergeev, Alexander; Kim, Arkady; Chvykov, Vladimir; Kalashnikov, Mikhail

2016-11-14

We developed an improved approach to calculate the Fourier transform of signals with arbitrary large quadratic phase which can be efficiently implemented in numerical simulations utilizing Fast Fourier transform. The proposed algorithm significantly reduces the computational cost of Fourier transform of a highly chirped and stretched pulse by splitting it into two separate transforms of almost transform limited pulses, thereby reducing the required grid size roughly by a factor of the pulse stretching. The application of our improved Fourier transform algorithm in the split-step method for numerical modeling of CPA and OPCPA shows excellent agreement with standard algorithms.

Digital SAR processing using a fast polynomial transform

NASA Technical Reports Server (NTRS)

Butman, S.; Lipes, R.; Rubin, A.; Truong, T. K.

1981-01-01

A new digital processing algorithm based on the fast polynomial transform is developed for producing images from Synthetic Aperture Radar data. This algorithm enables the computation of the two dimensional cyclic correlation of the raw echo data with the impulse response of a point target, thereby reducing distortions inherent in one dimensional transforms. This SAR processing technique was evaluated on a general-purpose computer and an actual Seasat SAR image was produced. However, regular production runs will require a dedicated facility. It is expected that such a new SAR processing algorithm could provide the basis for a real-time SAR correlator implementation in the Deep Space Network.
Analysis of Orientations of Collagen Fibers by Novel Fiber-Tracking Software

NASA Astrophysics Data System (ADS)

Wu, Jun; Rajwa, Bartlomiej; Filmer, David L.; Hoffmann, Christoph M.; Yuan, Bo; Chiang, Ching-Shoei; Sturgis, Jennie; Robinson, J. Paul

2003-12-01

Recent evidence supports the notion that biological functions of extracellular matrix (ECM) are highly correlated to not only its composition but also its structure. This article integrates confocal microscopy imaging and image-processing techniques to analyze the microstructural properties of ECM. This report describes a two- and three-dimensional fiber middle-line tracing algorithm that may be used to quantify collagen fibril organization. We utilized computer simulation and statistical analysis to validate the developed algorithm. These algorithms were applied to confocal images of collagen gels made with reconstituted bovine collagen type I, to demonstrate the computation of orientations of individual fibers.
Final Technical Report [Scalable methods for electronic excitations and optical responses of nanostructures: mathematics to algorithms to observables

DOE Office of Scientific and Technical Information (OSTI.GOV)

Saad, Yousef

2014-03-19

The master project under which this work is funded had as its main objective to develop computational methods for modeling electronic excited-state and optical properties of various nanostructures. The specific goals of the computer science group were primarily to develop effective numerical algorithms in Density Functional Theory (DFT) and Time Dependent Density Functional Theory (TDDFT). There were essentially four distinct stated objectives. The first objective was to study and develop effective numerical algorithms for solving large eigenvalue problems such as those that arise in Density Functional Theory (DFT) methods. The second objective was to explore so-called linear scaling methods ormore » Methods that avoid diagonalization. The third was to develop effective approaches for Time-Dependent DFT (TDDFT). Our fourth and final objective was to examine effective solution strategies for other problems in electronic excitations, such as the GW/Bethe-Salpeter method, and quantum transport problems.« less
A parallel algorithm for the eigenvalues and eigenvectors for a general complex matrix

NASA Technical Reports Server (NTRS)

Shroff, Gautam

1989-01-01

A new parallel Jacobi-like algorithm is developed for computing the eigenvalues of a general complex matrix. Most parallel methods for this parallel typically display only linear convergence. Sequential norm-reducing algorithms also exit and they display quadratic convergence in most cases. The new algorithm is a parallel form of the norm-reducing algorithm due to Eberlein. It is proven that the asymptotic convergence rate of this algorithm is quadratic. Numerical experiments are presented which demonstrate the quadratic convergence of the algorithm and certain situations where the convergence is slow are also identified. The algorithm promises to be very competitive on a variety of parallel architectures.
A multiarchitecture parallel-processing development environment

NASA Technical Reports Server (NTRS)

Townsend, Scott; Blech, Richard; Cole, Gary

1993-01-01

A description is given of the hardware and software of a multiprocessor test bed - the second generation Hypercluster system. The Hypercluster architecture consists of a standard hypercube distributed-memory topology, with multiprocessor shared-memory nodes. By using standard, off-the-shelf hardware, the system can be upgraded to use rapidly improving computer technology. The Hypercluster's multiarchitecture nature makes it suitable for researching parallel algorithms in computational field simulation applications (e.g., computational fluid dynamics). The dedicated test-bed environment of the Hypercluster and its custom-built software allows experiments with various parallel-processing concepts such as message passing algorithms, debugging tools, and computational 'steering'. Such research would be difficult, if not impossible, to achieve on shared, commercial systems.
The updated algorithm of the Energy Consumption Program (ECP): A computer model simulating heating and cooling energy loads in buildings

NASA Technical Reports Server (NTRS)

Lansing, F. L.; Strain, D. M.; Chai, V. W.; Higgins, S.

1979-01-01

The energy Comsumption Computer Program was developed to simulate building heating and cooling loads and compute thermal and electric energy consumption and cost. This article reports on the new additional algorithms and modifications made in an effort to widen the areas of application. The program structure was rewritten accordingly to refine and advance the building model and to further reduce the processing time and cost. The program is noted for its very low cost and ease of use compared to other available codes. The accuracy of computations is not sacrificed however, since the results are expected to lie within + or - 10% of actual energy meter readings.
Design and Construction of Detector and Data Acquisition Elements for Proton Computed Tomography

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fermi Research Alliance; Northern Illinois University

2015-07-15

Proton computed tomography (pCT) offers an alternative to x-ray imaging with potential for three-dimensional imaging, reduced radiation exposure, and in-situ imaging. Northern Illinois University (NIU) is developing a second-generation proton computed tomography system with a goal of demonstrating the feasibility of three-dimensional imaging within clinically realistic imaging times. The second-generation pCT system is comprised of a tracking system, a calorimeter, data acquisition, a computing farm, and software algorithms. The proton beam encounters the upstream tracking detectors, the patient or phantom, the downstream tracking detectors, and a calorimeter. The schematic layout of the PCT system is shown. The data acquisition sendsmore » the proton scattering information to an offline computing farm. Major innovations of the second generation pCT project involve an increased data acquisition rate ( MHz range) and development of three-dimensional imaging algorithms. The Fermilab Particle Physics Division and Northern Illinois Center for Accelerator and Detector Development at Northern Illinois University worked together to design and construct the tracking detectors, calorimeter, readout electronics and detector mounting system.« less
Using parallel computing methods to improve log surface defect detection methods

Treesearch

R. Edward Thomas; Liya Thomas

2013-01-01

Determining the size and location of surface defects is crucial to evaluating the potential yield and value of hardwood logs. Recently a surface defect detection algorithm was developed using the Java language. This algorithm was developed around an earlier laser scanning system that had poor resolution along the length of the log (15 scan lines per foot). A newer...
A Computer Program for the Calculation of Three-Dimensional Transonic Nacelle/Inlet Flowfields

NASA Technical Reports Server (NTRS)

Vadyak, J.; Atta, E. H.

1983-01-01

A highly efficient computer analysis was developed for predicting transonic nacelle/inlet flowfields. This algorithm can compute the three dimensional transonic flowfield about axisymmetric (or asymmetric) nacelle/inlet configurations at zero or nonzero incidence. The flowfield is determined by solving the full-potential equation in conservative form on a body-fitted curvilinear computational mesh. The difference equations are solved using the AF2 approximate factorization scheme. This report presents a discussion of the computational methods used to both generate the body-fitted curvilinear mesh and to obtain the inviscid flow solution. Computed results and correlations with existing methods and experiment are presented. Also presented are discussions on the organization of the grid generation (NGRIDA) computer program and the flow solution (NACELLE) computer program, descriptions of the respective subroutines, definitions of the required input parameters for both algorithms, a brief discussion on interpretation of the output, and sample cases to illustrate application of the analysis.
A New Algorithm with Plane Waves and Wavelets for Random Velocity Fields with Many Spatial Scales

NASA Astrophysics Data System (ADS)

Elliott, Frank W.; Majda, Andrew J.

1995-03-01

A new Monte Carlo algorithm for constructing and sampling stationary isotropic Gaussian random fields with power-law energy spectrum, infrared divergence, and fractal self-similar scaling is developed here. The theoretical basis for this algorithm involves the fact that such a random field is well approximated by a superposition of random one-dimensional plane waves involving a fixed finite number of directions. In general each one-dimensional plane wave is the sum of a random shear layer and a random acoustical wave. These one-dimensional random plane waves are then simulated by a wavelet Monte Carlo method for a single space variable developed recently by the authors. The computational results reported in this paper demonstrate remarkable low variance and economical representation of such Gaussian random fields through this new algorithm. In particular, the velocity structure function for an imcorepressible isotropic Gaussian random field in two space dimensions with the Kolmogoroff spectrum can be simulated accurately over 12 decades with only 100 realizations of the algorithm with the scaling exponent accurate to 1.1% and the constant prefactor accurate to 6%; in fact, the exponent of the velocity structure function can be computed over 12 decades within 3.3% with only 10 realizations. Furthermore, only 46,592 active computational elements are utilized in each realization to achieve these results for 12 decades of scaling behavior.
Multiscale computations with a wavelet-adaptive algorithm

NASA Astrophysics Data System (ADS)

Rastigejev, Yevgenii Anatolyevich

A wavelet-based adaptive multiresolution algorithm for the numerical solution of multiscale problems governed by partial differential equations is introduced. The main features of the method include fast algorithms for the calculation of wavelet coefficients and approximation of derivatives on nonuniform stencils. The connection between the wavelet order and the size of the stencil is established. The algorithm is based on the mathematically well established wavelet theory. This allows us to provide error estimates of the solution which are used in conjunction with an appropriate threshold criteria to adapt the collocation grid. The efficient data structures for grid representation as well as related computational algorithms to support grid rearrangement procedure are developed. The algorithm is applied to the simulation of phenomena described by Navier-Stokes equations. First, we undertake the study of the ignition and subsequent viscous detonation of a H2 : O2 : Ar mixture in a one-dimensional shock tube. Subsequently, we apply the algorithm to solve the two- and three-dimensional benchmark problem of incompressible flow in a lid-driven cavity at large Reynolds numbers. For these cases we show that solutions of comparable accuracy as the benchmarks are obtained with more than an order of magnitude reduction in degrees of freedom. The simulations show the striking ability of the algorithm to adapt to a solution having different scales at different spatial locations so as to produce accurate results at a relatively low computational cost.
ESTIMATION OF PHYSIOCHEMICAL PROPERTIES OF ORGANIC COMPOUNDS BY SPARC

EPA Science Inventory

The computer program SPARC (SPARC Performs Automated Reasoning in Chemistry) has been under development for several years to estimate physical properties and chemical reactivity parameters of organic compounds strictly from molecular structure. SPARC uses computational algorithms...
Nonlinear Dynamic Model-Based Multiobjective Sensor Network Design Algorithm for a Plant with an Estimator-Based Control System

DOE Office of Scientific and Technical Information (OSTI.GOV)

Paul, Prokash; Bhattacharyya, Debangsu; Turton, Richard

Here, a novel sensor network design (SND) algorithm is developed for maximizing process efficiency while minimizing sensor network cost for a nonlinear dynamic process with an estimator-based control system. The multiobjective optimization problem is solved following a lexicographic approach where the process efficiency is maximized first followed by minimization of the sensor network cost. The partial net present value, which combines the capital cost due to the sensor network and the operating cost due to deviation from the optimal efficiency, is proposed as an alternative objective. The unscented Kalman filter is considered as the nonlinear estimator. The large-scale combinatorial optimizationmore » problem is solved using a genetic algorithm. The developed SND algorithm is applied to an acid gas removal (AGR) unit as part of an integrated gasification combined cycle (IGCC) power plant with CO 2 capture. Due to the computational expense, a reduced order nonlinear model of the AGR process is identified and parallel computation is performed during implementation.« less
Development of an upwind, finite-volume code with finite-rate chemistry

NASA Technical Reports Server (NTRS)

Molvik, Gregory A.

1994-01-01

Under this grant, two numerical algorithms were developed to predict the flow of viscous, hypersonic, chemically reacting gases over three-dimensional bodies. Both algorithms take advantage of the benefits of upwind differencing, total variation diminishing techniques, and a finite-volume framework, but obtain their solution in two separate manners. The first algorithm is a zonal, time-marching scheme, and is generally used to obtain solutions in the subsonic portions of the flow field. The second algorithm is a much less expensive, space-marching scheme and can be used for the computation of the larger, supersonic portion of the flow field. Both codes compute their interface fluxes with a temporal Riemann solver and the resulting schemes are made fully implicit including the chemical source terms and boundary conditions. Strong coupling is used between the fluid dynamic, chemical, and turbulence equations. These codes have been validated on numerous hypersonic test cases and have provided excellent comparison with existing data.
Nonlinear Dynamic Model-Based Multiobjective Sensor Network Design Algorithm for a Plant with an Estimator-Based Control System

DOE PAGES

Paul, Prokash; Bhattacharyya, Debangsu; Turton, Richard; ...

2017-06-06

Here, a novel sensor network design (SND) algorithm is developed for maximizing process efficiency while minimizing sensor network cost for a nonlinear dynamic process with an estimator-based control system. The multiobjective optimization problem is solved following a lexicographic approach where the process efficiency is maximized first followed by minimization of the sensor network cost. The partial net present value, which combines the capital cost due to the sensor network and the operating cost due to deviation from the optimal efficiency, is proposed as an alternative objective. The unscented Kalman filter is considered as the nonlinear estimator. The large-scale combinatorial optimizationmore » problem is solved using a genetic algorithm. The developed SND algorithm is applied to an acid gas removal (AGR) unit as part of an integrated gasification combined cycle (IGCC) power plant with CO 2 capture. Due to the computational expense, a reduced order nonlinear model of the AGR process is identified and parallel computation is performed during implementation.« less
A random walk approach to quantum algorithms.

PubMed

Kendon, Vivien M

2006-12-15

The development of quantum algorithms based on quantum versions of random walks is placed in the context of the emerging field of quantum computing. Constructing a suitable quantum version of a random walk is not trivial; pure quantum dynamics is deterministic, so randomness only enters during the measurement phase, i.e. when converting the quantum information into classical information. The outcome of a quantum random walk is very different from the corresponding classical random walk owing to the interference between the different possible paths. The upshot is that quantum walkers find themselves further from their starting point than a classical walker on average, and this forms the basis of a quantum speed up, which can be exploited to solve problems faster. Surprisingly, the effect of making the walk slightly less than perfectly quantum can optimize the properties of the quantum walk for algorithmic applications. Looking to the future, even with a small quantum computer available, the development of quantum walk algorithms might proceed more rapidly than it has, especially for solving real problems.
Post-processing interstitialcy diffusion from molecular dynamics simulations

NASA Astrophysics Data System (ADS)

Bhardwaj, U.; Bukkuru, S.; Warrier, M.

2016-01-01

An algorithm to rigorously trace the interstitialcy diffusion trajectory in crystals is developed. The algorithm incorporates unsupervised learning and graph optimization which obviate the need to input extra domain specific information depending on crystal or temperature of the simulation. The algorithm is implemented in a flexible framework as a post-processor to molecular dynamics (MD) simulations. We describe in detail the reduction of interstitialcy diffusion into known computational problems of unsupervised clustering and graph optimization. We also discuss the steps, computational efficiency and key components of the algorithm. Using the algorithm, thermal interstitialcy diffusion from low to near-melting point temperatures is studied. We encapsulate the algorithms in a modular framework with functionality to calculate diffusion coefficients, migration energies and other trajectory properties. The study validates the algorithm by establishing the conformity of output parameters with experimental values and provides detailed insights for the interstitialcy diffusion mechanism. The algorithm along with the help of supporting visualizations and analysis gives convincing details and a new approach to quantifying diffusion jumps, jump-lengths, time between jumps and to identify interstitials from lattice atoms.
Exact and Heuristic Algorithms for Runway Scheduling

NASA Technical Reports Server (NTRS)

Malik, Waqar A.; Jung, Yoon C.

2016-01-01

This paper explores the Single Runway Scheduling (SRS) problem with arrivals, departures, and crossing aircraft on the airport surface. Constraints for wake vortex separations, departure area navigation separations and departure time window restrictions are explicitly considered. The main objective of this research is to develop exact and heuristic based algorithms that can be used in real-time decision support tools for Air Traffic Control Tower (ATCT) controllers. The paper provides a multi-objective dynamic programming (DP) based algorithm that finds the exact solution to the SRS problem, but may prove unusable for application in real-time environment due to large computation times for moderate sized problems. We next propose a second algorithm that uses heuristics to restrict the search space for the DP based algorithm. A third algorithm based on a combination of insertion and local search (ILS) heuristics is then presented. Simulation conducted for the east side of Dallas/Fort Worth International Airport allows comparison of the three proposed algorithms and indicates that the ILS algorithm performs favorably in its ability to find efficient solutions and its computation times.
Post-processing interstitialcy diffusion from molecular dynamics simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bhardwaj, U., E-mail: haptork@gmail.com; Bukkuru, S.; Warrier, M.

2016-01-15

An algorithm to rigorously trace the interstitialcy diffusion trajectory in crystals is developed. The algorithm incorporates unsupervised learning and graph optimization which obviate the need to input extra domain specific information depending on crystal or temperature of the simulation. The algorithm is implemented in a flexible framework as a post-processor to molecular dynamics (MD) simulations. We describe in detail the reduction of interstitialcy diffusion into known computational problems of unsupervised clustering and graph optimization. We also discuss the steps, computational efficiency and key components of the algorithm. Using the algorithm, thermal interstitialcy diffusion from low to near-melting point temperatures ismore » studied. We encapsulate the algorithms in a modular framework with functionality to calculate diffusion coefficients, migration energies and other trajectory properties. The study validates the algorithm by establishing the conformity of output parameters with experimental values and provides detailed insights for the interstitialcy diffusion mechanism. The algorithm along with the help of supporting visualizations and analysis gives convincing details and a new approach to quantifying diffusion jumps, jump-lengths, time between jumps and to identify interstitials from lattice atoms. -- Graphical abstract:.« less
Multiscale solvers and systematic upscaling in computational physics

NASA Astrophysics Data System (ADS)

Brandt, A.

2005-07-01

Multiscale algorithms can overcome the scale-born bottlenecks that plague most computations in physics. These algorithms employ separate processing at each scale of the physical space, combined with interscale iterative interactions, in ways which use finer scales very sparingly. Having been developed first and well known as multigrid solvers for partial differential equations, highly efficient multiscale techniques have more recently been developed for many other types of computational tasks, including: inverse PDE problems; highly indefinite (e.g., standing wave) equations; Dirac equations in disordered gauge fields; fast computation and updating of large determinants (as needed in QCD); fast integral transforms; integral equations; astrophysics; molecular dynamics of macromolecules and fluids; many-atom electronic structures; global and discrete-state optimization; practical graph problems; image segmentation and recognition; tomography (medical imaging); fast Monte-Carlo sampling in statistical physics; and general, systematic methods of upscaling (accurate numerical derivation of large-scale equations from microscopic laws).

Integrated Multiscale Modeling of Molecular Computing Devices

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gregory Beylkin

2012-03-23

Significant advances were made on all objectives of the research program. We have developed fast multiresolution methods for performing electronic structure calculations with emphasis on constructing efficient representations of functions and operators. We extended our approach to problems of scattering in solids, i.e. constructing fast algorithms for computing above the Fermi energy level. Part of the work was done in collaboration with Robert Harrison and George Fann at ORNL. Specific results (in part supported by this grant) are listed here and are described in greater detail. (1) We have implemented a fast algorithm to apply the Green's function for themore » free space (oscillatory) Helmholtz kernel. The algorithm maintains its speed and accuracy when the kernel is applied to functions with singularities. (2) We have developed a fast algorithm for applying periodic and quasi-periodic, oscillatory Green's functions and those with boundary conditions on simple domains. Importantly, the algorithm maintains its speed and accuracy when applied to functions with singularities. (3) We have developed a fast algorithm for obtaining and applying multiresolution representations of periodic and quasi-periodic Green's functions and Green's functions with boundary conditions on simple domains. (4) We have implemented modifications to improve the speed of adaptive multiresolution algorithms for applying operators which are represented via a Gaussian expansion. (5) We have constructed new nearly optimal quadratures for the sphere that are invariant under the icosahedral rotation group. (6) We obtained new results on approximation of functions by exponential sums and/or rational functions, one of the key methods that allows us to construct separated representations for Green's functions. (7) We developed a new fast and accurate reduction algorithm for obtaining optimal approximation of functions by exponential sums and/or their rational representations.« less
A depth-first search algorithm to compute elementary flux modes by linear programming.

PubMed

Quek, Lake-Ee; Nielsen, Lars K

2014-07-30

The decomposition of complex metabolic networks into elementary flux modes (EFMs) provides a useful framework for exploring reaction interactions systematically. Generating a complete set of EFMs for large-scale models, however, is near impossible. Even for moderately-sized models (<400 reactions), existing approaches based on the Double Description method must iterate through a large number of combinatorial candidates, thus imposing an immense processor and memory demand. Based on an alternative elementarity test, we developed a depth-first search algorithm using linear programming (LP) to enumerate EFMs in an exhaustive fashion. Constraints can be introduced to directly generate a subset of EFMs satisfying the set of constraints. The depth-first search algorithm has a constant memory overhead. Using flux constraints, a large LP problem can be massively divided and parallelized into independent sub-jobs for deployment into computing clusters. Since the sub-jobs do not overlap, the approach scales to utilize all available computing nodes with minimal coordination overhead or memory limitations. The speed of the algorithm was comparable to efmtool, a mainstream Double Description method, when enumerating all EFMs; the attrition power gained from performing flux feasibility tests offsets the increased computational demand of running an LP solver. Unlike the Double Description method, the algorithm enables accelerated enumeration of all EFMs satisfying a set of constraints.
Scalable Parallel Density-based Clustering and Applications

NASA Astrophysics Data System (ADS)

Patwary, Mostofa Ali

2014-04-01

Recently, density-based clustering algorithms (DBSCAN and OPTICS) have gotten significant attention of the scientific community due to their unique capability of discovering arbitrary shaped clusters and eliminating noise data. These algorithms have several applications, which require high performance computing, including finding halos and subhalos (clusters) from massive cosmology data in astrophysics, analyzing satellite images, X-ray crystallography, and anomaly detection. However, parallelization of these algorithms are extremely challenging as they exhibit inherent sequential data access order, unbalanced workload resulting in low parallel efficiency. To break the data access sequentiality and to achieve high parallelism, we develop new parallel algorithms, both for DBSCAN and OPTICS, designed using graph algorithmic techniques. For example, our parallel DBSCAN algorithm exploits the similarities between DBSCAN and computing connected components. Using datasets containing up to a billion floating point numbers, we show that our parallel density-based clustering algorithms significantly outperform the existing algorithms, achieving speedups up to 27.5 on 40 cores on shared memory architecture and speedups up to 5,765 using 8,192 cores on distributed memory architecture. In our experiments, we found that while achieving the scalability, our algorithms produce clustering results with comparable quality to the classical algorithms.
A finite element solver for 3-D compressible viscous flows

NASA Technical Reports Server (NTRS)

Reddy, K. C.; Reddy, J. N.; Nayani, S.

1990-01-01

Computation of the flow field inside a space shuttle main engine (SSME) requires the application of state of the art computational fluid dynamic (CFD) technology. Several computer codes are under development to solve 3-D flow through the hot gas manifold. Some algorithms were designed to solve the unsteady compressible Navier-Stokes equations, either by implicit or explicit factorization methods, using several hundred or thousands of time steps to reach a steady state solution. A new iterative algorithm is being developed for the solution of the implicit finite element equations without assembling global matrices. It is an efficient iteration scheme based on a modified nonlinear Gauss-Seidel iteration with symmetric sweeps. The algorithm is analyzed for a model equation and is shown to be unconditionally stable. Results from a series of test problems are presented. The finite element code was tested for couette flow, which is flow under a pressure gradient between two parallel plates in relative motion. Another problem that was solved is viscous laminar flow over a flat plate. The general 3-D finite element code was used to compute the flow in an axisymmetric turnaround duct at low Mach numbers.
Regularization and computational methods for precise solution of perturbed orbit transfer problems

NASA Astrophysics Data System (ADS)

Woollands, Robyn Michele

The author has developed a suite of algorithms for solving the perturbed Lambert's problem in celestial mechanics. These algorithms have been implemented as a parallel computation tool that has broad applicability. This tool is composed of four component algorithms and each provides unique benefits for solving a particular type of orbit transfer problem. The first one utilizes a Keplerian solver (a-iteration) for solving the unperturbed Lambert's problem. This algorithm not only provides a "warm start" for solving the perturbed problem but is also used to identify which of several perturbed solvers is best suited for the job. The second algorithm solves the perturbed Lambert's problem using a variant of the modified Chebyshev-Picard iteration initial value solver that solves two-point boundary value problems. This method converges over about one third of an orbit and does not require a Newton-type shooting method and thus no state transition matrix needs to be computed. The third algorithm makes use of regularization of the differential equations through the Kustaanheimo-Stiefel transformation and extends the domain of convergence over which the modified Chebyshev-Picard iteration two-point boundary value solver will converge, from about one third of an orbit to almost a full orbit. This algorithm also does not require a Newton-type shooting method. The fourth algorithm uses the method of particular solutions and the modified Chebyshev-Picard iteration initial value solver to solve the perturbed two-impulse Lambert problem over multiple revolutions. The method of particular solutions is a shooting method but differs from the Newton-type shooting methods in that it does not require integration of the state transition matrix. The mathematical developments that underlie these four algorithms are derived in the chapters of this dissertation. For each of the algorithms, some orbit transfer test cases are included to provide insight on accuracy and efficiency of these individual algorithms. Following this discussion, the combined parallel algorithm, known as the unified Lambert tool, is presented and an explanation is given as to how it automatically selects which of the three perturbed solvers to compute the perturbed solution for a particular orbit transfer. The unified Lambert tool may be used to determine a single orbit transfer or for generating of an extremal field map. A case study is presented for a mission that is required to rendezvous with two pieces of orbit debris (spent rocket boosters). The unified Lambert tool software developed in this dissertation is already being utilized by several industrial partners and we are confident that it will play a significant role in practical applications, including solution of Lambert problems that arise in the current applications focused on enhanced space situational awareness.
Edge Pushing is Equivalent to Vertex Elimination for Computing Hessians

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Mu; Pothen, Alex; Hovland, Paul

We prove the equivalence of two different Hessian evaluation algorithms in AD. The first is the Edge Pushing algorithm of Gower and Mello, which may be viewed as a second order Reverse mode algorithm for computing the Hessian. In earlier work, we have derived the Edge Pushing algorithm by exploiting a Reverse mode invariant based on the concept of live variables in compiler theory. The second algorithm is based on eliminating vertices in a computational graph of the gradient, in which intermediate variables are successively eliminated from the graph, and the weights of the edges are updated suitably. We provemore » that if the vertices are eliminated in a reverse topological order while preserving symmetry in the computational graph of the gradient, then the Vertex Elimination algorithm and the Edge Pushing algorithm perform identical computations. In this sense, the two algorithms are equivalent. This insight that unifies two seemingly disparate approaches to Hessian computations could lead to improved algorithms and implementations for computing Hessians. Read More: http://epubs.siam.org/doi/10.1137/1.9781611974690.ch11« less
Development of X-TOOLSS: Preliminary Design of Space Systems Using Evolutionary Computation

NASA Technical Reports Server (NTRS)

Schnell, Andrew R.; Hull, Patrick V.; Turner, Mike L.; Dozier, Gerry; Alverson, Lauren; Garrett, Aaron; Reneau, Jarred

2008-01-01

Evolutionary computational (EC) techniques such as genetic algorithms (GA) have been identified as promising methods to explore the design space of mechanical and electrical systems at the earliest stages of design. In this paper the authors summarize their research in the use of evolutionary computation to develop preliminary designs for various space systems. An evolutionary computational solver developed over the course of the research, X-TOOLSS (Exploration Toolset for the Optimization of Launch and Space Systems) is discussed. With the success of early, low-fidelity example problems, an outline of work involving more computationally complex models is discussed.
Alignment-free detection of horizontal gene transfer between closely related bacterial genomes.

PubMed

Domazet-Lošo, Mirjana; Haubold, Bernhard

2011-09-01

Bacterial epidemics are often caused by strains that have acquired their increased virulence through horizontal gene transfer. Due to this association with disease, the detection of horizontal gene transfer continues to receive attention from microbiologists and bioinformaticians alike. Most software for detecting transfer events is based on alignments of sets of genes or of entire genomes. But despite great advances in the design of algorithms and computer programs, genome alignment remains computationally challenging. We have therefore developed an alignment-free algorithm for rapidly detecting horizontal gene transfer between closely related bacterial genomes. Our implementation of this algorithm is called alfy for "ALignment Free local homologY" and is freely available from http://guanine.evolbio.mpg.de/alfy/. In this comment we demonstrate the application of alfy to the genomes of Staphylococcus aureus. We also argue that-contrary to popular belief and in spite of increasing computer speed-algorithmic optimization is becoming more, not less, important if genome data continues to accumulate at the present rate.
Evaluation of stochastic algorithms for financial mathematics problems from point of view of energy-efficiency

DOE Office of Scientific and Technical Information (OSTI.GOV)

Atanassov, E.; Dimitrov, D., E-mail: d.slavov@bas.bg, E-mail: emanouil@parallel.bas.bg, E-mail: gurov@bas.bg; Gurov, T.

2015-10-28

The recent developments in the area of high-performance computing are driven not only by the desire for ever higher performance but also by the rising costs of electricity. The use of various types of accelerators like GPUs, Intel Xeon Phi has become mainstream and many algorithms and applications have been ported to make use of them where available. In Financial Mathematics the question of optimal use of computational resources should also take into account the limitations on space, because in many use cases the servers are deployed close to the exchanges. In this work we evaluate various algorithms for optionmore » pricing that we have implemented for different target architectures in terms of their energy and space efficiency. Since it has been established that low-discrepancy sequences may be better than pseudorandom numbers for these types of algorithms, we also test the Sobol and Halton sequences. We present the raw results, the computed metrics and conclusions from our tests.« less
Vibration extraction based on fast NCC algorithm and high-speed camera.

PubMed

Lei, Xiujun; Jin, Yi; Guo, Jie; Zhu, Chang'an

2015-09-20

In this study, a high-speed camera system is developed to complete the vibration measurement in real time and to overcome the mass introduced by conventional contact measurements. The proposed system consists of a notebook computer and a high-speed camera which can capture the images as many as 1000 frames per second. In order to process the captured images in the computer, the normalized cross-correlation (NCC) template tracking algorithm with subpixel accuracy is introduced. Additionally, a modified local search algorithm based on the NCC is proposed to reduce the computation time and to increase efficiency significantly. The modified algorithm can rapidly accomplish one displacement extraction 10 times faster than the traditional template matching without installing any target panel onto the structures. Two experiments were carried out under laboratory and outdoor conditions to validate the accuracy and efficiency of the system performance in practice. The results demonstrated the high accuracy and efficiency of the camera system in extracting vibrating signals.
A Linked List-Based Algorithm for Blob Detection on Embedded Vision-Based Sensors.

PubMed

Acevedo-Avila, Ricardo; Gonzalez-Mendoza, Miguel; Garcia-Garcia, Andres

2016-05-28

Blob detection is a common task in vision-based applications. Most existing algorithms are aimed at execution on general purpose computers; while very few can be adapted to the computing restrictions present in embedded platforms. This paper focuses on the design of an algorithm capable of real-time blob detection that minimizes system memory consumption. The proposed algorithm detects objects in one image scan; it is based on a linked-list data structure tree used to label blobs depending on their shape and node information. An example application showing the results of a blob detection co-processor has been built on a low-powered field programmable gate array hardware as a step towards developing a smart video surveillance system. The detection method is intended for general purpose application. As such, several test cases focused on character recognition are also examined. The results obtained present a fair trade-off between accuracy and memory requirements; and prove the validity of the proposed approach for real-time implementation on resource-constrained computing platforms.
Privacy Preserving Nearest Neighbor Search

NASA Astrophysics Data System (ADS)

Shaneck, Mark; Kim, Yongdae; Kumar, Vipin

Data mining is frequently obstructed by privacy concerns. In many cases data is distributed, and bringing the data together in one place for analysis is not possible due to privacy laws (e.g. HIPAA) or policies. Privacy preserving data mining techniques have been developed to address this issue by providing mechanisms to mine the data while giving certain privacy guarantees. In this chapter we address the issue of privacy preserving nearest neighbor search, which forms the kernel of many data mining applications. To this end, we present a novel algorithm based on secure multiparty computation primitives to compute the nearest neighbors of records in horizontally distributed data. We show how this algorithm can be used in three important data mining algorithms, namely LOF outlier detection, SNN clustering, and kNN classification. We prove the security of these algorithms under the semi-honest adversarial model, and describe methods that can be used to optimize their performance. Keywords: Privacy Preserving Data Mining, Nearest Neighbor Search, Outlier Detection, Clustering, Classification, Secure Multiparty Computation
Evaluation of stochastic algorithms for financial mathematics problems from point of view of energy-efficiency

NASA Astrophysics Data System (ADS)

Atanassov, E.; Dimitrov, D.; Gurov, T.

2015-10-01

The recent developments in the area of high-performance computing are driven not only by the desire for ever higher performance but also by the rising costs of electricity. The use of various types of accelerators like GPUs, Intel Xeon Phi has become mainstream and many algorithms and applications have been ported to make use of them where available. In Financial Mathematics the question of optimal use of computational resources should also take into account the limitations on space, because in many use cases the servers are deployed close to the exchanges. In this work we evaluate various algorithms for option pricing that we have implemented for different target architectures in terms of their energy and space efficiency. Since it has been established that low-discrepancy sequences may be better than pseudorandom numbers for these types of algorithms, we also test the Sobol and Halton sequences. We present the raw results, the computed metrics and conclusions from our tests.
Separation of left and right lungs using 3-dimensional information of sequential computed tomography images and a guided dynamic programming algorithm.

PubMed

Park, Sang Cheol; Leader, Joseph Ken; Tan, Jun; Lee, Guee Sang; Kim, Soo Hyung; Na, In Seop; Zheng, Bin

2011-01-01

This article presents a new computerized scheme that aims to accurately and robustly separate left and right lungs on computed tomography (CT) examinations. We developed and tested a method to separate the left and right lungs using sequential CT information and a guided dynamic programming algorithm using adaptively and automatically selected start point and end point with especially severe and multiple connections. The scheme successfully identified and separated all 827 connections on the total 4034 CT images in an independent testing data set of CT examinations. The proposed scheme separated multiple connections regardless of their locations, and the guided dynamic programming algorithm reduced the computation time to approximately 4.6% in comparison with the traditional dynamic programming and avoided the permeation of the separation boundary into normal lung tissue. The proposed method is able to robustly and accurately disconnect all connections between left and right lungs, and the guided dynamic programming algorithm is able to remove redundant processing.
Rapid algorithm prototyping and implementation for power quality measurement

NASA Astrophysics Data System (ADS)

Kołek, Krzysztof; Piątek, Krzysztof

2015-12-01

This article presents a Model-Based Design (MBD) approach to rapidly implement power quality (PQ) metering algorithms. Power supply quality is a very important aspect of modern power systems and will become even more important in future smart grids. In this case, maintaining the PQ parameters at the desired level will require efficient implementation methods of the metering algorithms. Currently, the development of new, advanced PQ metering algorithms requires new hardware with adequate computational capability and time intensive, cost-ineffective manual implementations. An alternative, considered here, is an MBD approach. The MBD approach focuses on the modelling and validation of the model by simulation, which is well-supported by a Computer-Aided Engineering (CAE) packages. This paper presents two algorithms utilized in modern PQ meters: a phase-locked loop based on an Enhanced Phase Locked Loop (EPLL), and the flicker measurement according to the IEC 61000-4-15 standard. The algorithms were chosen because of their complexity and non-trivial development. They were first modelled in the MATLAB/Simulink package, then tested and validated in a simulation environment. The models, in the form of Simulink diagrams, were next used to automatically generate C code. The code was compiled and executed in real-time on the Zynq Xilinx platform that combines a reconfigurable Field Programmable Gate Array (FPGA) with a dual-core processor. The MBD development of PQ algorithms, automatic code generation, and compilation form a rapid algorithm prototyping and implementation path for PQ measurements. The main advantage of this approach is the ability to focus on the design, validation, and testing stages while skipping over implementation issues. The code generation process renders production-ready code that can be easily used on the target hardware. This is especially important when standards for PQ measurement are in constant development, and the PQ issues in emerging smart grids will require tools for rapid development and implementation of such algorithms.
High-resolution computational algorithms for simulating offshore wind turbines and farms: Model development and validation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Calderer, Antoni; Yang, Xiaolei; Angelidis, Dionysios

2015-10-30

The present project involves the development of modeling and analysis design tools for assessing offshore wind turbine technologies. The computational tools developed herein are able to resolve the effects of the coupled interaction of atmospheric turbulence and ocean waves on aerodynamic performance and structural stability and reliability of offshore wind turbines and farms. Laboratory scale experiments have been carried out to derive data sets for validating the computational models.
Algorithmic Mechanism Design of Evolutionary Computation.

PubMed

Pei, Yan

2015-01-01

We consider algorithmic design, enhancement, and improvement of evolutionary computation as a mechanism design problem. All individuals or several groups of individuals can be considered as self-interested agents. The individuals in evolutionary computation can manipulate parameter settings and operations by satisfying their own preferences, which are defined by an evolutionary computation algorithm designer, rather than by following a fixed algorithm rule. Evolutionary computation algorithm designers or self-adaptive methods should construct proper rules and mechanisms for all agents (individuals) to conduct their evolution behaviour correctly in order to definitely achieve the desired and preset objective(s). As a case study, we propose a formal framework on parameter setting, strategy selection, and algorithmic design of evolutionary computation by considering the Nash strategy equilibrium of a mechanism design in the search process. The evaluation results present the efficiency of the framework. This primary principle can be implemented in any evolutionary computation algorithm that needs to consider strategy selection issues in its optimization process. The final objective of our work is to solve evolutionary computation design as an algorithmic mechanism design problem and establish its fundamental aspect by taking this perspective. This paper is the first step towards achieving this objective by implementing a strategy equilibrium solution (such as Nash equilibrium) in evolutionary computation algorithm.
Algorithmic Mechanism Design of Evolutionary Computation

PubMed Central

2015-01-01

We consider algorithmic design, enhancement, and improvement of evolutionary computation as a mechanism design problem. All individuals or several groups of individuals can be considered as self-interested agents. The individuals in evolutionary computation can manipulate parameter settings and operations by satisfying their own preferences, which are defined by an evolutionary computation algorithm designer, rather than by following a fixed algorithm rule. Evolutionary computation algorithm designers or self-adaptive methods should construct proper rules and mechanisms for all agents (individuals) to conduct their evolution behaviour correctly in order to definitely achieve the desired and preset objective(s). As a case study, we propose a formal framework on parameter setting, strategy selection, and algorithmic design of evolutionary computation by considering the Nash strategy equilibrium of a mechanism design in the search process. The evaluation results present the efficiency of the framework. This primary principle can be implemented in any evolutionary computation algorithm that needs to consider strategy selection issues in its optimization process. The final objective of our work is to solve evolutionary computation design as an algorithmic mechanism design problem and establish its fundamental aspect by taking this perspective. This paper is the first step towards achieving this objective by implementing a strategy equilibrium solution (such as Nash equilibrium) in evolutionary computation algorithm. PMID:26257777
An Automatic Registration Algorithm for 3D Maxillofacial Model

NASA Astrophysics Data System (ADS)

Qiu, Luwen; Zhou, Zhongwei; Guo, Jixiang; Lv, Jiancheng

2016-09-01

3D image registration aims at aligning two 3D data sets in a common coordinate system, which has been widely used in computer vision, pattern recognition and computer assisted surgery. One challenging problem in 3D registration is that point-wise correspondences between two point sets are often unknown apriori. In this work, we develop an automatic algorithm for 3D maxillofacial models registration including facial surface model and skull model. Our proposed registration algorithm can achieve a good alignment result between partial and whole maxillofacial model in spite of ambiguous matching, which has a potential application in the oral and maxillofacial reparative and reconstructive surgery. The proposed algorithm includes three steps: (1) 3D-SIFT features extraction and FPFH descriptors construction; (2) feature matching using SAC-IA; (3) coarse rigid alignment and refinement by ICP. Experiments on facial surfaces and mandible skull models demonstrate the efficiency and robustness of our algorithm.
Crashworthiness simulations with DYNA3D

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schauer, D.A.; Hoover, C.G.; Kay, G.J.

1996-04-01

Current progress in parallel algorithm research and applications in vehicle crash simulation is described for the explicit, finite element algorithms in DYNA3D. Problem partitioning methods and parallel algorithms for contact at material interfaces are the two challenging algorithm research problems that are addressed. Two prototype parallel contact algorithms have been developed for treating the cases of local and arbitrary contact. Demonstration problems for local contact are crashworthiness simulations with 222 locally defined contact surfaces and a vehicle/barrier collision modeled with arbitrary contact. A simulation of crash tests conducted for a vehicle impacting a U-channel small sign post embedded in soilmore » has been run on both the serial and parallel versions of DYNA3D. A significant reduction in computational time has been observed when running these problems on the parallel version. However, to achieve maximum efficiency, complex problems must be appropriately partitioned, especially when contact dominates the computation.« less

SHARPEN-systematic hierarchical algorithms for rotamers and proteins on an extended network.

PubMed

Loksha, Ilya V; Maiolo, James R; Hong, Cheng W; Ng, Albert; Snow, Christopher D

2009-04-30

Algorithms for discrete optimization of proteins play a central role in recent advances in protein structure prediction and design. We wish to improve the resources available for computational biologists to rapidly prototype such algorithms and to easily scale these algorithms to many processors. To that end, we describe the implementation and use of two new open source resources, citing potential benefits over existing software. We discuss CHOMP, a new object-oriented library for macromolecular optimization, and SHARPEN, a framework for scaling CHOMP scripts to many computers. These tools allow users to develop new algorithms for a variety of applications including protein repacking, protein-protein docking, loop rebuilding, or homology model remediation. Particular care was taken to allow modular energy function design; protein conformations may currently be scored using either the OPLSaa molecular mechanical energy function or an all-atom semiempirical energy function employed by Rosetta. (c) 2009 Wiley Periodicals, Inc.
The Ordered Clustered Travelling Salesman Problem: A Hybrid Genetic Algorithm

PubMed Central

Ahmed, Zakir Hussain

2014-01-01

The ordered clustered travelling salesman problem is a variation of the usual travelling salesman problem in which a set of vertices (except the starting vertex) of the network is divided into some prespecified clusters. The objective is to find the least cost Hamiltonian tour in which vertices of any cluster are visited contiguously and the clusters are visited in the prespecified order. The problem is NP-hard, and it arises in practical transportation and sequencing problems. This paper develops a hybrid genetic algorithm using sequential constructive crossover, 2-opt search, and a local search for obtaining heuristic solution to the problem. The efficiency of the algorithm has been examined against two existing algorithms for some asymmetric and symmetric TSPLIB instances of various sizes. The computational results show that the proposed algorithm is very effective in terms of solution quality and computational time. Finally, we present solution to some more symmetric TSPLIB instances. PMID:24701148
A review on quantum search algorithms

NASA Astrophysics Data System (ADS)

Giri, Pulak Ranjan; Korepin, Vladimir E.

2017-12-01

The use of superposition of states in quantum computation, known as quantum parallelism, has significant advantage in terms of speed over the classical computation. It is evident from the early invented quantum algorithms such as Deutsch's algorithm, Deutsch-Jozsa algorithm and its variation as Bernstein-Vazirani algorithm, Simon algorithm, Shor's algorithms, etc. Quantum parallelism also significantly speeds up the database search algorithm, which is important in computer science because it comes as a subroutine in many important algorithms. Quantum database search of Grover achieves the task of finding the target element in an unsorted database in a time quadratically faster than the classical computer. We review Grover's quantum search algorithms for a singe and multiple target elements in a database. The partial search algorithm of Grover and Radhakrishnan and its optimization by Korepin called GRK algorithm are also discussed.
An efficient impedance method for induced field evaluation based on a stabilized Bi-conjugate gradient algorithm.

PubMed

Wang, Hua; Liu, Feng; Xia, Ling; Crozier, Stuart

2008-11-21

This paper presents a stabilized Bi-conjugate gradient algorithm (BiCGstab) that can significantly improve the performance of the impedance method, which has been widely applied to model low-frequency field induction phenomena in voxel phantoms. The improved impedance method offers remarkable computational advantages in terms of convergence performance and memory consumption over the conventional, successive over-relaxation (SOR)-based algorithm. The scheme has been validated against other numerical/analytical solutions on a lossy, multilayered sphere phantom excited by an ideal coil loop. To demonstrate the computational performance and application capability of the developed algorithm, the induced fields inside a human phantom due to a low-frequency hyperthermia device is evaluated. The simulation results show the numerical accuracy and superior performance of the method.
An Analysis of Navigation Algorithms for Smartphones Using J2ME

NASA Astrophysics Data System (ADS)

Santos, André C.; Tarrataca, Luís; Cardoso, João M. P.

Embedded systems are considered one of the most potential areas for future innovations. Two embedded fields that will most certainly take a primary role in future innovations are mobile robotics and mobile computing. Mobile robots and smartphones are growing in number and functionalities, becoming a presence in our daily life. In this paper, we study the current feasibility of a smartphone to execute navigation algorithms. As a test case, we use a smartphone to control an autonomous mobile robot. We tested three navigation problems: Mapping, Localization and Path Planning. For each of these problems, an algorithm has been chosen, developed in J2ME, and tested on the field. Results show the current mobile Java capacity for executing computationally demanding algorithms and reveal the real possibility of using smartphones for autonomous navigation.
Quantum speedup in solving the maximal-clique problem

NASA Astrophysics Data System (ADS)

Chang, Weng-Long; Yu, Qi; Li, Zhaokai; Chen, Jiahui; Peng, Xinhua; Feng, Mang

2018-03-01

The maximal-clique problem, to find the maximally sized clique in a given graph, is classically an NP-complete computational problem, which has potential applications ranging from electrical engineering, computational chemistry, and bioinformatics to social networks. Here we develop a quantum algorithm to solve the maximal-clique problem for any graph G with n vertices with quadratic speedup over its classical counterparts, where the time and spatial complexities are reduced to, respectively, O (√{2n}) and O (n2) . With respect to oracle-related quantum algorithms for the NP-complete problems, we identify our algorithm as optimal. To justify the feasibility of the proposed quantum algorithm, we successfully solve a typical clique problem for a graph G with two vertices and one edge by carrying out a nuclear magnetic resonance experiment involving four qubits.
Un algorithme efficace d'intégration plastique pour un matériau obéissant au critère anisotrope de Hill

NASA Astrophysics Data System (ADS)

Titeux, Isabelle; Li, Yuming M.; Debray, Karl; Guo, Ying-Qiao

2004-11-01

This Note deals with an efficient algorithm to carry out the plastic integration and compute the stresses due to large strains for materials satisfying the Hill's anisotropic yield criterion. The classical algorithm of plastic integration such as 'Return Mapping Method' is largely used for nonlinear analyses of structures and numerical simulations of forming processes, but it requires an iterative schema and may have convergence problems. A new direct algorithm based on a scalar method is developed which allows us to directly obtain the plastic multiplier without an iteration procedure; thus the computation time is largely reduced and the numerical problems are avoided. To cite this article: I. Titeux et al., C. R. Mecanique 332 (2004).
NPLOT: an Interactive Plotting Program for NASTRAN Finite Element Models

NASA Technical Reports Server (NTRS)

Jones, G. K.; Mcentire, K. J.

1985-01-01

The NPLOT (NASTRAN Plot) is an interactive computer graphics program for plotting undeformed and deformed NASTRAN finite element models. Developed at NASA's Goddard Space Flight Center, the program provides flexible element selection and grid point, ASET and SPC degree of freedom labelling. It is easy to use and provides a combination menu and command driven user interface. NPLOT also provides very fast hidden line and haloed line algorithms. The hidden line algorithm in NPLOT proved to be both very accurate and several times faster than other existing hidden line algorithms. A fast spatial bucket sort and horizon edge computation are used to achieve this high level of performance. The hidden line and the haloed line algorithms are the primary features that make NPLOT different from other plotting programs.
Survey of new vector computers: The CRAY 1S from CRAY research; the CYBER 205 from CDC and the parallel computer from ICL - architecture and programming

NASA Technical Reports Server (NTRS)

Gentzsch, W.

1982-01-01

Problems which can arise with vector and parallel computers are discussed in a user oriented context. Emphasis is placed on the algorithms used and the programming techniques adopted. Three recently developed supercomputers are examined and typical application examples are given in CRAY FORTRAN, CYBER 205 FORTRAN and DAP (distributed array processor) FORTRAN. The systems performance is compared. The addition of parts of two N x N arrays is considered. The influence of the architecture on the algorithms and programming language is demonstrated. Numerical analysis of magnetohydrodynamic differential equations by an explicit difference method is illustrated, showing very good results for all three systems. The prognosis for supercomputer development is assessed.
Development of an Output-based Adaptive Method for Multi-Dimensional Euler and Navier-Stokes Simulations

NASA Technical Reports Server (NTRS)

Darmofal, David L.

2003-01-01

The use of computational simulations in the prediction of complex aerodynamic flows is becoming increasingly prevalent in the design process within the aerospace industry. Continuing advancements in both computing technology and algorithmic development are ultimately leading to attempts at simulating ever-larger, more complex problems. However, by increasing the reliance on computational simulations in the design cycle, we must also increase the accuracy of these simulations in order to maintain or improve the reliability arid safety of the resulting aircraft. At the same time, large-scale computational simulations must be made more affordable so that their potential benefits can be fully realized within the design cycle. Thus, a continuing need exists for increasing the accuracy and efficiency of computational algorithms such that computational fluid dynamics can become a viable tool in the design of more reliable, safer aircraft. The objective of this research was the development of an error estimation and grid adaptive strategy for reducing simulation errors in integral outputs (functionals) such as lift or drag from from multi-dimensional Euler and Navier-Stokes simulations. In this final report, we summarize our work during this grant.
Particle transport in the human respiratory tract: formulation of a nodal inverse distance weighted Eulerian-Lagrangian transport and implementation of the Wind-Kessel algorithm for an oral delivery.

PubMed

Kannan, Ravishekar; Guo, Peng; Przekwas, Andrzej

2016-06-01

This paper is the first in a series wherein efficient computational methods are developed and implemented to accurately quantify the transport, deposition, and clearance of the microsized particles (range of interest: 2 to 10 µm) in the human respiratory tract. In particular, this paper (part I) deals with (i) development of a detailed 3D computational finite volume mesh comprising of the NOPL (nasal, oral, pharyngeal and larynx), trachea and several airway generations; (ii) use of CFD Research Corporation's finite volume Computational Biology (CoBi) flow solver to obtain the flow physics for an oral inhalation simulation; (iii) implement a novel and accurate nodal inverse distance weighted Eulerian-Lagrangian formulation to accurately obtain the deposition, and (iv) development of Wind-Kessel boundary condition algorithm. This new Wind-Kessel boundary condition algorithm allows the 'escaped' particles to reenter the airway through the outlets, thereby to an extent accounting for the drawbacks of having a finite number of lung generations in the computational mesh. The deposition rates in the NOPL, trachea, the first and second bifurcation were computed, and they were in reasonable accord with the Typical Path Length model. The quantitatively validated results indicate that these developments will be useful for (i) obtaining depositions in diseased lungs (because of asthma and COPD), for which there are no empirical models, and (ii) obtaining the secondary clearance (mucociliary clearance) of the deposited particles. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.
Efficient Craig Interpolation for Linear Diophantine (Dis)Equations and Linear Modular Equations

DTIC Science & Technology

2008-02-01

Craig interpolants has enabled the development of powerful hardware and software model checking techniques. Efficient algorithms are known for computing...interpolants in rational and real linear arithmetic. We focus on subsets of integer linear arithmetic. Our main results are polynomial time algorithms ...congruences), and linear diophantine disequations. We show the utility of the proposed interpolation algorithms for discovering modular/divisibility predicates
A Real Time Controller For Applications In Smart Structures

NASA Astrophysics Data System (ADS)

Ahrens, Christian P.; Claus, Richard O.

1990-02-01

Research in smart structures, especially the area of vibration suppression, has warranted the investigation of advanced computing environments. Real time PC computing power has limited development of high order control algorithms. This paper presents a simple Real Time Embedded Control System (RTECS) in an application of Intelligent Structure Monitoring by way of modal domain sensing for vibration control. It is compared to a PC AT based system for overall functionality and speed. The system employs a novel Reduced Instruction Set Computer (RISC) microcontroller capable of 15 million instructions per second (MIPS) continuous performance and burst rates of 40 MIPS. Advanced Complimentary Metal Oxide Semiconductor (CMOS) circuits are integrated on a single 100 mm by 160 mm printed circuit board requiring only 1 Watt of power. An operating system written in Forth provides high speed operation and short development cycles. The system allows for implementation of Input/Output (I/O) intensive algorithms and provides capability for advanced system development.
Automated Quantification of Pneumothorax in CT

PubMed Central

Do, Synho; Salvaggio, Kristen; Gupta, Supriya; Kalra, Mannudeep; Ali, Nabeel U.; Pien, Homer

2012-01-01

An automated, computer-aided diagnosis (CAD) algorithm for the quantification of pneumothoraces from Multidetector Computed Tomography (MDCT) images has been developed. Algorithm performance was evaluated through comparison to manual segmentation by expert radiologists. A combination of two-dimensional and three-dimensional processing techniques was incorporated to reduce required processing time by two-thirds (as compared to similar techniques). Volumetric measurements on relative pneumothorax size were obtained and the overall performance of the automated method shows an average error of just below 1%. PMID:23082091
A mathematical model for computer image tracking.

PubMed

Legters, G R; Young, T Y

1982-06-01

A mathematical model using an operator formulation for a moving object in a sequence of images is presented. Time-varying translation and rotation operators are derived to describe the motion. A variational estimation algorithm is developed to track the dynamic parameters of the operators. The occlusion problem is alleviated by using a predictive Kalman filter to keep the tracking on course during severe occlusion. The tracking algorithm (variational estimation in conjunction with Kalman filter) is implemented to track moving objects with occasional occlusion in computer-simulated binary images.
Implementation of a block Lanczos algorithm for Eigenproblem solution of gyroscopic systems

NASA Technical Reports Server (NTRS)

Gupta, Kajal K.; Lawson, Charles L.

1987-01-01

The details of implementation of a general numerical procedure developed for the accurate and economical computation of natural frequencies and associated modes of any elastic structure rotating along an arbitrary axis are described. A block version of the Lanczos algorithm is derived for the solution that fully exploits associated matrix sparsity and employs only real numbers in all relevant computations. It is also capable of determining multiple roots and proves to be most efficient when compared to other, similar, exisiting techniques.
Motion Cueing Algorithm Development: New Motion Cueing Program Implementation and Tuning

NASA Technical Reports Server (NTRS)

Houck, Jacob A. (Technical Monitor); Telban, Robert J.; Cardullo, Frank M.; Kelly, Lon C.

2005-01-01

A computer program has been developed for the purpose of driving the NASA Langley Research Center Visual Motion Simulator (VMS). This program includes two new motion cueing algorithms, the optimal algorithm and the nonlinear algorithm. A general description of the program is given along with a description and flowcharts for each cueing algorithm, and also descriptions and flowcharts for subroutines used with the algorithms. Common block variable listings and a program listing are also provided. The new cueing algorithms have a nonlinear gain algorithm implemented that scales each aircraft degree-of-freedom input with a third-order polynomial. A description of the nonlinear gain algorithm is given along with past tuning experience and procedures for tuning the gain coefficient sets for each degree-of-freedom to produce the desired piloted performance. This algorithm tuning will be needed when the nonlinear motion cueing algorithm is implemented on a new motion system in the Cockpit Motion Facility (CMF) at the NASA Langley Research Center.
Investigation, Development, and Evaluation of Performance Proving for Fault-tolerant Computers

NASA Technical Reports Server (NTRS)

Levitt, K. N.; Schwartz, R.; Hare, D.; Moore, J. S.; Melliar-Smith, P. M.; Shostak, R. E.; Boyer, R. S.; Green, M. W.; Elliott, W. D.

1983-01-01

A number of methodologies for verifying systems and computer based tools that assist users in verifying their systems were developed. These tools were applied to verify in part the SIFT ultrareliable aircraft computer. Topics covered included: STP theorem prover; design verification of SIFT; high level language code verification; assembly language level verification; numerical algorithm verification; verification of flight control programs; and verification of hardware logic.
Algorithms used for read-out optical system pointing to multiplexed computer generated 1D-Fourier holograms and decoding the encrypted information

NASA Astrophysics Data System (ADS)

Donchenko, Sergey S.; Odinokov, Sergey B.; Betin, Alexandr U.; Hanevich, Pavel; Semishko, Sergey; Zlokazov, Evgenii Y.

2017-05-01

The holographic disk reading device for recovery of CGFH is described. Principle of its work is shown. Analyzed approaches for developing algorithms, used in this device: guidance and decoding. Listed results of experimental researches.
Constructing high-quality bounding volume hierarchies for N-body computation using the acceptance volume heuristic

NASA Astrophysics Data System (ADS)

Olsson, O.

2018-01-01

We present a novel heuristic derived from a probabilistic cost model for approximate N-body simulations. We show that this new heuristic can be used to guide tree construction towards higher quality trees with improved performance over current N-body codes. This represents an important step beyond the current practice of using spatial partitioning for N-body simulations, and enables adoption of a range of state-of-the-art algorithms developed for computer graphics applications to yield further improvements in N-body simulation performance. We outline directions for further developments and review the most promising such algorithms.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.