computer algorithm calculates: Topics by Science.gov

Sample records for computer algorithm calculates

Parallelizing flow-accumulation calculations on graphics processing units—From iterative DEM preprocessing algorithm to recursive multiple-flow-direction algorithm

NASA Astrophysics Data System (ADS)

Qin, Cheng-Zhi; Zhan, Lijun

2012-06-01

As one of the important tasks in digital terrain analysis, the calculation of flow accumulations from gridded digital elevation models (DEMs) usually involves two steps in a real application: (1) using an iterative DEM preprocessing algorithm to remove the depressions and flat areas commonly contained in real DEMs, and (2) using a recursive flow-direction algorithm to calculate the flow accumulation for every cell in the DEM. Because both algorithms are computationally intensive, quick calculation of the flow accumulations from a DEM (especially for a large area) presents a practical challenge to personal computer (PC) users. In recent years, rapid increases in hardware capacity of the graphics processing units (GPUs) provided in modern PCs have made it possible to meet this challenge in a PC environment. Parallel computing on GPUs using a compute-unified-device-architecture (CUDA) programming model has been explored to speed up the execution of the single-flow-direction algorithm (SFD). However, the parallel implementation on a GPU of the multiple-flow-direction (MFD) algorithm, which generally performs better than the SFD algorithm, has not been reported. Moreover, GPU-based parallelization of the DEM preprocessing step in the flow-accumulation calculations has not been addressed. This paper proposes a parallel approach to calculate flow accumulations (including both iterative DEM preprocessing and a recursive MFD algorithm) on a CUDA-compatible GPU. For the parallelization of an MFD algorithm (MFD-md), two different parallelization strategies using a GPU are explored. The first parallelization strategy, which has been used in the existing parallel SFD algorithm on GPU, has the problem of computing redundancy. Therefore, we designed a parallelization strategy based on graph theory. The application results show that the proposed parallel approach to calculate flow accumulations on a GPU performs much faster than either sequential algorithms or other parallel GPU-based algorithms based on existing parallelization strategies.
Algorithm Calculates Cumulative Poisson Distribution

NASA Technical Reports Server (NTRS)

Bowerman, Paul N.; Nolty, Robert C.; Scheuer, Ernest M.

1992-01-01

Algorithm calculates accurate values of cumulative Poisson distribution under conditions where other algorithms fail because numbers are so small (underflow) or so large (overflow) that computer cannot process them. Factors inserted temporarily to prevent underflow and overflow. Implemented in CUMPOIS computer program described in "Cumulative Poisson Distribution Program" (NPO-17714).
Computation of multi-dimensional viscous supersonic flow

NASA Technical Reports Server (NTRS)

Buggeln, R. C.; Kim, Y. N.; Mcdonald, H.

1986-01-01

A method has been developed for two- and three-dimensional computations of viscous supersonic jet flows interacting with an external flow. The approach employs a reduced form of the Navier-Stokes equations which allows solution as an initial-boundary value problem in space, using an efficient noniterative forward marching algorithm. Numerical instability associated with forward marching algorithms for flows with embedded subsonic regions is avoided by approximation of the reduced form of the Navier-Stokes equations in the subsonic regions of the boundary layers. Supersonic and subsonic portions of the flow field are simultaneously calculated by a consistently split linearized block implicit computational algorithm. The results of computations for a series of test cases associated with supersonic jet flow is presented and compared with other calculations for axisymmetric cases. Demonstration calculations indicate that the computational technique has great promise as a tool for calculating a wide range of supersonic flow problems including jet flow. Finally, a User's Manual is presented for the computer code used to perform the calculations.
Treecode with a Special-Purpose Processor

NASA Astrophysics Data System (ADS)

Makino, Junichiro

1991-08-01

We describe an implementation of the modified Barnes-Hut tree algorithm for a gravitational N-body calculation on a GRAPE (GRAvity PipE) backend processor. GRAPE is a special-purpose computer for N-body calculations. It receives the positions and masses of particles from a host computer and then calculates the gravitational force at each coordinate specified by the host. To use this GRAPE processor with the hierarchical tree algorithm, the host computer must maintain a list of all nodes that exert force on a particle. If we create this list for each particle of the system at each timestep, the number of floating-point operations on the host and that on GRAPE would become comparable, and the increased speed obtained by using GRAPE would be small. In our modified algorithm, we create a list of nodes for many particles. Thus, the amount of the work required of the host is significantly reduced. This algorithm was originally developed by Barnes in order to vectorize the force calculation on a Cyber 205. With this algorithm, the computing time of the force calculation becomes comparable to that of the tree construction, if the GRAPE backend processor is sufficiently fast. The obtained speed-up factor is 30 to 50 for a RISC-based host computer and GRAPE-1A with a peak speed of 240 Mflops.
Minimum time acceleration of aircraft turbofan engines by using an algorithm based on nonlinear programming

NASA Technical Reports Server (NTRS)

Teren, F.

1977-01-01

Minimum time accelerations of aircraft turbofan engines are presented. The calculation of these accelerations was made by using a piecewise linear engine model, and an algorithm based on nonlinear programming. Use of this model and algorithm allows such trajectories to be readily calculated on a digital computer with a minimal expenditure of computer time.
Pressure algorithm for elliptic flow calculations with the PDF method

NASA Technical Reports Server (NTRS)

Anand, M. S.; Pope, S. B.; Mongia, H. C.

1991-01-01

An algorithm to determine the mean pressure field for elliptic flow calculations with the probability density function (PDF) method is developed and applied. The PDF method is a most promising approach for the computation of turbulent reacting flows. Previous computations of elliptic flows with the method were in conjunction with conventional finite volume based calculations that provided the mean pressure field. The algorithm developed and described here permits the mean pressure field to be determined within the PDF calculations. The PDF method incorporating the pressure algorithm is applied to the flow past a backward-facing step. The results are in good agreement with data for the reattachment length, mean velocities, and turbulence quantities including triple correlations.
Computational efficiency for the surface renewal method

NASA Astrophysics Data System (ADS)

Kelley, Jason; Higgins, Chad

2018-04-01

Measuring surface fluxes using the surface renewal (SR) method requires programmatic algorithms for tabulation, algebraic calculation, and data quality control. A number of different methods have been published describing automated calibration of SR parameters. Because the SR method utilizes high-frequency (10 Hz+) measurements, some steps in the flux calculation are computationally expensive, especially when automating SR to perform many iterations of these calculations. Several new algorithms were written that perform the required calculations more efficiently and rapidly, and that tested for sensitivity to length of flux averaging period, ability to measure over a large range of lag timescales, and overall computational efficiency. These algorithms utilize signal processing techniques and algebraic simplifications that demonstrate simple modifications that dramatically improve computational efficiency. The results here complement efforts by other authors to standardize a robust and accurate computational SR method. Increased speed of computation time grants flexibility to implementing the SR method, opening new avenues for SR to be used in research, for applied monitoring, and in novel field deployments.
The Impact of Monte Carlo Dose Calculations on Intensity-Modulated Radiation Therapy

NASA Astrophysics Data System (ADS)

Siebers, J. V.; Keall, P. J.; Mohan, R.

The effect of dose calculation accuracy for IMRT was studied by comparing different dose calculation algorithms. A head and neck IMRT plan was optimized using a superposition dose calculation algorithm. Dose was re-computed for the optimized plan using both Monte Carlo and pencil beam dose calculation algorithms to generate patient and phantom dose distributions. Tumor control probabilities (TCP) and normal tissue complication probabilities (NTCP) were computed to estimate the plan outcome. For the treatment plan studied, Monte Carlo best reproduces phantom dose measurements, the TCP was slightly lower than the superposition and pencil beam results, and the NTCP values differed little.
Impedance computed tomography using an adaptive smoothing coefficient algorithm.

PubMed

Suzuki, A; Uchiyama, A

2001-01-01

In impedance computed tomography, a fixed coefficient regularization algorithm has been frequently used to improve the ill-conditioning problem of the Newton-Raphson algorithm. However, a lot of experimental data and a long period of computation time are needed to determine a good smoothing coefficient because a good smoothing coefficient has to be manually chosen from a number of coefficients and is a constant for each iteration calculation. Thus, sometimes the fixed coefficient regularization algorithm distorts the information or fails to obtain any effect. In this paper, a new adaptive smoothing coefficient algorithm is proposed. This algorithm automatically calculates the smoothing coefficient from the eigenvalue of the ill-conditioned matrix. Therefore, the effective images can be obtained within a short computation time. Also the smoothing coefficient is automatically adjusted by the information related to the real resistivity distribution and the data collection method. In our impedance system, we have reconstructed the resistivity distributions of two phantoms using this algorithm. As a result, this algorithm only needs one-fifth the computation time compared to the fixed coefficient regularization algorithm. When compared to the fixed coefficient regularization algorithm, it shows that the image is obtained more rapidly and applicable in real-time monitoring of the blood vessel.
On some methods for improving time of reachability sets computation for the dynamic system control problem

NASA Astrophysics Data System (ADS)

Zimovets, Artem; Matviychuk, Alexander; Ushakov, Vladimir

2016-12-01

The paper presents two different approaches to reduce the time of computer calculation of reachability sets. First of these two approaches use different data structures for storing the reachability sets in the computer memory for calculation in single-threaded mode. Second approach is based on using parallel algorithms with reference to the data structures from the first approach. Within the framework of this paper parallel algorithm of approximate reachability set calculation on computer with SMP-architecture is proposed. The results of numerical modelling are presented in the form of tables which demonstrate high efficiency of parallel computing technology and also show how computing time depends on the used data structure.
Computation of multi-dimensional viscous supersonic jet flow

NASA Technical Reports Server (NTRS)

Kim, Y. N.; Buggeln, R. C.; Mcdonald, H.

1986-01-01

A new method has been developed for two- and three-dimensional computations of viscous supersonic flows with embedded subsonic regions adjacent to solid boundaries. The approach employs a reduced form of the Navier-Stokes equations which allows solution as an initial-boundary value problem in space, using an efficient noniterative forward marching algorithm. Numerical instability associated with forward marching algorithms for flows with embedded subsonic regions is avoided by approximation of the reduced form of the Navier-Stokes equations in the subsonic regions of the boundary layers. Supersonic and subsonic portions of the flow field are simultaneously calculated by a consistently split linearized block implicit computational algorithm. The results of computations for a series of test cases relevant to internal supersonic flow is presented and compared with data. Comparison between data and computation are in general excellent thus indicating that the computational technique has great promise as a tool for calculating supersonic flow with embedded subsonic regions. Finally, a User's Manual is presented for the computer code used to perform the calculations.
An S N Algorithm for Modern Architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, Randal Scott

2016-08-29

LANL discrete ordinates transport packages are required to perform large, computationally intensive time-dependent calculations on massively parallel architectures, where even a single such calculation may need many months to complete. While KBA methods scale out well to very large numbers of compute nodes, we are limited by practical constraints on the number of such nodes we can actually apply to any given calculation. Instead, we describe a modified KBA algorithm that allows realization of the reductions in solution time offered by both the current, and future, architectural changes within a compute node.
A projected preconditioned conjugate gradient algorithm for computing many extreme eigenpairs of a Hermitian matrix [A projected preconditioned conjugate gradient algorithm for computing a large eigenspace of a Hermitian matrix

DOE PAGES

Vecharynski, Eugene; Yang, Chao; Pask, John E.

2015-02-25

Here, we present an iterative algorithm for computing an invariant subspace associated with the algebraically smallest eigenvalues of a large sparse or structured Hermitian matrix A. We are interested in the case in which the dimension of the invariant subspace is large (e.g., over several hundreds or thousands) even though it may still be small relative to the dimension of A. These problems arise from, for example, density functional theory (DFT) based electronic structure calculations for complex materials. The key feature of our algorithm is that it performs fewer Rayleigh–Ritz calculations compared to existing algorithms such as the locally optimalmore » block preconditioned conjugate gradient or the Davidson algorithm. It is a block algorithm, and hence can take advantage of efficient BLAS3 operations and be implemented with multiple levels of concurrency. We discuss a number of practical issues that must be addressed in order to implement the algorithm efficiently on a high performance computer.« less
Algorithms for the Computation of Debris Risk

NASA Technical Reports Server (NTRS)

Matney, Mark J.

2017-01-01

Determining the risks from space debris involve a number of statistical calculations. These calculations inevitably involve assumptions about geometry - including the physical geometry of orbits and the geometry of satellites. A number of tools have been developed in NASA’s Orbital Debris Program Office to handle these calculations; many of which have never been published before. These include algorithms that are used in NASA’s Orbital Debris Engineering Model ORDEM 3.0, as well as other tools useful for computing orbital collision rates and ground casualty risks. This paper presents an introduction to these algorithms and the assumptions upon which they are based.
Algorithms for the Computation of Debris Risks

NASA Technical Reports Server (NTRS)

Matney, Mark

2017-01-01

Determining the risks from space debris involve a number of statistical calculations. These calculations inevitably involve assumptions about geometry - including the physical geometry of orbits and the geometry of non-spherical satellites. A number of tools have been developed in NASA's Orbital Debris Program Office to handle these calculations; many of which have never been published before. These include algorithms that are used in NASA's Orbital Debris Engineering Model ORDEM 3.0, as well as other tools useful for computing orbital collision rates and ground casualty risks. This paper will present an introduction to these algorithms and the assumptions upon which they are based.
Gas flow calculation method of a ramjet engine

NASA Astrophysics Data System (ADS)

Kostyushin, Kirill; Kagenov, Anuar; Eremin, Ivan; Zhiltsov, Konstantin; Shuvarikov, Vladimir

2017-11-01

At the present study calculation methodology of gas dynamics equations in ramjet engine is presented. The algorithm is based on Godunov`s scheme. For realization of calculation algorithm, the system of data storage is offered, the system does not depend on mesh topology, and it allows using the computational meshes with arbitrary number of cell faces. The algorithm of building a block-structured grid is given. Calculation algorithm in the software package "FlashFlow" is implemented. Software package is verified on the calculations of simple configurations of air intakes and scramjet models.
Regional-scale calculation of the LS factor using parallel processing

NASA Astrophysics Data System (ADS)

Liu, Kai; Tang, Guoan; Jiang, Ling; Zhu, A.-Xing; Yang, Jianyi; Song, Xiaodong

2015-05-01

With the increase of data resolution and the increasing application of USLE over large areas, the existing serial implementation of algorithms for computing the LS factor is becoming a bottleneck. In this paper, a parallel processing model based on message passing interface (MPI) is presented for the calculation of the LS factor, so that massive datasets at a regional scale can be processed efficiently. The parallel model contains algorithms for calculating flow direction, flow accumulation, drainage network, slope, slope length and the LS factor. According to the existence of data dependence, the algorithms are divided into local algorithms and global algorithms. Parallel strategy are designed according to the algorithm characters including the decomposition method for maintaining the integrity of the results, optimized workflow for reducing the time taken for exporting the unnecessary intermediate data and a buffer-communication-computation strategy for improving the communication efficiency. Experiments on a multi-node system show that the proposed parallel model allows efficient calculation of the LS factor at a regional scale with a massive dataset.
Parallel Implementation of the Terrain Masking Algorithm

DTIC Science & Technology

1994-03-01

contains behavior rules which can define a computation or an algorithm. It can communicate with other process nodes, it can contain local data, and it can...terrain maskirg calculation is being performed. It is this algorithm that comsumes about seventy percent of the total terrain masking calculation time
Real-time polarization imaging algorithm for camera-based polarization navigation sensors.

PubMed

Lu, Hao; Zhao, Kaichun; You, Zheng; Huang, Kaoli

2017-04-10

Biologically inspired polarization navigation is a promising approach due to its autonomous nature, high precision, and robustness. Many researchers have built point source-based and camera-based polarization navigation prototypes in recent years. Camera-based prototypes can benefit from their high spatial resolution but incur a heavy computation load. The pattern recognition algorithm in most polarization imaging algorithms involves several nonlinear calculations that impose a significant computation burden. In this paper, the polarization imaging and pattern recognition algorithms are optimized through reduction to several linear calculations by exploiting the orthogonality of the Stokes parameters without affecting precision according to the features of the solar meridian and the patterns of the polarized skylight. The algorithm contains a pattern recognition algorithm with a Hough transform as well as orientation measurement algorithms. The algorithm was loaded and run on a digital signal processing system to test its computational complexity. The test showed that the running time decreased to several tens of milliseconds from several thousand milliseconds. Through simulations and experiments, it was found that the algorithm can measure orientation without reducing precision. It can hence satisfy the practical demands of low computational load and high precision for use in embedded systems.
Universal algorithms and programs for calculating the motion parameters in the two-body problem

NASA Technical Reports Server (NTRS)

Bakhshiyan, B. T.; Sukhanov, A. A.

1979-01-01

The algorithms and FORTRAN programs for computing positions and velocities, orbital elements and first and second partial derivatives in the two-body problem are presented. The algorithms are applicable for any value of eccentricity and are convenient for computing various navigation parameters.

Hardware architecture design of image restoration based on time-frequency domain computation

NASA Astrophysics Data System (ADS)

Wen, Bo; Zhang, Jing; Jiao, Zipeng

2013-10-01

The image restoration algorithms based on time-frequency domain computation is high maturity and applied widely in engineering. To solve the high-speed implementation of these algorithms, the TFDC hardware architecture is proposed. Firstly, the main module is designed, by analyzing the common processing and numerical calculation. Then, to improve the commonality, the iteration control module is planed for iterative algorithms. In addition, to reduce the computational cost and memory requirements, the necessary optimizations are suggested for the time-consuming module, which include two-dimensional FFT/IFFT and the plural calculation. Eventually, the TFDC hardware architecture is adopted for hardware design of real-time image restoration system. The result proves that, the TFDC hardware architecture and its optimizations can be applied to image restoration algorithms based on TFDC, with good algorithm commonality, hardware realizability and high efficiency.
Creation of parallel algorithms for the solution of problems of gas dynamics on multi-core computers and GPU

NASA Astrophysics Data System (ADS)

Rybakin, B.; Bogatencov, P.; Secrieru, G.; Iliuha, N.

2013-10-01

The paper deals with a parallel algorithm for calculations on multiprocessor computers and GPU accelerators. The calculations of shock waves interaction with low-density bubble results and the problem of the gas flow with the forces of gravity are presented. This algorithm combines a possibility to capture a high resolution of shock waves, the second-order accuracy for TVD schemes, and a possibility to observe a low-level diffusion of the advection scheme. Many complex problems of continuum mechanics are numerically solved on structured or unstructured grids. To improve the accuracy of the calculations is necessary to choose a sufficiently small grid (with a small cell size). This leads to the drawback of a substantial increase of computation time. Therefore, for the calculations of complex problems it is reasonable to use the method of Adaptive Mesh Refinement. That is, the grid refinement is performed only in the areas of interest of the structure, where, e.g., the shock waves are generated, or a complex geometry or other such features exist. Thus, the computing time is greatly reduced. In addition, the execution of the application on the resulting sequence of nested, decreasing nets can be parallelized. Proposed algorithm is based on the AMR method. Utilization of AMR method can significantly improve the resolution of the difference grid in areas of high interest, and from other side to accelerate the processes of the multi-dimensional problems calculating. Parallel algorithms of the analyzed difference models realized for the purpose of calculations on graphic processors using the CUDA technology [1].
A case study of view-factor rectification procedures for diffuse-gray radiation enclosure computations

NASA Technical Reports Server (NTRS)

Taylor, Robert P.; Luck, Rogelio

1995-01-01

The view factors which are used in diffuse-gray radiation enclosure calculations are often computed by approximate numerical integrations. These approximately calculated view factors will usually not satisfy the important physical constraints of reciprocity and closure. In this paper several view-factor rectification algorithms are reviewed and a rectification algorithm based on a least-squares numerical filtering scheme is proposed with both weighted and unweighted classes. A Monte-Carlo investigation is undertaken to study the propagation of view-factor and surface-area uncertainties into the heat transfer results of the diffuse-gray enclosure calculations. It is found that the weighted least-squares algorithm is vastly superior to the other rectification schemes for the reduction of the heat-flux sensitivities to view-factor uncertainties. In a sample problem, which has proven to be very sensitive to uncertainties in view factor, the heat transfer calculations with weighted least-squares rectified view factors are very good with an original view-factor matrix computed to only one-digit accuracy. All of the algorithms had roughly equivalent effects on the reduction in sensitivity to area uncertainty in this case study.
BCM: toolkit for Bayesian analysis of Computational Models using samplers.

PubMed

Thijssen, Bram; Dijkstra, Tjeerd M H; Heskes, Tom; Wessels, Lodewyk F A

2016-10-21

Computational models in biology are characterized by a large degree of uncertainty. This uncertainty can be analyzed with Bayesian statistics, however, the sampling algorithms that are frequently used for calculating Bayesian statistical estimates are computationally demanding, and each algorithm has unique advantages and disadvantages. It is typically unclear, before starting an analysis, which algorithm will perform well on a given computational model. We present BCM, a toolkit for the Bayesian analysis of Computational Models using samplers. It provides efficient, multithreaded implementations of eleven algorithms for sampling from posterior probability distributions and for calculating marginal likelihoods. BCM includes tools to simplify the process of model specification and scripts for visualizing the results. The flexible architecture allows it to be used on diverse types of biological computational models. In an example inference task using a model of the cell cycle based on ordinary differential equations, BCM is significantly more efficient than existing software packages, allowing more challenging inference problems to be solved. BCM represents an efficient one-stop-shop for computational modelers wishing to use sampler-based Bayesian statistics.
Massively parallel algorithm and implementation of RI-MP2 energy calculation for peta-scale many-core supercomputers.

PubMed

Katouda, Michio; Naruse, Akira; Hirano, Yukihiko; Nakajima, Takahito

2016-11-15

A new parallel algorithm and its implementation for the RI-MP2 energy calculation utilizing peta-flop-class many-core supercomputers are presented. Some improvements from the previous algorithm (J. Chem. Theory Comput. 2013, 9, 5373) have been performed: (1) a dual-level hierarchical parallelization scheme that enables the use of more than 10,000 Message Passing Interface (MPI) processes and (2) a new data communication scheme that reduces network communication overhead. A multi-node and multi-GPU implementation of the present algorithm is presented for calculations on a central processing unit (CPU)/graphics processing unit (GPU) hybrid supercomputer. Benchmark results of the new algorithm and its implementation using the K computer (CPU clustering system) and TSUBAME 2.5 (CPU/GPU hybrid system) demonstrate high efficiency. The peak performance of 3.1 PFLOPS is attained using 80,199 nodes of the K computer. The peak performance of the multi-node and multi-GPU implementation is 514 TFLOPS using 1349 nodes and 4047 GPUs of TSUBAME 2.5. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Assessment of the information content of patterns: an algorithm

NASA Astrophysics Data System (ADS)

Daemi, M. Farhang; Beurle, R. L.

1991-12-01

A preliminary investigation confirmed the possibility of assessing the translational and rotational information content of simple artificial images. The calculation is tedious, and for more realistic patterns it is essential to implement the method on a computer. This paper describes an algorithm developed for this purpose which confirms the results of the preliminary investigation. Use of the algorithm facilitates much more comprehensive analysis of the combined effect of continuous rotation and fine translation, and paves the way for analysis of more realistic patterns. Owing to the volume of calculation involved in these algorithms, extensive computing facilities were necessary. The major part of the work was carried out using an ICL 3900 series mainframe computer as well as other powerful workstations such as a RISC architecture MIPS machine.
Computer-Based Algorithmic Determination of Muscle Movement Onset Using M-Mode Ultrasonography

DTIC Science & Technology

2017-05-01

contraction images were analyzed visually and with three different classes of algorithms: pixel standard deviation (SD), high-pass filter and Teager Kaiser...Linear relationships and agreements between computed and visual muscle onset were calculated. The top algorithms were high-pass filtered with a 30 Hz...suggest that computer automated determination using high-pass filtering is a potential objective alternative to visual determination in human
Algorithms for the optimization of RBE-weighted dose in particle therapy.

PubMed

Horcicka, M; Meyer, C; Buschbacher, A; Durante, M; Krämer, M

2013-01-21

We report on various algorithms used for the nonlinear optimization of RBE-weighted dose in particle therapy. Concerning the dose calculation carbon ions are considered and biological effects are calculated by the Local Effect Model. Taking biological effects fully into account requires iterative methods to solve the optimization problem. We implemented several additional algorithms into GSI's treatment planning system TRiP98, like the BFGS-algorithm and the method of conjugated gradients, in order to investigate their computational performance. We modified textbook iteration procedures to improve the convergence speed. The performance of the algorithms is presented by convergence in terms of iterations and computation time. We found that the Fletcher-Reeves variant of the method of conjugated gradients is the algorithm with the best computational performance. With this algorithm we could speed up computation times by a factor of 4 compared to the method of steepest descent, which was used before. With our new methods it is possible to optimize complex treatment plans in a few minutes leading to good dose distributions. At the end we discuss future goals concerning dose optimization issues in particle therapy which might benefit from fast optimization solvers.
Algorithms for the optimization of RBE-weighted dose in particle therapy

NASA Astrophysics Data System (ADS)

Horcicka, M.; Meyer, C.; Buschbacher, A.; Durante, M.; Krämer, M.

2013-01-01

We report on various algorithms used for the nonlinear optimization of RBE-weighted dose in particle therapy. Concerning the dose calculation carbon ions are considered and biological effects are calculated by the Local Effect Model. Taking biological effects fully into account requires iterative methods to solve the optimization problem. We implemented several additional algorithms into GSI's treatment planning system TRiP98, like the BFGS-algorithm and the method of conjugated gradients, in order to investigate their computational performance. We modified textbook iteration procedures to improve the convergence speed. The performance of the algorithms is presented by convergence in terms of iterations and computation time. We found that the Fletcher-Reeves variant of the method of conjugated gradients is the algorithm with the best computational performance. With this algorithm we could speed up computation times by a factor of 4 compared to the method of steepest descent, which was used before. With our new methods it is possible to optimize complex treatment plans in a few minutes leading to good dose distributions. At the end we discuss future goals concerning dose optimization issues in particle therapy which might benefit from fast optimization solvers.
An efficient and accurate 3D displacements tracking strategy for digital volume correlation

NASA Astrophysics Data System (ADS)

Pan, Bing; Wang, Bo; Wu, Dafang; Lubineau, Gilles

2014-07-01

Owing to its inherent computational complexity, practical implementation of digital volume correlation (DVC) for internal displacement and strain mapping faces important challenges in improving its computational efficiency. In this work, an efficient and accurate 3D displacement tracking strategy is proposed for fast DVC calculation. The efficiency advantage is achieved by using three improvements. First, to eliminate the need of updating Hessian matrix in each iteration, an efficient 3D inverse compositional Gauss-Newton (3D IC-GN) algorithm is introduced to replace existing forward additive algorithms for accurate sub-voxel displacement registration. Second, to ensure the 3D IC-GN algorithm that converges accurately and rapidly and avoid time-consuming integer-voxel displacement searching, a generalized reliability-guided displacement tracking strategy is designed to transfer accurate and complete initial guess of deformation for each calculation point from its computed neighbors. Third, to avoid the repeated computation of sub-voxel intensity interpolation coefficients, an interpolation coefficient lookup table is established for tricubic interpolation. The computational complexity of the proposed fast DVC and the existing typical DVC algorithms are first analyzed quantitatively according to necessary arithmetic operations. Then, numerical tests are performed to verify the performance of the fast DVC algorithm in terms of measurement accuracy and computational efficiency. The experimental results indicate that, compared with the existing DVC algorithm, the presented fast DVC algorithm produces similar precision and slightly higher accuracy at a substantially reduced computational cost.
A New Algorithm to Optimize Maximal Information Coefficient

PubMed Central

Luo, Feng; Yuan, Zheming

2016-01-01

The maximal information coefficient (MIC) captures dependences between paired variables, including both functional and non-functional relationships. In this paper, we develop a new method, ChiMIC, to calculate the MIC values. The ChiMIC algorithm uses the chi-square test to terminate grid optimization and then removes the restriction of maximal grid size limitation of original ApproxMaxMI algorithm. Computational experiments show that ChiMIC algorithm can maintain same MIC values for noiseless functional relationships, but gives much smaller MIC values for independent variables. For noise functional relationship, the ChiMIC algorithm can reach the optimal partition much faster. Furthermore, the MCN values based on MIC calculated by ChiMIC can capture the complexity of functional relationships in a better way, and the statistical powers of MIC calculated by ChiMIC are higher than those calculated by ApproxMaxMI. Moreover, the computational costs of ChiMIC are much less than those of ApproxMaxMI. We apply the MIC values tofeature selection and obtain better classification accuracy using features selected by the MIC values from ChiMIC. PMID:27333001
A numerical algorithm for the explicit calculation of SU(N) and SL(N,C) Clebsch-Gordan coefficients

DOE Office of Scientific and Technical Information (OSTI.GOV)

Alex, Arne; Delft, Jan von; Kalus, Matthias

2011-02-15

We present an algorithm for the explicit numerical calculation of SU(N) and SL(N,C) Clebsch-Gordan coefficients, based on the Gelfand-Tsetlin pattern calculus. Our algorithm is well suited for numerical implementation; we include a computer code in an appendix. Our exposition presumes only familiarity with the representation theory of SU(2).
A Kirchhoff approach to seismic modeling and prestack depth migration

NASA Astrophysics Data System (ADS)

Liu, Zhen-Yue

1993-05-01

The Kirchhoff integral provides a robust method for implementing seismic modeling and prestack depth migration, which can handle lateral velocity variation and turning waves. With a little extra computation cost, the Kirchoff-type migration can obtain multiple outputs that have the same phase but different amplitudes, compared with that of other migration methods. The ratio of these amplitudes is helpful in computing some quantities such as reflection angle. I develop a seismic modeling and prestack depth migration method based on the Kirchhoff integral, that handles both laterally variant velocity and a dip beyond 90 degrees. The method uses a finite-difference algorithm to calculate travel times and WKBJ amplitudes for the Kirchhoff integral. Compared to ray-tracing algorithms, the finite-difference algorithm gives an efficient implementation and single-valued quantities (first arrivals) on output. In my finite difference algorithm, the upwind scheme is used to calculate travel times, and the Crank-Nicolson scheme is used to calculate amplitudes. Moreover, interpolation is applied to save computation cost. The modeling and migration algorithms require a smooth velocity function. I develop a velocity-smoothing technique based on damped least-squares to aid in obtaining a successful migration.
Probabilistic numerics and uncertainty in computations

PubMed Central

Hennig, Philipp; Osborne, Michael A.; Girolami, Mark

2015-01-01

We deliver a call to arms for probabilistic numerical methods: algorithms for numerical tasks, including linear algebra, integration, optimization and solving differential equations, that return uncertainties in their calculations. Such uncertainties, arising from the loss of precision induced by numerical calculation with limited time or hardware, are important for much contemporary science and industry. Within applications such as climate science and astrophysics, the need to make decisions on the basis of computations with large and complex data have led to a renewed focus on the management of numerical uncertainty. We describe how several seminal classic numerical methods can be interpreted naturally as probabilistic inference. We then show that the probabilistic view suggests new algorithms that can flexibly be adapted to suit application specifics, while delivering improved empirical performance. We provide concrete illustrations of the benefits of probabilistic numeric algorithms on real scientific problems from astrometry and astronomical imaging, while highlighting open problems with these new algorithms. Finally, we describe how probabilistic numerical methods provide a coherent framework for identifying the uncertainty in calculations performed with a combination of numerical algorithms (e.g. both numerical optimizers and differential equation solvers), potentially allowing the diagnosis (and control) of error sources in computations. PMID:26346321
Probabilistic numerics and uncertainty in computations.

PubMed

Hennig, Philipp; Osborne, Michael A; Girolami, Mark

2015-07-08

We deliver a call to arms for probabilistic numerical methods : algorithms for numerical tasks, including linear algebra, integration, optimization and solving differential equations, that return uncertainties in their calculations. Such uncertainties, arising from the loss of precision induced by numerical calculation with limited time or hardware, are important for much contemporary science and industry. Within applications such as climate science and astrophysics, the need to make decisions on the basis of computations with large and complex data have led to a renewed focus on the management of numerical uncertainty. We describe how several seminal classic numerical methods can be interpreted naturally as probabilistic inference. We then show that the probabilistic view suggests new algorithms that can flexibly be adapted to suit application specifics, while delivering improved empirical performance. We provide concrete illustrations of the benefits of probabilistic numeric algorithms on real scientific problems from astrometry and astronomical imaging, while highlighting open problems with these new algorithms. Finally, we describe how probabilistic numerical methods provide a coherent framework for identifying the uncertainty in calculations performed with a combination of numerical algorithms (e.g. both numerical optimizers and differential equation solvers), potentially allowing the diagnosis (and control) of error sources in computations.
Dosimetric comparison of lung stereotactic body radiotherapy treatment plans using averaged computed tomography and end-exhalation computed tomography images: Evaluation of the effect of different dose-calculation algorithms and prescription methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mitsuyoshi, Takamasa; Nakamura, Mitsuhiro, E-mail: m_nkmr@kuhp.kyoto-u.ac.jp; Matsuo, Yukinori

The purpose of this article is to quantitatively evaluate differences in dose distributions calculated using various computed tomography (CT) datasets, dose-calculation algorithms, and prescription methods in stereotactic body radiotherapy (SBRT) for patients with early-stage lung cancer. Data on 29 patients with early-stage lung cancer treated with SBRT were retrospectively analyzed. Averaged CT (Ave-CT) and expiratory CT (Ex-CT) images were reconstructed for each patient using 4-dimensional CT data. Dose distributions were initially calculated using the Ave-CT images and recalculated (in the same monitor units [MUs]) by employing Ex-CT images with the same beam arrangements. The dose-volume parameters, including D{sub 95}, D{submore » 90}, D{sub 50}, and D{sub 2} of the planning target volume (PTV), were compared between the 2 image sets. To explore the influence of dose-calculation algorithms and prescription methods on the differences in dose distributions evident between Ave-CT and Ex-CT images, we calculated dose distributions using the following 3 different algorithms: x-ray Voxel Monte Carlo (XVMC), Acuros XB (AXB), and the anisotropic analytical algorithm (AAA). We also used 2 different dose-prescription methods; the isocenter prescription and the PTV periphery prescription methods. All differences in PTV dose-volume parameters calculated using Ave-CT and Ex-CT data were within 3 percentage points (%pts) employing the isocenter prescription method, and within 1.5%pts using the PTV periphery prescription method, irrespective of which of the 3 algorithms (XVMC, AXB, and AAA) was employed. The frequencies of dose-volume parameters differing by >1%pt when the XVMC and AXB were used were greater than those associated with the use of the AAA, regardless of the dose-prescription method employed. All differences in PTV dose-volume parameters calculated using Ave-CT and Ex-CT data on patients who underwent lung SBRT were within 3%pts, regardless of the dose-calculation algorithm or the dose-prescription method employed.« less
Cloud computing task scheduling strategy based on improved differential evolution algorithm

NASA Astrophysics Data System (ADS)

Ge, Junwei; He, Qian; Fang, Yiqiu

2017-04-01

In order to optimize the cloud computing task scheduling scheme, an improved differential evolution algorithm for cloud computing task scheduling is proposed. Firstly, the cloud computing task scheduling model, according to the model of the fitness function, and then used improved optimization calculation of the fitness function of the evolutionary algorithm, according to the evolution of generation of dynamic selection strategy through dynamic mutation strategy to ensure the global and local search ability. The performance test experiment was carried out in the CloudSim simulation platform, the experimental results show that the improved differential evolution algorithm can reduce the cloud computing task execution time and user cost saving, good implementation of the optimal scheduling of cloud computing tasks.
Obtaining lower bounds from the progressive hedging algorithm for stochastic mixed-integer programs

DOE PAGES

Gade, Dinakar; Hackebeil, Gabriel; Ryan, Sarah M.; ...

2016-04-02

We present a method for computing lower bounds in the progressive hedging algorithm (PHA) for two-stage and multi-stage stochastic mixed-integer programs. Computing lower bounds in the PHA allows one to assess the quality of the solutions generated by the algorithm contemporaneously. The lower bounds can be computed in any iteration of the algorithm by using dual prices that are calculated during execution of the standard PHA. In conclusion, we report computational results on stochastic unit commitment and stochastic server location problem instances, and explore the relationship between key PHA parameters and the quality of the resulting lower bounds.
Computing Gröbner Bases within Linear Algebra

NASA Astrophysics Data System (ADS)

Suzuki, Akira

In this paper, we present an alternative algorithm to compute Gröbner bases, which is based on computations on sparse linear algebra. Both of S-polynomial computations and monomial reductions are computed in linear algebra simultaneously in this algorithm. So it can be implemented to any computational system which can handle linear algebra. For a given ideal in a polynomial ring, it calculates a Gröbner basis along with the corresponding term order appropriately.
Simulated quantum computation of molecular energies.

PubMed

Aspuru-Guzik, Alán; Dutoi, Anthony D; Love, Peter J; Head-Gordon, Martin

2005-09-09

The calculation time for the energy of atoms and molecules scales exponentially with system size on a classical computer but polynomially using quantum algorithms. We demonstrate that such algorithms can be applied to problems of chemical interest using modest numbers of quantum bits. Calculations of the water and lithium hydride molecular ground-state energies have been carried out on a quantum computer simulator using a recursive phase-estimation algorithm. The recursive algorithm reduces the number of quantum bits required for the readout register from about 20 to 4. Mappings of the molecular wave function to the quantum bits are described. An adiabatic method for the preparation of a good approximate ground-state wave function is described and demonstrated for a stretched hydrogen molecule. The number of quantum bits required scales linearly with the number of basis functions, and the number of gates required grows polynomially with the number of quantum bits.

Analysis of basic clustering algorithms for numerical estimation of statistical averages in biomolecules.

PubMed

Anandakrishnan, Ramu; Onufriev, Alexey

2008-03-01

In statistical mechanics, the equilibrium properties of a physical system of particles can be calculated as the statistical average over accessible microstates of the system. In general, these calculations are computationally intractable since they involve summations over an exponentially large number of microstates. Clustering algorithms are one of the methods used to numerically approximate these sums. The most basic clustering algorithms first sub-divide the system into a set of smaller subsets (clusters). Then, interactions between particles within each cluster are treated exactly, while all interactions between different clusters are ignored. These smaller clusters have far fewer microstates, making the summation over these microstates, tractable. These algorithms have been previously used for biomolecular computations, but remain relatively unexplored in this context. Presented here, is a theoretical analysis of the error and computational complexity for the two most basic clustering algorithms that were previously applied in the context of biomolecular electrostatics. We derive a tight, computationally inexpensive, error bound for the equilibrium state of a particle computed via these clustering algorithms. For some practical applications, it is the root mean square error, which can be significantly lower than the error bound, that may be more important. We how that there is a strong empirical relationship between error bound and root mean square error, suggesting that the error bound could be used as a computationally inexpensive metric for predicting the accuracy of clustering algorithms for practical applications. An example of error analysis for such an application-computation of average charge of ionizable amino-acids in proteins-is given, demonstrating that the clustering algorithm can be accurate enough for practical purposes.
Economizing Education: Assessment Algorithms and Calculative Agencies

ERIC Educational Resources Information Center

O'Keeffe, Cormac

2017-01-01

International Large Scale Assessments have been producing data about educational attainment for over 60 years. More recently however, these assessments as tests have become digitally and computationally complex and increasingly rely on the calculative work performed by algorithms. In this article I first consider the coordination of relations…
Rapid Calculation of Spacecraft Trajectories Using Efficient Taylor Series Integration

NASA Technical Reports Server (NTRS)

Scott, James R.; Martini, Michael C.

2011-01-01

A variable-order, variable-step Taylor series integration algorithm was implemented in NASA Glenn's SNAP (Spacecraft N-body Analysis Program) code. SNAP is a high-fidelity trajectory propagation program that can propagate the trajectory of a spacecraft about virtually any body in the solar system. The Taylor series algorithm's very high order accuracy and excellent stability properties lead to large reductions in computer time relative to the code's existing 8th order Runge-Kutta scheme. Head-to-head comparison on near-Earth, lunar, Mars, and Europa missions showed that Taylor series integration is 15.8 times faster than Runge- Kutta on average, and is more accurate. These speedups were obtained for calculations involving central body, other body, thrust, and drag forces. Similar speedups have been obtained for calculations that include J2 spherical harmonic for central body gravitation. The algorithm includes a step size selection method that directly calculates the step size and never requires a repeat step. High-order Taylor series integration algorithms have been shown to provide major reductions in computer time over conventional integration methods in numerous scientific applications. The objective here was to directly implement Taylor series integration in an existing trajectory analysis code and demonstrate that large reductions in computer time (order of magnitude) could be achieved while simultaneously maintaining high accuracy. This software greatly accelerates the calculation of spacecraft trajectories. At each time level, the spacecraft position, velocity, and mass are expanded in a high-order Taylor series whose coefficients are obtained through efficient differentiation arithmetic. This makes it possible to take very large time steps at minimal cost, resulting in large savings in computer time. The Taylor series algorithm is implemented primarily through three subroutines: (1) a driver routine that automatically introduces auxiliary variables and sets up initial conditions and integrates; (2) a routine that calculates system reduced derivatives using recurrence relations for quotients and products; and (3) a routine that determines the step size and sums the series. The order of accuracy used in a trajectory calculation is arbitrary and can be set by the user. The algorithm directly calculates the motion of other planetary bodies and does not require ephemeris files (except to start the calculation). The code also runs with Taylor series and Runge-Kutta used interchangeably for different phases of a mission.
The Even-Rho and Even-Epsilon Algorithms for Accelerating Convergence of a Numerical Sequence

DTIC Science & Technology

1981-12-01

equal, leading to zero or very small divisors. Computer programs implementing these algorithms are given along with sample output. An appreciable amount...calculation of the array of Shank’s transforms or, -A equivalently, of the related Padd Table. The :other, the even-rho algorithm, is closely related...leading to zero or very small divisors. Computer pro- grams implementing these algorithms are given along with sample output. An appreciable amount or
An exact computational method for performance analysis of sequential test algorithms for detecting network intrusions

NASA Astrophysics Data System (ADS)

Chen, Xinjia; Lacy, Fred; Carriere, Patrick

2015-05-01

Sequential test algorithms are playing increasingly important roles for quick detecting network intrusions such as portscanners. In view of the fact that such algorithms are usually analyzed based on intuitive approximation or asymptotic analysis, we develop an exact computational method for the performance analysis of such algorithms. Our method can be used to calculate the probability of false alarm and average detection time up to arbitrarily pre-specified accuracy.
Calculating Shocks In Flows At Chemical Equilibrium

NASA Technical Reports Server (NTRS)

Eberhardt, Scott; Palmer, Grant

1988-01-01

Boundary conditions prove critical. Conference paper describes algorithm for calculation of shocks in hypersonic flows of gases at chemical equilibrium. Although algorithm represents intermediate stage in development of reliable, accurate computer code for two-dimensional flow, research leading up to it contributes to understanding of what is needed to complete task.
Calculation of the Poisson cumulative distribution function

NASA Technical Reports Server (NTRS)

Bowerman, Paul N.; Nolty, Robert G.; Scheuer, Ernest M.

1990-01-01

A method for calculating the Poisson cdf (cumulative distribution function) is presented. The method avoids computer underflow and overflow during the process. The computer program uses this technique to calculate the Poisson cdf for arbitrary inputs. An algorithm that determines the Poisson parameter required to yield a specified value of the cdf is presented.
Communication: Calculation of interatomic forces and optimization of molecular geometry with auxiliary-field quantum Monte Carlo

NASA Astrophysics Data System (ADS)

Motta, Mario; Zhang, Shiwei

2018-05-01

We propose an algorithm for accurate, systematic, and scalable computation of interatomic forces within the auxiliary-field quantum Monte Carlo (AFQMC) method. The algorithm relies on the Hellmann-Feynman theorem and incorporates Pulay corrections in the presence of atomic orbital basis sets. We benchmark the method for small molecules by comparing the computed forces with the derivatives of the AFQMC potential energy surface and by direct comparison with other quantum chemistry methods. We then perform geometry optimizations using the steepest descent algorithm in larger molecules. With realistic basis sets, we obtain equilibrium geometries in agreement, within statistical error bars, with experimental values. The increase in computational cost for computing forces in this approach is only a small prefactor over that of calculating the total energy. This paves the way for a general and efficient approach for geometry optimization and molecular dynamics within AFQMC.
Experimental determination of Ramsey numbers.

PubMed

Bian, Zhengbing; Chudak, Fabian; Macready, William G; Clark, Lane; Gaitan, Frank

2013-09-27

Ramsey theory is a highly active research area in mathematics that studies the emergence of order in large disordered structures. Ramsey numbers mark the threshold at which order first appears and are extremely difficult to calculate due to their explosive rate of growth. Recently, an algorithm that can be implemented using adiabatic quantum evolution has been proposed that calculates the two-color Ramsey numbers R(m,n). Here we present results of an experimental implementation of this algorithm and show that it correctly determines the Ramsey numbers R(3,3) and R(m,2) for 4≤m≤8. The R(8,2) computation used 84 qubits of which 28 were computational qubits. This computation is the largest experimental implementation of a scientifically meaningful adiabatic evolution algorithm that has been done to date.
Experimental Determination of Ramsey Numbers

NASA Astrophysics Data System (ADS)

Bian, Zhengbing; Chudak, Fabian; Macready, William G.; Clark, Lane; Gaitan, Frank

2013-09-01

Ramsey theory is a highly active research area in mathematics that studies the emergence of order in large disordered structures. Ramsey numbers mark the threshold at which order first appears and are extremely difficult to calculate due to their explosive rate of growth. Recently, an algorithm that can be implemented using adiabatic quantum evolution has been proposed that calculates the two-color Ramsey numbers R(m,n). Here we present results of an experimental implementation of this algorithm and show that it correctly determines the Ramsey numbers R(3,3) and R(m,2) for 4≤m≤8. The R(8,2) computation used 84 qubits of which 28 were computational qubits. This computation is the largest experimental implementation of a scientifically meaningful adiabatic evolution algorithm that has been done to date.
Texture functions in image analysis: A computationally efficient solution

NASA Technical Reports Server (NTRS)

Cox, S. C.; Rose, J. F.

1983-01-01

A computationally efficient means for calculating texture measurements from digital images by use of the co-occurrence technique is presented. The calculation of the statistical descriptors of image texture and a solution that circumvents the need for calculating and storing a co-occurrence matrix are discussed. The results show that existing efficient algorithms for calculating sums, sums of squares, and cross products can be used to compute complex co-occurrence relationships directly from the digital image input.
Zero-block mode decision algorithm for H.264/AVC.

PubMed

Lee, Yu-Ming; Lin, Yinyi

2009-03-01

In the previous paper , we proposed a zero-block intermode decision algorithm for H.264 video coding based upon the number of zero-blocks of 4 x 4 DCT coefficients between the current macroblock and the co-located macroblock. The proposed algorithm can achieve significant improvement in computation, but the computation performance is limited for high bit-rate coding. To improve computation efficiency, in this paper, we suggest an enhanced zero-block decision algorithm, which uses an early zero-block detection method to compute the number of zero-blocks instead of direct DCT and quantization (DCT/Q) calculation and incorporates two adequate decision methods into semi-stationary and nonstationary regions of a video sequence. In addition, the zero-block decision algorithm is also applied to the intramode prediction in the P frame. The enhanced zero-block decision algorithm brings out a reduction of average 27% of total encoding time compared to the zero-block decision algorithm.
Rapid solution of large-scale systems of equations

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.

1994-01-01

The analysis and design of complex aerospace structures requires the rapid solution of large systems of linear and nonlinear equations, eigenvalue extraction for buckling, vibration and flutter modes, structural optimization and design sensitivity calculation. Computers with multiple processors and vector capabilities can offer substantial computational advantages over traditional scalar computer for these analyses. These computers fall into two categories: shared memory computers and distributed memory computers. This presentation covers general-purpose, highly efficient algorithms for generation/assembly or element matrices, solution of systems of linear and nonlinear equations, eigenvalue and design sensitivity analysis and optimization. All algorithms are coded in FORTRAN for shared memory computers and many are adapted to distributed memory computers. The capability and numerical performance of these algorithms will be addressed.
A new algorithm to handle finite nuclear mass effects in electronic calculations: the ISOTOPE program.

PubMed

Gonçalves, Cristina P; Mohallem, José R

2004-11-15

We report the development of a simple algorithm to modify quantum chemistry codes based on the LCAO procedure, to account for the isotope problem in electronic structure calculations. No extra computations are required compared to standard Born-Oppenheimer calculations. An upgrade of the Gamess package called ISOTOPE is presented, and its applicability is demonstrated in some examples.
Performance comparison of attitude determination, attitude estimation, and nonlinear observers algorithms

NASA Astrophysics Data System (ADS)

MOHAMMED, M. A. SI; BOUSSADIA, H.; BELLAR, A.; ADNANE, A.

2017-01-01

This paper presents a brief synthesis and useful performance analysis of different attitude filtering algorithms (attitude determination algorithms, attitude estimation algorithms, and nonlinear observers) applied to Low Earth Orbit Satellite in terms of accuracy, convergence time, amount of memory, and computation time. This latter is calculated in two ways, using a personal computer and also using On-board computer 750 (OBC 750) that is being used in many SSTL Earth observation missions. The use of this comparative study could be an aided design tool to the designer to choose from an attitude determination or attitude estimation or attitude observer algorithms. The simulation results clearly indicate that the nonlinear Observer is the more logical choice.
Automatic mesh adaptivity for hybrid Monte Carlo/deterministic neutronics modeling of difficult shielding problems

DOE PAGES

Ibrahim, Ahmad M.; Wilson, Paul P.H.; Sawan, Mohamed E.; ...

2015-06-30

The CADIS and FW-CADIS hybrid Monte Carlo/deterministic techniques dramatically increase the efficiency of neutronics modeling, but their use in the accurate design analysis of very large and geometrically complex nuclear systems has been limited by the large number of processors and memory requirements for their preliminary deterministic calculations and final Monte Carlo calculation. Three mesh adaptivity algorithms were developed to reduce the memory requirements of CADIS and FW-CADIS without sacrificing their efficiency improvement. First, a macromaterial approach enhances the fidelity of the deterministic models without changing the mesh. Second, a deterministic mesh refinement algorithm generates meshes that capture as muchmore » geometric detail as possible without exceeding a specified maximum number of mesh elements. Finally, a weight window coarsening algorithm decouples the weight window mesh and energy bins from the mesh and energy group structure of the deterministic calculations in order to remove the memory constraint of the weight window map from the deterministic mesh resolution. The three algorithms were used to enhance an FW-CADIS calculation of the prompt dose rate throughout the ITER experimental facility. Using these algorithms resulted in a 23.3% increase in the number of mesh tally elements in which the dose rates were calculated in a 10-day Monte Carlo calculation and, additionally, increased the efficiency of the Monte Carlo simulation by a factor of at least 3.4. The three algorithms enabled this difficult calculation to be accurately solved using an FW-CADIS simulation on a regular computer cluster, eliminating the need for a world-class super computer.« less
A Fast Full Tensor Gravity computation algorithm for High Resolution 3D Geologic Interpretations

NASA Astrophysics Data System (ADS)

Jayaram, V.; Crain, K.; Keller, G. R.

2011-12-01

We present an algorithm to rapidly calculate the vertical gravity and full tensor gravity (FTG) values due to a 3-D geologic model. This algorithm can be implemented on single, multi-core CPU and graphical processing units (GPU) architectures. Our technique is based on the line element approximation with a constant density within each grid cell. This type of parameterization is well suited for high-resolution elevation datasets with grid size typically in the range of 1m to 30m. The large high-resolution data grids in our studies employ a pre-filtered mipmap pyramid type representation for the grid data known as the Geometry clipmap. The clipmap was first introduced by Microsoft Research in 2004 to do fly-through terrain visualization. This method caches nested rectangular extents of down-sampled data layers in the pyramid to create view-dependent calculation scheme. Together with the simple grid structure, this allows the gravity to be computed conveniently on-the-fly, or stored in a highly compressed format. Neither of these capabilities has previously been available. Our approach can perform rapid calculations on large topographies including crustal-scale models derived from complex geologic interpretations. For example, we used a 1KM Sphere model consisting of 105000 cells at 10m resolution with 100000 gravity stations. The line element approach took less than 90 seconds to compute the FTG and vertical gravity on an Intel Core i7 CPU at 3.07 GHz utilizing just its single core. Also, unlike traditional gravity computational algorithms, the line-element approach can calculate gravity effects at locations interior or exterior to the model. The only condition that must be met is the observation point cannot be located directly above the line element. Therefore, we perform a location test and then apply appropriate formulation to those data points. We will present and compare the computational performance of the traditional prism method versus the line element approach on different CPU-GPU system configurations. The algorithm calculates the expected gravity at station locations where the observed gravity and FTG data were acquired. This algorithm can be used for all fast forward model calculations of 3D geologic interpretations for data from airborne, space and submarine gravity, and FTG instrumentation.
Computing the Power-Density Spectrum for an Engineering Model

NASA Technical Reports Server (NTRS)

Dunn, H. J.

1982-01-01

Computer program for calculating of power-density spectrum (PDS) from data base generated by Advanced Continuous Simulation Language (ACSL) uses algorithm that employs fast Fourier transform (FFT) to calculate PDS of variable. Accomplished by first estimating autocovariance function of variable and then taking FFT of smoothed autocovariance function to obtain PDS. Fast-Fourier-transform technique conserves computer resources.
Blocked inverted indices for exact clustering of large chemical spaces.

PubMed

Thiel, Philipp; Sach-Peltason, Lisa; Ottmann, Christian; Kohlbacher, Oliver

2014-09-22

The calculation of pairwise compound similarities based on fingerprints is one of the fundamental tasks in chemoinformatics. Methods for efficient calculation of compound similarities are of the utmost importance for various applications like similarity searching or library clustering. With the increasing size of public compound databases, exact clustering of these databases is desirable, but often computationally prohibitively expensive. We present an optimized inverted index algorithm for the calculation of all pairwise similarities on 2D fingerprints of a given data set. In contrast to other algorithms, it neither requires GPU computing nor yields a stochastic approximation of the clustering. The algorithm has been designed to work well with multicore architectures and shows excellent parallel speedup. As an application example of this algorithm, we implemented a deterministic clustering application, which has been designed to decompose virtual libraries comprising tens of millions of compounds in a short time on current hardware. Our results show that our implementation achieves more than 400 million Tanimoto similarity calculations per second on a common desktop CPU. Deterministic clustering of the available chemical space thus can be done on modern multicore machines within a few days.
Euler/Navier-Stokes calculations of transonic flow past fixed- and rotary-wing aircraft configurations

NASA Technical Reports Server (NTRS)

Deese, J. E.; Agarwal, R. K.

1989-01-01

Computational fluid dynamics has an increasingly important role in the design and analysis of aircraft as computer hardware becomes faster and algorithms become more efficient. Progress is being made in two directions: more complex and realistic configurations are being treated and algorithms based on higher approximations to the complete Navier-Stokes equations are being developed. The literature indicates that linear panel methods can model detailed, realistic aircraft geometries in flow regimes where this approximation is valid. As algorithms including higher approximations to the Navier-Stokes equations are developed, computer resource requirements increase rapidly. Generation of suitable grids become more difficult and the number of grid points required to resolve flow features of interest increases. Recently, the development of large vector computers has enabled researchers to attempt more complex geometries with Euler and Navier-Stokes algorithms. The results of calculations for transonic flow about a typical transport and fighter wing-body configuration using thin layer Navier-Stokes equations are described along with flow about helicopter rotor blades using both Euler/Navier-Stokes equations.

Parallelization strategies for continuum-generalized method of moments on the multi-thread systems

NASA Astrophysics Data System (ADS)

Bustamam, A.; Handhika, T.; Ernastuti, Kerami, D.

2017-07-01

Continuum-Generalized Method of Moments (C-GMM) covers the Generalized Method of Moments (GMM) shortfall which is not as efficient as Maximum Likelihood estimator by using the continuum set of moment conditions in a GMM framework. However, this computation would take a very long time since optimizing regularization parameter. Unfortunately, these calculations are processed sequentially whereas in fact all modern computers are now supported by hierarchical memory systems and hyperthreading technology, which allowing for parallel computing. This paper aims to speed up the calculation process of C-GMM by designing a parallel algorithm for C-GMM on the multi-thread systems. First, parallel regions are detected for the original C-GMM algorithm. There are two parallel regions in the original C-GMM algorithm, that are contributed significantly to the reduction of computational time: the outer-loop and the inner-loop. Furthermore, this parallel algorithm will be implemented with standard shared-memory application programming interface, i.e. Open Multi-Processing (OpenMP). The experiment shows that the outer-loop parallelization is the best strategy for any number of observations.
Calculation method of water injection forward modeling and inversion process in oilfield water injection network

NASA Astrophysics Data System (ADS)

Liu, Long; Liu, Wei

2018-04-01

A forward modeling and inversion algorithm is adopted in order to determine the water injection plan in the oilfield water injection network. The main idea of the algorithm is shown as follows: firstly, the oilfield water injection network is inversely calculated. The pumping station demand flow is calculated. Then, forward modeling calculation is carried out for judging whether all water injection wells meet the requirements of injection allocation or not. If all water injection wells meet the requirements of injection allocation, calculation is stopped, otherwise the demand injection allocation flow rate of certain step size is reduced aiming at water injection wells which do not meet requirements, and next iterative operation is started. It is not necessary to list the algorithm into water injection network system algorithm, which can be realized easily. Iterative method is used, which is suitable for computer programming. Experimental result shows that the algorithm is fast and accurate.
Acceleration of High Angular Momentum Electron Repulsion Integrals and Integral Derivatives on Graphics Processing Units.

PubMed

Miao, Yipu; Merz, Kenneth M

2015-04-14

We present an efficient implementation of ab initio self-consistent field (SCF) energy and gradient calculations that run on Compute Unified Device Architecture (CUDA) enabled graphical processing units (GPUs) using recurrence relations. We first discuss the machine-generated code that calculates the electron-repulsion integrals (ERIs) for different ERI types. Next we describe the porting of the SCF gradient calculation to GPUs, which results in an acceleration of the computation of the first-order derivative of the ERIs. However, only s, p, and d ERIs and s and p derivatives could be executed simultaneously on GPUs using the current version of CUDA and generation of NVidia GPUs using a previously described algorithm [Miao and Merz J. Chem. Theory Comput. 2013, 9, 965-976.]. Hence, we developed an algorithm to compute f type ERIs and d type ERI derivatives on GPUs. Our benchmarks shows the performance GPU enable ERI and ERI derivative computation yielded speedups of 10-18 times relative to traditional CPU execution. An accuracy analysis using double-precision calculations demonstrates that the overall accuracy is satisfactory for most applications.
An efficient parallel algorithm for the calculation of canonical MP2 energies.

PubMed

Baker, Jon; Pulay, Peter

2002-09-01

We present the parallel version of a previous serial algorithm for the efficient calculation of canonical MP2 energies (Pulay, P.; Saebo, S.; Wolinski, K. Chem Phys Lett 2001, 344, 543). It is based on the Saebo-Almlöf direct-integral transformation, coupled with an efficient prescreening of the AO integrals. The parallel algorithm avoids synchronization delays by spawning a second set of slaves during the bin-sort prior to the second half-transformation. Results are presented for systems with up to 2000 basis functions. MP2 energies for molecules with 400-500 basis functions can be routinely calculated to microhartree accuracy on a small number of processors (6-8) in a matter of minutes with modern PC-based parallel computers. Copyright 2002 Wiley Periodicals, Inc. J Comput Chem 23: 1150-1156, 2002
Fast algorithm for chirp transforms with zooming-in ability and its applications.

PubMed

Deng, X; Bihari, B; Gan, J; Zhao, F; Chen, R T

2000-04-01

A general fast numerical algorithm for chirp transforms is developed by using two fast Fourier transforms and employing an analytical kernel. This new algorithm unifies the calculations of arbitrary real-order fractional Fourier transforms and Fresnel diffraction. Its computational complexity is better than a fast convolution method using Fourier transforms. Furthermore, one can freely choose the sampling resolutions in both x and u space and zoom in on any portion of the data of interest. Computational results are compared with analytical ones. The errors are essentially limited by the accuracy of the fast Fourier transforms and are higher than the order 10(-12) for most cases. As an example of its application to scalar diffraction, this algorithm can be used to calculate near-field patterns directly behind the aperture, 0 < or = z < d2/lambda. It compensates another algorithm for Fresnel diffraction that is limited to z > d2/lambdaN [J. Opt. Soc. Am. A 15, 2111 (1998)]. Experimental results from waveguide-output microcoupler diffraction are in good agreement with the calculations.
An 'adding' algorithm for the Markov chain formalism for radiation transfer

NASA Technical Reports Server (NTRS)

Esposito, L. W.

1979-01-01

An adding algorithm is presented, that extends the Markov chain method and considers a preceding calculation as a single state of a new Markov chain. This method takes advantage of the description of the radiation transport as a stochastic process. Successive application of this procedure makes calculation possible for any optical depth without increasing the size of the linear system used. It is determined that the time required for the algorithm is comparable to that for a doubling calculation for homogeneous atmospheres. For an inhomogeneous atmosphere the new method is considerably faster than the standard adding routine. It is concluded that the algorithm is efficient, accurate, and suitable for smaller computers in calculating the diffuse intensity scattered by an inhomogeneous planetary atmosphere.
Quantum Monte Carlo with very large multideterminant wavefunctions.

PubMed

Scemama, Anthony; Applencourt, Thomas; Giner, Emmanuel; Caffarel, Michel

2016-07-01

An algorithm to compute efficiently the first two derivatives of (very) large multideterminant wavefunctions for quantum Monte Carlo calculations is presented. The calculation of determinants and their derivatives is performed using the Sherman-Morrison formula for updating the inverse Slater matrix. An improved implementation based on the reduction of the number of column substitutions and on a very efficient implementation of the calculation of the scalar products involved is presented. It is emphasized that multideterminant expansions contain in general a large number of identical spin-specific determinants: for typical configuration interaction-type wavefunctions the number of unique spin-specific determinants Ndetσ ( σ=↑,↓) with a non-negligible weight in the expansion is of order O(Ndet). We show that a careful implementation of the calculation of the Ndet -dependent contributions can make this step negligible enough so that in practice the algorithm scales as the total number of unique spin-specific determinants,  Ndet↑+Ndet↓, over a wide range of total number of determinants (here, Ndet up to about one million), thus greatly reducing the total computational cost. Finally, a new truncation scheme for the multideterminant expansion is proposed so that larger expansions can be considered without increasing the computational time. The algorithm is illustrated with all-electron fixed-node diffusion Monte Carlo calculations of the total energy of the chlorine atom. Calculations using a trial wavefunction including about 750,000 determinants with a computational increase of ∼400 compared to a single-determinant calculation are shown to be feasible. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Fast algorithm for bilinear transforms in optics

NASA Astrophysics Data System (ADS)

Ostrovsky, Andrey S.; Martinez-Niconoff, Gabriel C.; Ramos Romero, Obdulio; Cortes, Liliana

2000-10-01

The fast algorithm for calculating the bilinear transform in the optical system is proposed. This algorithm is based on the coherent-mode representation of the cross-spectral density function of the illumination. The algorithm is computationally efficient when the illumination is partially coherent. Numerical examples are studied and compared with the theoretical results.
Scalable Domain Decomposed Monte Carlo Particle Transport

NASA Astrophysics Data System (ADS)

O'Brien, Matthew Joseph

In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation. The main algorithms we consider are: • Domain decomposition of constructive solid geometry: enables extremely large calculations in which the background geometry is too large to fit in the memory of a single computational node. • Load Balancing: keeps the workload per processor as even as possible so the calculation runs efficiently. • Global Particle Find: if particles are on the wrong processor, globally resolve their locations to the correct processor based on particle coordinate and background domain. • Visualizing constructive solid geometry, sourcing particles, deciding that particle streaming communication is completed and spatial redecomposition. These algorithms are some of the most important parallel algorithms required for domain decomposed Monte Carlo particle transport. We demonstrate that our previous algorithms were not scalable, prove that our new algorithms are scalable, and run some of the algorithms up to 2 million MPI processes on the Sequoia supercomputer.
Efficient Fourier-based algorithms for time-periodic unsteady problems

NASA Astrophysics Data System (ADS)

Gopinath, Arathi Kamath

2007-12-01

This dissertation work proposes two algorithms for the simulation of time-periodic unsteady problems via the solution of Unsteady Reynolds-Averaged Navier-Stokes (URANS) equations. These algorithms use a Fourier representation in time and hence solve for the periodic state directly without resolving transients (which consume most of the resources in a time-accurate scheme). In contrast to conventional Fourier-based techniques which solve the governing equations in frequency space, the new algorithms perform all the calculations in the time domain, and hence require minimal modifications to an existing solver. The complete space-time solution is obtained by iterating in a fifth pseudo-time dimension. Various time-periodic problems such as helicopter rotors, wind turbines, turbomachinery and flapping-wings can be simulated using the Time Spectral method. The algorithm is first validated using pitching airfoil/wing test cases. The method is further extended to turbomachinery problems, and computational results verified by comparison with a time-accurate calculation. The technique can be very memory intensive for large problems, since the solution is computed (and hence stored) simultaneously at all time levels. Often, the blade counts of a turbomachine are rescaled such that a periodic fraction of the annulus can be solved. This approximation enables the solution to be obtained at a fraction of the cost of a full-scale time-accurate solution. For a viscous computation over a three-dimensional single-stage rescaled compressor, an order of magnitude savings is achieved. The second algorithm, the reduced-order Harmonic Balance method is applicable only to turbomachinery flows, and offers even larger computational savings than the Time Spectral method. It simulates the true geometry of the turbomachine using only one blade passage per blade row as the computational domain. In each blade row of the turbomachine, only the dominant frequencies are resolved, namely, combinations of neighbor's blade passing. An appropriate set of frequencies can be chosen by the analyst/designer based on a trade-off between accuracy and computational resources available. A cost comparison with a time-accurate computation for an Euler calculation on a two-dimensional multi-stage compressor obtained an order of magnitude savings, and a RANS calculation on a three-dimensional single-stage compressor achieved two orders of magnitude savings, with comparable accuracy.
Three-dimensional photoacoustic tomography based on graphics-processing-unit-accelerated finite element method.

PubMed

Peng, Kuan; He, Ling; Zhu, Ziqiang; Tang, Jingtian; Xiao, Jiaying

2013-12-01

Compared with commonly used analytical reconstruction methods, the frequency-domain finite element method (FEM) based approach has proven to be an accurate and flexible algorithm for photoacoustic tomography. However, the FEM-based algorithm is computationally demanding, especially for three-dimensional cases. To enhance the algorithm's efficiency, in this work a parallel computational strategy is implemented in the framework of the FEM-based reconstruction algorithm using a graphic-processing-unit parallel frame named the "compute unified device architecture." A series of simulation experiments is carried out to test the accuracy and accelerating effect of the improved method. The results obtained indicate that the parallel calculation does not change the accuracy of the reconstruction algorithm, while its computational cost is significantly reduced by a factor of 38.9 with a GTX 580 graphics card using the improved method.
Enhanced intelligent water drops algorithm for multi-depot vehicle routing problem

PubMed Central

Akutsah, Francis; Olusanya, Micheal O.; Adewumi, Aderemi O.

2018-01-01

The intelligent water drop algorithm is a swarm-based metaheuristic algorithm, inspired by the characteristics of water drops in the river and the environmental changes resulting from the action of the flowing river. Since its appearance as an alternative stochastic optimization method, the algorithm has found applications in solving a wide range of combinatorial and functional optimization problems. This paper presents an improved intelligent water drop algorithm for solving multi-depot vehicle routing problems. A simulated annealing algorithm was introduced into the proposed algorithm as a local search metaheuristic to prevent the intelligent water drop algorithm from getting trapped into local minima and also improve its solution quality. In addition, some of the potential problematic issues associated with using simulated annealing that include high computational runtime and exponential calculation of the probability of acceptance criteria, are investigated. The exponential calculation of the probability of acceptance criteria for the simulated annealing based techniques is computationally expensive. Therefore, in order to maximize the performance of the intelligent water drop algorithm using simulated annealing, a better way of calculating the probability of acceptance criteria is considered. The performance of the proposed hybrid algorithm is evaluated by using 33 standard test problems, with the results obtained compared with the solutions offered by four well-known techniques from the subject literature. Experimental results and statistical tests show that the new method possesses outstanding performance in terms of solution quality and runtime consumed. In addition, the proposed algorithm is suitable for solving large-scale problems. PMID:29554662
Enhanced intelligent water drops algorithm for multi-depot vehicle routing problem.

PubMed

Ezugwu, Absalom E; Akutsah, Francis; Olusanya, Micheal O; Adewumi, Aderemi O

2018-01-01

The intelligent water drop algorithm is a swarm-based metaheuristic algorithm, inspired by the characteristics of water drops in the river and the environmental changes resulting from the action of the flowing river. Since its appearance as an alternative stochastic optimization method, the algorithm has found applications in solving a wide range of combinatorial and functional optimization problems. This paper presents an improved intelligent water drop algorithm for solving multi-depot vehicle routing problems. A simulated annealing algorithm was introduced into the proposed algorithm as a local search metaheuristic to prevent the intelligent water drop algorithm from getting trapped into local minima and also improve its solution quality. In addition, some of the potential problematic issues associated with using simulated annealing that include high computational runtime and exponential calculation of the probability of acceptance criteria, are investigated. The exponential calculation of the probability of acceptance criteria for the simulated annealing based techniques is computationally expensive. Therefore, in order to maximize the performance of the intelligent water drop algorithm using simulated annealing, a better way of calculating the probability of acceptance criteria is considered. The performance of the proposed hybrid algorithm is evaluated by using 33 standard test problems, with the results obtained compared with the solutions offered by four well-known techniques from the subject literature. Experimental results and statistical tests show that the new method possesses outstanding performance in terms of solution quality and runtime consumed. In addition, the proposed algorithm is suitable for solving large-scale problems.
Using the Metropolis Algorithm to Calculate Thermodynamic Quantities: An Undergraduate Computational Experiment

ERIC Educational Resources Information Center

Beddard, Godfrey S.

2011-01-01

Thermodynamic quantities such as the average energy, heat capacity, and entropy are calculated using a Monte Carlo method based on the Metropolis algorithm. This method is illustrated with reference to the harmonic oscillator but is particularly useful when the partition function cannot be evaluated; an example using a one-dimensional spin system…
Satellite Doppler data processing using a microcomputer

NASA Technical Reports Server (NTRS)

Schmid, P. E.; Lynn, J. J.

1977-01-01

A microcomputer which was developed to compute ground radio beacon position locations using satellite measurements of Doppler frequency shift is described. Both the computational algorithms and the microcomputer hardware incorporating these algorithms were discussed. Results are presented where the microcomputer in conjunction with the NIMBUS-6 random access measurement system provides real time calculation of beacon latitude and longitude.
Integral Images: Efficient Algorithms for Their Computation and Storage in Resource-Constrained Embedded Vision Systems

PubMed Central

Ehsan, Shoaib; Clark, Adrian F.; ur Rehman, Naveed; McDonald-Maier, Klaus D.

2015-01-01

The integral image, an intermediate image representation, has found extensive use in multi-scale local feature detection algorithms, such as Speeded-Up Robust Features (SURF), allowing fast computation of rectangular features at constant speed, independent of filter size. For resource-constrained real-time embedded vision systems, computation and storage of integral image presents several design challenges due to strict timing and hardware limitations. Although calculation of the integral image only consists of simple addition operations, the total number of operations is large owing to the generally large size of image data. Recursive equations allow substantial decrease in the number of operations but require calculation in a serial fashion. This paper presents two new hardware algorithms that are based on the decomposition of these recursive equations, allowing calculation of up to four integral image values in a row-parallel way without significantly increasing the number of operations. An efficient design strategy is also proposed for a parallel integral image computation unit to reduce the size of the required internal memory (nearly 35% for common HD video). Addressing the storage problem of integral image in embedded vision systems, the paper presents two algorithms which allow substantial decrease (at least 44.44%) in the memory requirements. Finally, the paper provides a case study that highlights the utility of the proposed architectures in embedded vision systems. PMID:26184211
Integral Images: Efficient Algorithms for Their Computation and Storage in Resource-Constrained Embedded Vision Systems.

PubMed

Ehsan, Shoaib; Clark, Adrian F; Naveed ur Rehman; McDonald-Maier, Klaus D

2015-07-10

The integral image, an intermediate image representation, has found extensive use in multi-scale local feature detection algorithms, such as Speeded-Up Robust Features (SURF), allowing fast computation of rectangular features at constant speed, independent of filter size. For resource-constrained real-time embedded vision systems, computation and storage of integral image presents several design challenges due to strict timing and hardware limitations. Although calculation of the integral image only consists of simple addition operations, the total number of operations is large owing to the generally large size of image data. Recursive equations allow substantial decrease in the number of operations but require calculation in a serial fashion. This paper presents two new hardware algorithms that are based on the decomposition of these recursive equations, allowing calculation of up to four integral image values in a row-parallel way without significantly increasing the number of operations. An efficient design strategy is also proposed for a parallel integral image computation unit to reduce the size of the required internal memory (nearly 35% for common HD video). Addressing the storage problem of integral image in embedded vision systems, the paper presents two algorithms which allow substantial decrease (at least 44.44%) in the memory requirements. Finally, the paper provides a case study that highlights the utility of the proposed architectures in embedded vision systems.
Quantum computing applied to calculations of molecular energies: CH2 benchmark.

PubMed

Veis, Libor; Pittner, Jiří

2010-11-21

Quantum computers are appealing for their ability to solve some tasks much faster than their classical counterparts. It was shown in [Aspuru-Guzik et al., Science 309, 1704 (2005)] that they, if available, would be able to perform the full configuration interaction (FCI) energy calculations with a polynomial scaling. This is in contrast to conventional computers where FCI scales exponentially. We have developed a code for simulation of quantum computers and implemented our version of the quantum FCI algorithm. We provide a detailed description of this algorithm and the results of the assessment of its performance on the four lowest lying electronic states of CH(2) molecule. This molecule was chosen as a benchmark, since its two lowest lying (1)A(1) states exhibit a multireference character at the equilibrium geometry. It has been shown that with a suitably chosen initial state of the quantum register, one is able to achieve the probability amplification regime of the iterative phase estimation algorithm even in this case.
The application of dynamic programming in production planning

NASA Astrophysics Data System (ADS)

Wu, Run

2017-05-01

Nowadays, with the popularity of the computers, various industries and fields are widely applying computer information technology, which brings about huge demand for a variety of application software. In order to develop software meeting various needs with most economical cost and best quality, programmers must design efficient algorithms. A superior algorithm can not only soul up one thing, but also maximize the benefits and generate the smallest overhead. As one of the common algorithms, dynamic programming algorithms are used to solving problems with some sort of optimal properties. When solving problems with a large amount of sub-problems that needs repetitive calculations, the ordinary sub-recursive method requires to consume exponential time, and dynamic programming algorithm can reduce the time complexity of the algorithm to the polynomial level, according to which we can conclude that dynamic programming algorithm is a very efficient compared to other algorithms reducing the computational complexity and enriching the computational results. In this paper, we expound the concept, basic elements, properties, core, solving steps and difficulties of the dynamic programming algorithm besides, establish the dynamic programming model of the production planning problem.
Lattice QCD Calculations in Nuclear Physics towards the Exascale

NASA Astrophysics Data System (ADS)

Joo, Balint

2017-01-01

The combination of algorithmic advances and new highly parallel computing architectures are enabling lattice QCD calculations to tackle ever more complex problems in nuclear physics. In this talk I will review some computational challenges that are encountered in large scale cold nuclear physics campaigns such as those in hadron spectroscopy calculations. I will discuss progress in addressing these with algorithmic improvements such as multi-grid solvers and software for recent hardware architectures such as GPUs and Intel Xeon Phi, Knights Landing. Finally, I will highlight some current topics for research and development as we head towards the Exascale era This material is funded by the U.S. Department of Energy, Office Of Science, Offices of Nuclear Physics, High Energy Physics and Advanced Scientific Computing Research, as well as the Office of Nuclear Physics under contract DE-AC05-06OR23177.

Rock climbing: A local-global algorithm to compute minimum energy and minimum free energy pathways.

PubMed

Templeton, Clark; Chen, Szu-Hua; Fathizadeh, Arman; Elber, Ron

2017-10-21

The calculation of minimum energy or minimum free energy paths is an important step in the quantitative and qualitative studies of chemical and physical processes. The computations of these coordinates present a significant challenge and have attracted considerable theoretical and computational interest. Here we present a new local-global approach to study reaction coordinates, based on a gradual optimization of an action. Like other global algorithms, it provides a path between known reactants and products, but it uses a local algorithm to extend the current path in small steps. The local-global approach does not require an initial guess to the path, a major challenge for global pathway finders. Finally, it provides an exact answer (the steepest descent path) at the end of the calculations. Numerical examples are provided for the Mueller potential and for a conformational transition in a solvated ring system.
Imaging quality analysis of computer-generated holograms using the point-based method and slice-based method

NASA Astrophysics Data System (ADS)

Zhang, Zhen; Chen, Siqing; Zheng, Huadong; Sun, Tao; Yu, Yingjie; Gao, Hongyue; Asundi, Anand K.

2017-06-01

Computer holography has made a notably progress in recent years. The point-based method and slice-based method are chief calculation algorithms for generating holograms in holographic display. Although both two methods are validated numerically and optically, the differences of the imaging quality of these methods have not been specifically analyzed. In this paper, we analyze the imaging quality of computer-generated phase holograms generated by point-based Fresnel zone plates (PB-FZP), point-based Fresnel diffraction algorithm (PB-FDA) and slice-based Fresnel diffraction algorithm (SB-FDA). The calculation formula and hologram generation with three methods are demonstrated. In order to suppress the speckle noise, sequential phase-only holograms are generated in our work. The results of reconstructed images numerically and experimentally are also exhibited. By comparing the imaging quality, the merits and drawbacks with three methods are analyzed. Conclusions are given by us finally.
Linear Scaling Density Functional Calculations with Gaussian Orbitals

NASA Technical Reports Server (NTRS)

Scuseria, Gustavo E.

1999-01-01

Recent advances in linear scaling algorithms that circumvent the computational bottlenecks of large-scale electronic structure simulations make it possible to carry out density functional calculations with Gaussian orbitals on molecules containing more than 1000 atoms and 15000 basis functions using current workstations and personal computers. This paper discusses the recent theoretical developments that have led to these advances and demonstrates in a series of benchmark calculations the present capabilities of state-of-the-art computational quantum chemistry programs for the prediction of molecular structure and properties.
Thrust stand evaluation of engine performance improvement algorithms in an F-15 airplane

NASA Technical Reports Server (NTRS)

Conners, Timothy R.

1992-01-01

Results are presented from the evaluation of the performance seeking control (PSC) optimization algorithm developed by Smith et al. (1990) for F-15 aircraft, which optimizes the quasi-steady-state performance of an F100 derivative turbofan engine for several modes of operation. The PSC algorithm uses onboard software engine model that calculates thrust, stall margin, and other unmeasured variables for use in the optimization. Comparisons are presented between the load cell measurements, PSC onboard model thrust calculations, and posttest state variable model computations. Actual performance improvements using the PSC algorithm are presented for its various modes. The results of using PSC algorithm are compared with similar test case results using the HIDEC algorithm.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kieselmann, J; Bartzsch, S; Oelfke, U

Purpose: Microbeam Radiation Therapy is a preclinical method in radiation oncology that modulates radiation fields on a micrometre scale. Dose calculation is challenging due to arising dose gradients and therapeutically important dose ranges. Monte Carlo (MC) simulations, often used as gold standard, are computationally expensive and hence too slow for the optimisation of treatment parameters in future clinical applications. On the other hand, conventional kernel based dose calculation leads to inaccurate results close to material interfaces. The purpose of this work is to overcome these inaccuracies while keeping computation times low. Methods: A point kernel superposition algorithm is modified tomore » account for tissue inhomogeneities. Instead of conventional ray tracing approaches, methods from differential geometry are applied and the space around the primary photon interaction is locally warped. The performance of this approach is compared to MC simulations and a simple convolution algorithm (CA) for two different phantoms and photon spectra. Results: While peak doses of all dose calculation methods agreed within less than 4% deviations, the proposed approach surpassed a simple convolution algorithm in accuracy by a factor of up to 3 in the scatter dose. In a treatment geometry similar to possible future clinical situations differences between Monte Carlo and the differential geometry algorithm were less than 3%. At the same time the calculation time did not exceed 15 minutes. Conclusion: With the developed method it was possible to improve the dose calculation based on the CA method with respect to accuracy especially at sharp tissue boundaries. While the calculation is more extensive than for the CA method and depends on field size, the typical calculation time for a 20×20 mm{sup 2} field on a 3.4 GHz and 8 GByte RAM processor remained below 15 minutes. Parallelisation and optimisation of the algorithm could lead to further significant calculation time reductions.« less
GPU-accelerated computing for Lagrangian coherent structures of multi-body gravitational regimes

NASA Astrophysics Data System (ADS)

Lin, Mingpei; Xu, Ming; Fu, Xiaoyu

2017-04-01

Based on a well-established theoretical foundation, Lagrangian Coherent Structures (LCSs) have elicited widespread research on the intrinsic structures of dynamical systems in many fields, including the field of astrodynamics. Although the application of LCSs in dynamical problems seems straightforward theoretically, its associated computational cost is prohibitive. We propose a block decomposition algorithm developed on Compute Unified Device Architecture (CUDA) platform for the computation of the LCSs of multi-body gravitational regimes. In order to take advantage of GPU's outstanding computing properties, such as Shared Memory, Constant Memory, and Zero-Copy, the algorithm utilizes a block decomposition strategy to facilitate computation of finite-time Lyapunov exponent (FTLE) fields of arbitrary size and timespan. Simulation results demonstrate that this GPU-based algorithm can satisfy double-precision accuracy requirements and greatly decrease the time needed to calculate final results, increasing speed by approximately 13 times. Additionally, this algorithm can be generalized to various large-scale computing problems, such as particle filters, constellation design, and Monte-Carlo simulation.
Design of nucleic acid sequences for DNA computing based on a thermodynamic approach

PubMed Central

Tanaka, Fumiaki; Kameda, Atsushi; Yamamoto, Masahito; Ohuchi, Azuma

2005-01-01

We have developed an algorithm for designing multiple sequences of nucleic acids that have a uniform melting temperature between the sequence and its complement and that do not hybridize non-specifically with each other based on the minimum free energy (ΔGmin). Sequences that satisfy these constraints can be utilized in computations, various engineering applications such as microarrays, and nano-fabrications. Our algorithm is a random generate-and-test algorithm: it generates a candidate sequence randomly and tests whether the sequence satisfies the constraints. The novelty of our algorithm is that the filtering method uses a greedy search to calculate ΔGmin. This effectively excludes inappropriate sequences before ΔGmin is calculated, thereby reducing computation time drastically when compared with an algorithm without the filtering. Experimental results in silico showed the superiority of the greedy search over the traditional approach based on the hamming distance. In addition, experimental results in vitro demonstrated that the experimental free energy (ΔGexp) of 126 sequences correlated well with ΔGmin (|R| = 0.90) than with the hamming distance (|R| = 0.80). These results validate the rationality of a thermodynamic approach. We implemented our algorithm in a graphic user interface-based program written in Java. PMID:15701762
Paradeisos: A perfect hashing algorithm for many-body eigenvalue problems

NASA Astrophysics Data System (ADS)

Jia, C. J.; Wang, Y.; Mendl, C. B.; Moritz, B.; Devereaux, T. P.

2018-03-01

We describe an essentially perfect hashing algorithm for calculating the position of an element in an ordered list, appropriate for the construction and manipulation of many-body Hamiltonian, sparse matrices. Each element of the list corresponds to an integer value whose binary representation reflects the occupation of single-particle basis states for each element in the many-body Hilbert space. The algorithm replaces conventional methods, such as binary search, for locating the elements of the ordered list, eliminating the need to store the integer representation for each element, without increasing the computational complexity. Combined with the "checkerboard" decomposition of the Hamiltonian matrix for distribution over parallel computing environments, this leads to a substantial savings in aggregate memory. While the algorithm can be applied broadly to many-body, correlated problems, we demonstrate its utility in reducing total memory consumption for a series of fermionic single-band Hubbard model calculations on small clusters with progressively larger Hilbert space dimension.
Improved Spectral Calculations for Discrete Schrődinger Operators

NASA Astrophysics Data System (ADS)

Puelz, Charles

This work details an O(n2) algorithm for computing spectra of discrete Schrődinger operators with periodic potentials. Spectra of these objects enhance our understanding of fundamental aperiodic physical systems and contain rich theoretical structure of interest to the mathematical community. Previous work on the Harper model led to an O(n2) algorithm relying on properties not satisfied by other aperiodic operators. Physicists working with the Fibonacci Hamiltonian, a popular quasicrystal model, have instead used a problematic dynamical map approach or a sluggish O(n3) procedure for their calculations. The algorithm presented in this work, a blend of well-established eigenvalue/vector algorithms, provides researchers with a more robust computational tool of general utility. Application to the Fibonacci Hamiltonian in the sparsely studied intermediate coupling regime reveals structure in canonical coverings of the spectrum that will prove useful in motivating conjectures regarding band combinatorics and fractal dimensions.
Quantum Chemistry on Quantum Computers: A Polynomial-Time Quantum Algorithm for Constructing the Wave Functions of Open-Shell Molecules.

PubMed

Sugisaki, Kenji; Yamamoto, Satoru; Nakazawa, Shigeaki; Toyota, Kazuo; Sato, Kazunobu; Shiomi, Daisuke; Takui, Takeji

2016-08-18

Quantum computers are capable to efficiently perform full configuration interaction (FCI) calculations of atoms and molecules by using the quantum phase estimation (QPE) algorithm. Because the success probability of the QPE depends on the overlap between approximate and exact wave functions, efficient methods to prepare accurate initial guess wave functions enough to have sufficiently large overlap with the exact ones are highly desired. Here, we propose a quantum algorithm to construct the wave function consisting of one configuration state function, which is suitable for the initial guess wave function in QPE-based FCI calculations of open-shell molecules, based on the addition theorem of angular momentum. The proposed quantum algorithm enables us to prepare the wave function consisting of an exponential number of Slater determinants only by a polynomial number of quantum operations.
Deadbeat Predictive Controllers

NASA Technical Reports Server (NTRS)

Juang, Jer-Nan; Phan, Minh

1997-01-01

Several new computational algorithms are presented to compute the deadbeat predictive control law. The first algorithm makes use of a multi-step-ahead output prediction to compute the control law without explicitly calculating the controllability matrix. The system identification must be performed first and then the predictive control law is designed. The second algorithm uses the input and output data directly to compute the feedback law. It combines the system identification and the predictive control law into one formulation. The third algorithm uses an observable-canonical form realization to design the predictive controller. The relationship between all three algorithms is established through the use of the state-space representation. All algorithms are applicable to multi-input, multi-output systems with disturbance inputs. In addition to the feedback terms, feed forward terms may also be added for disturbance inputs if they are measurable. Although the feedforward terms do not influence the stability of the closed-loop feedback law, they enhance the performance of the controlled system.
Computationally Efficient Multiconfigurational Reactive Molecular Dynamics

PubMed Central

Yamashita, Takefumi; Peng, Yuxing; Knight, Chris; Voth, Gregory A.

2012-01-01

It is a computationally demanding task to explicitly simulate the electronic degrees of freedom in a system to observe the chemical transformations of interest, while at the same time sampling the time and length scales required to converge statistical properties and thus reduce artifacts due to initial conditions, finite-size effects, and limited sampling. One solution that significantly reduces the computational expense consists of molecular models in which effective interactions between particles govern the dynamics of the system. If the interaction potentials in these models are developed to reproduce calculated properties from electronic structure calculations and/or ab initio molecular dynamics simulations, then one can calculate accurate properties at a fraction of the computational cost. Multiconfigurational algorithms model the system as a linear combination of several chemical bonding topologies to simulate chemical reactions, also sometimes referred to as “multistate”. These algorithms typically utilize energy and force calculations already found in popular molecular dynamics software packages, thus facilitating their implementation without significant changes to the structure of the code. However, the evaluation of energies and forces for several bonding topologies per simulation step can lead to poor computational efficiency if redundancy is not efficiently removed, particularly with respect to the calculation of long-ranged Coulombic interactions. This paper presents accurate approximations (effective long-range interaction and resulting hybrid methods) and multiple-program parallelization strategies for the efficient calculation of electrostatic interactions in reactive molecular simulations. PMID:25100924
Heuristic algorithms for the minmax regret flow-shop problem with interval processing times.

PubMed

Ćwik, Michał; Józefczyk, Jerzy

2018-01-01

An uncertain version of the permutation flow-shop with unlimited buffers and the makespan as a criterion is considered. The investigated parametric uncertainty is represented by given interval-valued processing times. The maximum regret is used for the evaluation of uncertainty. Consequently, the minmax regret discrete optimization problem is solved. Due to its high complexity, two relaxations are applied to simplify the optimization procedure. First of all, a greedy procedure is used for calculating the criterion's value, as such calculation is NP-hard problem itself. Moreover, the lower bound is used instead of solving the internal deterministic flow-shop. The constructive heuristic algorithm is applied for the relaxed optimization problem. The algorithm is compared with previously elaborated other heuristic algorithms basing on the evolutionary and the middle interval approaches. The conducted computational experiments showed the advantage of the constructive heuristic algorithm with regards to both the criterion and the time of computations. The Wilcoxon paired-rank statistical test confirmed this conclusion.
Electromagnetic scattering calculations on the Intel Touchstone Delta

NASA Technical Reports Server (NTRS)

Cwik, Tom; Patterson, Jean; Scott, David

1992-01-01

During the first year's operation of the Intel Touchstone Delta system, software which solves the electric field integral equations for fields scattered from arbitrarily shaped objects has been transferred to the Delta. To fully realize the Delta's resources, an out-of-core dense matrix solution algorithm that utilizes some or all of the 90 Gbyte of concurrent file system (CFS) has been used. The largest calculation completed to date computes the fields scattered from a perfectly conducting sphere modeled by 48,672 unknown functions, resulting in a complex valued dense matrix needing 37.9 Gbyte of storage. The out-of-core LU matrix factorization algorithm was executed in 8.25 h at a rate of 10.35 Gflops. Total time to complete the calculation was 19.7 h-the additional time was used to compute the 48,672 x 48,672 matrix entries, solve the system for a given excitation, and compute observable quantities. The calculation was performed in 64-b precision.
The use of a computerized algorithm to determine single cardiac cell volumes.

PubMed

Marino, T A; Cook, L; Cook, P N; Dwyer, S J

1981-04-01

Single cardiac muscles cell volume data have been difficult to obtain, especially because the shape of a cell is quite complex. With the aid of a surface reconstruction method, a cell volume estimation algorithm has been developed that can be used on serial of cells. The cell surface is reconstructed by means of triangular tiles so that the cell is represented as a polyhedron. When this algorithm was tested on computer generated surfaces of a known volume, the difference was less than 1.6%. Serial sections of two phantoms of a known volume were also reconstructed and a comparison of the mathematically derived volumes and the computed volume estimations gave a per cent difference of between 2.8% and 4.1%. Finally cell volumes derived using conventional methods and volumes calculated using the algorithm were compared. The mean atrial muscle cell volume derived using conventional methods was 7752.7 +/- 644.7 micrometers3, while the mean computerized algorithm estimated atrial muscle cell volume was 7110.6 +/- 625.5 micrometers3. For AV bundle cells the mean cell volume obtained by conventional methods was 484.4 +/- 88.8 micrometers3 and the volume derived from the computer algorithm was 506.0 +/- 78.5 micrometers3. The differences between the volumes calculated using conventional methods and the algorithm were not significantly different.
Automatically-computed prehospital severity scores are equivalent to scores based on medic documentation.

PubMed

Reisner, Andrew T; Chen, Liangyou; McKenna, Thomas M; Reifman, Jaques

2008-10-01

Prehospital severity scores can be used in routine prehospital care, mass casualty care, and military triage. If computers could reliably calculate clinical scores, new clinical and research methodologies would be possible. One obstacle is that vital signs measured automatically can be unreliable. We hypothesized that Signal Quality Indices (SQI's), computer algorithms that differentiate between reliable and unreliable monitored physiologic data, could improve the predictive power of computer-calculated scores. In a retrospective analysis of trauma casualties transported by air ambulance, we computed the Triage Revised Trauma Score (RTS) from archived travel monitor data. We compared the areas-under-the-curve (AUC's) of receiver operating characteristic curves for prediction of mortality and red blood cell transfusion for 187 subjects with comparable quantities of good-quality and poor-quality data. Vital signs deemed reliable by SQI's led to significantly more discriminatory severity scores than vital signs deemed unreliable. We also compared automatically-computed RTS (using the SQI's) versus RTS computed from vital signs documented by medics. For the subjects in whom the SQI algorithms identified 15 consecutive seconds of reliable vital signs data (n = 350), the automatically-computed scores' AUC's were the same as the medic-based scores' AUC's. Using the Prehospital Index in place of RTS led to very similar results, corroborating our findings. SQI algorithms improve automatically-computed severity scores, and automatically-computed scores using SQI's are equivalent to medic-based scores.
Divide-and-conquer density functional theory on hierarchical real-space grids: Parallel implementation and applications

NASA Astrophysics Data System (ADS)

Shimojo, Fuyuki; Kalia, Rajiv K.; Nakano, Aiichiro; Vashishta, Priya

2008-02-01

A linear-scaling algorithm based on a divide-and-conquer (DC) scheme has been designed to perform large-scale molecular-dynamics (MD) simulations, in which interatomic forces are computed quantum mechanically in the framework of the density functional theory (DFT). Electronic wave functions are represented on a real-space grid, which is augmented with a coarse multigrid to accelerate the convergence of iterative solutions and with adaptive fine grids around atoms to accurately calculate ionic pseudopotentials. Spatial decomposition is employed to implement the hierarchical-grid DC-DFT algorithm on massively parallel computers. The largest benchmark tests include 11.8×106 -atom ( 1.04×1012 electronic degrees of freedom) calculation on 131 072 IBM BlueGene/L processors. The DC-DFT algorithm has well-defined parameters to control the data locality, with which the solutions converge rapidly. Also, the total energy is well conserved during the MD simulation. We perform first-principles MD simulations based on the DC-DFT algorithm, in which large system sizes bring in excellent agreement with x-ray scattering measurements for the pair-distribution function of liquid Rb and allow the description of low-frequency vibrational modes of graphene. The band gap of a CdSe nanorod calculated by the DC-DFT algorithm agrees well with the available conventional DFT results. With the DC-DFT algorithm, the band gap is calculated for larger system sizes until the result reaches the asymptotic value.
A comparison between anisotropic analytical and multigrid superposition dose calculation algorithms in radiotherapy treatment planning

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Vincent W.C., E-mail: htvinwu@polyu.edu.hk; Tse, Teddy K.H.; Ho, Cola L.M.

2013-07-01

Monte Carlo (MC) simulation is currently the most accurate dose calculation algorithm in radiotherapy planning but requires relatively long processing time. Faster model-based algorithms such as the anisotropic analytical algorithm (AAA) by the Eclipse treatment planning system and multigrid superposition (MGS) by the XiO treatment planning system are 2 commonly used algorithms. This study compared AAA and MGS against MC, as the gold standard, on brain, nasopharynx, lung, and prostate cancer patients. Computed tomography of 6 patients of each cancer type was used. The same hypothetical treatment plan using the same machine and treatment prescription was computed for each casemore » by each planning system using their respective dose calculation algorithm. The doses at reference points including (1) soft tissues only, (2) bones only, (3) air cavities only, (4) soft tissue-bone boundary (Soft/Bone), (5) soft tissue-air boundary (Soft/Air), and (6) bone-air boundary (Bone/Air), were measured and compared using the mean absolute percentage error (MAPE), which was a function of the percentage dose deviations from MC. Besides, the computation time of each treatment plan was recorded and compared. The MAPEs of MGS were significantly lower than AAA in all types of cancers (p<0.001). With regards to body density combinations, the MAPE of AAA ranged from 1.8% (soft tissue) to 4.9% (Bone/Air), whereas that of MGS from 1.6% (air cavities) to 2.9% (Soft/Bone). The MAPEs of MGS (2.6%±2.1) were significantly lower than that of AAA (3.7%±2.5) in all tissue density combinations (p<0.001). The mean computation time of AAA for all treatment plans was significantly lower than that of the MGS (p<0.001). Both AAA and MGS algorithms demonstrated dose deviations of less than 4.0% in most clinical cases and their performance was better in homogeneous tissues than at tissue boundaries. In general, MGS demonstrated relatively smaller dose deviations than AAA but required longer computation time.« less
Algorithm 971: An Implementation of a Randomized Algorithm for Principal Component Analysis

PubMed Central

LI, HUAMIN; LINDERMAN, GEORGE C.; SZLAM, ARTHUR; STANTON, KELLY P.; KLUGER, YUVAL; TYGERT, MARK

2017-01-01

Recent years have witnessed intense development of randomized methods for low-rank approximation. These methods target principal component analysis and the calculation of truncated singular value decompositions. The present article presents an essentially black-box, foolproof implementation for Mathworks’ MATLAB, a popular software platform for numerical computation. As illustrated via several tests, the randomized algorithms for low-rank approximation outperform or at least match the classical deterministic techniques (such as Lanczos iterations run to convergence) in basically all respects: accuracy, computational efficiency (both speed and memory usage), ease-of-use, parallelizability, and reliability. However, the classical procedures remain the methods of choice for estimating spectral norms and are far superior for calculating the least singular values and corresponding singular vectors (or singular subspaces). PMID:28983138
A conservative finite difference algorithm for the unsteady transonic potential equation in generalized coordinates

NASA Technical Reports Server (NTRS)

Bridgeman, J. O.; Steger, J. L.; Caradonna, F. X.

1982-01-01

An implicit, approximate-factorization, finite-difference algorithm has been developed for the computation of unsteady, inviscid transonic flows in two and three dimensions. The computer program solves the full-potential equation in generalized coordinates in conservation-law form in order to properly capture shock-wave position and speed. A body-fitted coordinate system is employed for the simple and accurate treatment of boundary conditions on the body surface. The time-accurate algorithm is modified to a conventional ADI relaxation scheme for steady-state computations. Results from two- and three-dimensional steady and two-dimensional unsteady calculations are compared with existing methods.

A parallel algorithm for Hamiltonian matrix construction in electron-molecule collision calculations: MPI-SCATCI

NASA Astrophysics Data System (ADS)

Al-Refaie, Ahmed F.; Tennyson, Jonathan

2017-12-01

Construction and diagonalization of the Hamiltonian matrix is the rate-limiting step in most low-energy electron - molecule collision calculations. Tennyson (1996) implemented a novel algorithm for Hamiltonian construction which took advantage of the structure of the wavefunction in such calculations. This algorithm is re-engineered to make use of modern computer architectures and the use of appropriate diagonalizers is considered. Test calculations demonstrate that significant speed-ups can be gained using multiple CPUs. This opens the way to calculations which consider higher collision energies, larger molecules and / or more target states. The methodology, which is implemented as part of the UK molecular R-matrix codes (UKRMol and UKRMol+) can also be used for studies of bound molecular Rydberg states, photoionization and positron-molecule collisions.
Research in Computational Astrobiology

NASA Technical Reports Server (NTRS)

Chaban, Galina; Colombano, Silvano; Scargle, Jeff; New, Michael H.; Pohorille, Andrew; Wilson, Michael A.

2003-01-01

We report on several projects in the field of computational astrobiology, which is devoted to advancing our understanding of the origin, evolution and distribution of life in the Universe using theoretical and computational tools. Research projects included modifying existing computer simulation codes to use efficient, multiple time step algorithms, statistical methods for analysis of astrophysical data via optimal partitioning methods, electronic structure calculations on water-nuclei acid complexes, incorporation of structural information into genomic sequence analysis methods and calculations of shock-induced formation of polycylic aromatic hydrocarbon compounds.
Improvement and speed optimization of numerical tsunami modelling program using OpenMP technology

NASA Astrophysics Data System (ADS)

Chernov, A.; Zaytsev, A.; Yalciner, A.; Kurkin, A.

2009-04-01

Currently, the basic problem of tsunami modeling is low speed of calculations which is unacceptable for services of the operative notification. Existing algorithms of numerical modeling of hydrodynamic processes of tsunami waves are developed without taking the opportunities of modern computer facilities. There is an opportunity to have considerable acceleration of process of calculations by using parallel algorithms. We discuss here new approach to parallelization tsunami modeling code using OpenMP Technology (for multiprocessing systems with the general memory). Nowadays, multiprocessing systems are easily accessible for everyone. The cost of the use of such systems becomes much lower comparing to the costs of clusters. This opportunity also benefits all programmers to apply multithreading algorithms on desktop computers of researchers. Other important advantage of the given approach is the mechanism of the general memory - there is no necessity to send data on slow networks (for example Ethernet). All memory is the common for all computing processes; it causes almost linear scalability of the program and processes. In the new version of NAMI DANCE using OpenMP technology and multi-threading algorithm provide 80% gain in speed in comparison with the one-thread version for dual-processor unit. The speed increased and 320% gain was attained for four core processor unit of PCs. Thus, it was possible to reduce considerably time of performance of calculations on the scientific workstations (desktops) without complete change of the program and user interfaces. The further modernization of algorithms of preparation of initial data and processing of results using OpenMP looks reasonable. The final version of NAMI DANCE with the increased computational speed can be used not only for research purposes but also in real time Tsunami Warning Systems.
Parametric inference for biological sequence analysis.

PubMed

Pachter, Lior; Sturmfels, Bernd

2004-11-16

One of the major successes in computational biology has been the unification, by using the graphical model formalism, of a multitude of algorithms for annotating and comparing biological sequences. Graphical models that have been applied to these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems that are associated with different statistical models. This article introduces the polytope propagation algorithm for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.
Ramsey numbers and adiabatic quantum computing.

PubMed

Gaitan, Frank; Clark, Lane

2012-01-06

The graph-theoretic Ramsey numbers are notoriously difficult to calculate. In fact, for the two-color Ramsey numbers R(m,n) with m, n≥3, only nine are currently known. We present a quantum algorithm for the computation of the Ramsey numbers R(m,n). We show how the computation of R(m,n) can be mapped to a combinatorial optimization problem whose solution can be found using adiabatic quantum evolution. We numerically simulate this adiabatic quantum algorithm and show that it correctly determines the Ramsey numbers R(3,3) and R(2,s) for 5≤s≤7. We then discuss the algorithm's experimental implementation, and close by showing that Ramsey number computation belongs to the quantum complexity class quantum Merlin Arthur.
Interactive visualization of Earth and Space Science computations

NASA Technical Reports Server (NTRS)

Hibbard, William L.; Paul, Brian E.; Santek, David A.; Dyer, Charles R.; Battaiola, Andre L.; Voidrot-Martinez, Marie-Francoise

1994-01-01

Computers have become essential tools for scientists simulating and observing nature. Simulations are formulated as mathematical models but are implemented as computer algorithms to simulate complex events. Observations are also analyzed and understood in terms of mathematical models, but the number of these observations usually dictates that we automate analyses with computer algorithms. In spite of their essential role, computers are also barriers to scientific understanding. Unlike hand calculations, automated computations are invisible and, because of the enormous numbers of individual operations in automated computations, the relation between an algorithm's input and output is often not intuitive. This problem is illustrated by the behavior of meteorologists responsible for forecasting weather. Even in this age of computers, many meteorologists manually plot weather observations on maps, then draw isolines of temperature, pressure, and other fields by hand (special pads of maps are printed for just this purpose). Similarly, radiologists use computers to collect medical data but are notoriously reluctant to apply image-processing algorithms to that data. To these scientists with life-and-death responsibilities, computer algorithms are black boxes that increase rather than reduce risk. The barrier between scientists and their computations can be bridged by techniques that make the internal workings of algorithms visible and that allow scientists to experiment with their computations. Here we describe two interactive systems developed at the University of Wisconsin-Madison Space Science and Engineering Center (SSEC) that provide these capabilities to Earth and space scientists.
Special-purpose computer for holography HORN-4 with recurrence algorithm

NASA Astrophysics Data System (ADS)

Shimobaba, Tomoyoshi; Hishinuma, Sinsuke; Ito, Tomoyoshi

2002-10-01

We designed and built a special-purpose computer for holography, HORN-4 (HOlographic ReconstructioN) using PLD (Programmable Logic Device) technology. HORN computers have a pipeline architecture. We use HORN-4 as an attached processor to enhance the performance of a general-purpose computer when it is used to generate holograms using a "recurrence formulas" algorithm developed by our previous paper. In the HORN-4 system, we designed the pipeline by adopting our "recurrence formulas" algorithm which can calculate the phase on a hologram. As the result, we could integrate the pipeline composed of 21 units into one PLD chip. The units in the pipeline consists of one BPU (Basic Phase Unit) unit and twenty CU (Cascade Unit) units. These CU units can compute twenty light intensities on a hologram plane at one time. By mounting two of the PLD chips on a PCI (Peripheral Component Interconnect) universal board, HORN-4 can calculate holograms at high speed of about 42 Gflops equivalent. The cost of HORN-4 board is about 1700 US dollar. We could obtain 800×600 grids hologram from a 3D-image composed of 415 points in about 0.45 sec with the HORN-4 system.
A fast, parallel algorithm for distant-dependent calculation of crystal properties

NASA Astrophysics Data System (ADS)

Stein, Matthew

2017-12-01

A fast, parallel algorithm for distant-dependent calculation and simulation of crystal properties is presented along with speedup results and methods of application. An illustrative example is used to compute the Lennard-Jones lattice constants up to 32 significant figures for 4 ≤ p ≤ 30 in the simple cubic, face-centered cubic, body-centered cubic, hexagonal-close-pack, and diamond lattices. In most cases, the known precision of these constants is more than doubled, and in some cases, corrected from previously published figures. The tools and strategies to make this computation possible are detailed along with application to other potentials, including those that model defects.
A Fast Algorithm for Lattice Hyperonic Potentials

NASA Astrophysics Data System (ADS)

Nemura, Hidekatsu; Aoki, Sinya; Doi, Takumi; Gongyo, Shinya; Hatsuda, Tetsuo; Ikeda, Yoichi; Inoue, Takashi; Iritani, Takumi; Ishii, Noriyoshi; Miyamoto, Takaya; Murano, Keiko; Sasaki, Kenji

We describe an efficient algorithm to compute a large number of baryon-baryon interactions from NN to ΞΞ by means of HAL QCD method, which lays the groundwork for the nearly physical point lattice QCD calculation with volume (96a)4 ≈ (8.2 fm)4. Preliminary results of ΛN potential calculated with quark masses corresponding to (mπ, mK) ≈ (146,525) MeV are presented.
Parallel computation safety analysis irradiation targets fission product molybdenum in neutronic aspect using the successive over-relaxation algorithm

NASA Astrophysics Data System (ADS)

Susmikanti, Mike; Dewayatna, Winter; Sulistyo, Yos

2014-09-01

One of the research activities in support of commercial radioisotope production program is a safety research on target FPM (Fission Product Molybdenum) irradiation. FPM targets form a tube made of stainless steel which contains nuclear-grade high-enrichment uranium. The FPM irradiation tube is intended to obtain fission products. Fission materials such as Mo99 used widely the form of kits in the medical world. The neutronics problem is solved using first-order perturbation theory derived from the diffusion equation for four groups. In contrast, Mo isotopes have longer half-lives, about 3 days (66 hours), so the delivery of radioisotopes to consumer centers and storage is possible though still limited. The production of this isotope potentially gives significant economic value. The criticality and flux in multigroup diffusion model was calculated for various irradiation positions and uranium contents. This model involves complex computation, with large and sparse matrix system. Several parallel algorithms have been developed for the sparse and large matrix solution. In this paper, a successive over-relaxation (SOR) algorithm was implemented for the calculation of reactivity coefficients which can be done in parallel. Previous works performed reactivity calculations serially with Gauss-Seidel iteratives. The parallel method can be used to solve multigroup diffusion equation system and calculate the criticality and reactivity coefficients. In this research a computer code was developed to exploit parallel processing to perform reactivity calculations which were to be used in safety analysis. The parallel processing in the multicore computer system allows the calculation to be performed more quickly. This code was applied for the safety limits calculation of irradiated FPM targets containing highly enriched uranium. The results of calculations neutron show that for uranium contents of 1.7676 g and 6.1866 g (× 106 cm-1) in a tube, their delta reactivities are the still within safety limits; however, for 7.9542 g and 8.838 g (× 106 cm-1) the limits were exceeded.
Artificial Bee Colony Optimization of Capping Potentials for Hybrid Quantum Mechanical/Molecular Mechanical Calculations.

PubMed

Schiffmann, Christoph; Sebastiani, Daniel

2011-05-10

We present an algorithmic extension of a numerical optimization scheme for analytic capping potentials for use in mixed quantum-classical (quantum mechanical/molecular mechanical, QM/MM) ab initio calculations. Our goal is to minimize bond-cleavage-induced perturbations in the electronic structure, measured by means of a suitable penalty functional. The optimization algorithm-a variant of the artificial bee colony (ABC) algorithm, which relies on swarm intelligence-couples deterministic (downhill gradient) and stochastic elements to avoid local minimum trapping. The ABC algorithm outperforms the conventional downhill gradient approach, if the penalty hypersurface exhibits wiggles that prevent a straight minimization pathway. We characterize the optimized capping potentials by computing NMR chemical shifts. This approach will increase the accuracy of QM/MM calculations of complex biomolecules.
Development of hardware accelerator for molecular dynamics simulations: a computation board that calculates nonbonded interactions in cooperation with fast multipole method.

PubMed

Amisaki, Takashi; Toyoda, Shinjiro; Miyagawa, Hiroh; Kitamura, Kunihiro

2003-04-15

Evaluation of long-range Coulombic interactions still represents a bottleneck in the molecular dynamics (MD) simulations of biological macromolecules. Despite the advent of sophisticated fast algorithms, such as the fast multipole method (FMM), accurate simulations still demand a great amount of computation time due to the accuracy/speed trade-off inherently involved in these algorithms. Unless higher order multipole expansions, which are extremely expensive to evaluate, are employed, a large amount of the execution time is still spent in directly calculating particle-particle interactions within the nearby region of each particle. To reduce this execution time for pair interactions, we developed a computation unit (board), called MD-Engine II, that calculates nonbonded pairwise interactions using a specially designed hardware. Four custom arithmetic-processors and a processor for memory manipulation ("particle processor") are mounted on the computation board. The arithmetic processors are responsible for calculation of the pair interactions. The particle processor plays a central role in realizing efficient cooperation with the FMM. The results of a series of 50-ps MD simulations of a protein-water system (50,764 atoms) indicated that a more stringent setting of accuracy in FMM computation, compared with those previously reported, was required for accurate simulations over long time periods. Such a level of accuracy was efficiently achieved using the cooperative calculations of the FMM and MD-Engine II. On an Alpha 21264 PC, the FMM computation at a moderate but tolerable level of accuracy was accelerated by a factor of 16.0 using three boards. At a high level of accuracy, the cooperative calculation achieved a 22.7-fold acceleration over the corresponding conventional FMM calculation. In the cooperative calculations of the FMM and MD-Engine II, it was possible to achieve more accurate computation at a comparable execution time by incorporating larger nearby regions. Copyright 2003 Wiley Periodicals, Inc. J Comput Chem 24: 582-592, 2003
Generalization techniques to reduce the number of volume elements for terrain effect calculations in fully analytical gravitational modelling

NASA Astrophysics Data System (ADS)

Benedek, Judit; Papp, Gábor; Kalmár, János

2018-04-01

Beyond rectangular prism polyhedron, as a discrete volume element, can also be used to model the density distribution inside 3D geological structures. The calculation of the closed formulae given for the gravitational potential and its higher-order derivatives, however, needs twice more runtime than that of the rectangular prism computations. Although the more detailed the better principle is generally accepted it is basically true only for errorless data. As soon as errors are present any forward gravitational calculation from the model is only a possible realization of the true force field on the significance level determined by the errors. So if one really considers the reliability of input data used in the calculations then sometimes the "less" can be equivalent to the "more" in statistical sense. As a consequence the processing time of the related complex formulae can be significantly reduced by the optimization of the number of volume elements based on the accuracy estimates of the input data. New algorithms are proposed to minimize the number of model elements defined both in local and in global coordinate systems. Common gravity field modelling programs generate optimized models for every computation points ( dynamic approach), whereas the static approach provides only one optimized model for all. Based on the static approach two different algorithms were developed. The grid-based algorithm starts with the maximum resolution polyhedral model defined by 3-3 points of each grid cell and generates a new polyhedral surface defined by points selected from the grid. The other algorithm is more general; it works also for irregularly distributed data (scattered points) connected by triangulation. Beyond the description of the optimization schemes some applications of these algorithms in regional and local gravity field modelling are presented too. The efficiency of the static approaches may provide even more than 90% reduction in computation time in favourable situation without the loss of reliability of the calculated gravity field parameters.
Design of an optimum computer vision-based automatic abalone (Haliotis discus hannai) grading algorithm.

PubMed

Lee, Donggil; Lee, Kyounghoon; Kim, Seonghun; Yang, Yongsu

2015-04-01

An automatic abalone grading algorithm that estimates abalone weights on the basis of computer vision using 2D images is developed and tested. The algorithm overcomes the problems experienced by conventional abalone grading methods that utilize manual sorting and mechanical automatic grading. To design an optimal algorithm, a regression formula and R(2) value were investigated by performing a regression analysis for each of total length, body width, thickness, view area, and actual volume against abalone weights. The R(2) value between the actual volume and abalone weight was 0.999, showing a relatively high correlation. As a result, to easily estimate the actual volumes of abalones based on computer vision, the volumes were calculated under the assumption that abalone shapes are half-oblate ellipsoids, and a regression formula was derived to estimate the volumes of abalones through linear regression analysis between the calculated and actual volumes. The final automatic abalone grading algorithm is designed using the abalone volume estimation regression formula derived from test results, and the actual volumes and abalone weights regression formula. In the range of abalones weighting from 16.51 to 128.01 g, the results of evaluation of the performance of algorithm via cross-validation indicate root mean square and worst-case prediction errors of are 2.8 and ±8 g, respectively. © 2015 Institute of Food Technologists®
Dynamic programming algorithms for biological sequence comparison.

PubMed

Pearson, W R; Miller, W

1992-01-01

Efficient dynamic programming algorithms are available for a broad class of protein and DNA sequence comparison problems. These algorithms require computer time proportional to the product of the lengths of the two sequences being compared [O(N2)] but require memory space proportional only to the sum of these lengths [O(N)]. Although the requirement for O(N2) time limits use of the algorithms to the largest computers when searching protein and DNA sequence databases, many other applications of these algorithms, such as calculation of distances for evolutionary trees and comparison of a new sequence to a library of sequence profiles, are well within the capabilities of desktop computers. In particular, the results of library searches with rapid searching programs, such as FASTA or BLAST, should be confirmed by performing a rigorous optimal alignment. Whereas rapid methods do not overlook significant sequence similarities, FASTA limits the number of gaps that can be inserted into an alignment, so that a rigorous alignment may extend the alignment substantially in some cases. BLAST does not allow gaps in the local regions that it reports; a calculation that allows gaps is very likely to extend the alignment substantially. Although a Monte Carlo evaluation of the statistical significance of a similarity score with a rigorous algorithm is much slower than the heuristic approach used by the RDF2 program, the dynamic programming approach should take less than 1 hr on a 386-based PC or desktop Unix workstation. For descriptive purposes, we have limited our discussion to methods for calculating similarity scores and distances that use gap penalties of the form g = rk. Nevertheless, programs for the more general case (g = q+rk) are readily available. Versions of these programs that run either on Unix workstations, IBM-PC class computers, or the Macintosh can be obtained from either of the authors.
Simulation of Nuclear Reactor Kinetics by the Monte Carlo Method

NASA Astrophysics Data System (ADS)

Gomin, E. A.; Davidenko, V. D.; Zinchenko, A. S.; Kharchenko, I. K.

2017-12-01

The KIR computer code intended for calculations of nuclear reactor kinetics using the Monte Carlo method is described. The algorithm implemented in the code is described in detail. Some results of test calculations are given.
Analysis of a new phase and height algorithm in phase measurement profilometry

NASA Astrophysics Data System (ADS)

Bian, Xintian; Zuo, Fen; Cheng, Ju

2018-04-01

Traditional phase measurement profilometry adopts divergent illumination to obtain the height distribution of a measured object accurately. However, the mapping relation between reference plane coordinates and phase distribution must be calculated before measurement. Data are then stored in a computer in the form of a data sheet for standby applications. This study improved the distribution of projected fringes and deducted the phase-height mapping algorithm when the two pupils of the projection and imaging systems are of unequal heights and when the projection and imaging axes are on different planes. With the algorithm, calculating the mapping relation between reference plane coordinates and phase distribution prior to measurement is unnecessary. Thus, the measurement process is simplified, and the construction of an experimental system is made easy. Computer simulation and experimental results confirm the effectiveness of the method.
Volumetric visualization algorithm development for an FPGA-based custom computing machine

NASA Astrophysics Data System (ADS)

Sallinen, Sami J.; Alakuijala, Jyrki; Helminen, Hannu; Laitinen, Joakim

1998-05-01

Rendering volumetric medical images is a burdensome computational task for contemporary computers due to the large size of the data sets. Custom designed reconfigurable hardware could considerably speed up volume visualization if an algorithm suitable for the platform is used. We present an algorithm and speedup techniques for visualizing volumetric medical CT and MR images with a custom-computing machine based on a Field Programmable Gate Array (FPGA). We also present simulated performance results of the proposed algorithm calculated with a software implementation running on a desktop PC. Our algorithm is capable of generating perspective projection renderings of single and multiple isosurfaces with transparency, simulated X-ray images, and Maximum Intensity Projections (MIP). Although more speedup techniques exist for parallel projection than for perspective projection, we have constrained ourselves to perspective viewing, because of its importance in the field of radiotherapy. The algorithm we have developed is based on ray casting, and the rendering is sped up by three different methods: shading speedup by gradient precalculation, a new generalized version of Ray-Acceleration by Distance Coding (RADC), and background ray elimination by speculative ray selection.
QRS detection based ECG quality assessment.

PubMed

Hayn, Dieter; Jammerbund, Bernhard; Schreier, Günter

2012-09-01

Although immediate feedback concerning ECG signal quality during recording is useful, up to now not much literature describing quality measures is available. We have implemented and evaluated four ECG quality measures. Empty lead criterion (A), spike detection criterion (B) and lead crossing point criterion (C) were calculated from basic signal properties. Measure D quantified the robustness of QRS detection when applied to the signal. An advanced Matlab-based algorithm combining all four measures and a simplified algorithm for Android platforms, excluding measure D, were developed. Both algorithms were evaluated by taking part in the Computing in Cardiology Challenge 2011. Each measure's accuracy and computing time was evaluated separately. During the challenge, the advanced algorithm correctly classified 93.3% of the ECGs in the training-set and 91.6 % in the test-set. Scores for the simplified algorithm were 0.834 in event 2 and 0.873 in event 3. Computing time for measure D was almost five times higher than for other measures. Required accuracy levels depend on the application and are related to computing time. While our simplified algorithm may be accurate for real-time feedback during ECG self-recordings, QRS detection based measures can further increase the performance if sufficient computing power is available.
Application of sensitivity-analysis techniques to the calculation of topological quantities

NASA Astrophysics Data System (ADS)

Gilchrist, Stuart

2017-08-01

Magnetic reconnection in the corona occurs preferentially at sites where the magnetic connectivity is either discontinuous or has a large spatial gradient. Hence there is a general interest in computing quantities (like the squashing factor) that characterize the gradient in the field-line mapping function. Here we present an algorithm for calculating certain (quasi)topological quantities using mathematical techniques from the field of ``sensitivity-analysis''. The method is based on the calculation of a three dimensional field-line mapping Jacobian from which all the present topological quantities of interest can be derived. We will present the algorithm and the details of a publicly available set of libraries that implement the algorithm.

Description of the computations and pilot procedures for planning fuel-conservative descents with a small programmable calculator

NASA Technical Reports Server (NTRS)

Vicroy, D. D.; Knox, C. E.

1983-01-01

A simplified flight management descent algorithm was developed and programmed on a small programmable calculator. It was designed to aid the pilot in planning and executing a fuel conservative descent to arrive at a metering fix at a time designated by the air traffic control system. The algorithm may also be used for planning fuel conservative descents when time is not a consideration. The descent path was calculated for a constant Mach/airspeed schedule from linear approximations of airplane performance with considerations given for gross weight, wind, and nonstandard temperature effects. The flight management descent algorithm and the vertical performance modeling required for the DC-10 airplane is described.
A multiresolution approach to iterative reconstruction algorithms in X-ray computed tomography.

PubMed

De Witte, Yoni; Vlassenbroeck, Jelle; Van Hoorebeke, Luc

2010-09-01

In computed tomography, the application of iterative reconstruction methods in practical situations is impeded by their high computational demands. Especially in high resolution X-ray computed tomography, where reconstruction volumes contain a high number of volume elements (several giga voxels), this computational burden prevents their actual breakthrough. Besides the large amount of calculations, iterative algorithms require the entire volume to be kept in memory during reconstruction, which quickly becomes cumbersome for large data sets. To overcome this obstacle, we present a novel multiresolution reconstruction, which greatly reduces the required amount of memory without significantly affecting the reconstructed image quality. It is shown that, combined with an efficient implementation on a graphical processing unit, the multiresolution approach enables the application of iterative algorithms in the reconstruction of large volumes at an acceptable speed using only limited resources.
High-performance computing on GPUs for resistivity logging of oil and gas wells

NASA Astrophysics Data System (ADS)

Glinskikh, V.; Dudaev, A.; Nechaev, O.; Surodina, I.

2017-10-01

We developed and implemented into software an algorithm for high-performance simulation of electrical logs from oil and gas wells using high-performance heterogeneous computing. The numerical solution of the 2D forward problem is based on the finite-element method and the Cholesky decomposition for solving a system of linear algebraic equations (SLAE). Software implementations of the algorithm used the NVIDIA CUDA technology and computing libraries are made, allowing us to perform decomposition of SLAE and find its solution on central processor unit (CPU) and graphics processor unit (GPU). The calculation time is analyzed depending on the matrix size and number of its non-zero elements. We estimated the computing speed on CPU and GPU, including high-performance heterogeneous CPU-GPU computing. Using the developed algorithm, we simulated resistivity data in realistic models.
Accelerating EPI distortion correction by utilizing a modern GPU-based parallel computation.

PubMed

Yang, Yao-Hao; Huang, Teng-Yi; Wang, Fu-Nien; Chuang, Tzu-Chao; Chen, Nan-Kuei

2013-04-01

The combination of phase demodulation and field mapping is a practical method to correct echo planar imaging (EPI) geometric distortion. However, since phase dispersion accumulates in each phase-encoding step, the calculation complexity of phase modulation is Ny-fold higher than conventional image reconstructions. Thus, correcting EPI images via phase demodulation is generally a time-consuming task. Parallel computing by employing general-purpose calculations on graphics processing units (GPU) can accelerate scientific computing if the algorithm is parallelized. This study proposes a method that incorporates the GPU-based technique into phase demodulation calculations to reduce computation time. The proposed parallel algorithm was applied to a PROPELLER-EPI diffusion tensor data set. The GPU-based phase demodulation method reduced the EPI distortion correctly, and accelerated the computation. The total reconstruction time of the 16-slice PROPELLER-EPI diffusion tensor images with matrix size of 128 × 128 was reduced from 1,754 seconds to 101 seconds by utilizing the parallelized 4-GPU program. GPU computing is a promising method to accelerate EPI geometric correction. The resulting reduction in computation time of phase demodulation should accelerate postprocessing for studies performed with EPI, and should effectuate the PROPELLER-EPI technique for clinical practice. Copyright © 2011 by the American Society of Neuroimaging.
Fast computation algorithms for speckle pattern simulation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nascov, Victor; Samoilă, Cornel; Ursuţiu, Doru

2013-11-13

We present our development of a series of efficient computation algorithms, generally usable to calculate light diffraction and particularly for speckle pattern simulation. We use mainly the scalar diffraction theory in the form of Rayleigh-Sommerfeld diffraction formula and its Fresnel approximation. Our algorithms are based on a special form of the convolution theorem and the Fast Fourier Transform. They are able to evaluate the diffraction formula much faster than by direct computation and we have circumvented the restrictions regarding the relative sizes of the input and output domains, met on commonly used procedures. Moreover, the input and output planes canmore » be tilted each to other and the output domain can be off-axis shifted.« less
Continuous development of schemes for parallel computing of the electrostatics in biological systems: implementation in DelPhi.

PubMed

Li, Chuan; Petukh, Marharyta; Li, Lin; Alexov, Emil

2013-08-15

Due to the enormous importance of electrostatics in molecular biology, calculating the electrostatic potential and corresponding energies has become a standard computational approach for the study of biomolecules and nano-objects immersed in water and salt phase or other media. However, the electrostatics of large macromolecules and macromolecular complexes, including nano-objects, may not be obtainable via explicit methods and even the standard continuum electrostatics methods may not be applicable due to high computational time and memory requirements. Here, we report further development of the parallelization scheme reported in our previous work (Li, et al., J. Comput. Chem. 2012, 33, 1960) to include parallelization of the molecular surface and energy calculations components of the algorithm. The parallelization scheme utilizes different approaches such as space domain parallelization, algorithmic parallelization, multithreading, and task scheduling, depending on the quantity being calculated. This allows for efficient use of the computing resources of the corresponding computer cluster. The parallelization scheme is implemented in the popular software DelPhi and results in speedup of several folds. As a demonstration of the efficiency and capability of this methodology, the electrostatic potential, and electric field distributions are calculated for the bovine mitochondrial supercomplex illustrating their complex topology, which cannot be obtained by modeling the supercomplex components alone. Copyright © 2013 Wiley Periodicals, Inc.
A Sparse Self-Consistent Field Algorithm and Its Parallel Implementation: Application to Density-Functional-Based Tight Binding.

PubMed

Scemama, Anthony; Renon, Nicolas; Rapacioli, Mathias

2014-06-10

We present an algorithm and its parallel implementation for solving a self-consistent problem as encountered in Hartree-Fock or density functional theory. The algorithm takes advantage of the sparsity of matrices through the use of local molecular orbitals. The implementation allows one to exploit efficiently modern symmetric multiprocessing (SMP) computer architectures. As a first application, the algorithm is used within the density-functional-based tight binding method, for which most of the computational time is spent in the linear algebra routines (diagonalization of the Fock/Kohn-Sham matrix). We show that with this algorithm (i) single point calculations on very large systems (millions of atoms) can be performed on large SMP machines, (ii) calculations involving intermediate size systems (1000-100 000 atoms) are also strongly accelerated and can run efficiently on standard servers, and (iii) the error on the total energy due to the use of a cutoff in the molecular orbital coefficients can be controlled such that it remains smaller than the SCF convergence criterion.
An Automated Pipeline for Engineering Many-Enzyme Pathways: Computational Sequence Design, Pathway Expression-Flux Mapping, and Scalable Pathway Optimization.

PubMed

Halper, Sean M; Cetnar, Daniel P; Salis, Howard M

2018-01-01

Engineering many-enzyme metabolic pathways suffers from the design curse of dimensionality. There are an astronomical number of synonymous DNA sequence choices, though relatively few will express an evolutionary robust, maximally productive pathway without metabolic bottlenecks. To solve this challenge, we have developed an integrated, automated computational-experimental pipeline that identifies a pathway's optimal DNA sequence without high-throughput screening or many cycles of design-build-test. The first step applies our Operon Calculator algorithm to design a host-specific evolutionary robust bacterial operon sequence with maximally tunable enzyme expression levels. The second step applies our RBS Library Calculator algorithm to systematically vary enzyme expression levels with the smallest-sized library. After characterizing a small number of constructed pathway variants, measurements are supplied to our Pathway Map Calculator algorithm, which then parameterizes a kinetic metabolic model that ultimately predicts the pathway's optimal enzyme expression levels and DNA sequences. Altogether, our algorithms provide the ability to efficiently map the pathway's sequence-expression-activity space and predict DNA sequences with desired metabolic fluxes. Here, we provide a step-by-step guide to applying the Pathway Optimization Pipeline on a desired multi-enzyme pathway in a bacterial host.
Computational methods for reactive transport modeling: A Gibbs energy minimization approach for multiphase equilibrium calculations

NASA Astrophysics Data System (ADS)

Leal, Allan M. M.; Kulik, Dmitrii A.; Kosakowski, Georg

2016-02-01

We present a numerical method for multiphase chemical equilibrium calculations based on a Gibbs energy minimization approach. The method can accurately and efficiently determine the stable phase assemblage at equilibrium independently of the type of phases and species that constitute the chemical system. We have successfully applied our chemical equilibrium algorithm in reactive transport simulations to demonstrate its effective use in computationally intensive applications. We used FEniCS to solve the governing partial differential equations of mass transport in porous media using finite element methods in unstructured meshes. Our equilibrium calculations were benchmarked with GEMS3K, the numerical kernel of the geochemical package GEMS. This allowed us to compare our results with a well-established Gibbs energy minimization algorithm, as well as their performance on every mesh node, at every time step of the transport simulation. The benchmark shows that our novel chemical equilibrium algorithm is accurate, robust, and efficient for reactive transport applications, and it is an improvement over the Gibbs energy minimization algorithm used in GEMS3K. The proposed chemical equilibrium method has been implemented in Reaktoro, a unified framework for modeling chemically reactive systems, which is now used as an alternative numerical kernel of GEMS.
Efficient computation of optimal oligo-RNA binding.

PubMed

Hodas, Nathan O; Aalberts, Daniel P

2004-01-01

We present an algorithm that calculates the optimal binding conformation and free energy of two RNA molecules, one or both oligomeric. This algorithm has applications to modeling DNA microarrays, RNA splice-site recognitions and other antisense problems. Although other recent algorithms perform the same calculation in time proportional to the sum of the lengths cubed, O((N1 + N2)3), our oligomer binding algorithm, called bindigo, scales as the product of the sequence lengths, O(N1*N2). The algorithm performs well in practice with the aid of a heuristic for large asymmetric loops. To demonstrate its speed and utility, we use bindigo to investigate the binding proclivities of U1 snRNA to mRNA donor splice sites.
Paradeisos: A perfect hashing algorithm for many-body eigenvalue problems

DOE PAGES

Jia, C. J.; Wang, Y.; Mendl, C. B.; ...

2017-12-02

Here, we describe an essentially perfect hashing algorithm for calculating the position of an element in an ordered list, appropriate for the construction and manipulation of many-body Hamiltonian, sparse matrices. Each element of the list corresponds to an integer value whose binary representation reflects the occupation of single-particle basis states for each element in the many-body Hilbert space. The algorithm replaces conventional methods, such as binary search, for locating the elements of the ordered list, eliminating the need to store the integer representation for each element, without increasing the computational complexity. Combined with the “checkerboard” decomposition of the Hamiltonian matrixmore » for distribution over parallel computing environments, this leads to a substantial savings in aggregate memory. While the algorithm can be applied broadly to many-body, correlated problems, we demonstrate its utility in reducing total memory consumption for a series of fermionic single-band Hubbard model calculations on small clusters with progressively larger Hilbert space dimension.« less
Paradeisos: A perfect hashing algorithm for many-body eigenvalue problems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jia, C. J.; Wang, Y.; Mendl, C. B.

Here, we describe an essentially perfect hashing algorithm for calculating the position of an element in an ordered list, appropriate for the construction and manipulation of many-body Hamiltonian, sparse matrices. Each element of the list corresponds to an integer value whose binary representation reflects the occupation of single-particle basis states for each element in the many-body Hilbert space. The algorithm replaces conventional methods, such as binary search, for locating the elements of the ordered list, eliminating the need to store the integer representation for each element, without increasing the computational complexity. Combined with the “checkerboard” decomposition of the Hamiltonian matrixmore » for distribution over parallel computing environments, this leads to a substantial savings in aggregate memory. While the algorithm can be applied broadly to many-body, correlated problems, we demonstrate its utility in reducing total memory consumption for a series of fermionic single-band Hubbard model calculations on small clusters with progressively larger Hilbert space dimension.« less
The effectiveness of a new algorithm on a three-dimensional finite element model construction of bone trabeculae in implant biomechanics.

PubMed

Sato, Y; Teixeira, E R; Tsuga, K; Shindoi, N

1999-08-01

More validity of finite element analysis (FEA) in implant biomechanics requires element downsizing. However, excess downsizing needs computer memory and calculation time. To evaluate the effectiveness of a new algorithm established for more valid FEA model construction without downsizing, three-dimensional FEA bone trabeculae models with different element sizes (300, 150 and 75 micron) were constructed. Four algorithms of stepwise (1 to 4 ranks) assignment of Young's modulus accorded with bone volume in the individual cubic element was used and then stress distribution against vertical loading was analysed. The model with 300 micron element size, with 4 ranks of Young's moduli accorded with bone volume in each element presented similar stress distribution to the model with the 75 micron element size. These results show that the new algorithm was effective, and the use of the 300 micron element for bone trabeculae representation was proposed, without critical changes in stress values and for possible savings on computer memory and calculation time in the laboratory.
A Visual Analytic for High-Dimensional Data Exploitation: The Heterogeneous Data-Reduction Proximity Tool

DTIC Science & Technology

2013-07-01

structure of the data and Gower’s similarity coefficient as the algorithm for calculating the proximity matrices. The following section provides a...representative set of terrorist event data. Attribute Day Location Time Prim /Attack Sec/Attack Weight 1 1 1 1 1 Scale Nominal Nominal Interval Nominal...calculate the similarity it uses Gower’s similarity and multidimensional scaling algorithms contained in an R statistical computing environment
A priori mesh grading for the numerical calculation of the head-related transfer functions

PubMed Central

Ziegelwanger, Harald; Kreuzer, Wolfgang; Majdak, Piotr

2017-01-01

Head-related transfer functions (HRTFs) describe the directional filtering of the incoming sound caused by the morphology of a listener’s head and pinnae. When an accurate model of a listener’s morphology exists, HRTFs can be calculated numerically with the boundary element method (BEM). However, the general recommendation to model the head and pinnae with at least six elements per wavelength renders the BEM as a time-consuming procedure when calculating HRTFs for the full audible frequency range. In this study, a mesh preprocessing algorithm is proposed, viz., a priori mesh grading, which reduces the computational costs in the HRTF calculation process significantly. The mesh grading algorithm deliberately violates the recommendation of at least six elements per wavelength in certain regions of the head and pinnae and varies the size of elements gradually according to an a priori defined grading function. The evaluation of the algorithm involved HRTFs calculated for various geometric objects including meshes of three human listeners and various grading functions. The numerical accuracy and the predicted sound-localization performance of calculated HRTFs were analyzed. A-priori mesh grading appeared to be suitable for the numerical calculation of HRTFs in the full audible frequency range and outperformed uniform meshes in terms of numerical errors, perception based predictions of sound-localization performance, and computational costs. PMID:28239186
On the relation between correlation dimension, approximate entropy and sample entropy parameters, and a fast algorithm for their calculation

NASA Astrophysics Data System (ADS)

Zurek, Sebastian; Guzik, Przemyslaw; Pawlak, Sebastian; Kosmider, Marcin; Piskorski, Jaroslaw

2012-12-01

We explore the relation between correlation dimension, approximate entropy and sample entropy parameters, which are commonly used in nonlinear systems analysis. Using theoretical considerations we identify the points which are shared by all these complexity algorithms and show explicitly that the above parameters are intimately connected and mutually interdependent. A new geometrical interpretation of sample entropy and correlation dimension is provided and the consequences for the interpretation of sample entropy, its relative consistency and some of the algorithms for parameter selection for this quantity are discussed. To get an exact algorithmic relation between the three parameters we construct a very fast algorithm for simultaneous calculations of the above, which uses the full time series as the source of templates, rather than the usual 10%. This algorithm can be used in medical applications of complexity theory, as it can calculate all three parameters for a realistic recording of 104 points within minutes with the use of an average notebook computer.
Computer Solution of the Schrodinger Equation--Two Useful Programs.

ERIC Educational Resources Information Center

Evans, D. E.

1980-01-01

Describes a general purpose algorithm which enables one to calculate the allowed energy eigenvalues for an arbitrary potential. Results of a calculation where a centrifugal potential is added to the hydrogenic Coulomb potential are discussed. (Author/HM)
Approximate factorization for incompressible flow. Ph.D. Thesis; [Navier-Stokes equation

NASA Technical Reports Server (NTRS)

Bernard, R. S.

1981-01-01

For computational solution of the incompressible Navier-Stokes equations, the approximate factorization (AF) algorithm is used to solve the vectorized momentum equation in delta form based on the pressure calculated in the previous time step. The newly calculated velocities are substituted into the pressure equation (obtained from a linear combination of the continuity and momentum equation), which is then solved by means of line SOR. Computational results are presented for the NACA 66 sub 3 018 airfoil at Reynolds numbers of 1000 and 40,000 and attack angles of 0 and 6 degrees. Comparison with wind tunnel data for Re = 40,000 indicates good qualitative agreement between measured and calculated pressure distributions. Quantitative agreement is only fair, however, with the calculations somewhat displaced from the measurements. Furthermore, the computed velocity profiles are unrealistically thick around the airfoil, due to the excessive amount of artificial viscosity needed for stability. Based on the performance of the algorithm with regard to stability, it is concluded that AF/SOR is suitable for calculations at Reynolds numbers less than 10,000. Speedwise, the method is faster than point SOR by at least a factor of two.
An all digital phase locked loop for FM demodulation.

NASA Technical Reports Server (NTRS)

Greco, J.; Garodnick, J.; Schilling, D. L.

1972-01-01

A phase-locked loop designed with all-digital circuitry which avoids certain problems, and a digital voltage controlled oscillator algorithm are described. The system operates synchronously and performs all required digital calculations within one sampling period, thereby performing as a real-time special-purpose computer. The SNR ratio is computed for frequency offsets and sinusoidal modulation, and experimental results verify the theoretical calculations.
Methods and Algorithms for Computer-aided Engineering of Die Tooling of Compressor Blades from Titanium Alloy

NASA Astrophysics Data System (ADS)

Khaimovich, A. I.; Khaimovich, I. N.

2018-01-01

The articles provides the calculation algorithms for blank design and die forming fitting to produce the compressor blades for aircraft engines. The design system proposed in the article allows generating drafts of trimming and reducing dies automatically, leading to significant reduction of work preparation time. The detailed analysis of the blade structural elements features was carried out, the taken limitations and technological solutions allowed to form generalized algorithms of forming parting stamp face over the entire circuit of the engraving for different configurations of die forgings. The author worked out the algorithms and programs to calculate three dimensional point locations describing the configuration of die cavity.

New algorithm for tensor contractions on multi-core CPUs, GPUs, and accelerators enables CCSD and EOM-CCSD calculations with over 1000 basis functions on a single compute node.

PubMed

Kaliman, Ilya A; Krylov, Anna I

2017-04-30

A new hardware-agnostic contraction algorithm for tensors of arbitrary symmetry and sparsity is presented. The algorithm is implemented as a stand-alone open-source code libxm. This code is also integrated with general tensor library libtensor and with the Q-Chem quantum-chemistry package. An overview of the algorithm, its implementation, and benchmarks are presented. Similarly to other tensor software, the algorithm exploits efficient matrix multiplication libraries and assumes that tensors are stored in a block-tensor form. The distinguishing features of the algorithm are: (i) efficient repackaging of the individual blocks into large matrices and back, which affords efficient graphics processing unit (GPU)-enabled calculations without modifications of higher-level codes; (ii) fully asynchronous data transfer between disk storage and fast memory. The algorithm enables canonical all-electron coupled-cluster and equation-of-motion coupled-cluster calculations with single and double substitutions (CCSD and EOM-CCSD) with over 1000 basis functions on a single quad-GPU machine. We show that the algorithm exhibits predicted theoretical scaling for canonical CCSD calculations, O(N 6 ), irrespective of the data size on disk. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Using quantum chemistry muscle to flex massive systems: How to respond to something perturbing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bertoni, Colleen

Computational chemistry uses the theoretical advances of quantum mechanics and the algorithmic and hardware advances of computer science to give insight into chemical problems. It is currently possible to do highly accurate quantum chemistry calculations, but the most accurate methods are very computationally expensive. Thus it is only feasible to do highly accurate calculations on small molecules, since typically more computationally efficient methods are also less accurate. The overall goal of my dissertation work has been to try to decrease the computational expense of calculations without decreasing the accuracy. In particular, my dissertation work focuses on fragmentation methods, intermolecular interactionsmore » methods, analytic gradients, and taking advantage of new hardware.« less
Exploring Neutrino Oscillation Parameter Space with a Monte Carlo Algorithm

NASA Astrophysics Data System (ADS)

Espejel, Hugo; Ernst, David; Cogswell, Bernadette; Latimer, David

2015-04-01

The χ2 (or likelihood) function for a global analysis of neutrino oscillation data is first calculated as a function of the neutrino mixing parameters. A computational challenge is to obtain the minima or the allowed regions for the mixing parameters. The conventional approach is to calculate the χ2 (or likelihood) function on a grid for a large number of points, and then marginalize over the likelihood function. As the number of parameters increases with the number of neutrinos, making the calculation numerically efficient becomes necessary. We implement a new Monte Carlo algorithm (D. Foreman-Mackey, D. W. Hogg, D. Lang and J. Goodman, Publications of the Astronomical Society of the Pacific, 125 306 (2013)) to determine its computational efficiency at finding the minima and allowed regions. We examine a realistic example to compare the historical and the new methods.
A flexible and accurate digital volume correlation method applicable to high-resolution volumetric images

NASA Astrophysics Data System (ADS)

Pan, Bing; Wang, Bo

2017-10-01

Digital volume correlation (DVC) is a powerful technique for quantifying interior deformation within solid opaque materials and biological tissues. In the last two decades, great efforts have been made to improve the accuracy and efficiency of the DVC algorithm. However, there is still a lack of a flexible, robust and accurate version that can be efficiently implemented in personal computers with limited RAM. This paper proposes an advanced DVC method that can realize accurate full-field internal deformation measurement applicable to high-resolution volume images with up to billions of voxels. Specifically, a novel layer-wise reliability-guided displacement tracking strategy combined with dynamic data management is presented to guide the DVC computation from slice to slice. The displacements at specified calculation points in each layer are computed using the advanced 3D inverse-compositional Gauss-Newton algorithm with the complete initial guess of the deformation vector accurately predicted from the computed calculation points. Since only limited slices of interest in the reference and deformed volume images rather than the whole volume images are required, the DVC calculation can thus be efficiently implemented on personal computers. The flexibility, accuracy and efficiency of the presented DVC approach are demonstrated by analyzing computer-simulated and experimentally obtained high-resolution volume images.
Real-time optical flow estimation on a GPU for a skied-steered mobile robot

NASA Astrophysics Data System (ADS)

Kniaz, V. V.

2016-04-01

Accurate egomotion estimation is required for mobile robot navigation. Often the egomotion is estimated using optical flow algorithms. For an accurate estimation of optical flow most of modern algorithms require high memory resources and processor speed. However simple single-board computers that control the motion of the robot usually do not provide such resources. On the other hand, most of modern single-board computers are equipped with an embedded GPU that could be used in parallel with a CPU to improve the performance of the optical flow estimation algorithm. This paper presents a new Z-flow algorithm for efficient computation of an optical flow using an embedded GPU. The algorithm is based on the phase correlation optical flow estimation and provide a real-time performance on a low cost embedded GPU. The layered optical flow model is used. Layer segmentation is performed using graph-cut algorithm with a time derivative based energy function. Such approach makes the algorithm both fast and robust in low light and low texture conditions. The algorithm implementation for a Raspberry Pi Model B computer is discussed. For evaluation of the algorithm the computer was mounted on a Hercules mobile skied-steered robot equipped with a monocular camera. The evaluation was performed using a hardware-in-the-loop simulation and experiments with Hercules mobile robot. Also the algorithm was evaluated using KITTY Optical Flow 2015 dataset. The resulting endpoint error of the optical flow calculated with the developed algorithm was low enough for navigation of the robot along the desired trajectory.
On computing Laplace's coefficients and their derivatives.

NASA Astrophysics Data System (ADS)

Gerasimov, I. A.; Vinnikov, E. L.

The algorithm of computing Laplace's coefficients and their derivatives is proposed with application of recurrent relations. The A.G.M.-method is used for the calculation of values L0(0), L0(1). The FORTRAN-program corresponding to the algorithm is given. The precision control was provided with numerical integrating by Simpsons method. The behavior of Laplace's coefficients and their third derivatives whith varying indices K, n for fixed values of the α-parameter is presented graphically.
On randomized algorithms for numerical solution of applied Fredholm integral equations of the second kind

NASA Astrophysics Data System (ADS)

Voytishek, Anton V.; Shipilov, Nikolay M.

2017-11-01

In this paper, the systematization of numerical (implemented on a computer) randomized functional algorithms for approximation of a solution of Fredholm integral equation of the second kind is carried out. Wherein, three types of such algorithms are distinguished: the projection, the mesh and the projection-mesh methods. The possibilities for usage of these algorithms for solution of practically important problems is investigated in detail. The disadvantages of the mesh algorithms, related to the necessity of calculation values of the kernels of integral equations in fixed points, are identified. On practice, these kernels have integrated singularities, and calculation of their values is impossible. Thus, for applied problems, related to solving Fredholm integral equation of the second kind, it is expedient to use not mesh, but the projection and the projection-mesh randomized algorithms.
GRAPE-5: A Special-Purpose Computer for N-Body Simulations

NASA Astrophysics Data System (ADS)

Kawai, Atsushi; Fukushige, Toshiyuki; Makino, Junichiro; Taiji, Makoto

2000-08-01

We have developed a special-purpose computer for gravitational many-body simulations, GRAPE-5. GRAPE-5 accelerates the force calculation which dominates the calculation cost of the simulation. All other calculations, such as the time integration of orbits, are performed on a general-purpose computer (host computer) connected to GRAPE-5. A GRAPE-5 board consists of eight custom pipeline chips (G5 chip) and its peak performance is 38.4 Gflops. GRAPE-5 is the successor of GRAPE-3. The differences between GRAPE-5 and GRAPE-3 are: (1) The newly developed G5 chip contains two pipelines operating at 80 MHz, while the GRAPE chip, which was used for GRAPE-3, had one at 20 MHz. The calculation speed of GRAPE-5 is 8-times faster than that of GRAPE-3. (2) The GRAPE-5 board adopted a PCI bus as the interface to the host computer instead of VME of GRAPE-3, resulting in a communication speed one order of magnitude faster. (3) In addition to the pure 1/r potential, the G5 chip can calculate forces with arbitrary cutoff functions, so that it can be applied to the Ewald or P3M methods. (4) The pairwise force calculated on GRAPE-5 is about 10-times more accurate than that on GRAPE-3. On one GRAPE-5 board, one timestep with a direct summation algorithm takes 14 (N/128 k)2 seconds. With the Barnes-Hut tree algorithm (theta = 0.75), one timestep can be done in 15 (N/106) seconds.
An efficient parallel algorithm for matrix-vector multiplication

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hendrickson, B.; Leland, R.; Plimpton, S.

The multiplication of a vector by a matrix is the kernel computation of many algorithms in scientific computation. A fast parallel algorithm for this calculation is therefore necessary if one is to make full use of the new generation of parallel supercomputers. This paper presents a high performance, parallel matrix-vector multiplication algorithm that is particularly well suited to hypercube multiprocessors. For an n x n matrix on p processors, the communication cost of this algorithm is O(n/[radical]p + log(p)), independent of the matrix sparsity pattern. The performance of the algorithm is demonstrated by employing it as the kernel in themore » well-known NAS conjugate gradient benchmark, where a run time of 6.09 seconds was observed. This is the best published performance on this benchmark achieved to date using a massively parallel supercomputer.« less
Description of the computations and pilot procedures for planning fuel-conservative descents with a small programmable calculator

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vicroy, D.D.; Knox, C.E.

A simplified flight management descent algorithm was developed and programmed on a small programmable calculator. It was designed to aid the pilot in planning and executing a fuel conservative descent to arrive at a metering fix at a time designated by the air traffic control system. The algorithm may also be used for planning fuel conservative descents when time is not a consideration. The descent path was calculated for a constant Mach/airspeed schedule from linear approximations of airplane performance with considerations given for gross weight, wind, and nonstandard temperature effects. The flight management descent algorithm and the vertical performance modelingmore » required for the DC-10 airplane is described.« less
Real-time image dehazing using local adaptive neighborhoods and dark-channel-prior

NASA Astrophysics Data System (ADS)

Valderrama, Jesus A.; Díaz-Ramírez, Víctor H.; Kober, Vitaly; Hernandez, Enrique

2015-09-01

A real-time algorithm for single image dehazing is presented. The algorithm is based on calculation of local neighborhoods of a hazed image inside a moving window. The local neighborhoods are constructed by computing rank-order statistics. Next the dark-channel-prior approach is applied to the local neighborhoods to estimate the transmission function of the scene. By using the suggested approach there is no need for applying a refining algorithm to the estimated transmission such as the soft matting algorithm. To achieve high-rate signal processing the proposed algorithm is implemented exploiting massive parallelism on a graphics processing unit (GPU). Computer simulation results are carried out to test the performance of the proposed algorithm in terms of dehazing efficiency and speed of processing. These tests are performed using several synthetic and real images. The obtained results are analyzed and compared with those obtained with existing dehazing algorithms.
Clinical implementation and evaluation of the Acuros dose calculation algorithm.

PubMed

Yan, Chenyu; Combine, Anthony G; Bednarz, Greg; Lalonde, Ronald J; Hu, Bin; Dickens, Kathy; Wynn, Raymond; Pavord, Daniel C; Saiful Huq, M

2017-09-01

The main aim of this study is to validate the Acuros XB dose calculation algorithm for a Varian Clinac iX linac in our clinics, and subsequently compare it with the wildely used AAA algorithm. The source models for both Acuros XB and AAA were configured by importing the same measured beam data into Eclipse treatment planning system. Both algorithms were validated by comparing calculated dose with measured dose on a homogeneous water phantom for field sizes ranging from 6 cm × 6 cm to 40 cm × 40 cm. Central axis and off-axis points with different depths were chosen for the comparison. In addition, the accuracy of Acuros was evaluated for wedge fields with wedge angles from 15 to 60°. Similarly, variable field sizes for an inhomogeneous phantom were chosen to validate the Acuros algorithm. In addition, doses calculated by Acuros and AAA at the center of lung equivalent tissue from three different VMAT plans were compared to the ion chamber measured doses in QUASAR phantom, and the calculated dose distributions by the two algorithms and their differences on patients were compared. Computation time on VMAT plans was also evaluated for Acuros and AAA. Differences between dose-to-water (calculated by AAA and Acuros XB) and dose-to-medium (calculated by Acuros XB) on patient plans were compared and evaluated. For open 6 MV photon beams on the homogeneous water phantom, both Acuros XB and AAA calculations were within 1% of measurements. For 23 MV photon beams, the calculated doses were within 1.5% of measured doses for Acuros XB and 2% for AAA. Testing on the inhomogeneous phantom demonstrated that AAA overestimated doses by up to 8.96% at a point close to lung/solid water interface, while Acuros XB reduced that to 1.64%. The test on QUASAR phantom showed that Acuros achieved better agreement in lung equivalent tissue while AAA underestimated dose for all VMAT plans by up to 2.7%. Acuros XB computation time was about three times faster than AAA for VMAT plans, and computation time for other plans will be discussed at the end. Maximum difference between dose calculated by AAA and dose-to-medium by Acuros XB (Acuros_D m,m ) was 4.3% on patient plans at the isocenter, and maximum difference between D 100 calculated by AAA and by Acuros_D m,m was 11.3%. When calculating the maximum dose to spinal cord on patient plans, differences between dose calculated by AAA and Acuros_D m,m were more than 3%. Compared with AAA, Acuros XB improves accuracy in the presence of inhomogeneity, and also significantly reduces computation time for VMAT plans. Dose differences between AAA and Acuros_D w,m were generally less than the dose differences between AAA and Acuros_D m,m . Clinical practitioners should consider making Acuros XB available in clinics, however, further investigation and clarification is needed about which dose reporting mode (dose-to-water or dose-to-medium) should be used in clinics. © 2017 The Authors. Journal of Applied Clinical Medical Physics published by Wiley Periodicals, Inc. on behalf of American Association of Physicists in Medicine.
Variable-Metric Algorithm For Constrained Optimization

NASA Technical Reports Server (NTRS)

Frick, James D.

1989-01-01

Variable Metric Algorithm for Constrained Optimization (VMACO) is nonlinear computer program developed to calculate least value of function of n variables subject to general constraints, both equality and inequality. First set of constraints equality and remaining constraints inequalities. Program utilizes iterative method in seeking optimal solution. Written in ANSI Standard FORTRAN 77.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Pastore, Giovanni; Rabiti, Cristian; Pizzocri, Davide

PolyPole is a numerical algorithm for the calculation of intra-granular fission gas release. In particular, the algorithm solves the gas diffusion problem in a fuel grain in time-varying conditions. The program has been extensively tested. PolyPole combines a high accuracy with a high computational efficiency and is ideally suited for application in fuel performance codes.
Spectrum Orbit Utilization Program documentation: SOUP5 version 3.8 user's manual, volume 1, chapters 1 through 5

NASA Technical Reports Server (NTRS)

Davidson, J.; Ottey, H. R.; Sawitz, P.; Zusman, F. S.

1985-01-01

The underlying engineering and mathematical models as well as the computational methods used by the Spectrum Orbit Utilization Program 5 (SOUP5) analysis programs are described. Included are the algorithms used to calculate the technical parameters, and references to the technical literature. The organization, capabilities, processing sequences, and processing and data options of the SOUP5 system are described. The details of the geometric calculations are given. Also discussed are the various antenna gain algorithms; rain attenuation and depolarization calculations; calculations of transmitter power and received power flux density; channelization options, interference categories, and protection ratio calculation; generation of aggregrate interference and margins; equivalent gain calculations; and how to enter a protection ratio template.
A modified 3D algorithm for road traffic noise attenuation calculations in large urban areas.

PubMed

Wang, Haibo; Cai, Ming; Yao, Yifan

2017-07-01

The primary objective of this study is the development and application of a 3D road traffic noise attenuation calculation algorithm. First, the traditional empirical method does not address problems caused by non-direct occlusion by buildings and the different building heights. In contrast, this study considers the volume ratio of the buildings and the area ratio of the projection of buildings adjacent to the road. The influence of the ground affection is analyzed. The insertion loss due to barriers (infinite length and finite barriers) is also synthesized in the algorithm. Second, the impact of different road segmentation is analyzed. Through the case of Pearl River New Town, it is recommended that 5° is the most appropriate scanning angle as the computational time is acceptable and the average error is approximately 3.1 dB. In addition, the algorithm requires only 1/17 of the time that the beam tracking method requires at the cost of more imprecise calculation results. Finally, the noise calculation for a large urban area with a high density of buildings shows the feasibility of the 3D noise attenuation calculation algorithm. The algorithm is expected to be applied in projects requiring large area noise simulations. Copyright © 2017 Elsevier Ltd. All rights reserved.
Optimization Algorithm for Kalman Filter Exploiting the Numerical Characteristics of SINS/GPS Integrated Navigation Systems.

PubMed

Hu, Shaoxing; Xu, Shike; Wang, Duhu; Zhang, Aiwu

2015-11-11

Aiming at addressing the problem of high computational cost of the traditional Kalman filter in SINS/GPS, a practical optimization algorithm with offline-derivation and parallel processing methods based on the numerical characteristics of the system is presented in this paper. The algorithm exploits the sparseness and/or symmetry of matrices to simplify the computational procedure. Thus plenty of invalid operations can be avoided by offline derivation using a block matrix technique. For enhanced efficiency, a new parallel computational mechanism is established by subdividing and restructuring calculation processes after analyzing the extracted "useful" data. As a result, the algorithm saves about 90% of the CPU processing time and 66% of the memory usage needed in a classical Kalman filter. Meanwhile, the method as a numerical approach needs no precise-loss transformation/approximation of system modules and the accuracy suffers little in comparison with the filter before computational optimization. Furthermore, since no complicated matrix theories are needed, the algorithm can be easily transplanted into other modified filters as a secondary optimization method to achieve further efficiency.
Rapid Calculation of Max-Min Fair Rates for Multi-Commodity Flows in Fat-Tree Networks

DOE PAGES

Mollah, Md Atiqul; Yuan, Xin; Pakin, Scott; ...

2017-08-29

Max-min fairness is often used in the performance modeling of interconnection networks. Existing methods to compute max-min fair rates for multi-commodity flows have high complexity and are computationally infeasible for large networks. In this paper, we show that by considering topological features, this problem can be solved efficiently for the fat-tree topology that is widely used in data centers and high performance compute clusters. Several efficient new algorithms are developed for this problem, including a parallel algorithm that can take advantage of multi-core and shared-memory architectures. Using these algorithms, we demonstrate that it is possible to find the max-min fairmore » rate allocation for multi-commodity flows in fat-tree networks that support tens of thousands of nodes. We evaluate the run-time performance of the proposed algorithms and show improvement in orders of magnitude over the previously best known method. Finally, we further demonstrate a new application of max-min fair rate allocation that is only computationally feasible using our new algorithms.« less
Rapid Calculation of Max-Min Fair Rates for Multi-Commodity Flows in Fat-Tree Networks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mollah, Md Atiqul; Yuan, Xin; Pakin, Scott

Max-min fairness is often used in the performance modeling of interconnection networks. Existing methods to compute max-min fair rates for multi-commodity flows have high complexity and are computationally infeasible for large networks. In this paper, we show that by considering topological features, this problem can be solved efficiently for the fat-tree topology that is widely used in data centers and high performance compute clusters. Several efficient new algorithms are developed for this problem, including a parallel algorithm that can take advantage of multi-core and shared-memory architectures. Using these algorithms, we demonstrate that it is possible to find the max-min fairmore » rate allocation for multi-commodity flows in fat-tree networks that support tens of thousands of nodes. We evaluate the run-time performance of the proposed algorithms and show improvement in orders of magnitude over the previously best known method. Finally, we further demonstrate a new application of max-min fair rate allocation that is only computationally feasible using our new algorithms.« less
Fast Ss-Ilm a Computationally Efficient Algorithm to Discover Socially Important Locations

NASA Astrophysics Data System (ADS)

Dokuz, A. S.; Celik, M.

2017-11-01

Socially important locations are places which are frequently visited by social media users in their social media lifetime. Discovering socially important locations provide several valuable information about user behaviours on social media networking sites. However, discovering socially important locations are challenging due to data volume and dimensions, spatial and temporal calculations, location sparseness in social media datasets, and inefficiency of current algorithms. In the literature, several studies are conducted to discover important locations, however, the proposed approaches do not work in computationally efficient manner. In this study, we propose Fast SS-ILM algorithm by modifying the algorithm of SS-ILM to mine socially important locations efficiently. Experimental results show that proposed Fast SS-ILM algorithm decreases execution time of socially important locations discovery process up to 20 %.

On improving the algorithm efficiency in the particle-particle force calculations

NASA Astrophysics Data System (ADS)

Kozynchenko, Alexander I.; Kozynchenko, Sergey A.

2016-09-01

The problem of calculating inter-particle forces in the particle-particle (PP) simulation models takes an important place in scientific computing. Such simulation models are used in diverse scientific applications arising in astrophysics, plasma physics, particle accelerators, etc., where the long-range forces are considered. The inverse-square laws such as Coulomb's law of electrostatic forces and Newton's law of universal gravitation are the examples of laws pertaining to the long-range forces. The standard naïve PP method outlined, for example, by Hockney and Eastwood [1] is straightforward, processing all pairs of particles in a double nested loop. The PP algorithm provides the best accuracy of all possible methods, but its computational complexity is O (Np2), where Np is a total number of particles involved. Too low efficiency of the PP algorithm seems to be the challenging issue in some cases where the high accuracy is required. An example can be taken from the charged particle beam dynamics where, under computing the own space charge of the beam, so-called macro-particles are used (see e.g., Humphries Jr. [2], Kozynchenko and Svistunov [3]).
An Implicit Upwind Algorithm for Computing Turbulent Flows on Unstructured Grids

NASA Technical Reports Server (NTRS)

Anerson, W. Kyle; Bonhaus, Daryl L.

1994-01-01

An implicit, Navier-Stokes solution algorithm is presented for the computation of turbulent flow on unstructured grids. The inviscid fluxes are computed using an upwind algorithm and the solution is advanced in time using a backward-Euler time-stepping scheme. At each time step, the linear system of equations is approximately solved with a point-implicit relaxation scheme. This methodology provides a viable and robust algorithm for computing turbulent flows on unstructured meshes. Results are shown for subsonic flow over a NACA 0012 airfoil and for transonic flow over a RAE 2822 airfoil exhibiting a strong upper-surface shock. In addition, results are shown for 3 element and 4 element airfoil configurations. For the calculations, two one equation turbulence models are utilized. For the NACA 0012 airfoil, a pressure distribution and force data are compared with other computational results as well as with experiment. Comparisons of computed pressure distributions and velocity profiles with experimental data are shown for the RAE airfoil and for the 3 element configuration. For the 4 element case, comparisons of surface pressure distributions with experiment are made. In general, the agreement between the computations and the experiment is good.
A Flexible Computational Framework Using R and Map-Reduce for Permutation Tests of Massive Genetic Analysis of Complex Traits.

PubMed

Mahjani, Behrang; Toor, Salman; Nettelblad, Carl; Holmgren, Sverker

2017-01-01

In quantitative trait locus (QTL) mapping significance of putative QTL is often determined using permutation testing. The computational needs to calculate the significance level are immense, 10 4 up to 10 8 or even more permutations can be needed. We have previously introduced the PruneDIRECT algorithm for multiple QTL scan with epistatic interactions. This algorithm has specific strengths for permutation testing. Here, we present a flexible, parallel computing framework for identifying multiple interacting QTL using the PruneDIRECT algorithm which uses the map-reduce model as implemented in Hadoop. The framework is implemented in R, a widely used software tool among geneticists. This enables users to rearrange algorithmic steps to adapt genetic models, search algorithms, and parallelization steps to their needs in a flexible way. Our work underlines the maturity of accessing distributed parallel computing for computationally demanding bioinformatics applications through building workflows within existing scientific environments. We investigate the PruneDIRECT algorithm, comparing its performance to exhaustive search and DIRECT algorithm using our framework on a public cloud resource. We find that PruneDIRECT is vastly superior for permutation testing, and perform 2 ×10 5 permutations for a 2D QTL problem in 15 hours, using 100 cloud processes. We show that our framework scales out almost linearly for a 3D QTL search.
An Efficient Supervised Training Algorithm for Multilayer Spiking Neural Networks

PubMed Central

Xie, Xiurui; Qu, Hong; Liu, Guisong; Zhang, Malu; Kurths, Jürgen

2016-01-01

The spiking neural networks (SNNs) are the third generation of neural networks and perform remarkably well in cognitive tasks such as pattern recognition. The spike emitting and information processing mechanisms found in biological cognitive systems motivate the application of the hierarchical structure and temporal encoding mechanism in spiking neural networks, which have exhibited strong computational capability. However, the hierarchical structure and temporal encoding approach require neurons to process information serially in space and time respectively, which reduce the training efficiency significantly. For training the hierarchical SNNs, most existing methods are based on the traditional back-propagation algorithm, inheriting its drawbacks of the gradient diffusion and the sensitivity on parameters. To keep the powerful computation capability of the hierarchical structure and temporal encoding mechanism, but to overcome the low efficiency of the existing algorithms, a new training algorithm, the Normalized Spiking Error Back Propagation (NSEBP) is proposed in this paper. In the feedforward calculation, the output spike times are calculated by solving the quadratic function in the spike response model instead of detecting postsynaptic voltage states at all time points in traditional algorithms. Besides, in the feedback weight modification, the computational error is propagated to previous layers by the presynaptic spike jitter instead of the gradient decent rule, which realizes the layer-wised training. Furthermore, our algorithm investigates the mathematical relation between the weight variation and voltage error change, which makes the normalization in the weight modification applicable. Adopting these strategies, our algorithm outperforms the traditional SNN multi-layer algorithms in terms of learning efficiency and parameter sensitivity, that are also demonstrated by the comprehensive experimental results in this paper. PMID:27044001
Rapid calculation of radiative heating rates and photodissociation rates in inhomogeneous multiple scattering atmospheres

NASA Technical Reports Server (NTRS)

Toon, Owen B.; Mckay, C. P.; Ackerman, T. P.; Santhanam, K.

1989-01-01

The solution of the generalized two-stream approximation for radiative transfer in homogeneous multiple scattering atmospheres is extended to vertically inhomogeneous atmospheres in a manner which is numerically stable and computationally efficient. It is shown that solar energy deposition rates, photolysis rates, and infrared cooling rates all may be calculated with the simple modifications of a single algorithm. The accuracy of the algorithm is generally better than 10 percent, so that other uncertainties, such as in absorption coefficients, may often dominate the error in calculation of the quantities of interest to atmospheric studies.
Harmonic analysis of spacecraft power systems using a personal computer

NASA Technical Reports Server (NTRS)

Williamson, Frank; Sheble, Gerald B.

1989-01-01

The effects that nonlinear devices such as ac/dc converters, HVDC transmission links, and motor drives have on spacecraft power systems are discussed. The nonsinusoidal currents, along with the corresponding voltages, are calculated by a harmonic power flow which decouples and solves for each harmonic component individually using an iterative Newton-Raphson algorithm. The sparsity of the harmonic equations and the overall Jacobian matrix is used to an advantage in terms of saving computer memory space and in terms of reducing computation time. The algorithm could also be modified to analyze each harmonic separately instead of all at the same time.
Iterative methods for plasma sheath calculations: Application to spherical probe

NASA Technical Reports Server (NTRS)

Parker, L. W.; Sullivan, E. C.

1973-01-01

The computer cost of a Poisson-Vlasov iteration procedure for the numerical solution of a steady-state collisionless plasma-sheath problem depends on: (1) the nature of the chosen iterative algorithm, (2) the position of the outer boundary of the grid, and (3) the nature of the boundary condition applied to simulate a condition at infinity (as in three-dimensional probe or satellite-wake problems). Two iterative algorithms, in conjunction with three types of boundary conditions, are analyzed theoretically and applied to the computation of current-voltage characteristics of a spherical electrostatic probe. The first algorithm was commonly used by physicists, and its computer costs depend primarily on the boundary conditions and are only slightly affected by the mesh interval. The second algorithm is not commonly used, and its costs depend primarily on the mesh interval and slightly on the boundary conditions.
Network reliability maximization for stochastic-flow network subject to correlated failures using genetic algorithm and tabu\\xA0search

NASA Astrophysics Data System (ADS)

Yeh, Cheng-Ta; Lin, Yi-Kuei; Yang, Jo-Yun

2018-07-01

Network reliability is an important performance index for many real-life systems, such as electric power systems, computer systems and transportation systems. These systems can be modelled as stochastic-flow networks (SFNs) composed of arcs and nodes. Most system supervisors respect the network reliability maximization by finding the optimal multi-state resource assignment, which is one resource to each arc. However, a disaster may cause correlated failures for the assigned resources, affecting the network reliability. This article focuses on determining the optimal resource assignment with maximal network reliability for SFNs. To solve the problem, this study proposes a hybrid algorithm integrating the genetic algorithm and tabu search to determine the optimal assignment, called the hybrid GA-TS algorithm (HGTA), and integrates minimal paths, recursive sum of disjoint products and the correlated binomial distribution to calculate network reliability. Several practical numerical experiments are adopted to demonstrate that HGTA has better computational quality than several popular soft computing algorithms.
Algorithms for Computing the Magnetic Field, Vector Potential, and Field Derivatives for Circular Current Loops in Cylindrical Coordinates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Walstrom, Peter Lowell

A numerical algorithm for computing the field components B r and B z and their r and z derivatives with open boundaries in cylindrical coordinates for circular current loops is described. An algorithm for computing the vector potential is also described. For the convenience of the reader, derivations of the final expressions from their defining integrals are given in detail, since their derivations (especially for the field derivatives) are not all easily found in textbooks. Numerical calculations are based on evaluation of complete elliptic integrals using the Bulirsch algorithm cel. Since cel can evaluate complete elliptic integrals of a fairlymore » general type, in some cases the elliptic integrals can be evaluated without first reducing them to forms containing standard Legendre forms. The algorithms avoid the numerical difficulties that many of the textbook solutions have for points near the axis because of explicit factors of 1=r or 1=r 2 in the some of the expressions.« less
Some queuing network models of computer systems

NASA Technical Reports Server (NTRS)

Herndon, E. S.

1980-01-01

Queuing network models of a computer system operating with a single workload type are presented. Program algorithms are adapted for use on the Texas Instruments SR-52 programmable calculator. By slightly altering the algorithm to process the G and H matrices row by row instead of column by column, six devices and an unlimited job/terminal population could be handled on the SR-52. Techniques are also introduced for handling a simple load dependent server and for studying interactive systems with fixed multiprogramming limits.
PHoToNs–A parallel heterogeneous and threads oriented code for cosmological N-body simulation

NASA Astrophysics Data System (ADS)

Wang, Qiao; Cao, Zong-Yan; Gao, Liang; Chi, Xue-Bin; Meng, Chen; Wang, Jie; Wang, Long

2018-06-01

We introduce a new code for cosmological simulations, PHoToNs, which incorporates features for performing massive cosmological simulations on heterogeneous high performance computer (HPC) systems and threads oriented programming. PHoToNs adopts a hybrid scheme to compute gravitational force, with the conventional Particle-Mesh (PM) algorithm to compute the long-range force, the Tree algorithm to compute the short range force and the direct summation Particle-Particle (PP) algorithm to compute gravity from very close particles. A self-similar space filling a Peano-Hilbert curve is used to decompose the computing domain. Threads programming is advantageously used to more flexibly manage the domain communication, PM calculation and synchronization, as well as Dual Tree Traversal on the CPU+MIC platform. PHoToNs scales well and efficiency of the PP kernel achieves 68.6% of peak performance on MIC and 74.4% on CPU platforms. We also test the accuracy of the code against the much used Gadget-2 in the community and found excellent agreement.
An Algorithm to Compress Line-transition Data for Radiative-transfer Calculations

NASA Astrophysics Data System (ADS)

Cubillos, Patricio E.

2017-11-01

Molecular line-transition lists are an essential ingredient for radiative-transfer calculations. With recent databases now surpassing the billion-line mark, handling them has become computationally prohibitive, due to both the required processing power and memory. Here I present a temperature-dependent algorithm to separate strong from weak line transitions, reformatting the large majority of the weaker lines into a cross-section data file, and retaining the detailed line-by-line information of the fewer strong lines. For any given molecule over the 0.3-30 μm range, this algorithm reduces the number of lines to a few million, enabling faster radiative-transfer computations without a significant loss of information. The final compression rate depends on how densely populated the spectrum is. I validate this algorithm by comparing Exomol’s HCN extinction-coefficient spectra between the complete (65 million line transitions) and compressed (7.7 million) line lists. Over the 0.6-33 μm range, the average difference between extinction-coefficient values is less than 1%. A Python/C implementation of this algorithm is open-source and available at https://github.com/pcubillos/repack. So far, this code handles the Exomol and HITRAN line-transition format.
Computationally efficient method for Fourier transform of highly chirped pulses for laser and parametric amplifier modeling.

PubMed

Andrianov, Alexey; Szabo, Aron; Sergeev, Alexander; Kim, Arkady; Chvykov, Vladimir; Kalashnikov, Mikhail

2016-11-14

We developed an improved approach to calculate the Fourier transform of signals with arbitrary large quadratic phase which can be efficiently implemented in numerical simulations utilizing Fast Fourier transform. The proposed algorithm significantly reduces the computational cost of Fourier transform of a highly chirped and stretched pulse by splitting it into two separate transforms of almost transform limited pulses, thereby reducing the required grid size roughly by a factor of the pulse stretching. The application of our improved Fourier transform algorithm in the split-step method for numerical modeling of CPA and OPCPA shows excellent agreement with standard algorithms.
Distributed Environment Control Using Wireless Sensor/Actuator Networks for Lighting Applications

PubMed Central

Nakamura, Masayuki; Sakurai, Atsushi; Nakamura, Jiro

2009-01-01

We propose a decentralized algorithm to calculate the control signals for lights in wireless sensor/actuator networks. This algorithm uses an appropriate step size in the iterative process used for quickly computing the control signals. We demonstrate the accuracy and efficiency of this approach compared with the penalty method by using Mote-based mesh sensor networks. The estimation error of the new approach is one-eighth as large as that of the penalty method with one-fifth of its computation time. In addition, we describe our sensor/actuator node for distributed lighting control based on the decentralized algorithm and demonstrate its practical efficacy. PMID:22291525
GPU-based ultra-fast dose calculation using a finite size pencil beam model.

PubMed

Gu, Xuejun; Choi, Dongju; Men, Chunhua; Pan, Hubert; Majumdar, Amitava; Jiang, Steve B

2009-10-21

Online adaptive radiation therapy (ART) is an attractive concept that promises the ability to deliver an optimal treatment in response to the inter-fraction variability in patient anatomy. However, it has yet to be realized due to technical limitations. Fast dose deposit coefficient calculation is a critical component of the online planning process that is required for plan optimization of intensity-modulated radiation therapy (IMRT). Computer graphics processing units (GPUs) are well suited to provide the requisite fast performance for the data-parallel nature of dose calculation. In this work, we develop a dose calculation engine based on a finite-size pencil beam (FSPB) algorithm and a GPU parallel computing framework. The developed framework can accommodate any FSPB model. We test our implementation in the case of a water phantom and the case of a prostate cancer patient with varying beamlet and voxel sizes. All testing scenarios achieved speedup ranging from 200 to 400 times when using a NVIDIA Tesla C1060 card in comparison with a 2.27 GHz Intel Xeon CPU. The computational time for calculating dose deposition coefficients for a nine-field prostate IMRT plan with this new framework is less than 1 s. This indicates that the GPU-based FSPB algorithm is well suited for online re-planning for adaptive radiotherapy.
Simplified Calculation Of Solar Fluxes In Solar Receivers

NASA Technical Reports Server (NTRS)

Bhandari, Pradeep

1990-01-01

Simplified Calculation of Solar Flux Distribution on Side Wall of Cylindrical Cavity Solar Receivers computer program employs simple solar-flux-calculation algorithm for cylindrical-cavity-type solar receiver. Results compare favorably with those of more complicated programs. Applications include study of solar energy and transfer of heat, and space power/solar-dynamics engineering. Written in FORTRAN 77.
Adaptive density trajectory cluster based on time and space distance

NASA Astrophysics Data System (ADS)

Liu, Fagui; Zhang, Zhijie

2017-10-01

There are some hotspot problems remaining in trajectory cluster for discovering mobile behavior regularity, such as the computation of distance between sub trajectories, the setting of parameter values in cluster algorithm and the uncertainty/boundary problem of data set. As a result, based on the time and space, this paper tries to define the calculation method of distance between sub trajectories. The significance of distance calculation for sub trajectories is to clearly reveal the differences in moving trajectories and to promote the accuracy of cluster algorithm. Besides, a novel adaptive density trajectory cluster algorithm is proposed, in which cluster radius is computed through using the density of data distribution. In addition, cluster centers and number are selected by a certain strategy automatically, and uncertainty/boundary problem of data set is solved by designed weighted rough c-means. Experimental results demonstrate that the proposed algorithm can perform the fuzzy trajectory cluster effectively on the basis of the time and space distance, and obtain the optimal cluster centers and rich cluster results information adaptably for excavating the features of mobile behavior in mobile and sociology network.
A method of optimized neural network by L-M algorithm to transformer winding hot spot temperature forecasting

NASA Astrophysics Data System (ADS)

Wei, B. G.; Wu, X. Y.; Yao, Z. F.; Huang, H.

2017-11-01

Transformers are essential devices of the power system. The accurate computation of the highest temperature (HST) of a transformer’s windings is very significant, as for the HST is a fundamental parameter in controlling the load operation mode and influencing the life time of the insulation. Based on the analysis of the heat transfer processes and the thermal characteristics inside transformers, there is taken into consideration the influence of factors like the sunshine, external wind speed etc. on the oil-immersed transformers. Experimental data and the neural network are used for modeling and protesting of the HST, and furthermore, investigations are conducted on the optimization of the structure and algorithms of neutral network are conducted. Comparison is made between the measured values and calculated values by using the recommended algorithm of IEC60076 and by using the neural network algorithm proposed by the authors; comparison that shows that the value computed with the neural network algorithm approximates better the measured value than the value computed with the algorithm proposed by IEC60076.
Fast Combinatorial Algorithm for the Solution of Linearly Constrained Least Squares Problems

DOEpatents

Van Benthem, Mark H.; Keenan, Michael R.

2008-11-11

A fast combinatorial algorithm can significantly reduce the computational burden when solving general equality and inequality constrained least squares problems with large numbers of observation vectors. The combinatorial algorithm provides a mathematically rigorous solution and operates at great speed by reorganizing the calculations to take advantage of the combinatorial nature of the problems to be solved. The combinatorial algorithm exploits the structure that exists in large-scale problems in order to minimize the number of arithmetic operations required to obtain a solution.
Aquarius Salinity Retrieval Algorithm: Final Pre-Launch Version

NASA Technical Reports Server (NTRS)

Wentz, Frank J.; Le Vine, David M.

2011-01-01

This document provides the theoretical basis for the Aquarius salinity retrieval algorithm. The inputs to the algorithm are the Aquarius antenna temperature (T(sub A)) measurements along with a number of NCEP operational products and pre-computed tables of space radiation coming from the galaxy and sun. The output is sea-surface salinity and many intermediate variables required for the salinity calculation. This revision of the Algorithm Theoretical Basis Document (ATBD) is intended to be the final pre-launch version.

Using a Nondirect Product Basis to Compute J > 0 Rovibrational States of H3+

NASA Astrophysics Data System (ADS)

Jaquet, Ralph; Carrington, Tucker

2013-10-01

We have used a Lanczos algorithm with a nondirect product basis to compute energy levels of H3+ with J values as large as 46. Energy levels computed on the potential surface of M. Pavanello, et al. (J. Chem. Phys. 2012, 136, 184303) agree well with previous calculations for low J values.
Applications of modern statistical methods to analysis of data in physical science

NASA Astrophysics Data System (ADS)

Wicker, James Eric

Modern methods of statistical and computational analysis offer solutions to dilemmas confronting researchers in physical science. Although the ideas behind modern statistical and computational analysis methods were originally introduced in the 1970's, most scientists still rely on methods written during the early era of computing. These researchers, who analyze increasingly voluminous and multivariate data sets, need modern analysis methods to extract the best results from their studies. The first section of this work showcases applications of modern linear regression. Since the 1960's, many researchers in spectroscopy have used classical stepwise regression techniques to derive molecular constants. However, problems with thresholds of entry and exit for model variables plagues this analysis method. Other criticisms of this kind of stepwise procedure include its inefficient searching method, the order in which variables enter or leave the model and problems with overfitting data. We implement an information scoring technique that overcomes the assumptions inherent in the stepwise regression process to calculate molecular model parameters. We believe that this kind of information based model evaluation can be applied to more general analysis situations in physical science. The second section proposes new methods of multivariate cluster analysis. The K-means algorithm and the EM algorithm, introduced in the 1960's and 1970's respectively, formed the basis of multivariate cluster analysis methodology for many years. However, several shortcomings of these methods include strong dependence on initial seed values and inaccurate results when the data seriously depart from hypersphericity. We propose new cluster analysis methods based on genetic algorithms that overcomes the strong dependence on initial seed values. In addition, we propose a generalization of the Genetic K-means algorithm which can accurately identify clusters with complex hyperellipsoidal covariance structures. We then use this new algorithm in a genetic algorithm based Expectation-Maximization process that can accurately calculate parameters describing complex clusters in a mixture model routine. Using the accuracy of this GEM algorithm, we assign information scores to cluster calculations in order to best identify the number of mixture components in a multivariate data set. We will showcase how these algorithms can be used to process multivariate data from astronomical observations.
Algorithmic trends in computational fluid dynamics; The Institute for Computer Applications in Science and Engineering (ICASE)/LaRC Workshop, NASA Langley Research Center, Hampton, VA, US, Sep. 15-17, 1991

NASA Technical Reports Server (NTRS)

Hussaini, M. Y. (Editor); Kumar, A. (Editor); Salas, M. D. (Editor)

1993-01-01

The purpose here is to assess the state of the art in the areas of numerical analysis that are particularly relevant to computational fluid dynamics (CFD), to identify promising new developments in various areas of numerical analysis that will impact CFD, and to establish a long-term perspective focusing on opportunities and needs. Overviews are given of discretization schemes, computational fluid dynamics, algorithmic trends in CFD for aerospace flow field calculations, simulation of compressible viscous flow, and massively parallel computation. Also discussed are accerelation methods, spectral and high-order methods, multi-resolution and subcell resolution schemes, and inherently multidimensional schemes.
Use of CYBER 203 and CYBER 205 computers for three-dimensional transonic flow calculations

NASA Technical Reports Server (NTRS)

Melson, N. D.; Keller, J. D.

1983-01-01

Experiences are discussed for modifying two three-dimensional transonic flow computer programs (FLO 22 and FLO 27) for use on the CDC CYBER 203 computer system. Both programs were originally written for use on serial machines. Several methods were attempted to optimize the execution of the two programs on the vector machine: leaving the program in a scalar form (i.e., serial computation) with compiler software used to optimize and vectorize the program, vectorizing parts of the existing algorithm in the program, and incorporating a vectorizable algorithm (ZEBRA I or ZEBRA II) in the program. Comparison runs of the programs were made on CDC CYBER 175. CYBER 203, and two pipe CDC CYBER 205 computer systems.
SU-F-T-148: Are the Approximations in Analytic Semi-Empirical Dose Calculation Algorithms for Intensity Modulated Proton Therapy for Complex Heterogeneities of Head and Neck Clinically Significant?

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yepes, P; UT MD Anderson Cancer Center, Houston, TX; Titt, U

2016-06-15

Purpose: Evaluate the differences in dose distributions between the proton analytic semi-empirical dose calculation algorithm used in the clinic and Monte Carlo calculations for a sample of 50 head-and-neck (H&N) patients and estimate the potential clinical significance of the differences. Methods: A cohort of 50 H&N patients, treated at the University of Texas Cancer Center with Intensity Modulated Proton Therapy (IMPT), were selected for evaluation of clinical significance of approximations in computed dose distributions. H&N site was selected because of the highly inhomogeneous nature of the anatomy. The Fast Dose Calculator (FDC), a fast track-repeating accelerated Monte Carlo algorithm formore » proton therapy, was utilized for the calculation of dose distributions delivered during treatment plans. Because of its short processing time, FDC allows for the processing of large cohorts of patients. FDC has been validated versus GEANT4, a full Monte Carlo system and measurements in water and for inhomogeneous phantoms. A gamma-index analysis, DVHs, EUDs, and TCP and NTCPs computed using published models were utilized to evaluate the differences between the Treatment Plan System (TPS) and FDC. Results: The Monte Carlo results systematically predict lower dose delivered in the target. The observed differences can be as large as 8 Gy, and should have a clinical impact. Gamma analysis also showed significant differences between both approaches, especially for the target volumes. Conclusion: Monte Carlo calculations with fast algorithms is practical and should be considered for the clinic, at least as a treatment plan verification tool.« less
Decision support methods for the detection of adverse events in post-marketing data.

PubMed

Hauben, M; Bate, A

2009-04-01

Spontaneous reporting is a crucial component of post-marketing drug safety surveillance despite its significant limitations. The size and complexity of some spontaneous reporting system databases represent a challenge for drug safety professionals who traditionally have relied heavily on the scientific and clinical acumen of the prepared mind. Computer algorithms that calculate statistical measures of reporting frequency for huge numbers of drug-event combinations are increasingly used to support pharamcovigilance analysts screening large spontaneous reporting system databases. After an overview of pharmacovigilance and spontaneous reporting systems, we discuss the theory and application of contemporary computer algorithms in regular use, those under development, and the practical considerations involved in the implementation of computer algorithms within a comprehensive and holistic drug safety signal detection program.
Data inversion algorithm development for the hologen occultation experiment

NASA Technical Reports Server (NTRS)

Gordley, Larry L.; Mlynczak, Martin G.

1986-01-01

The successful retrieval of atmospheric parameters from radiometric measurement requires not only the ability to do ideal radiometric calculations, but also a detailed understanding of instrument characteristics. Therefore a considerable amount of time was spent in instrument characterization in the form of test data analysis and mathematical formulation. Analyses of solar-to-reference interference (electrical cross-talk), detector nonuniformity, instrument balance error, electronic filter time-constants and noise character were conducted. A second area of effort was the development of techniques for the ideal radiometric calculations required for the Halogen Occultation Experiment (HALOE) data reduction. The computer code for these calculations must be extremely complex and fast. A scheme for meeting these requirements was defined and the algorithms needed form implementation are currently under development. A third area of work included consulting on the implementation of the Emissivity Growth Approximation (EGA) method of absorption calculation into a HALOE broadband radiometer channel retrieval algorithm.
Automated, computer-guided PASI measurements by digital image analysis versus conventional physicians' PASI calculations: study protocol for a comparative, single-centre, observational study.

PubMed

Fink, Christine; Uhlmann, Lorenz; Klose, Christina; Haenssle, Holger A

2018-05-17

Reliable and accurate assessment of severity in psoriasis is very important in order to meet indication criteria for initiation of systemic treatment or to evaluate treatment efficacy. The most acknowledged tool for measuring the extent of psoriatic skin changes is the Psoriasis Area and Severity Index (PASI). However, the calculation of PASI can be tedious and subjective and high intraobserver and interobserver variability is an important concern. Therefore, there is a great need for a standardised and objective method that guarantees a reproducible PASI calculation. Within this study we will investigate the precision and reproducibility of automated, computer-guided PASI measurements in comparison to trained physicians to address these limitations. Non-interventional analyses of PASI calculations by either physicians in a prospective versus retrospective setting or an automated computer-guided algorithm in 120 patients with plaque psoriasis. All retrospective PASI calculations by physicians or by the computer algorithm are based on total body digital images. The primary objective of this study is comparison of automated computer-guided PASI measurements by means of digital image analysis versus conventional, prospective or retrospective physicians' PASI assessments. Secondary endpoints include (1) the assessment of physicians' interobserver variance in PASI calculations, (2) the assessment of physicians' intraobserver variance in PASI assessments of the same patients' images after a time interval of at least 4 weeks, (3) the assessment of the deviation between physicians' prospective versus retrospective PASI calculations, and (4) the reproducibility of automated computer-guided PASI measurements by assessment of two sets of total body digital images of the same patients taken at one time point. Ethical approval was provided by the Ethics Committee of the Medical Faculty of the University of Heidelberg (ethics approval number S-379/2016). DRKS00011818; Results. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
A low-cost test-bed for real-time landmark tracking

NASA Astrophysics Data System (ADS)

Csaszar, Ambrus; Hanan, Jay C.; Moreels, Pierre; Assad, Christopher

2007-04-01

A low-cost vehicle test-bed system was developed to iteratively test, refine and demonstrate navigation algorithms before attempting to transfer the algorithms to more advanced rover prototypes. The platform used here was a modified radio controlled (RC) car. A microcontroller board and onboard laptop computer allow for either autonomous or remote operation via a computer workstation. The sensors onboard the vehicle represent the types currently used on NASA-JPL rover prototypes. For dead-reckoning navigation, optical wheel encoders, a single axis gyroscope, and 2-axis accelerometer were used. An ultrasound ranger is available to calculate distance as a substitute for the stereo vision systems presently used on rovers. The prototype also carries a small laptop computer with a USB camera and wireless transmitter to send real time video to an off-board computer. A real-time user interface was implemented that combines an automatic image feature selector, tracking parameter controls, streaming video viewer, and user generated or autonomous driving commands. Using the test-bed, real-time landmark tracking was demonstrated by autonomously driving the vehicle through the JPL Mars yard. The algorithms tracked rocks as waypoints. This generated coordinates calculating relative motion and visually servoing to science targets. A limitation for the current system is serial computing-each additional landmark is tracked in order-but since each landmark is tracked independently, if transferred to appropriate parallel hardware, adding targets would not significantly diminish system speed.
Cloud computing for comparative genomics

PubMed Central

2010-01-01

Background Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD), to run within Amazon's Elastic Computing Cloud (EC2). We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. Results We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. Conclusions The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems. PMID:20482786
Cloud computing for comparative genomics.

PubMed

Wall, Dennis P; Kudtarkar, Parul; Fusaro, Vincent A; Pivovarov, Rimma; Patil, Prasad; Tonellato, Peter J

2010-05-18

Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD), to run within Amazon's Elastic Computing Cloud (EC2). We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems.
Real-time implementation of optimized maximum noise fraction transform for feature extraction of hyperspectral images

NASA Astrophysics Data System (ADS)

Wu, Yuanfeng; Gao, Lianru; Zhang, Bing; Zhao, Haina; Li, Jun

2014-01-01

We present a parallel implementation of the optimized maximum noise fraction (G-OMNF) transform algorithm for feature extraction of hyperspectral images on commodity graphics processing units (GPUs). The proposed approach explored the algorithm data-level concurrency and optimized the computing flow. We first defined a three-dimensional grid, in which each thread calculates a sub-block data to easily facilitate the spatial and spectral neighborhood data searches in noise estimation, which is one of the most important steps involved in OMNF. Then, we optimized the processing flow and computed the noise covariance matrix before computing the image covariance matrix to reduce the original hyperspectral image data transmission. These optimization strategies can greatly improve the computing efficiency and can be applied to other feature extraction algorithms. The proposed parallel feature extraction algorithm was implemented on an Nvidia Tesla GPU using the compute unified device architecture and basic linear algebra subroutines library. Through the experiments on several real hyperspectral images, our GPU parallel implementation provides a significant speedup of the algorithm compared with the CPU implementation, especially for highly data parallelizable and arithmetically intensive algorithm parts, such as noise estimation. In order to further evaluate the effectiveness of G-OMNF, we used two different applications: spectral unmixing and classification for evaluation. Considering the sensor scanning rate and the data acquisition time, the proposed parallel implementation met the on-board real-time feature extraction.
Computational Approaches to Simulation and Optimization of Global Aircraft Trajectories

NASA Technical Reports Server (NTRS)

Ng, Hok Kwan; Sridhar, Banavar

2016-01-01

This study examines three possible approaches to improving the speed in generating wind-optimal routes for air traffic at the national or global level. They are: (a) using the resources of a supercomputer, (b) running the computations on multiple commercially available computers and (c) implementing those same algorithms into NASAs Future ATM Concepts Evaluation Tool (FACET) and compares those to a standard implementation run on a single CPU. Wind-optimal aircraft trajectories are computed using global air traffic schedules. The run time and wait time on the supercomputer for trajectory optimization using various numbers of CPUs ranging from 80 to 10,240 units are compared with the total computational time for running the same computation on a single desktop computer and on multiple commercially available computers for potential computational enhancement through parallel processing on the computer clusters. This study also re-implements the trajectory optimization algorithm for further reduction of computational time through algorithm modifications and integrates that with FACET to facilitate the use of the new features which calculate time-optimal routes between worldwide airport pairs in a wind field for use with existing FACET applications. The implementations of trajectory optimization algorithms use MATLAB, Python, and Java programming languages. The performance evaluations are done by comparing their computational efficiencies and based on the potential application of optimized trajectories. The paper shows that in the absence of special privileges on a supercomputer, a cluster of commercially available computers provides a feasible approach for national and global air traffic system studies.
Performance of a plasma fluid code on the Intel parallel computers

NASA Technical Reports Server (NTRS)

Lynch, V. E.; Carreras, B. A.; Drake, J. B.; Leboeuf, J. N.; Liewer, P.

1992-01-01

One approach to improving the real-time efficiency of plasma turbulence calculations is to use a parallel algorithm. A parallel algorithm for plasma turbulence calculations was tested on the Intel iPSC/860 hypercube and the Touchtone Delta machine. Using the 128 processors of the Intel iPSC/860 hypercube, a factor of 5 improvement over a single-processor CRAY-2 is obtained. For the Touchtone Delta machine, the corresponding improvement factor is 16. For plasma edge turbulence calculations, an extrapolation of the present results to the Intel (sigma) machine gives an improvement factor close to 64 over the single-processor CRAY-2.
Simple computer method provides contours for radiological images

NASA Technical Reports Server (NTRS)

Newell, J. D.; Keller, R. A.; Baily, N. A.

1975-01-01

Computer is provided with information concerning boundaries in total image. Gradient of each point in digitized image is calculated with aid of threshold technique; then there is invoked set of algorithms designed to reduce number of gradient elements and to retain only major ones for definition of contour.
Design and implementation of a vision-based hovering and feature tracking algorithm for a quadrotor

NASA Astrophysics Data System (ADS)

Lee, Y. H.; Chahl, J. S.

2016-10-01

This paper demonstrates an approach to the vision-based control of the unmanned quadrotors for hover and object tracking. The algorithms used the Speed Up Robust Features (SURF) algorithm to detect objects. The pose of the object in the image was then calculated in order to pass the pose information to the flight controller. Finally, the flight controller steered the quadrotor to approach the object based on the calculated pose data. The above processes was run using standard onboard resources found in the 3DR Solo quadrotor in an embedded computing environment. The obtained results showed that the algorithm behaved well during its missions, tracking and hovering, although there were significant latencies due to low CPU performance of the onboard image processing system.
Fast generation of computer-generated holograms using wavelet shrinkage.

PubMed

Shimobaba, Tomoyoshi; Ito, Tomoyoshi

2017-01-09

Computer-generated holograms (CGHs) are generated by superimposing complex amplitudes emitted from a number of object points. However, this superposition process remains very time-consuming even when using the latest computers. We propose a fast calculation algorithm for CGHs that uses a wavelet shrinkage method, eliminating small wavelet coefficient values to express approximated complex amplitudes using only a few representative wavelet coefficients.
About improving efficiency of the P3 M algorithms when computing the inter-particle forces in beam dynamics

NASA Astrophysics Data System (ADS)

Kozynchenko, Alexander I.; Kozynchenko, Sergey A.

2017-03-01

In the paper, a problem of improving efficiency of the particle-particle- particle-mesh (P3M) algorithm in computing the inter-particle electrostatic forces is considered. The particle-mesh (PM) part of the algorithm is modified in such a way that the space field equation is solved by the direct method of summation of potentials over the ensemble of particles lying not too close to a reference particle. For this purpose, a specific matrix "pattern" is introduced to describe the spatial field distribution of a single point charge, so the "pattern" contains pre-calculated potential values. This approach allows to reduce a set of arithmetic operations performed at the innermost of nested loops down to an addition and assignment operators and, therefore, to decrease the running time substantially. The simulation model developed in C++ substantiates this view, showing the descent accuracy acceptable in particle beam calculations together with the improved speed performance.
Normalization Of Thermal-Radiation Form-Factor Matrix

NASA Technical Reports Server (NTRS)

Tsuyuki, Glenn T.

1994-01-01

Report describes algorithm that adjusts form-factor matrix in TRASYS computer program, which calculates intraspacecraft radiative interchange among various surfaces and environmental heat loading from sources such as sun.
A Computer Program for the Calculation of Three-Dimensional Transonic Nacelle/Inlet Flowfields

NASA Technical Reports Server (NTRS)

Vadyak, J.; Atta, E. H.

1983-01-01

A highly efficient computer analysis was developed for predicting transonic nacelle/inlet flowfields. This algorithm can compute the three dimensional transonic flowfield about axisymmetric (or asymmetric) nacelle/inlet configurations at zero or nonzero incidence. The flowfield is determined by solving the full-potential equation in conservative form on a body-fitted curvilinear computational mesh. The difference equations are solved using the AF2 approximate factorization scheme. This report presents a discussion of the computational methods used to both generate the body-fitted curvilinear mesh and to obtain the inviscid flow solution. Computed results and correlations with existing methods and experiment are presented. Also presented are discussions on the organization of the grid generation (NGRIDA) computer program and the flow solution (NACELLE) computer program, descriptions of the respective subroutines, definitions of the required input parameters for both algorithms, a brief discussion on interpretation of the output, and sample cases to illustrate application of the analysis.

Development of a real-time aeroperformance analysis technique for the X-29A advanced technology demonstrator

NASA Technical Reports Server (NTRS)

Ray, R. J.; Hicks, J. W.; Alexander, R. I.

1988-01-01

The X-29A advanced technology demonstrator has shown the practicality and advantages of the capability to compute and display, in real time, aeroperformance flight results. This capability includes the calculation of the in-flight measured drag polar, lift curve, and aircraft specific excess power. From these elements many other types of aeroperformance measurements can be computed and analyzed. The technique can be used to give an immediate postmaneuver assessment of data quality and maneuver technique, thus increasing the productivity of a flight program. A key element of this new method was the concurrent development of a real-time in-flight net thrust algorithm, based on the simplified gross thrust method. This net thrust algorithm allows for the direct calculation of total aircraft drag.
Calculating the Mean Amplitude of Glycemic Excursions from Continuous Glucose Data Using an Open-Code Programmable Algorithm Based on the Integer Nonlinear Method.

PubMed

Yu, Xuefei; Lin, Liangzhuo; Shen, Jie; Chen, Zhi; Jian, Jun; Li, Bin; Xin, Sherman Xuegang

2018-01-01

The mean amplitude of glycemic excursions (MAGE) is an essential index for glycemic variability assessment, which is treated as a key reference for blood glucose controlling at clinic. However, the traditional "ruler and pencil" manual method for the calculation of MAGE is time-consuming and prone to error due to the huge data size, making the development of robust computer-aided program an urgent requirement. Although several software products are available instead of manual calculation, poor agreement among them is reported. Therefore, more studies are required in this field. In this paper, we developed a mathematical algorithm based on integer nonlinear programming. Following the proposed mathematical method, an open-code computer program named MAGECAA v1.0 was developed and validated. The results of the statistical analysis indicated that the developed program was robust compared to the manual method. The agreement among the developed program and currently available popular software is satisfied, indicating that the worry about the disagreement among different software products is not necessary. The open-code programmable algorithm is an extra resource for those peers who are interested in the related study on methodology in the future.
Design for a Crane Metallic Structure Based on Imperialist Competitive Algorithm and Inverse Reliability Strategy

NASA Astrophysics Data System (ADS)

Fan, Xiao-Ning; Zhi, Bo

2017-07-01

Uncertainties in parameters such as materials, loading, and geometry are inevitable in designing metallic structures for cranes. When considering these uncertainty factors, reliability-based design optimization (RBDO) offers a more reasonable design approach. However, existing RBDO methods for crane metallic structures are prone to low convergence speed and high computational cost. A unilevel RBDO method, combining a discrete imperialist competitive algorithm with an inverse reliability strategy based on the performance measure approach, is developed. Application of the imperialist competitive algorithm at the optimization level significantly improves the convergence speed of this RBDO method. At the reliability analysis level, the inverse reliability strategy is used to determine the feasibility of each probabilistic constraint at each design point by calculating its α-percentile performance, thereby avoiding convergence failure, calculation error, and disproportionate computational effort encountered using conventional moment and simulation methods. Application of the RBDO method to an actual crane structure shows that the developed RBDO realizes a design with the best tradeoff between economy and safety together with about one-third of the convergence speed and the computational cost of the existing method. This paper provides a scientific and effective design approach for the design of metallic structures of cranes.
Unmitigated numerical solution to the diffraction term in the parabolic nonlinear ultrasound wave equation.

PubMed

Hasani, Mojtaba H; Gharibzadeh, Shahriar; Farjami, Yaghoub; Tavakkoli, Jahan

2013-09-01

Various numerical algorithms have been developed to solve the Khokhlov-Kuznetsov-Zabolotskaya (KZK) parabolic nonlinear wave equation. In this work, a generalized time-domain numerical algorithm is proposed to solve the diffraction term of the KZK equation. This algorithm solves the transverse Laplacian operator of the KZK equation in three-dimensional (3D) Cartesian coordinates using a finite-difference method based on the five-point implicit backward finite difference and the five-point Crank-Nicolson finite difference discretization techniques. This leads to a more uniform discretization of the Laplacian operator which in turn results in fewer calculation gridding nodes without compromising accuracy in the diffraction term. In addition, a new empirical algorithm based on the LU decomposition technique is proposed to solve the system of linear equations obtained from this discretization. The proposed empirical algorithm improves the calculation speed and memory usage, while the order of computational complexity remains linear in calculation of the diffraction term in the KZK equation. For evaluating the accuracy of the proposed algorithm, two previously published algorithms are used as comparison references: the conventional 2D Texas code and its generalization for 3D geometries. The results show that the accuracy/efficiency performance of the proposed algorithm is comparable with the established time-domain methods.
Scalable load balancing for massively parallel distributed Monte Carlo particle transport

DOE Office of Scientific and Technical Information (OSTI.GOV)

O'Brien, M. J.; Brantley, P. S.; Joy, K. I.

2013-07-01

In order to run computer simulations efficiently on massively parallel computers with hundreds of thousands or millions of processors, care must be taken that the calculation is load balanced across the processors. Examining the workload of every processor leads to an unscalable algorithm, with run time at least as large as O(N), where N is the number of processors. We present a scalable load balancing algorithm, with run time 0(log(N)), that involves iterated processor-pair-wise balancing steps, ultimately leading to a globally balanced workload. We demonstrate scalability of the algorithm up to 2 million processors on the Sequoia supercomputer at Lawrencemore » Livermore National Laboratory. (authors)« less
Real-Gas Flow Properties for NASA Langley Research Center Aerothermodynamic Facilities Complex Wind Tunnels

NASA Technical Reports Server (NTRS)

Hollis, Brian R.

1996-01-01

A computational algorithm has been developed which can be employed to determine the flow properties of an arbitrary real (virial) gas in a wind tunnel. A multiple-coefficient virial gas equation of state and the assumption of isentropic flow are used to model the gas and to compute flow properties throughout the wind tunnel. This algorithm has been used to calculate flow properties for the wind tunnels of the Aerothermodynamics Facilities Complex at the NASA Langley Research Center, in which air, CF4. He, and N2 are employed as test gases. The algorithm is detailed in this paper and sample results are presented for each of the Aerothermodynamic Facilities Complex wind tunnels.
DE and NLP Based QPLS Algorithm

NASA Astrophysics Data System (ADS)

Yu, Xiaodong; Huang, Dexian; Wang, Xiong; Liu, Bo

As a novel evolutionary computing technique, Differential Evolution (DE) has been considered to be an effective optimization method for complex optimization problems, and achieved many successful applications in engineering. In this paper, a new algorithm of Quadratic Partial Least Squares (QPLS) based on Nonlinear Programming (NLP) is presented. And DE is used to solve the NLP so as to calculate the optimal input weights and the parameters of inner relationship. The simulation results based on the soft measurement of diesel oil solidifying point on a real crude distillation unit demonstrate that the superiority of the proposed algorithm to linear PLS and QPLS which is based on Sequential Quadratic Programming (SQP) in terms of fitting accuracy and computational costs.
Identification of Steady and Non-Steady Gait of Humanexoskeleton Walking System

NASA Astrophysics Data System (ADS)

Żur, K. K.

2013-08-01

In this paper a method of analysis of exoskeleton multistep locomotion was presented by using a computer with the preinstalled DChC program. The paper also presents a way to analytically calculate the ",motion indicator", as well as the algorithm calculating its two derivatives. The algorithm developed by the author processes data collected from the investigation and then a program presents the obtained final results. Research into steady and non-steady multistep locomotion can be used to design two-legged robots of DAR type and exoskeleton control system
Simple geometric algorithms to aid in clearance management for robotic mechanisms

NASA Technical Reports Server (NTRS)

Copeland, E. L.; Ray, L. D.; Peticolas, J. D.

1981-01-01

Global geometric shapes such as lines, planes, circles, spheres, cylinders, and the associated computational algorithms which provide relatively inexpensive estimates of minimum spatial clearance for safe operations were selected. The Space Shuttle, remote manipulator system, and the Power Extension Package are used as an example. Robotic mechanisms operate in quarters limited by external structures and the problem of clearance is often of considerable interest. Safe clearance management is simple and suited to real time calculation, whereas contact prediction requires more precision, sophistication, and computational overhead.
Node fingerprinting: an efficient heuristic for aligning biological networks.

PubMed

Radu, Alex; Charleston, Michael

2014-10-01

With the continuing increase in availability of biological data and improvements to biological models, biological network analysis has become a promising area of research. An emerging technique for the analysis of biological networks is through network alignment. Network alignment has been used to calculate genetic distance, similarities between regulatory structures, and the effect of external forces on gene expression, and to depict conditional activity of expression modules in cancer. Network alignment is algorithmically complex, and therefore we must rely on heuristics, ideally as efficient and accurate as possible. The majority of current techniques for network alignment rely on precomputed information, such as with protein sequence alignment, or on tunable network alignment parameters, which may introduce an increased computational overhead. Our presented algorithm, which we call Node Fingerprinting (NF), is appropriate for performing global pairwise network alignment without precomputation or tuning, can be fully parallelized, and is able to quickly compute an accurate alignment between two biological networks. It has performed as well as or better than existing algorithms on biological and simulated data, and with fewer computational resources. The algorithmic validation performed demonstrates the low computational resource requirements of NF.
Enhanced calculation of eigen-stress field and elastic energy in atomistic interdiffusion of alloys

NASA Astrophysics Data System (ADS)

Cecilia, José M.; Hernández-Díaz, A. M.; Castrillo, Pedro; Jiménez-Alonso, J. F.

2017-02-01

The structural evolution of alloys is affected by the elastic energy associated to eigen-stress fields. However, efficient calculations of the elastic energy in evolving geometries are actually a great challenge in promising atomistic simulation techniques such as Kinetic Monte Carlo (KMC) methods. In this paper, we report two complementary algorithms to calculate the eigen-stress field by linear superposition (a.k.a. LSA, Lineal Superposition Algorithm) and the elastic energy modification in atomistic interdiffusion of alloys (the Atom Exchange Elastic Energy Evaluation (AE4) Algorithm). LSA is shown to be appropriated for fast incremental stress calculation in highly nanostructured materials, whereas AE4 provides the required input for KMC and, additionally, it can be used to evaluate the accuracy of the eigen-stress field calculated by LSA. Consequently, they are suitable to be used on-the-fly with KMC. Both algorithms are massively parallel by their definition and thus well-suited for their parallelization on modern Graphics Processing Units (GPUs). Our computational studies confirm that we can obtain significant improvements compared to conventional Finite Element Methods, and the utilization of GPUs opens up new possibilities for the development of these methods in atomistic simulation of materials.
FAST-PT: a novel algorithm to calculate convolution integrals in cosmological perturbation theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

McEwen, Joseph E.; Fang, Xiao; Hirata, Christopher M.

2016-09-01

We present a novel algorithm, FAST-PT, for performing convolution or mode-coupling integrals that appear in nonlinear cosmological perturbation theory. The algorithm uses several properties of gravitational structure formation—the locality of the dark matter equations and the scale invariance of the problem—as well as Fast Fourier Transforms to describe the input power spectrum as a superposition of power laws. This yields extremely fast performance, enabling mode-coupling integral computations fast enough to embed in Monte Carlo Markov Chain parameter estimation. We describe the algorithm and demonstrate its application to calculating nonlinear corrections to the matter power spectrum, including one-loop standard perturbation theorymore » and the renormalization group approach. We also describe our public code (in Python) to implement this algorithm. The code, along with a user manual and example implementations, is available at https://github.com/JoeMcEwen/FAST-PT.« less
Evaluation of a New Backtrack Free Path Planning Algorithm for Manipulators

NASA Astrophysics Data System (ADS)

Islam, Md. Nazrul; Tamura, Shinsuke; Murata, Tomonari; Yanase, Tatsuro

This paper evaluates a newly proposed backtrack free path planning algorithm (BFA) for manipulators. BFA is an exact algorithm, i.e. it is resolution complete. Different from existing resolution complete algorithms, its computation time and memory space are proportional to the number of arms. Therefore paths can be calculated within practical and predetermined time even for manipulators with many arms, and it becomes possible to plan complicated motions of multi-arm manipulators in fully automated environments. The performance of BFA is evaluated for 2-dimensional environments while changing the number of arms and obstacle placements. Its performance under locus and attitude constraints is also evaluated. Evaluation results show that the computation volume of the algorithm is almost the same as the theoretical one, i.e. it increases linearly with the number of arms even in complicated environments. Moreover BFA achieves the constant performance independent of environments.
Deep first formal concept search.

PubMed

Zhang, Tao; Li, Hui; Hong, Wenxue; Yuan, Xiamei; Wei, Xinyu

2014-01-01

The calculation of formal concepts is a very important part in the theory of formal concept analysis (FCA); however, within the framework of FCA, computing all formal concepts is the main challenge because of its exponential complexity and difficulty in visualizing the calculating process. With the basic idea of Depth First Search, this paper presents a visualization algorithm by the attribute topology of formal context. Limited by the constraints and calculation rules, all concepts are achieved by the visualization global formal concepts searching, based on the topology degenerated with the fixed start and end points, without repetition and omission. This method makes the calculation of formal concepts precise and easy to operate and reflects the integrity of the algorithm, which enables it to be suitable for visualization analysis.
Localization Algorithm Based on a Spring Model (LASM) for Large Scale Wireless Sensor Networks.

PubMed

Chen, Wanming; Mei, Tao; Meng, Max Q-H; Liang, Huawei; Liu, Yumei; Li, Yangming; Li, Shuai

2008-03-15

A navigation method for a lunar rover based on large scale wireless sensornetworks is proposed. To obtain high navigation accuracy and large exploration area, highnode localization accuracy and large network scale are required. However, thecomputational and communication complexity and time consumption are greatly increasedwith the increase of the network scales. A localization algorithm based on a spring model(LASM) method is proposed to reduce the computational complexity, while maintainingthe localization accuracy in large scale sensor networks. The algorithm simulates thedynamics of physical spring system to estimate the positions of nodes. The sensor nodesare set as particles with masses and connected with neighbor nodes by virtual springs. Thevirtual springs will force the particles move to the original positions, the node positionscorrespondingly, from the randomly set positions. Therefore, a blind node position can bedetermined from the LASM algorithm by calculating the related forces with the neighbornodes. The computational and communication complexity are O(1) for each node, since thenumber of the neighbor nodes does not increase proportionally with the network scale size.Three patches are proposed to avoid local optimization, kick out bad nodes and deal withnode variation. Simulation results show that the computational and communicationcomplexity are almost constant despite of the increase of the network scale size. The time consumption has also been proven to remain almost constant since the calculation steps arealmost unrelated with the network scale size.
Reducing Earth Topography Resolution for SMAP Mission Ground Tracks Using K-Means Clustering

NASA Technical Reports Server (NTRS)

Rizvi, Farheen

2013-01-01

The K-means clustering algorithm is used to reduce Earth topography resolution for the SMAP mission ground tracks. As SMAP propagates in orbit, knowledge of the radar antenna footprints on Earth is required for the antenna misalignment calibration. Each antenna footprint contains a latitude and longitude location pair on the Earth surface. There are 400 pairs in one data set for the calibration model. It is computationally expensive to calculate corresponding Earth elevation for these data pairs. Thus, the antenna footprint resolution is reduced. Similar topographical data pairs are grouped together with the K-means clustering algorithm. The resolution is reduced to the mean of each topographical cluster called the cluster centroid. The corresponding Earth elevation for each cluster centroid is assigned to the entire group. Results show that 400 data points are reduced to 60 while still maintaining algorithm performance and computational efficiency. In this work, sensitivity analysis is also performed to show a trade-off between algorithm performance versus computational efficiency as the number of cluster centroids and algorithm iterations are increased.
Development and Validation of a Novel Fusion Algorithm for Continuous, Accurate and Automated R-wave Detection and Calculation of Signal-Derived Metrics

DTIC Science & Technology

2013-01-01

Predicting the onset of atrial fibrillation : the Computers in Cardiology Challenge 2001. Comput Cardiol 2001;28:113-6. [22] Moody GB, Mark RG, Goldberger AL...Computers in Cardiology Challenge 2006: QT interval measurement. Comput Cardiol 2006;33:313-6. [18] Moody GB. Spontaneous termination of atrial ... fibrillation : a challenge from PhysioNet and Computers in Cardiology 2004. Comput Cardiol 2004;31:101-4. [19] Moody GB, Jager F. Distinguishing ischemic from non
Precise locating approach of the beacon based on gray gradient segmentation interpolation in satellite optical communications.

PubMed

Wang, Qiang; Liu, Yuefei; Chen, Yiqiang; Ma, Jing; Tan, Liying; Yu, Siyuan

2017-03-01

Accurate location computation for a beacon is an important factor of the reliability of satellite optical communications. However, location precision is generally limited by the resolution of CCD. How to improve the location precision of a beacon is an important and urgent issue. In this paper, we present two precise centroid computation methods for locating a beacon in satellite optical communications. First, in terms of its characteristics, the beacon is divided into several parts according to the gray gradients. Afterward, different numbers of interpolation points and different interpolation methods are applied in the interpolation area; we calculate the centroid position after interpolation and choose the best strategy according to the algorithm. The method is called a "gradient segmentation interpolation approach," or simply, a GSI (gradient segmentation interpolation) algorithm. To take full advantage of the pixels of the beacon's central portion, we also present an improved segmentation square weighting (SSW) algorithm, whose effectiveness is verified by the simulation experiment. Finally, an experiment is established to verify GSI and SSW algorithms. The results indicate that GSI and SSW algorithms can improve locating accuracy over that calculated by a traditional gray centroid method. These approaches help to greatly improve the location precision for a beacon in satellite optical communications.
Algorithm development for Maxwell's equations for computational electromagnetism

NASA Technical Reports Server (NTRS)

Goorjian, Peter M.

1990-01-01

A new algorithm has been developed for solving Maxwell's equations for the electromagnetic field. It solves the equations in the time domain with central, finite differences. The time advancement is performed implicitly, using an alternating direction implicit procedure. The space discretization is performed with finite volumes, using curvilinear coordinates with electromagnetic components along those directions. Sample calculations are presented of scattering from a metal pin, a square and a circle to demonstrate the capabilities of the new algorithm.
Overview of fast algorithm in 3D dynamic holographic display

NASA Astrophysics Data System (ADS)

Liu, Juan; Jia, Jia; Pan, Yijie; Wang, Yongtian

2013-08-01

3D dynamic holographic display is one of the most attractive techniques for achieving real 3D vision with full depth cue without any extra devices. However, huge 3D information and data should be preceded and be computed in real time for generating the hologram in 3D dynamic holographic display, and it is a challenge even for the most advanced computer. Many fast algorithms are proposed for speeding the calculation and reducing the memory usage, such as:look-up table (LUT), compressed look-up table (C-LUT), split look-up table (S-LUT), and novel look-up table (N-LUT) based on the point-based method, and full analytical polygon-based methods, one-step polygon-based method based on the polygon-based method. In this presentation, we overview various fast algorithms based on the point-based method and the polygon-based method, and focus on the fast algorithm with low memory usage, the C-LUT, and one-step polygon-based method by the 2D Fourier analysis of the 3D affine transformation. The numerical simulations and the optical experiments are presented, and several other algorithms are compared. The results show that the C-LUT algorithm and the one-step polygon-based method are efficient methods for saving calculation time. It is believed that those methods could be used in the real-time 3D holographic display in future.

Real-time aerodynamic heating and surface temperature calculations for hypersonic flight simulation

NASA Technical Reports Server (NTRS)

Quinn, Robert D.; Gong, Leslie

1990-01-01

A real-time heating algorithm was derived and installed on the Ames Research Center Dryden Flight Research Facility real-time flight simulator. This program can calculate two- and three-dimensional stagnation point surface heating rates and surface temperatures. The two-dimensional calculations can be made with or without leading-edge sweep. In addition, upper and lower surface heating rates and surface temperatures for flat plates, wedges, and cones can be calculated. Laminar or turbulent heating can be calculated, with boundary-layer transition made a function of free-stream Reynolds number and free-stream Mach number. Real-time heating rates and surface temperatures calculated for a generic hypersonic vehicle are presented and compared with more exact values computed by a batch aeroheating program. As these comparisons show, the heating algorithm used on the flight simulator calculates surface heating rates and temperatures well within the accuracy required to evaluate flight profiles for acceptable heating trajectories.
Delineation of First-Order Elastic Property Closures for Hexagonal Metals Using Fast Fourier Transforms

PubMed Central

Landry, Nicholas W.; Knezevic, Marko

2015-01-01

Property closures are envelopes representing the complete set of theoretically feasible macroscopic property combinations for a given material system. In this paper, we present a computational procedure based on fast Fourier transforms (FFTs) for delineation of elastic property closures for hexagonal close packed (HCP) metals. The procedure consists of building a database of non-zero Fourier transforms for each component of the elastic stiffness tensor, calculating the Fourier transforms of orientation distribution functions (ODFs), and calculating the ODF-to-elastic property bounds in the Fourier space. In earlier studies, HCP closures were computed using the generalized spherical harmonics (GSH) representation and an assumption of orthotropic sample symmetry; here, the FFT approach allowed us to successfully calculate the closures for a range of HCP metals without invoking any sample symmetry assumption. The methodology presented here facilitates for the first time computation of property closures involving normal-shear coupling stiffness coefficients. We found that the representation of these property linkages using FFTs need more terms compared to GSH representations. However, the use of FFT representations reduces the computational time involved in producing the property closures due to the use of fast FFT algorithms. Moreover, FFT algorithms are readily available as opposed to GSH codes. PMID:28793566
Toward ab initio molecular dynamics modeling for sum-frequency generation spectra; an efficient algorithm based on surface-specific velocity-velocity correlation function.

PubMed

Ohto, Tatsuhiko; Usui, Kota; Hasegawa, Taisuke; Bonn, Mischa; Nagata, Yuki

2015-09-28

Interfacial water structures have been studied intensively by probing the O-H stretch mode of water molecules using sum-frequency generation (SFG) spectroscopy. This surface-specific technique is finding increasingly widespread use, and accordingly, computational approaches to calculate SFG spectra using molecular dynamics (MD) trajectories of interfacial water molecules have been developed and employed to correlate specific spectral signatures with distinct interfacial water structures. Such simulations typically require relatively long (several nanoseconds) MD trajectories to allow reliable calculation of the SFG response functions through the dipole moment-polarizability time correlation function. These long trajectories limit the use of computationally expensive MD techniques such as ab initio MD and centroid MD simulations. Here, we present an efficient algorithm determining the SFG response from the surface-specific velocity-velocity correlation function (ssVVCF). This ssVVCF formalism allows us to calculate SFG spectra using a MD trajectory of only ∼100 ps, resulting in the substantial reduction of the computational costs, by almost an order of magnitude. We demonstrate that the O-H stretch SFG spectra at the water-air interface calculated by using the ssVVCF formalism well reproduce those calculated by using the dipole moment-polarizability time correlation function. Furthermore, we applied this ssVVCF technique for computing the SFG spectra from the ab initio MD trajectories with various density functionals. We report that the SFG responses computed from both ab initio MD simulations and MD simulations with an ab initio based force field model do not show a positive feature in its imaginary component at 3100 cm(-1).
Autumn Algorithm-Computation of Hybridization Networks for Realistic Phylogenetic Trees.

PubMed

Huson, Daniel H; Linz, Simone

2018-01-01

A minimum hybridization network is a rooted phylogenetic network that displays two given rooted phylogenetic trees using a minimum number of reticulations. Previous mathematical work on their calculation has usually assumed the input trees to be bifurcating, correctly rooted, or that they both contain the same taxa. These assumptions do not hold in biological studies and "realistic" trees have multifurcations, are difficult to root, and rarely contain the same taxa. We present a new algorithm for computing minimum hybridization networks for a given pair of "realistic" rooted phylogenetic trees. We also describe how the algorithm might be used to improve the rooting of the input trees. We introduce the concept of "autumn trees", a nice framework for the formulation of algorithms based on the mathematics of "maximum acyclic agreement forests". While the main computational problem is hard, the run-time depends mainly on how different the given input trees are. In biological studies, where the trees are reasonably similar, our parallel implementation performs well in practice. The algorithm is available in our open source program Dendroscope 3, providing a platform for biologists to explore rooted phylogenetic networks. We demonstrate the utility of the algorithm using several previously studied data sets.
Hamiltonian lattice field theory: Computer calculations using variational methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zako, Robert L.

1991-12-03

I develop a variational method for systematic numerical computation of physical quantities -- bound state energies and scattering amplitudes -- in quantum field theory. An infinite-volume, continuum theory is approximated by a theory on a finite spatial lattice, which is amenable to numerical computation. I present an algorithm for computing approximate energy eigenvalues and eigenstates in the lattice theory and for bounding the resulting errors. I also show how to select basis states and choose variational parameters in order to minimize errors. The algorithm is based on the Rayleigh-Ritz principle and Kato`s generalizations of Temple`s formula. The algorithm could bemore » adapted to systems such as atoms and molecules. I show how to compute Green`s functions from energy eigenvalues and eigenstates in the lattice theory, and relate these to physical (renormalized) coupling constants, bound state energies and Green`s functions. Thus one can compute approximate physical quantities in a lattice theory that approximates a quantum field theory with specified physical coupling constants. I discuss the errors in both approximations. In principle, the errors can be made arbitrarily small by increasing the size of the lattice, decreasing the lattice spacing and computing sufficiently long. Unfortunately, I do not understand the infinite-volume and continuum limits well enough to quantify errors due to the lattice approximation. Thus the method is currently incomplete. I apply the method to real scalar field theories using a Fock basis of free particle states. All needed quantities can be calculated efficiently with this basis. The generalization to more complicated theories is straightforward. I describe a computer implementation of the method and present numerical results for simple quantum mechanical systems.« less
Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver

NASA Astrophysics Data System (ADS)

Shao, Meiyue; Aktulga, H. Metin; Yang, Chao; Ng, Esmond G.; Maris, Pieter; Vary, James P.

2018-01-01

We describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. The use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. We also discuss the implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.
Post-processing interstitialcy diffusion from molecular dynamics simulations

NASA Astrophysics Data System (ADS)

Bhardwaj, U.; Bukkuru, S.; Warrier, M.

2016-01-01

An algorithm to rigorously trace the interstitialcy diffusion trajectory in crystals is developed. The algorithm incorporates unsupervised learning and graph optimization which obviate the need to input extra domain specific information depending on crystal or temperature of the simulation. The algorithm is implemented in a flexible framework as a post-processor to molecular dynamics (MD) simulations. We describe in detail the reduction of interstitialcy diffusion into known computational problems of unsupervised clustering and graph optimization. We also discuss the steps, computational efficiency and key components of the algorithm. Using the algorithm, thermal interstitialcy diffusion from low to near-melting point temperatures is studied. We encapsulate the algorithms in a modular framework with functionality to calculate diffusion coefficients, migration energies and other trajectory properties. The study validates the algorithm by establishing the conformity of output parameters with experimental values and provides detailed insights for the interstitialcy diffusion mechanism. The algorithm along with the help of supporting visualizations and analysis gives convincing details and a new approach to quantifying diffusion jumps, jump-lengths, time between jumps and to identify interstitials from lattice atoms.
Post-processing interstitialcy diffusion from molecular dynamics simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bhardwaj, U., E-mail: haptork@gmail.com; Bukkuru, S.; Warrier, M.

2016-01-15

An algorithm to rigorously trace the interstitialcy diffusion trajectory in crystals is developed. The algorithm incorporates unsupervised learning and graph optimization which obviate the need to input extra domain specific information depending on crystal or temperature of the simulation. The algorithm is implemented in a flexible framework as a post-processor to molecular dynamics (MD) simulations. We describe in detail the reduction of interstitialcy diffusion into known computational problems of unsupervised clustering and graph optimization. We also discuss the steps, computational efficiency and key components of the algorithm. Using the algorithm, thermal interstitialcy diffusion from low to near-melting point temperatures ismore » studied. We encapsulate the algorithms in a modular framework with functionality to calculate diffusion coefficients, migration energies and other trajectory properties. The study validates the algorithm by establishing the conformity of output parameters with experimental values and provides detailed insights for the interstitialcy diffusion mechanism. The algorithm along with the help of supporting visualizations and analysis gives convincing details and a new approach to quantifying diffusion jumps, jump-lengths, time between jumps and to identify interstitials from lattice atoms. -- Graphical abstract:.« less
Patient-specific CT dosimetry calculation: a feasibility study.

PubMed

Fearon, Thomas; Xie, Huchen; Cheng, Jason Y; Ning, Holly; Zhuge, Ying; Miller, Robert W

2011-11-15

Current estimation of radiation dose from computed tomography (CT) scans on patients has relied on the measurement of Computed Tomography Dose Index (CTDI) in standard cylindrical phantoms, and calculations based on mathematical representations of "standard man". Radiation dose to both adult and pediatric patients from a CT scan has been a concern, as noted in recent reports. The purpose of this study was to investigate the feasibility of adapting a radiation treatment planning system (RTPS) to provide patient-specific CT dosimetry. A radiation treatment planning system was modified to calculate patient-specific CT dose distributions, which can be represented by dose at specific points within an organ of interest, as well as organ dose-volumes (after image segmentation) for a GE Light Speed Ultra Plus CT scanner. The RTPS calculation algorithm is based on a semi-empirical, measured correction-based algorithm, which has been well established in the radiotherapy community. Digital representations of the physical phantoms (virtual phantom) were acquired with the GE CT scanner in axial mode. Thermoluminescent dosimeter (TLDs) measurements in pediatric anthropomorphic phantoms were utilized to validate the dose at specific points within organs of interest relative to RTPS calculations and Monte Carlo simulations of the same virtual phantoms (digital representation). Congruence of the calculated and measured point doses for the same physical anthropomorphic phantom geometry was used to verify the feasibility of the method. The RTPS algorithm can be extended to calculate the organ dose by calculating a dose distribution point-by-point for a designated volume. Electron Gamma Shower (EGSnrc) codes for radiation transport calculations developed by National Research Council of Canada (NRCC) were utilized to perform the Monte Carlo (MC) simulation. In general, the RTPS and MC dose calculations are within 10% of the TLD measurements for the infant and child chest scans. With respect to the dose comparisons for the head, the RTPS dose calculations are slightly higher (10%-20%) than the TLD measurements, while the MC results were within 10% of the TLD measurements. The advantage of the algebraic dose calculation engine of the RTPS is a substantially reduced computation time (minutes vs. days) relative to Monte Carlo calculations, as well as providing patient-specific dose estimation. It also provides the basis for a more elaborate reporting of dosimetric results, such as patient specific organ dose volumes after image segmentation.
Fast template matching with polynomials.

PubMed

Omachi, Shinichiro; Omachi, Masako

2007-08-01

Template matching is widely used for many applications in image and signal processing. This paper proposes a novel template matching algorithm, called algebraic template matching. Given a template and an input image, algebraic template matching efficiently calculates similarities between the template and the partial images of the input image, for various widths and heights. The partial image most similar to the template image is detected from the input image for any location, width, and height. In the proposed algorithm, a polynomial that approximates the template image is used to match the input image instead of the template image. The proposed algorithm is effective especially when the width and height of the template image differ from the partial image to be matched. An algorithm using the Legendre polynomial is proposed for efficient approximation of the template image. This algorithm not only reduces computational costs, but also improves the quality of the approximated image. It is shown theoretically and experimentally that the computational cost of the proposed algorithm is much smaller than the existing methods.
Software algorithm and hardware design for real-time implementation of new spectral estimator

PubMed Central

2014-01-01

Background Real-time spectral analyzers can be difficult to implement for PC computer-based systems because of the potential for high computational cost, and algorithm complexity. In this work a new spectral estimator (NSE) is developed for real-time analysis, and compared with the discrete Fourier transform (DFT). Method Clinical data in the form of 216 fractionated atrial electrogram sequences were used as inputs. The sample rate for acquisition was 977 Hz, or approximately 1 millisecond between digital samples. Real-time NSE power spectra were generated for 16,384 consecutive data points. The same data sequences were used for spectral calculation using a radix-2 implementation of the DFT. The NSE algorithm was also developed for implementation as a real-time spectral analyzer electronic circuit board. Results The average interval for a single real-time spectral calculation in software was 3.29 μs for NSE versus 504.5 μs for DFT. Thus for real-time spectral analysis, the NSE algorithm is approximately 150× faster than the DFT. Over a 1 millisecond sampling period, the NSE algorithm had the capability to spectrally analyze a maximum of 303 data channels, while the DFT algorithm could only analyze a single channel. Moreover, for the 8 second sequences, the NSE spectral resolution in the 3-12 Hz range was 0.037 Hz while the DFT spectral resolution was only 0.122 Hz. The NSE was also found to be implementable as a standalone spectral analyzer board using approximately 26 integrated circuits at a cost of approximately $500. The software files used for analysis are included as a supplement, please see the Additional files 1 and 2. Conclusions The NSE real-time algorithm has low computational cost and complexity, and is implementable in both software and hardware for 1 millisecond updates of multichannel spectra. The algorithm may be helpful to guide radiofrequency catheter ablation in real time. PMID:24886214
Toward real-time diffuse optical tomography: accelerating light propagation modeling employing parallel computing on GPU and CPU

NASA Astrophysics Data System (ADS)

Doulgerakis, Matthaios; Eggebrecht, Adam; Wojtkiewicz, Stanislaw; Culver, Joseph; Dehghani, Hamid

2017-12-01

Parameter recovery in diffuse optical tomography is a computationally expensive algorithm, especially when used for large and complex volumes, as in the case of human brain functional imaging. The modeling of light propagation, also known as the forward problem, is the computational bottleneck of the recovery algorithm, whereby the lack of a real-time solution is impeding practical and clinical applications. The objective of this work is the acceleration of the forward model, within a diffusion approximation-based finite-element modeling framework, employing parallelization to expedite the calculation of light propagation in realistic adult head models. The proposed methodology is applicable for modeling both continuous wave and frequency-domain systems with the results demonstrating a 10-fold speed increase when GPU architectures are available, while maintaining high accuracy. It is shown that, for a very high-resolution finite-element model of the adult human head with ˜600,000 nodes, consisting of heterogeneous layers, light propagation can be calculated at ˜0.25 s/excitation source.
Implementation of a fully-balanced periodic tridiagonal solver on a parallel distributed memory architecture

NASA Technical Reports Server (NTRS)

Eidson, T. M.; Erlebacher, G.

1994-01-01

While parallel computers offer significant computational performance, it is generally necessary to evaluate several programming strategies. Two programming strategies for a fairly common problem - a periodic tridiagonal solver - are developed and evaluated. Simple model calculations as well as timing results are presented to evaluate the various strategies. The particular tridiagonal solver evaluated is used in many computational fluid dynamic simulation codes. The feature that makes this algorithm unique is that these simulation codes usually require simultaneous solutions for multiple right-hand-sides (RHS) of the system of equations. Each RHS solutions is independent and thus can be computed in parallel. Thus a Gaussian elimination type algorithm can be used in a parallel computation and the more complicated approaches such as cyclic reduction are not required. The two strategies are a transpose strategy and a distributed solver strategy. For the transpose strategy, the data is moved so that a subset of all the RHS problems is solved on each of the several processors. This usually requires significant data movement between processor memories across a network. The second strategy attempts to have the algorithm allow the data across processor boundaries in a chained manner. This usually requires significantly less data movement. An approach to accomplish this second strategy in a near-perfect load-balanced manner is developed. In addition, an algorithm will be shown to directly transform a sequential Gaussian elimination type algorithm into the parallel chained, load-balanced algorithm.
Algorithm to calculate proportional area transformation factors for digital geographic databases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Edwards, R.

1983-01-01

A computer technique is described for determining proportionate-area factors used to transform thematic data between large geographic areal databases. The number of calculations in the algorithm increases linearly with the number of segments in the polygonal definitions of the databases, and increases with the square root of the total number of chains. Experience is presented in calculating transformation factors for two national databases, the USGS Water Cataloging Unit outlines and DOT county boundaries which consist of 2100 and 3100 polygons respectively. The technique facilitates using thematic data defined on various natural bases (watersheds, landcover units, etc.) in analyses involving economicmore » and other administrative bases (states, counties, etc.), and vice versa.« less
Stream-based Hebbian eigenfilter for real-time neuronal spike discrimination

PubMed Central

2012-01-01

Background Principal component analysis (PCA) has been widely employed for automatic neuronal spike sorting. Calculating principal components (PCs) is computationally expensive, and requires complex numerical operations and large memory resources. Substantial hardware resources are therefore needed for hardware implementations of PCA. General Hebbian algorithm (GHA) has been proposed for calculating PCs of neuronal spikes in our previous work, which eliminates the needs of computationally expensive covariance analysis and eigenvalue decomposition in conventional PCA algorithms. However, large memory resources are still inherently required for storing a large volume of aligned spikes for training PCs. The large size memory will consume large hardware resources and contribute significant power dissipation, which make GHA difficult to be implemented in portable or implantable multi-channel recording micro-systems. Method In this paper, we present a new algorithm for PCA-based spike sorting based on GHA, namely stream-based Hebbian eigenfilter, which eliminates the inherent memory requirements of GHA while keeping the accuracy of spike sorting by utilizing the pseudo-stationarity of neuronal spikes. Because of the reduction of large hardware storage requirements, the proposed algorithm can lead to ultra-low hardware resources and power consumption of hardware implementations, which is critical for the future multi-channel micro-systems. Both clinical and synthetic neural recording data sets were employed for evaluating the accuracy of the stream-based Hebbian eigenfilter. The performance of spike sorting using stream-based eigenfilter and the computational complexity of the eigenfilter were rigorously evaluated and compared with conventional PCA algorithms. Field programmable logic arrays (FPGAs) were employed to implement the proposed algorithm, evaluate the hardware implementations and demonstrate the reduction in both power consumption and hardware memories achieved by the streaming computing Results and discussion Results demonstrate that the stream-based eigenfilter can achieve the same accuracy and is 10 times more computationally efficient when compared with conventional PCA algorithms. Hardware evaluations show that 90.3% logic resources, 95.1% power consumption and 86.8% computing latency can be reduced by the stream-based eigenfilter when compared with PCA hardware. By utilizing the streaming method, 92% memory resources and 67% power consumption can be saved when compared with the direct implementation of GHA. Conclusion Stream-based Hebbian eigenfilter presents a novel approach to enable real-time spike sorting with reduced computational complexity and hardware costs. This new design can be further utilized for multi-channel neuro-physiological experiments or chronic implants. PMID:22490725
Determination of the optimal mesh parameters for Iguassu centrifuge flow and separation calculations

NASA Astrophysics Data System (ADS)

Romanihin, S. M.; Tronin, I. V.

2016-09-01

We present the method and the results of the determination for optimal computational mesh parameters for axisymmetric modeling of flow and separation in the Iguasu gas centrifuge. The aim of this work was to determine the mesh parameters which provide relatively low computational cost whithout loss of accuracy. We use direct search optimization algorithm to calculate optimal mesh parameters. Obtained parameters were tested by the calculation of the optimal working regime of the Iguasu GC. Separative power calculated using the optimal mesh parameters differs less than 0.5% from the result obtained on the detailed mesh. Presented method can be used to determine optimal mesh parameters of the Iguasu GC with different rotor speeds.
MorphoHawk: Geometric-based Software for Manufacturing and More

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keith Arterburn

2001-04-01

Hollywood movies portray facial recognition as a perfected technology, but reality is that sophisticated computers and algorithmic calculations are far from perfect. In fact, the most sophisticated and successful computer for recognizing faces and other imagery already is the human brain with more than 10 billion nerve cells. Beginning at birth, humans process data and connect optical and sensory experiences that create unparalleled accumulation of data for people to associate images with life experiences, emotions and knowledge. Computers are powerful, rapid and tireless, but still cannot compare to the highly sophisticated relational calculations and associations that the human computer canmore » produce in connecting ‘what we see with what we know.’« less
Ground temperature measurement by PRT-5 for maps experiment

NASA Technical Reports Server (NTRS)

Gupta, S. K.; Tiwari, S. N.

1978-01-01

A simple algorithm and computer program were developed for determining the actual surface temperature from the effective brightness temperature as measured remotely by a radiation thermometer called PRT-5. This procedure allows the computation of atmospheric correction to the effective brightness temperature without performing detailed radiative transfer calculations. Model radiative transfer calculations were performed to compute atmospheric corrections for several values of the surface and atmospheric parameters individually and in combination. Polynomial regressions were performed between the magnitudes or deviations of these parameters and the corresponding computed corrections to establish simple analytical relations between them. Analytical relations were also developed to represent combined correction for simultaneous variation of parameters in terms of their individual corrections.
Parallel computing of a digital hologram and particle searching for microdigital-holographic particle-tracking velocimetry

DOE Office of Scientific and Technical Information (OSTI.GOV)

Satake, Shin-ichi; Kanamori, Hiroyuki; Kunugi, Tomoaki

2007-02-01

We have developed a parallel algorithm for microdigital-holographic particle-tracking velocimetry. The algorithm is used in (1) numerical reconstruction of a particle image computer using a digital hologram, and (2) searching for particles. The numerical reconstruction from the digital hologram makes use of the Fresnel diffraction equation and the FFT (fast Fourier transform),whereas the particle search algorithm looks for local maximum graduation in a reconstruction field represented by a 3D matrix. To achieve high performance computing for both calculations (reconstruction and particle search), two memory partitions are allocated to the 3D matrix. In this matrix, the reconstruction part consists of horizontallymore » placed 2D memory partitions on the x-y plane for the FFT, whereas, the particle search part consists of vertically placed 2D memory partitions set along the z axes.Consequently, the scalability can be obtained for the proportion of processor elements,where the benchmarks are carried out for parallel computation by a SGI Altix machine.« less
Binding Direction-Based Two-Dimensional Flattened Contact Area Computing Algorithm for Protein-Protein Interactions.

PubMed

Kang, Beom Sik; Pugalendhi, GaneshKumar; Kim, Ku-Jin

2017-10-13

Interactions between protein molecules are essential for the assembly, function, and regulation of proteins. The contact region between two protein molecules in a protein complex is usually complementary in shape for both molecules and the area of the contact region can be used to estimate the binding strength between two molecules. Although the area is a value calculated from the three-dimensional surface, it cannot represent the three-dimensional shape of the surface. Therefore, we propose an original concept of two-dimensional contact area which provides further information such as the ruggedness of the contact region. We present a novel algorithm for calculating the binding direction between two molecules in a protein complex, and then suggest a method to compute the two-dimensional flattened area of the contact region between two molecules based on the binding direction.

High performance transcription factor-DNA docking with GPU computing

PubMed Central

2012-01-01

Background Protein-DNA docking is a very challenging problem in structural bioinformatics and has important implications in a number of applications, such as structure-based prediction of transcription factor binding sites and rational drug design. Protein-DNA docking is very computational demanding due to the high cost of energy calculation and the statistical nature of conformational sampling algorithms. More importantly, experiments show that the docking quality depends on the coverage of the conformational sampling space. It is therefore desirable to accelerate the computation of the docking algorithm, not only to reduce computing time, but also to improve docking quality. Methods In an attempt to accelerate the sampling process and to improve the docking performance, we developed a graphics processing unit (GPU)-based protein-DNA docking algorithm. The algorithm employs a potential-based energy function to describe the binding affinity of a protein-DNA pair, and integrates Monte-Carlo simulation and a simulated annealing method to search through the conformational space. Algorithmic techniques were developed to improve the computation efficiency and scalability on GPU-based high performance computing systems. Results The effectiveness of our approach is tested on a non-redundant set of 75 TF-DNA complexes and a newly developed TF-DNA docking benchmark. We demonstrated that the GPU-based docking algorithm can significantly accelerate the simulation process and thereby improving the chance of finding near-native TF-DNA complex structures. This study also suggests that further improvement in protein-DNA docking research would require efforts from two integral aspects: improvement in computation efficiency and energy function design. Conclusions We present a high performance computing approach for improving the prediction accuracy of protein-DNA docking. The GPU-based docking algorithm accelerates the search of the conformational space and thus increases the chance of finding more near-native structures. To the best of our knowledge, this is the first ad hoc effort of applying GPU or GPU clusters to the protein-DNA docking problem. PMID:22759575
Implicit, nonswitching, vector-oriented algorithm for steady transonic flow

NASA Technical Reports Server (NTRS)

Lottati, I.

1983-01-01

A rapid computation of a sequence of transonic flow solutions has to be performed in many areas of aerodynamic technology. The employment of low-cost vector array processors makes the conduction of such calculations economically feasible. However, for a full utilization of the new hardware, the developed algorithms must take advantage of the special characteristics of the vector array processor. The present investigation has the objective to develop an efficient algorithm for solving transonic flow problems governed by mixed partial differential equations on an array processor.
Cubic scaling algorithms for RPA correlation using interpolative separable density fitting

NASA Astrophysics Data System (ADS)

Lu, Jianfeng; Thicke, Kyle

2017-12-01

We present a new cubic scaling algorithm for the calculation of the RPA correlation energy. Our scheme splits up the dependence between the occupied and virtual orbitals in χ0 by use of Cauchy's integral formula. This introduces an additional integral to be carried out, for which we provide a geometrically convergent quadrature rule. Our scheme also uses the newly developed Interpolative Separable Density Fitting algorithm to further reduce the computational cost in a way analogous to that of the Resolution of Identity method.
PSC algorithm description

NASA Technical Reports Server (NTRS)

Nobbs, Steven G.

1995-01-01

An overview of the performance seeking control (PSC) algorithm and details of the important components of the algorithm are given. The onboard propulsion system models, the linear programming optimization, and engine control interface are described. The PSC algorithm receives input from various computers on the aircraft including the digital flight computer, digital engine control, and electronic inlet control. The PSC algorithm contains compact models of the propulsion system including the inlet, engine, and nozzle. The models compute propulsion system parameters, such as inlet drag and fan stall margin, which are not directly measurable in flight. The compact models also compute sensitivities of the propulsion system parameters to change in control variables. The engine model consists of a linear steady state variable model (SSVM) and a nonlinear model. The SSVM is updated with efficiency factors calculated in the engine model update logic, or Kalman filter. The efficiency factors are used to adjust the SSVM to match the actual engine. The propulsion system models are mathematically integrated to form an overall propulsion system model. The propulsion system model is then optimized using a linear programming optimization scheme. The goal of the optimization is determined from the selected PSC mode of operation. The resulting trims are used to compute a new operating point about which the optimization process is repeated. This process is continued until an overall (global) optimum is reached before applying the trims to the controllers.
Time-independent lattice Boltzmann method calculation of hydrodynamic interactions between two particles

NASA Astrophysics Data System (ADS)

Ding, E. J.

2015-06-01

The time-independent lattice Boltzmann algorithm (TILBA) is developed to calculate the hydrodynamic interactions between two particles in a Stokes flow. The TILBA is distinguished from the traditional lattice Boltzmann method in that a background matrix (BGM) is generated prior to the calculation. The BGM, once prepared, can be reused for calculations for different scenarios, and the computational cost for each such calculation will be significantly reduced. The advantage of the TILBA is that it is easy to code and can be applied to any particle shape without complicated implementation, and the computational cost is independent of the shape of the particle. The TILBA is validated and shown to be accurate by comparing calculation results obtained from the TILBA to analytical or numerical solutions for certain problems.
RGCA: A Reliable GPU Cluster Architecture for Large-Scale Internet of Things Computing Based on Effective Performance-Energy Optimization

PubMed Central

Chen, Qingkui; Zhao, Deyu; Wang, Jingjuan

2017-01-01

This paper aims to develop a low-cost, high-performance and high-reliability computing system to process large-scale data using common data mining algorithms in the Internet of Things (IoT) computing environment. Considering the characteristics of IoT data processing, similar to mainstream high performance computing, we use a GPU (Graphics Processing Unit) cluster to achieve better IoT services. Firstly, we present an energy consumption calculation method (ECCM) based on WSNs. Then, using the CUDA (Compute Unified Device Architecture) Programming model, we propose a Two-level Parallel Optimization Model (TLPOM) which exploits reasonable resource planning and common compiler optimization techniques to obtain the best blocks and threads configuration considering the resource constraints of each node. The key to this part is dynamic coupling Thread-Level Parallelism (TLP) and Instruction-Level Parallelism (ILP) to improve the performance of the algorithms without additional energy consumption. Finally, combining the ECCM and the TLPOM, we use the Reliable GPU Cluster Architecture (RGCA) to obtain a high-reliability computing system considering the nodes’ diversity, algorithm characteristics, etc. The results show that the performance of the algorithms significantly increased by 34.1%, 33.96% and 24.07% for Fermi, Kepler and Maxwell on average with TLPOM and the RGCA ensures that our IoT computing system provides low-cost and high-reliability services. PMID:28777325
RGCA: A Reliable GPU Cluster Architecture for Large-Scale Internet of Things Computing Based on Effective Performance-Energy Optimization.

PubMed

Fang, Yuling; Chen, Qingkui; Xiong, Neal N; Zhao, Deyu; Wang, Jingjuan

2017-08-04

This paper aims to develop a low-cost, high-performance and high-reliability computing system to process large-scale data using common data mining algorithms in the Internet of Things (IoT) computing environment. Considering the characteristics of IoT data processing, similar to mainstream high performance computing, we use a GPU (Graphics Processing Unit) cluster to achieve better IoT services. Firstly, we present an energy consumption calculation method (ECCM) based on WSNs. Then, using the CUDA (Compute Unified Device Architecture) Programming model, we propose a Two-level Parallel Optimization Model (TLPOM) which exploits reasonable resource planning and common compiler optimization techniques to obtain the best blocks and threads configuration considering the resource constraints of each node. The key to this part is dynamic coupling Thread-Level Parallelism (TLP) and Instruction-Level Parallelism (ILP) to improve the performance of the algorithms without additional energy consumption. Finally, combining the ECCM and the TLPOM, we use the Reliable GPU Cluster Architecture (RGCA) to obtain a high-reliability computing system considering the nodes' diversity, algorithm characteristics, etc. The results show that the performance of the algorithms significantly increased by 34.1%, 33.96% and 24.07% for Fermi, Kepler and Maxwell on average with TLPOM and the RGCA ensures that our IoT computing system provides low-cost and high-reliability services.
[Evaluation of three methods for constructing craniofacial mid-sagittal plane based on the cone beam computed tomography].

PubMed

Wang, S W; Li, M; Yang, H F; Zhao, Y J; Wang, Y; Liu, Y

2016-04-18

To compare the accuracyof interactive closet point (ICP) algorithm, Procrustes analysis (PA) algorithm,and a landmark-independent method to construct the mid-sagittal plane (MSP) of the cone beam computed tomography.To provide theoretical basis for establishing coordinate systemof CBCT images and symmetric analysis. Ten patients were selected and scanned by CBCT before orthodontic treatment.The scan data was imported into Mimics 10.0 to reconstructthree dimensional skulls.And the MSP of each skull was generated by ICP algorithm, PA algorithm and landmark-independent method. MSP extracted by ICP algorithm or PA algorithm involvedthree steps. First, the 3D skull processing was performed by reverse engineering software geomagic studio 2012 to obtain the mirror skull. Then, the original and its mirror skull was registered separately by ICP algorithm in geomagic studio 2012 and PA algorithm in NX Imageware 11.0. Finally, the registered data were united into new data to calculate the MSP of the originaldata in geomagic studio 2012. The mid-sagittal plane was determined by SELLA (S), nasion (N), basion (Ba) as traditional landmark-dependent methodconducted in software InVivoDental 5.0. The distance from 9 pairs of symmetric anatomical marked points to three sagittal plane were measured and calculated to compare the differences of the absolute value. The one-way ANOVA test was used to analyze the variable differences among the 3 MSPs. The pairwise comparison was performed with LSD method. MSPs calculated by the three methods were available for clinic analysis, which could be concluded from the front view.However, there was significant differences among the distances from the 9 pairs of symmetric anatomical marked points to the MSPs (F=10.932,P=0.001).LSD test showed there was no significant difference between the ICP algorithm and landmark-independent method (P=0.11), while there was significant difference between the PA algorithm and landmark-independent methods (P=0.01) . Mid-sagittal plane of 3D skulls could be generated base on ICP algorithm or PA algorithm. There was no significant difference between the ICP algorithm and landmark-independent method. For the subjects with no evident asymmetry, ICP algorithm is feasible in clinical analysis.
Accelerating atomistic calculations of quantum energy eigenstates on graphic cards

NASA Astrophysics Data System (ADS)

Rodrigues, Walter; Pecchia, A.; Lopez, M.; Auf der Maur, M.; Di Carlo, A.

2014-10-01

Electronic properties of nanoscale materials require the calculation of eigenvalues and eigenvectors of large matrices. This bottleneck can be overcome by parallel computing techniques or the introduction of faster algorithms. In this paper we report a custom implementation of the Lanczos algorithm with simple restart, optimized for graphical processing units (GPUs). The whole algorithm has been developed using CUDA and runs entirely on the GPU, with a specialized implementation that spares memory and reduces at most machine-to-device data transfers. Furthermore parallel distribution over several GPUs has been attained using the standard message passing interface (MPI). Benchmark calculations performed on a GaN/AlGaN wurtzite quantum dot with up to 600,000 atoms are presented. The empirical tight-binding (ETB) model with an sp3d5s∗+spin-orbit parametrization has been used to build the system Hamiltonian (H).
Monte Carlo tests of the ELIPGRID-PC algorithm

DOE Office of Scientific and Technical Information (OSTI.GOV)

Davidson, J.R.

1995-04-01

The standard tool for calculating the probability of detecting pockets of contamination called hot spots has been the ELIPGRID computer code of Singer and Wickman. The ELIPGRID-PC program has recently made this algorithm available for an IBM{reg_sign} PC. However, no known independent validation of the ELIPGRID algorithm exists. This document describes a Monte Carlo simulation-based validation of a modified version of the ELIPGRID-PC code. The modified ELIPGRID-PC code is shown to match Monte Carlo-calculated hot-spot detection probabilities to within {plus_minus}0.5% for 319 out of 320 test cases. The one exception, a very thin elliptical hot spot located within a rectangularmore » sampling grid, differed from the Monte Carlo-calculated probability by about 1%. These results provide confidence in the ability of the modified ELIPGRID-PC code to accurately predict hot-spot detection probabilities within an acceptable range of error.« less
Dimensioning appropriate technical and economic parameters of elements in urban distribution power nets based on discrete fast marching method

NASA Astrophysics Data System (ADS)

Afanasyev, A. P.; Bazhenov, R. I.; Luchaninov, D. V.

2018-05-01

The main purpose of the research is to develop techniques for defining the best technical and economic trajectories of cables in urban power systems. The proposed algorithms of calculation of the routes for laying cables take into consideration topological, technical and economic features of the cabling. The discrete option of an algorithm Fast marching method is applied as a calculating tool. It has certain advantages compared to other approaches. In particular, this algorithm is cost-effective to compute, therefore, it is not iterative. Trajectories of received laying cables are considered as optimal ones from the point of view of technical and economic criteria. They correspond to the present rules of modern urban development.
Phase Transitions in Combinatorial Optimization Problems: Basics, Algorithms and Statistical Mechanics

NASA Astrophysics Data System (ADS)

Hartmann, Alexander K.; Weigt, Martin

2005-10-01

A concise, comprehensive introduction to the topic of statistical physics of combinatorial optimization, bringing together theoretical concepts and algorithms from computer science with analytical methods from physics. The result bridges the gap between statistical physics and combinatorial optimization, investigating problems taken from theoretical computing, such as the vertex-cover problem, with the concepts and methods of theoretical physics. The authors cover rapid developments and analytical methods that are both extremely complex and spread by word-of-mouth, providing all the necessary basics in required detail. Throughout, the algorithms are shown with examples and calculations, while the proofs are given in a way suitable for graduate students, post-docs, and researchers. Ideal for newcomers to this young, multidisciplinary field.
Generalized and efficient algorithm for computing multipole energies and gradients based on Cartesian tensors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Dejun, E-mail: dejun.lin@gmail.com

2015-09-21

Accurate representation of intermolecular forces has been the central task of classical atomic simulations, known as molecular mechanics. Recent advancements in molecular mechanics models have put forward the explicit representation of permanent and/or induced electric multipole (EMP) moments. The formulas developed so far to calculate EMP interactions tend to have complicated expressions, especially in Cartesian coordinates, which can only be applied to a specific kernel potential function. For example, one needs to develop a new formula each time a new kernel function is encountered. The complication of these formalisms arises from an intriguing and yet obscured mathematical relation between themore » kernel functions and the gradient operators. Here, I uncover this relation via rigorous derivation and find that the formula to calculate EMP interactions is basically invariant to the potential kernel functions as long as they are of the form f(r), i.e., any Green’s function that depends on inter-particle distance. I provide an algorithm for efficient evaluation of EMP interaction energies, forces, and torques for any kernel f(r) up to any arbitrary rank of EMP moments in Cartesian coordinates. The working equations of this algorithm are essentially the same for any kernel f(r). Recently, a few recursive algorithms were proposed to calculate EMP interactions. Depending on the kernel functions, the algorithm here is about 4–16 times faster than these algorithms in terms of the required number of floating point operations and is much more memory efficient. I show that it is even faster than a theoretically ideal recursion scheme, i.e., one that requires 1 floating point multiplication and 1 addition per recursion step. This algorithm has a compact vector-based expression that is optimal for computer programming. The Cartesian nature of this algorithm makes it fit easily into modern molecular simulation packages as compared with spherical coordinate-based algorithms. A software library based on this algorithm has been implemented in C++11 and has been released.« less
Real-time in-flight thrust calculation on a digital electronic engine control-equipped F100 engine in an F-15 airplane

NASA Technical Reports Server (NTRS)

Ray, R. J.; Myers, L. P.

1984-01-01

Computer algorithms which calculate in-flight engine and aircraft performance real-time are discussed. The first step was completed with the implementation of a real-time thrust calculation program on a digital electronic engine control (DEEC) equiped F100 engine in an F-15 aircraft. The in-flight thrust modifications that allow calculations to be performed in real-time, to compare results to predictions, are presented.
Finite-difference simulation of transonic separated flow using a full potential boundary layer interaction approach

NASA Technical Reports Server (NTRS)

Van Dalsem, W. R.; Steger, J. L.

1983-01-01

A new, fast, direct-inverse, finite-difference boundary-layer code has been developed and coupled with a full-potential transonic airfoil analysis code via new inviscid-viscous interaction algorithms. The resulting code has been used to calculate transonic separated flows. The results are in good agreement with Navier-Stokes calculations and experimental data. Solutions are obtained in considerably less computer time than Navier-Stokes solutions of equal resolution. Because efficient inviscid and viscous algorithms are used, it is expected this code will also compare favorably with other codes of its type as they become available.
Precise calculation of the local pressure tensor in Cartesian and spherical coordinates in LAMMPS

NASA Astrophysics Data System (ADS)

Nakamura, Takenobu; Kawamoto, Shuhei; Shinoda, Wataru

2015-05-01

An accurate and efficient algorithm for calculating the 3D pressure field has been developed and implemented in the open-source molecular dynamics package, LAMMPS. Additionally, an algorithm to compute the pressure profile along the radial direction in spherical coordinates has also been implemented. The latter is particularly useful for systems showing a spherical symmetry such as micelles and vesicles. These methods yield precise pressure fields based on the Irving-Kirkwood contour integration and are particularly useful for biomolecular force fields. The present methods are applied to several systems including a buckled membrane and a vesicle.
Localized overlap algorithm for unexpanded dispersion energies

NASA Astrophysics Data System (ADS)

Rob, Fazle; Misquitta, Alston J.; Podeszwa, Rafał; Szalewicz, Krzysztof

2014-03-01

First-principles-based, linearly scaling algorithm has been developed for calculations of dispersion energies from frequency-dependent density susceptibility (FDDS) functions with account of charge-overlap effects. The transition densities in FDDSs are fitted by a set of auxiliary atom-centered functions. The terms in the dispersion energy expression involving products of such functions are computed using either the unexpanded (exact) formula or from inexpensive asymptotic expansions, depending on the location of these functions relative to the dimer configuration. This approach leads to significant savings of computational resources. In particular, for a dimer consisting of two elongated monomers with 81 atoms each in a head-to-head configuration, the most favorable case for our algorithm, a 43-fold speedup has been achieved while the approximate dispersion energy differs by less than 1% from that computed using the standard unexpanded approach. In contrast, the dispersion energy computed from the distributed asymptotic expansion differs by dozens of percent in the van der Waals minimum region. A further increase of the size of each monomer would result in only small increased costs since all the additional terms would be computed from the asymptotic expansion.
The computation of all plane/surface intersections for CAD/CAM applications

NASA Technical Reports Server (NTRS)

Hoitsma, D. H., Jr.; Roche, M.

1984-01-01

The problem of the computation and display of all intersections of a given plane with a rational bicubic surface patch for use on an interactive CAD/CAM system is examined. The general problem of calculating all intersections of a plane and a surface consisting of rational bicubic patches is reduced to the case of a single generic patch by applying a rejection algorithm which excludes all patches that do not intersect the plane. For each pertinent patch the algorithm presented computed the intersection curves by locating an initial point on each curve, and computes successive points on the curve using a tolerance step equation. A single cubic equation solver is used to compute the initial curve points lying on the boundary of a surface patch, and the method of resultants as applied to curve theory is used to determine critical points which, in turn, are used to locate initial points that lie on intersection curves which are in the interior of the patch. Examples are given to illustrate the ability of this algorithm to produce all intersection curves.
A novel Gravity-FREAK feature extraction and Gravity-KLT tracking registration algorithm based on iPhone MEMS mobile sensor in mobile environment

PubMed Central

Lin, Fan; Xiao, Bin

2017-01-01

Based on the traditional Fast Retina Keypoint (FREAK) feature description algorithm, this paper proposed a Gravity-FREAK feature description algorithm based on Micro-electromechanical Systems (MEMS) sensor to overcome the limited computing performance and memory resources of mobile devices and further improve the reality interaction experience of clients through digital information added to the real world by augmented reality technology. The algorithm takes the gravity projection vector corresponding to the feature point as its feature orientation, which saved the time of calculating the neighborhood gray gradient of each feature point, reduced the cost of calculation and improved the accuracy of feature extraction. In the case of registration method of matching and tracking natural features, the adaptive and generic corner detection based on the Gravity-FREAK matching purification algorithm was used to eliminate abnormal matches, and Gravity Kaneda-Lucas Tracking (KLT) algorithm based on MEMS sensor can be used for the tracking registration of the targets and robustness improvement of tracking registration algorithm under mobile environment. PMID:29088228
A novel Gravity-FREAK feature extraction and Gravity-KLT tracking registration algorithm based on iPhone MEMS mobile sensor in mobile environment.

PubMed

Hong, Zhiling; Lin, Fan; Xiao, Bin

2017-01-01

Based on the traditional Fast Retina Keypoint (FREAK) feature description algorithm, this paper proposed a Gravity-FREAK feature description algorithm based on Micro-electromechanical Systems (MEMS) sensor to overcome the limited computing performance and memory resources of mobile devices and further improve the reality interaction experience of clients through digital information added to the real world by augmented reality technology. The algorithm takes the gravity projection vector corresponding to the feature point as its feature orientation, which saved the time of calculating the neighborhood gray gradient of each feature point, reduced the cost of calculation and improved the accuracy of feature extraction. In the case of registration method of matching and tracking natural features, the adaptive and generic corner detection based on the Gravity-FREAK matching purification algorithm was used to eliminate abnormal matches, and Gravity Kaneda-Lucas Tracking (KLT) algorithm based on MEMS sensor can be used for the tracking registration of the targets and robustness improvement of tracking registration algorithm under mobile environment.

Multiscale computations with a wavelet-adaptive algorithm

NASA Astrophysics Data System (ADS)

Rastigejev, Yevgenii Anatolyevich

A wavelet-based adaptive multiresolution algorithm for the numerical solution of multiscale problems governed by partial differential equations is introduced. The main features of the method include fast algorithms for the calculation of wavelet coefficients and approximation of derivatives on nonuniform stencils. The connection between the wavelet order and the size of the stencil is established. The algorithm is based on the mathematically well established wavelet theory. This allows us to provide error estimates of the solution which are used in conjunction with an appropriate threshold criteria to adapt the collocation grid. The efficient data structures for grid representation as well as related computational algorithms to support grid rearrangement procedure are developed. The algorithm is applied to the simulation of phenomena described by Navier-Stokes equations. First, we undertake the study of the ignition and subsequent viscous detonation of a H2 : O2 : Ar mixture in a one-dimensional shock tube. Subsequently, we apply the algorithm to solve the two- and three-dimensional benchmark problem of incompressible flow in a lid-driven cavity at large Reynolds numbers. For these cases we show that solutions of comparable accuracy as the benchmarks are obtained with more than an order of magnitude reduction in degrees of freedom. The simulations show the striking ability of the algorithm to adapt to a solution having different scales at different spatial locations so as to produce accurate results at a relatively low computational cost.
SU-F-BRD-09: A Random Walk Model Algorithm for Proton Dose Calculation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yao, W; Farr, J

2015-06-15

Purpose: To develop a random walk model algorithm for calculating proton dose with balanced computation burden and accuracy. Methods: Random walk (RW) model is sometimes referred to as a density Monte Carlo (MC) simulation. In MC proton dose calculation, the use of Gaussian angular distribution of protons due to multiple Coulomb scatter (MCS) is convenient, but in RW the use of Gaussian angular distribution requires an extremely large computation and memory. Thus, our RW model adopts spatial distribution from the angular one to accelerate the computation and to decrease the memory usage. From the physics and comparison with the MCmore » simulations, we have determined and analytically expressed those critical variables affecting the dose accuracy in our RW model. Results: Besides those variables such as MCS, stopping power, energy spectrum after energy absorption etc., which have been extensively discussed in literature, the following variables were found to be critical in our RW model: (1) inverse squared law that can significantly reduce the computation burden and memory, (2) non-Gaussian spatial distribution after MCS, and (3) the mean direction of scatters at each voxel. In comparison to MC results, taken as reference, for a water phantom irradiated by mono-energetic proton beams from 75 MeV to 221.28 MeV, the gamma test pass rate was 100% for the 2%/2mm/10% criterion. For a highly heterogeneous phantom consisting of water embedded by a 10 cm cortical bone and a 10 cm lung in the Bragg peak region of the proton beam, the gamma test pass rate was greater than 98% for the 3%/3mm/10% criterion. Conclusion: We have determined key variables in our RW model for proton dose calculation. Compared with commercial pencil beam algorithms, our RW model much improves the dose accuracy in heterogeneous regions, and is about 10 times faster than MC simulations.« less
Relative-Error-Covariance Algorithms

NASA Technical Reports Server (NTRS)

Bierman, Gerald J.; Wolff, Peter J.

1991-01-01

Two algorithms compute error covariance of difference between optimal estimates, based on data acquired during overlapping or disjoint intervals, of state of discrete linear system. Provides quantitative measure of mutual consistency or inconsistency of estimates of states. Relative-error-covariance concept applied, to determine degree of correlation between trajectories calculated from two overlapping sets of measurements and construct real-time test of consistency of state estimates based upon recently acquired data.
A Simulation Algorithm to Approximate the Area of Mapped Forest Inventory Plots

Treesearch

William A. Bechtold; Naser E. Heravi; Matthew E. Kinkenon

2003-01-01

Calculating the area of polygons associated with mapped forest inventory plots can be mathematically cumbersome, especially when computing change between inventories. We developed a simulation technique that utilizes a computer-generated dot grid and geometry to estimate the area of mapped polygons within any size circle. The technique also yields a matrix of change in...
Cosmological N-body Simulation

NASA Astrophysics Data System (ADS)

Lake, George

1994-05-01

.90ex> }}} The ``N'' in N-body calculations has doubled every year for the last two decades. To continue this trend, the UW N-body group is working on algorithms for the fast evaluation of gravitational forces on parallel computers and establishing rigorous standards for the computations. In these algorithms, the computational cost per time step is ~ 10(3) pairwise forces per particle. A new adaptive time integrator enables us to perform high quality integrations that are fully temporally and spatially adaptive. SPH--smoothed particle hydrodynamics will be added to simulate the effects of dissipating gas and magnetic fields. The importance of these calculations is two-fold. First, they determine the nonlinear consequences of theories for the structure of the Universe. Second, they are essential for the interpretation of observations. Every galaxy has six coordinates of velocity and position. Observations determine two sky coordinates and a line of sight velocity that bundles universal expansion (distance) together with a random velocity created by the mass distribution. Simulations are needed to determine the underlying structure and masses. The importance of simulations has moved from ex post facto explanation to an integral part of planning large observational programs. I will show why high quality simulations with ``large N'' are essential to accomplish our scientific goals. This year, our simulations have N >~ 10(7) . This is sufficient to tackle some niche problems, but well short of our 5 year goal--simulating The Sloan Digital Sky Survey using a few Billion particles (a Teraflop-year simulation). Extrapolating past trends, we would have to ``wait'' 7 years for this hundred-fold improvement. Like past gains, significant changes in the computational methods are required for these advances. I will describe new algorithms, algorithmic hacks and a dedicated computer to perform Billion particle simulations. Finally, I will describe research that can be enabled by Petaflop computers. This research is supported by the NASA HPCC/ESS program.
On a fast calculation of structure factors at a subatomic resolution.

PubMed

Afonine, P V; Urzhumtsev, A

2004-01-01

In the last decade, the progress of protein crystallography allowed several protein structures to be solved at a resolution higher than 0.9 A. Such studies provide researchers with important new information reflecting very fine structural details. The signal from these details is very weak with respect to that corresponding to the whole structure. Its analysis requires high-quality data, which previously were available only for crystals of small molecules, and a high accuracy of calculations. The calculation of structure factors using direct formulae, traditional for 'small-molecule' crystallography, allows a relatively simple accuracy control. For macromolecular crystals, diffraction data sets at a subatomic resolution contain hundreds of thousands of reflections, and the number of parameters used to describe the corresponding models may reach the same order. Therefore, the direct way of calculating structure factors becomes very time expensive when applied to large molecules. These problems of high accuracy and computational efficiency require a re-examination of computer tools and algorithms. The calculation of model structure factors through an intermediate generation of an electron density [Sayre (1951). Acta Cryst. 4, 362-367; Ten Eyck (1977). Acta Cryst. A33, 486-492] may be much more computationally efficient, but contains some parameters (grid step, 'effective' atom radii etc.) whose influence on the accuracy of the calculation is not straightforward. At the same time, the choice of parameters within safety margins that largely ensure a sufficient accuracy may result in a significant loss of the CPU time, making it close to the time for the direct-formulae calculations. The impact of the different parameters on the computer efficiency of structure-factor calculation is studied. It is shown that an appropriate choice of these parameters allows the structure factors to be obtained with a high accuracy and in a significantly shorter time than that required when using the direct formulae. Practical algorithms for the optimal choice of the parameters are suggested.
Adaptation of a Hyperspectral Atmospheric Correction Algorithm for Multi-spectral Ocean Color Data in Coastal Waters. Chapter 3

NASA Technical Reports Server (NTRS)

Gao, Bo-Cai; Montes, Marcos J.; Davis, Curtiss O.

2003-01-01

This SIMBIOS contract supports several activities over its three-year time-span. These include certain computational aspects of atmospheric correction, including the modification of our hyperspectral atmospheric correction algorithm Tafkaa for various multi-spectral instruments, such as SeaWiFS, MODIS, and GLI. Additionally, since absorbing aerosols are becoming common in many coastal areas, we are making the model calculations to incorporate various absorbing aerosol models into tables used by our Tafkaa atmospheric correction algorithm. Finally, we have developed the algorithms to use MODIS data to characterize thin cirrus effects on aerosol retrieval.
Dose calculation algorithm of fast fine-heterogeneity correction for heavy charged particle radiotherapy.

PubMed

Kanematsu, Nobuyuki

2011-04-01

This work addresses computing techniques for dose calculations in treatment planning with proton and ion beams, based on an efficient kernel-convolution method referred to as grid-dose spreading (GDS) and accurate heterogeneity-correction method referred to as Gaussian beam splitting. The original GDS algorithm suffered from distortion of dose distribution for beams tilted with respect to the dose-grid axes. Use of intermediate grids normal to the beam field has solved the beam-tilting distortion. Interplay of arrangement between beams and grids was found as another intrinsic source of artifact. Inclusion of rectangular-kernel convolution in beam transport, to share the beam contribution among the nearest grids in a regulatory manner, has solved the interplay problem. This algorithmic framework was applied to a tilted proton pencil beam and a broad carbon-ion beam. In these cases, while the elementary pencil beams individually split into several tens, the calculation time increased only by several times with the GDS algorithm. The GDS and beam-splitting methods will complementarily enable accurate and efficient dose calculations for radiotherapy with protons and ions. Copyright © 2010 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
Implementation and performance of FDPS: a framework for developing parallel particle simulation codes

NASA Astrophysics Data System (ADS)

Iwasawa, Masaki; Tanikawa, Ataru; Hosono, Natsuki; Nitadori, Keigo; Muranushi, Takayuki; Makino, Junichiro

2016-08-01

We present the basic idea, implementation, measured performance, and performance model of FDPS (Framework for Developing Particle Simulators). FDPS is an application-development framework which helps researchers to develop simulation programs using particle methods for large-scale distributed-memory parallel supercomputers. A particle-based simulation program for distributed-memory parallel computers needs to perform domain decomposition, exchange of particles which are not in the domain of each computing node, and gathering of the particle information in other nodes which are necessary for interaction calculation. Also, even if distributed-memory parallel computers are not used, in order to reduce the amount of computation, algorithms such as the Barnes-Hut tree algorithm or the Fast Multipole Method should be used in the case of long-range interactions. For short-range interactions, some methods to limit the calculation to neighbor particles are required. FDPS provides all of these functions which are necessary for efficient parallel execution of particle-based simulations as "templates," which are independent of the actual data structure of particles and the functional form of the particle-particle interaction. By using FDPS, researchers can write their programs with the amount of work necessary to write a simple, sequential and unoptimized program of O(N2) calculation cost, and yet the program, once compiled with FDPS, will run efficiently on large-scale parallel supercomputers. A simple gravitational N-body program can be written in around 120 lines. We report the actual performance of these programs and the performance model. The weak scaling performance is very good, and almost linear speed-up was obtained for up to the full system of the K computer. The minimum calculation time per timestep is in the range of 30 ms (N = 107) to 300 ms (N = 109). These are currently limited by the time for the calculation of the domain decomposition and communication necessary for the interaction calculation. We discuss how we can overcome these bottlenecks.
Efficient tree tensor network states (TTNS) for quantum chemistry: Generalizations of the density matrix renormalization group algorithm

NASA Astrophysics Data System (ADS)

Nakatani, Naoki; Chan, Garnet Kin-Lic

2013-04-01

We investigate tree tensor network states for quantum chemistry. Tree tensor network states represent one of the simplest generalizations of matrix product states and the density matrix renormalization group. While matrix product states encode a one-dimensional entanglement structure, tree tensor network states encode a tree entanglement structure, allowing for a more flexible description of general molecules. We describe an optimal tree tensor network state algorithm for quantum chemistry. We introduce the concept of half-renormalization which greatly improves the efficiency of the calculations. Using our efficient formulation we demonstrate the strengths and weaknesses of tree tensor network states versus matrix product states. We carry out benchmark calculations both on tree systems (hydrogen trees and π-conjugated dendrimers) as well as non-tree molecules (hydrogen chains, nitrogen dimer, and chromium dimer). In general, tree tensor network states require much fewer renormalized states to achieve the same accuracy as matrix product states. In non-tree molecules, whether this translates into a computational savings is system dependent, due to the higher prefactor and computational scaling associated with tree algorithms. In tree like molecules, tree network states are easily superior to matrix product states. As an illustration, our largest dendrimer calculation with tree tensor network states correlates 110 electrons in 110 active orbitals.
SU-E-T-538: Evaluation of IMRT Dose Calculation Based on Pencil-Beam and AAA Algorithms.

PubMed

Yuan, Y; Duan, J; Popple, R; Brezovich, I

2012-06-01

To evaluate the accuracy of dose calculation for intensity modulated radiation therapy (IMRT) based on Pencil Beam (PB) and Analytical Anisotropic Algorithm (AAA) computation algorithms. IMRT plans of twelve patients with different treatment sites, including head/neck, lung and pelvis, were investigated. For each patient, dose calculation with PB and AAA algorithms using dose grid sizes of 0.5 mm, 0.25 mm, and 0.125 mm, were compared with composite-beam ion chamber and film measurements in patient specific QA. Discrepancies between the calculation and the measurement were evaluated by percentage error for ion chamber dose and γ〉l failure rate in gamma analysis (3%/3mm) for film dosimetry. For 9 patients, ion chamber dose calculated with AAA-algorithms is closer to ion chamber measurement than that calculated with PB algorithm with grid size of 2.5 mm, though all calculated ion chamber doses are within 3% of the measurements. For head/neck patients and other patients with large treatment volumes, γ〉l failure rate is significantly reduced (within 5%) with AAA-based treatment planning compared to generally more than 10% with PB-based treatment planning (grid size=2.5 mm). For lung and brain cancer patients with medium and small treatment volumes, γ〉l failure rates are typically within 5% for both AAA and PB-based treatment planning (grid size=2.5 mm). For both PB and AAA-based treatment planning, improvements of dose calculation accuracy with finer dose grids were observed in film dosimetry of 11 patients and in ion chamber measurements for 3 patients. AAA-based treatment planning provides more accurate dose calculation for head/neck patients and other patients with large treatment volumes. Compared with film dosimetry, a γ〉l failure rate within 5% can be achieved for AAA-based treatment planning. © 2012 American Association of Physicists in Medicine.
Recommendations for dose calculations of lung cancer treatment plans treated with stereotactic ablative body radiotherapy (SABR)

NASA Astrophysics Data System (ADS)

Devpura, S.; Siddiqui, M. S.; Chen, D.; Liu, D.; Li, H.; Kumar, S.; Gordon, J.; Ajlouni, M.; Movsas, B.; Chetty, I. J.

2014-03-01

The purpose of this study was to systematically evaluate dose distributions computed with 5 different dose algorithms for patients with lung cancers treated using stereotactic ablative body radiotherapy (SABR). Treatment plans for 133 lung cancer patients, initially computed with a 1D-pencil beam (equivalent-path-length, EPL-1D) algorithm, were recalculated with 4 other algorithms commissioned for treatment planning, including 3-D pencil-beam (EPL-3D), anisotropic analytical algorithm (AAA), collapsed cone convolution superposition (CCC), and Monte Carlo (MC). The plan prescription dose was 48 Gy in 4 fractions normalized to the 95% isodose line. Tumors were classified according to location: peripheral tumors surrounded by lung (lung-island, N=39), peripheral tumors attached to the rib-cage or chest wall (lung-wall, N=44), and centrally-located tumors (lung-central, N=50). Relative to the EPL-1D algorithm, PTV D95 and mean dose values computed with the other 4 algorithms were lowest for "lung-island" tumors with smallest field sizes (3-5 cm). On the other hand, the smallest differences were noted for lung-central tumors treated with largest field widths (7-10 cm). Amongst all locations, dose distribution differences were most strongly correlated with tumor size for lung-island tumors. For most cases, convolution/superposition and MC algorithms were in good agreement. Mean lung dose (MLD) values computed with the EPL-1D algorithm were highly correlated with that of the other algorithms (correlation coefficient =0.99). The MLD values were found to be ~10% lower for small lung-island tumors with the model-based (conv/superposition and MC) vs. the correction-based (pencil-beam) algorithms with the model-based algorithms predicting greater low dose spread within the lungs. This study suggests that pencil beam algorithms should be avoided for lung SABR planning. For the most challenging cases, small tumors surrounded entirely by lung tissue (lung-island type), a Monte-Carlo-based algorithm may be warranted.
Open-cycle OTEC system performance analysis. [Claude cycle

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lewandowski, A.A.; Olson, D.A.; Johnson, D.H.

1980-10-01

An algorithm developed to calculate the performance of Claude-Cycle ocean thermal energy conversion (OTEC) systems is described. The algorithm treats each component of the system separately and then interfaces them to form a complete system, allowing a component to be changed without changing the rest of the algorithm. Two components that are subject to change are the evaporator and condenser. For this study we developed mathematical models of a channel-flow evaporator and both a horizontal jet and spray director contact condenser. The algorithm was then programmed to run on SERI's CDC 7600 computer and used to calculate the effect onmore » performance of deaerating the warm and cold water streams before entering the evaporator and condenser, respectively. This study indicates that there is no advantage to removing air from these streams compared with removing the air from the condenser.« less
Harmony search optimization for HDR prostate brachytherapy

NASA Astrophysics Data System (ADS)

Panchal, Aditya

In high dose-rate (HDR) prostate brachytherapy, multiple catheters are inserted interstitially into the target volume. The process of treating the prostate involves calculating and determining the best dose distribution to the target and organs-at-risk by means of optimizing the time that the radioactive source dwells at specified positions within the catheters. It is the goal of this work to investigate the use of a new optimization algorithm, known as Harmony Search, in order to optimize dwell times for HDR prostate brachytherapy. The new algorithm was tested on 9 different patients and also compared with the genetic algorithm. Simulations were performed to determine the optimal value of the Harmony Search parameters. Finally, multithreading of the simulation was examined to determine potential benefits. First, a simulation environment was created using the Python programming language and the wxPython graphical interface toolkit, which was necessary to run repeated optimizations. DICOM RT data from Varian BrachyVision was parsed and used to obtain patient anatomy and HDR catheter information. Once the structures were indexed, the volume of each structure was determined and compared to the original volume calculated in BrachyVision for validation. Dose was calculated using the AAPM TG-43 point source model of the GammaMed 192Ir HDR source and was validated against Varian BrachyVision. A DVH-based objective function was created and used for the optimization simulation. Harmony Search and the genetic algorithm were implemented as optimization algorithms for the simulation and were compared against each other. The optimal values for Harmony Search parameters (Harmony Memory Size [HMS], Harmony Memory Considering Rate [HMCR], and Pitch Adjusting Rate [PAR]) were also determined. Lastly, the simulation was modified to use multiple threads of execution in order to achieve faster computational times. Experimental results show that the volume calculation that was implemented in this thesis was within 2% of the values computed by Varian BrachyVision for the prostate, within 3% for the rectum and bladder and 6% for the urethra. The calculation of dose compared to BrachyVision was determined to be different by only 0.38%. Isodose curves were also generated and were found to be similar to BrachyVision. The comparison between Harmony Search and genetic algorithm showed that Harmony Search was over 4 times faster when compared over multiple data sets. The optimal Harmony Memory Size was found to be 5 or lower; the Harmony Memory Considering Rate was determined to be 0.95, and the Pitch Adjusting Rate was found to be 0.9. Ultimately, the effect of multithreading showed that as intensive computations such as optimization and dose calculation are involved, the threads of execution scale with the number of processors, achieving a speed increase proportional to the number of processor cores. In conclusion, this work showed that Harmony Search is a viable alternative to existing algorithms for use in HDR prostate brachytherapy optimization. Coupled with the optimal parameters for the algorithm and a multithreaded simulation, this combination has the capability to significantly decrease the time spent on minimizing optimization problems in the clinic that are time intensive, such as brachytherapy, IMRT and beam angle optimization.
Monte Carlo evaluation of Acuros XB dose calculation Algorithm for intensity modulated radiation therapy of nasopharyngeal carcinoma

NASA Astrophysics Data System (ADS)

Yeh, Peter C. Y.; Lee, C. C.; Chao, T. C.; Tung, C. J.

2017-11-01

Intensity-modulated radiation therapy is an effective treatment modality for the nasopharyngeal carcinoma. One important aspect of this cancer treatment is the need to have an accurate dose algorithm dealing with the complex air/bone/tissue interface in the head-neck region to achieve the cure without radiation-induced toxicities. The Acuros XB algorithm explicitly solves the linear Boltzmann transport equation in voxelized volumes to account for the tissue heterogeneities such as lungs, bone, air, and soft tissues in the treatment field receiving radiotherapy. With the single beam setup in phantoms, this algorithm has already been demonstrated to achieve the comparable accuracy with Monte Carlo simulations. In the present study, five nasopharyngeal carcinoma patients treated with the intensity-modulated radiation therapy were examined for their dose distributions calculated using the Acuros XB in the planning target volume and the organ-at-risk. Corresponding results of Monte Carlo simulations were computed from the electronic portal image data and the BEAMnrc/DOSXYZnrc code. Analysis of dose distributions in terms of the clinical indices indicated that the Acuros XB was in comparable accuracy with Monte Carlo simulations and better than the anisotropic analytical algorithm for dose calculations in real patients.
Dose specification for radiation therapy: dose to water or dose to medium?

NASA Astrophysics Data System (ADS)

Ma, C.-M.; Li, Jinsheng

2011-05-01

The Monte Carlo method enables accurate dose calculation for radiation therapy treatment planning and has been implemented in some commercial treatment planning systems. Unlike conventional dose calculation algorithms that provide patient dose information in terms of dose to water with variable electron density, the Monte Carlo method calculates the energy deposition in different media and expresses dose to a medium. This paper discusses the differences in dose calculated using water with different electron densities and that calculated for different biological media and the clinical issues on dose specification including dose prescription and plan evaluation using dose to water and dose to medium. We will demonstrate that conventional photon dose calculation algorithms compute doses similar to those simulated by Monte Carlo using water with different electron densities, which are close (<4% differences) to doses to media but significantly different (up to 11%) from doses to water converted from doses to media following American Association of Physicists in Medicine (AAPM) Task Group 105 recommendations. Our results suggest that for consistency with previous radiation therapy experience Monte Carlo photon algorithms report dose to medium for radiotherapy dose prescription, treatment plan evaluation and treatment outcome analysis.
An evolutionary computation based algorithm for calculating solar differential rotation by automatic tracking of coronal bright points

NASA Astrophysics Data System (ADS)

Shahamatnia, Ehsan; Dorotovič, Ivan; Fonseca, Jose M.; Ribeiro, Rita A.

2016-03-01

Developing specialized software tools is essential to support studies of solar activity evolution. With new space missions such as Solar Dynamics Observatory (SDO), solar images are being produced in unprecedented volumes. To capitalize on that huge data availability, the scientific community needs a new generation of software tools for automatic and efficient data processing. In this paper a prototype of a modular framework for solar feature detection, characterization, and tracking is presented. To develop an efficient system capable of automatic solar feature tracking and measuring, a hybrid approach combining specialized image processing, evolutionary optimization, and soft computing algorithms is being followed. The specialized hybrid algorithm for tracking solar features allows automatic feature tracking while gathering characterization details about the tracked features. The hybrid algorithm takes advantages of the snake model, a specialized image processing algorithm widely used in applications such as boundary delineation, image segmentation, and object tracking. Further, it exploits the flexibility and efficiency of Particle Swarm Optimization (PSO), a stochastic population based optimization algorithm. PSO has been used successfully in a wide range of applications including combinatorial optimization, control, clustering, robotics, scheduling, and image processing and video analysis applications. The proposed tool, denoted PSO-Snake model, was already successfully tested in other works for tracking sunspots and coronal bright points. In this work, we discuss the application of the PSO-Snake algorithm for calculating the sidereal rotational angular velocity of the solar corona. To validate the results we compare them with published manual results performed by an expert.
An efficient parallel algorithm: Poststack and prestack Kirchhoff 3D depth migration using flexi-depth iterations

NASA Astrophysics Data System (ADS)

Rastogi, Richa; Srivastava, Abhishek; Khonde, Kiran; Sirasala, Kirannmayi M.; Londhe, Ashutosh; Chavhan, Hitesh

2015-07-01

This paper presents an efficient parallel 3D Kirchhoff depth migration algorithm suitable for current class of multicore architecture. The fundamental Kirchhoff depth migration algorithm exhibits inherent parallelism however, when it comes to 3D data migration, as the data size increases the resource requirement of the algorithm also increases. This challenges its practical implementation even on current generation high performance computing systems. Therefore a smart parallelization approach is essential to handle 3D data for migration. The most compute intensive part of Kirchhoff depth migration algorithm is the calculation of traveltime tables due to its resource requirements such as memory/storage and I/O. In the current research work, we target this area and develop a competent parallel algorithm for post and prestack 3D Kirchhoff depth migration, using hybrid MPI+OpenMP programming techniques. We introduce a concept of flexi-depth iterations while depth migrating data in parallel imaging space, using optimized traveltime table computations. This concept provides flexibility to the algorithm by migrating data in a number of depth iterations, which depends upon the available node memory and the size of data to be migrated during runtime. Furthermore, it minimizes the requirements of storage, I/O and inter-node communication, thus making it advantageous over the conventional parallelization approaches. The developed parallel algorithm is demonstrated and analysed on Yuva II, a PARAM series of supercomputers. Optimization, performance and scalability experiment results along with the migration outcome show the effectiveness of the parallel algorithm.
An efficient dynamic load balancing algorithm

NASA Astrophysics Data System (ADS)

Lagaros, Nikos D.

2014-01-01

In engineering problems, randomness and uncertainties are inherent. Robust design procedures, formulated in the framework of multi-objective optimization, have been proposed in order to take into account sources of randomness and uncertainty. These design procedures require orders of magnitude more computational effort than conventional analysis or optimum design processes since a very large number of finite element analyses is required to be dealt. It is therefore an imperative need to exploit the capabilities of computing resources in order to deal with this kind of problems. In particular, parallel computing can be implemented at the level of metaheuristic optimization, by exploiting the physical parallelization feature of the nondominated sorting evolution strategies method, as well as at the level of repeated structural analyses required for assessing the behavioural constraints and for calculating the objective functions. In this study an efficient dynamic load balancing algorithm for optimum exploitation of available computing resources is proposed and, without loss of generality, is applied for computing the desired Pareto front. In such problems the computation of the complete Pareto front with feasible designs only, constitutes a very challenging task. The proposed algorithm achieves linear speedup factors and almost 100% speedup factor values with reference to the sequential procedure.
Application of a fast skyline computation algorithm for serendipitous searching problems

NASA Astrophysics Data System (ADS)

Koizumi, Kenichi; Hiraki, Kei; Inaba, Mary

2018-02-01

Skyline computation is a method of extracting interesting entries from a large population with multiple attributes. These entries, called skyline or Pareto optimal entries, are known to have extreme characteristics that cannot be found by outlier detection methods. Skyline computation is an important task for characterizing large amounts of data and selecting interesting entries with extreme features. When the population changes dynamically, the task of calculating a sequence of skyline sets is called continuous skyline computation. This task is known to be difficult to perform for the following reasons: (1) information of non-skyline entries must be stored since they may join the skyline in the future; (2) the appearance or disappearance of even a single entry can change the skyline drastically; (3) it is difficult to adopt a geometric acceleration algorithm for skyline computation tasks with high-dimensional datasets. Our new algorithm called jointed rooted-tree (JR-tree) manages entries using a rooted tree structure. JR-tree delays extend the tree to deep levels to accelerate tree construction and traversal. In this study, we presented the difficulties in extracting entries tagged with a rare label in high-dimensional space and the potential of fast skyline computation in low-latency cell identification technology.

Independent calculation of monitor units for VMAT and SPORT

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Xin; Bush, Karl; Ding, Aiping

Purpose: Dose and monitor units (MUs) represent two important facets of a radiation therapy treatment. In current practice, verification of a treatment plan is commonly done in dose domain, in which a phantom measurement or forward dose calculation is performed to examine the dosimetric accuracy and the MU settings of a given treatment plan. While it is desirable to verify directly the MU settings, a computational framework for obtaining the MU values from a known dose distribution has yet to be developed. This work presents a strategy to calculate independently the MUs from a given dose distribution of volumetric modulatedmore » arc therapy (VMAT) and station parameter optimized radiation therapy (SPORT). Methods: The dose at a point can be expressed as a sum of contributions from all the station points (or control points). This relationship forms the basis of the proposed MU verification technique. To proceed, the authors first obtain the matrix elements which characterize the dosimetric contribution of the involved station points by computing the doses at a series of voxels, typically on the prescription surface of the VMAT/SPORT treatment plan, with unit MU setting for all the station points. An in-house Monte Carlo (MC) software is used for the dose matrix calculation. The MUs of the station points are then derived by minimizing the least-squares difference between doses computed by the treatment planning system (TPS) and that of the MC for the selected set of voxels on the prescription surface. The technique is applied to 16 clinical cases with a variety of energies, disease sites, and TPS dose calculation algorithms. Results: For all plans except the lung cases with large tissue density inhomogeneity, the independently computed MUs agree with that of TPS to within 2.7% for all the station points. In the dose domain, no significant difference between the MC and Eclipse Anisotropic Analytical Algorithm (AAA) dose distribution is found in terms of isodose contours, dose profiles, gamma index, and dose volume histogram (DVH) for these cases. For the lung cases, the MC-calculated MUs differ significantly from that of the treatment plan computed using AAA. However, the discrepancies are reduced to within 3% when the TPS dose calculation algorithm is switched to a transport equation-based technique (Acuros™). Comparison in the dose domain between the MC and Eclipse AAA/Acuros calculation yields conclusion consistent with the MU calculation. Conclusions: A computational framework relating the MU and dose domains has been established. The framework does not only enable them to verify the MU values of the involved station points of a VMAT plan directly in the MU domain but also provide a much needed mechanism to adaptively modify the MU values of the station points in accordance to a specific change in the dose domain.« less
Thermodynamic properties of solvated peptides from selective integrated tempering sampling with a new weighting factor estimation algorithm

NASA Astrophysics Data System (ADS)

Shen, Lin; Xie, Liangxu; Yang, Mingjun

2017-04-01

Conformational sampling under rugged energy landscape is always a challenge in computer simulations. The recently developed integrated tempering sampling, together with its selective variant (SITS), emerges to be a powerful tool in exploring the free energy landscape or functional motions of various systems. The estimation of weighting factors constitutes a critical step in these methods and requires accurate calculation of partition function ratio between different thermodynamic states. In this work, we propose a new adaptive update algorithm to compute the weighting factors based on the weighted histogram analysis method (WHAM). The adaptive-WHAM algorithm with SITS is then applied to study the thermodynamic properties of several representative peptide systems solvated in an explicit water box. The performance of the new algorithm is validated in simulations of these solvated peptide systems. We anticipate more applications of this coupled optimisation and production algorithm to other complicated systems such as the biochemical reactions in solution.
Identifying online user reputation of user-object bipartite networks

NASA Astrophysics Data System (ADS)

Liu, Xiao-Lu; Liu, Jian-Guo; Yang, Kai; Guo, Qiang; Han, Jing-Ti

2017-02-01

Identifying online user reputation based on the rating information of the user-object bipartite networks is important for understanding online user collective behaviors. Based on the Bayesian analysis, we present a parameter-free algorithm for ranking online user reputation, where the user reputation is calculated based on the probability that their ratings are consistent with the main part of all user opinions. The experimental results show that the AUC values of the presented algorithm could reach 0.8929 and 0.8483 for the MovieLens and Netflix data sets, respectively, which is better than the results generated by the CR and IARR methods. Furthermore, the experimental results for different user groups indicate that the presented algorithm outperforms the iterative ranking methods in both ranking accuracy and computation complexity. Moreover, the results for the synthetic networks show that the computation complexity of the presented algorithm is a linear function of the network size, which suggests that the presented algorithm is very effective and efficient for the large scale dynamic online systems.
Efficient GW calculations using eigenvalue-eigenvector decomposition of the dielectric matrix

NASA Astrophysics Data System (ADS)

Nguyen, Huy-Viet; Pham, T. Anh; Rocca, Dario; Galli, Giulia

2011-03-01

During the past 25 years, the GW method has been successfully used to compute electronic quasi-particle excitation spectra of a variety of materials. It is however a computationally intensive technique, as it involves summations over occupied and empty electronic states, to evaluate both the Green function (G) and the dielectric matrix (DM) entering the expression of the screened Coulomb interaction (W). Recent developments have shown that eigenpotentials of DMs can be efficiently calculated without any explicit evaluation of empty states. In this work, we will present a computationally efficient approach to the calculations of GW spectra by combining a representation of DMs in terms of its eigenpotentials and a recently developed iterative algorithm. As a demonstration of the efficiency of the method, we will present calculations of the vertical ionization potentials of several systems. Work was funnded by SciDAC-e DE-FC02-06ER25777.
Massively parallel sparse matrix function calculations with NTPoly

NASA Astrophysics Data System (ADS)

Dawson, William; Nakajima, Takahito

2018-04-01

We present NTPoly, a massively parallel library for computing the functions of sparse, symmetric matrices. The theory of matrix functions is a well developed framework with a wide range of applications including differential equations, graph theory, and electronic structure calculations. One particularly important application area is diagonalization free methods in quantum chemistry. When the input and output of the matrix function are sparse, methods based on polynomial expansions can be used to compute matrix functions in linear time. We present a library based on these methods that can compute a variety of matrix functions. Distributed memory parallelization is based on a communication avoiding sparse matrix multiplication algorithm. OpenMP task parallellization is utilized to implement hybrid parallelization. We describe NTPoly's interface and show how it can be integrated with programs written in many different programming languages. We demonstrate the merits of NTPoly by performing large scale calculations on the K computer.
Research in Computational Aeroscience Applications Implemented on Advanced Parallel Computing Systems

NASA Technical Reports Server (NTRS)

Wigton, Larry

1996-01-01

Improving the numerical linear algebra routines for use in new Navier-Stokes codes, specifically Tim Barth's unstructured grid code, with spin-offs to TRANAIR is reported. A fast distance calculation routine for Navier-Stokes codes using the new one-equation turbulence models is written. The primary focus of this work was devoted to improving matrix-iterative methods. New algorithms have been developed which activate the full potential of classical Cray-class computers as well as distributed-memory parallel computers.
FPGA-based real-time phase measuring profilometry algorithm design and implementation

NASA Astrophysics Data System (ADS)

Zhan, Guomin; Tang, Hongwei; Zhong, Kai; Li, Zhongwei; Shi, Yusheng

2016-11-01

Phase measuring profilometry (PMP) has been widely used in many fields, like Computer Aided Verification (CAV), Flexible Manufacturing System (FMS) et al. High frame-rate (HFR) real-time vision-based feedback control will be a common demands in near future. However, the instruction time delay in the computer caused by numerous repetitive operations greatly limit the efficiency of data processing. FPGA has the advantages of pipeline architecture and parallel execution, and it fit for handling PMP algorithm. In this paper, we design a fully pipelined hardware architecture for PMP. The functions of hardware architecture includes rectification, phase calculation, phase shifting, and stereo matching. The experiment verified the performance of this method, and the factors that may influence the computation accuracy was analyzed.
Projection matrix acquisition for cone-beam computed tomography iterative reconstruction

NASA Astrophysics Data System (ADS)

Yang, Fuqiang; Zhang, Dinghua; Huang, Kuidong; Shi, Wenlong; Zhang, Caixin; Gao, Zongzhao

2017-02-01

Projection matrix is an essential and time-consuming part in computed tomography (CT) iterative reconstruction. In this article a novel calculation algorithm of three-dimensional (3D) projection matrix is proposed to quickly acquire the matrix for cone-beam CT (CBCT). The CT data needed to be reconstructed is considered as consisting of the three orthogonal sets of equally spaced and parallel planes, rather than the individual voxels. After getting the intersections the rays with the surfaces of the voxels, the coordinate points and vertex is compared to obtain the index value that the ray traversed. Without considering ray-slope to voxel, it just need comparing the position of two points. Finally, the computer simulation is used to verify the effectiveness of the algorithm.
Shared Memory Parallelization of an Implicit ADI-type CFD Code

NASA Technical Reports Server (NTRS)

Hauser, Th.; Huang, P. G.

1999-01-01

A parallelization study designed for ADI-type algorithms is presented using the OpenMP specification for shared-memory multiprocessor programming. Details of optimizations specifically addressed to cache-based computer architectures are described and performance measurements for the single and multiprocessor implementation are summarized. The paper demonstrates that optimization of memory access on a cache-based computer architecture controls the performance of the computational algorithm. A hybrid MPI/OpenMP approach is proposed for clusters of shared memory machines to further enhance the parallel performance. The method is applied to develop a new LES/DNS code, named LESTool. A preliminary DNS calculation of a fully developed channel flow at a Reynolds number of 180, Re(sub tau) = 180, has shown good agreement with existing data.
Computational model for fuel component supply into a combustion chamber of LRE

NASA Astrophysics Data System (ADS)

Teterev, A. V.; Mandrik, P. A.; Rudak, L. V.; Misyuchenko, N. I.

2017-12-01

A 2D-3D computational model for calculating a flow inside jet injectors that feed fuel components to a combustion chamber of a liquid rocket engine is described. The model is based on the gasdynamic calculation of compressible medium. Model software provides calculation of both one- and two-component injectors. Flow simulation in two-component injectors is realized using the scheme of separate supply of “gas-gas” or “gas-liquid” fuel components. An algorithm for converting a continuous liquid medium into a “cloud” of drops is described. Application areas of the developed model and the results of 2D simulation of injectors to obtain correction factors in the calculation formulas for fuel supply are discussed.
Sensitivity derivatives for advanced CFD algorithm and viscous modelling parameters via automatic differentiation

NASA Technical Reports Server (NTRS)

Green, Lawrence L.; Newman, Perry A.; Haigler, Kara J.

1993-01-01

The computational technique of automatic differentiation (AD) is applied to a three-dimensional thin-layer Navier-Stokes multigrid flow solver to assess the feasibility and computational impact of obtaining exact sensitivity derivatives typical of those needed for sensitivity analyses. Calculations are performed for an ONERA M6 wing in transonic flow with both the Baldwin-Lomax and Johnson-King turbulence models. The wing lift, drag, and pitching moment coefficients are differentiated with respect to two different groups of input parameters. The first group consists of the second- and fourth-order damping coefficients of the computational algorithm, whereas the second group consists of two parameters in the viscous turbulent flow physics modelling. Results obtained via AD are compared, for both accuracy and computational efficiency with the results obtained with divided differences (DD). The AD results are accurate, extremely simple to obtain, and show significant computational advantage over those obtained by DD for some cases.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Loef, P.A.; Smed, T.; Andersson, G.

The minimum singular value of the power flow Jacobian matrix has been used as a static voltage stability index, indicating the distance between the studied operating point and the steady state voltage stability limit. In this paper a fast method to calculate the minimum singular value and the corresponding (left and right) singular vectors is presented. The main advantages of the developed algorithm are the small amount of computation time needed, and that it only requires information available from an ordinary program for power flow calculations. Furthermore, the proposed method fully utilizes the sparsity of the power flow Jacobian matrixmore » and hence the memory requirements for the computation are low. These advantages are preserved when applied to various submatrices of the Jacobian matrix, which can be useful in constructing special voltage stability indices. The developed algorithm was applied to small test systems as well as to a large (real size) system with over 1000 nodes, with satisfactory results.« less
Analysis of stationary availability factor of two-level backbone computer networks with arbitrary topology

NASA Astrophysics Data System (ADS)

Rahman, P. A.

2018-05-01

This scientific paper deals with the two-level backbone computer networks with arbitrary topology. A specialized method, offered by the author for calculation of the stationary availability factor of the two-level backbone computer networks, based on the Markov reliability models for the set of the independent repairable elements with the given failure and repair rates and the methods of the discrete mathematics, is also discussed. A specialized algorithm, offered by the author for analysis of the network connectivity, taking into account different kinds of the network equipment failures, is also observed. Finally, this paper presents an example of calculation of the stationary availability factor for the backbone computer network with the given topology.
Investigation and Implementation of Matrix Permanent Algorithms for Identity Resolution

DTIC Science & Technology

2014-12-01

calculation of the permanent of a matrix whose dimension is a function of target count [21]. However, the optimal approach for computing the permanent is...presently unclear. The primary objective of this project was to determine the optimal computing strategy(-ies) for the matrix permanent in tactical and...solving various combinatorial problems (see [16] for details and appli- cations to a wide variety of problems) and thus can be applied to compute a
Layer-oriented multigrid wavefront reconstruction algorithms for multi-conjugate adaptive optics

NASA Astrophysics Data System (ADS)

Gilles, Luc; Ellerbroek, Brent L.; Vogel, Curtis R.

2003-02-01

Multi-conjugate adaptive optics (MCAO) systems with 104-105 degrees of freedom have been proposed for future giant telescopes. Using standard matrix methods to compute, optimize, and implement wavefront control algorithms for these systems is impractical, since the number of calculations required to compute and apply the reconstruction matrix scales respectively with the cube and the square of the number of AO degrees of freedom. In this paper, we develop an iterative sparse matrix implementation of minimum variance wavefront reconstruction for telescope diameters up to 32m with more than 104 actuators. The basic approach is the preconditioned conjugate gradient method, using a multigrid preconditioner incorporating a layer-oriented (block) symmetric Gauss-Seidel iterative smoothing operator. We present open-loop numerical simulation results to illustrate algorithm convergence.
Optical systolic solutions of linear algebraic equations

NASA Technical Reports Server (NTRS)

Neuman, C. P.; Casasent, D.

1984-01-01

The philosophy and data encoding possible in systolic array optical processor (SAOP) were reviewed. The multitude of linear algebraic operations achievable on this architecture is examined. These operations include such linear algebraic algorithms as: matrix-decomposition, direct and indirect solutions, implicit and explicit methods for partial differential equations, eigenvalue and eigenvector calculations, and singular value decomposition. This architecture can be utilized to realize general techniques for solving matrix linear and nonlinear algebraic equations, least mean square error solutions, FIR filters, and nested-loop algorithms for control engineering applications. The data flow and pipelining of operations, design of parallel algorithms and flexible architectures, application of these architectures to computationally intensive physical problems, error source modeling of optical processors, and matching of the computational needs of practical engineering problems to the capabilities of optical processors are emphasized.
GPUs benchmarking in subpixel image registration algorithm

NASA Astrophysics Data System (ADS)

Sanz-Sabater, Martin; Picazo-Bueno, Jose Angel; Micó, Vicente; Ferrerira, Carlos; Granero, Luis; Garcia, Javier

2015-05-01

Image registration techniques are used among different scientific fields, like medical imaging or optical metrology. The straightest way to calculate shifting between two images is using the cross correlation, taking the highest value of this correlation image. Shifting resolution is given in whole pixels which cannot be enough for certain applications. Better results can be achieved interpolating both images, as much as the desired resolution we want to get, and applying the same technique described before, but the memory needed by the system is significantly higher. To avoid memory consuming we are implementing a subpixel shifting method based on FFT. With the original images, subpixel shifting can be achieved multiplying its discrete Fourier transform by a linear phase with different slopes. This method is high time consuming method because checking a concrete shifting means new calculations. The algorithm, highly parallelizable, is very suitable for high performance computing systems. GPU (Graphics Processing Unit) accelerated computing became very popular more than ten years ago because they have hundreds of computational cores in a reasonable cheap card. In our case, we are going to register the shifting between two images, doing the first approach by FFT based correlation, and later doing the subpixel approach using the technique described before. We consider it as `brute force' method. So we will present a benchmark of the algorithm consisting on a first approach (pixel resolution) and then do subpixel resolution approaching, decreasing the shifting step in every loop achieving a high resolution in few steps. This program will be executed in three different computers. At the end, we will present the results of the computation, with different kind of CPUs and GPUs, checking the accuracy of the method, and the time consumed in each computer, discussing the advantages, disadvantages of the use of GPUs.
A hierarchical network-based algorithm for multi-scale watershed delineation

NASA Astrophysics Data System (ADS)

Castronova, Anthony M.; Goodall, Jonathan L.

2014-11-01

Watershed delineation is a process for defining a land area that contributes surface water flow to a single outlet point. It is a commonly used in water resources analysis to define the domain in which hydrologic process calculations are applied. There has been a growing effort over the past decade to improve surface elevation measurements in the U.S., which has had a significant impact on the accuracy of hydrologic calculations. Traditional watershed processing on these elevation rasters, however, becomes more burdensome as data resolution increases. As a result, processing of these datasets can be troublesome on standard desktop computers. This challenge has resulted in numerous works that aim to provide high performance computing solutions to large data, high resolution data, or both. This work proposes an efficient watershed delineation algorithm for use in desktop computing environments that leverages existing data, U.S. Geological Survey (USGS) National Hydrography Dataset Plus (NHD+), and open source software tools to construct watershed boundaries. This approach makes use of U.S. national-level hydrography data that has been precomputed using raster processing algorithms coupled with quality control routines. Our approach uses carefully arranged data and mathematical graph theory to traverse river networks and identify catchment boundaries. We demonstrate this new watershed delineation technique, compare its accuracy with traditional algorithms that derive watershed solely from digital elevation models, and then extend our approach to address subwatershed delineation. Our findings suggest that the open-source hierarchical network-based delineation procedure presented in the work is a promising approach to watershed delineation that can be used summarize publicly available datasets for hydrologic model input pre-processing. Through our analysis, we explore the benefits of reusing the NHD+ datasets for watershed delineation, and find that the our technique offers greater flexibility and extendability than traditional raster algorithms.
Computing Cooling Flows in Turbines

NASA Technical Reports Server (NTRS)

Gauntner, J.

1986-01-01

Algorithm developed for calculating both quantity of compressor bleed flow required to cool turbine and resulting decrease in efficiency due to cooling air injected into gas stream. Program intended for use with axial-flow, air-breathing, jet-propulsion engines with variety of airfoil-cooling configurations. Algorithm results compared extremely well with figures given by major engine manufacturers for given bulk-metal temperatures and cooling configurations. Program written in FORTRAN IV for batch execution.
KAM Tori Construction Algorithms

NASA Astrophysics Data System (ADS)

Wiesel, W.

In this paper we evaluate and compare two algorithms for the calculation of KAM tori in Hamiltonian systems. The direct fitting of a torus Fourier series to a numerically integrated trajectory is the first method, while an accelerated finite Fourier transform is the second method. The finite Fourier transform, with Hanning window functions, is by far superior in both computational loading and numerical accuracy. Some thoughts on applications of KAM tori are offered.

Computing the Free Energy Barriers for Less by Sampling with a Coarse Reference Potential while Retaining Accuracy of the Target Fine Model.

PubMed

Plotnikov, Nikolay V

2014-08-12

Proposed in this contribution is a protocol for calculating fine-physics (e.g., ab initio QM/MM) free-energy surfaces at a high level of accuracy locally (e.g., only at reactants and at the transition state for computing the activation barrier) from targeted fine-physics sampling and extensive exploratory coarse-physics sampling. The full free-energy surface is still computed but at a lower level of accuracy from coarse-physics sampling. The method is analytically derived in terms of the umbrella sampling and the free-energy perturbation methods which are combined with the thermodynamic cycle and the targeted sampling strategy of the paradynamics approach. The algorithm starts by computing low-accuracy fine-physics free-energy surfaces from the coarse-physics sampling in order to identify the reaction path and to select regions for targeted sampling. Thus, the algorithm does not rely on the coarse-physics minimum free-energy reaction path. Next, segments of high-accuracy free-energy surface are computed locally at selected regions from the targeted fine-physics sampling and are positioned relative to the coarse-physics free-energy shifts. The positioning is done by averaging the free-energy perturbations computed with multistep linear response approximation method. This method is analytically shown to provide results of the thermodynamic integration and the free-energy interpolation methods, while being extremely simple in implementation. Incorporating the metadynamics sampling to the algorithm is also briefly outlined. The application is demonstrated by calculating the B3LYP//6-31G*/MM free-energy barrier for an enzymatic reaction using a semiempirical PM6/MM reference potential. These modifications allow computing the activation free energies at a significantly reduced computational cost but at the same level of accuracy compared to computing full potential of mean force.
Computing the Free Energy Barriers for Less by Sampling with a Coarse Reference Potential while Retaining Accuracy of the Target Fine Model

PubMed Central

2015-01-01

Proposed in this contribution is a protocol for calculating fine-physics (e.g., ab initio QM/MM) free-energy surfaces at a high level of accuracy locally (e.g., only at reactants and at the transition state for computing the activation barrier) from targeted fine-physics sampling and extensive exploratory coarse-physics sampling. The full free-energy surface is still computed but at a lower level of accuracy from coarse-physics sampling. The method is analytically derived in terms of the umbrella sampling and the free-energy perturbation methods which are combined with the thermodynamic cycle and the targeted sampling strategy of the paradynamics approach. The algorithm starts by computing low-accuracy fine-physics free-energy surfaces from the coarse-physics sampling in order to identify the reaction path and to select regions for targeted sampling. Thus, the algorithm does not rely on the coarse-physics minimum free-energy reaction path. Next, segments of high-accuracy free-energy surface are computed locally at selected regions from the targeted fine-physics sampling and are positioned relative to the coarse-physics free-energy shifts. The positioning is done by averaging the free-energy perturbations computed with multistep linear response approximation method. This method is analytically shown to provide results of the thermodynamic integration and the free-energy interpolation methods, while being extremely simple in implementation. Incorporating the metadynamics sampling to the algorithm is also briefly outlined. The application is demonstrated by calculating the B3LYP//6-31G*/MM free-energy barrier for an enzymatic reaction using a semiempirical PM6/MM reference potential. These modifications allow computing the activation free energies at a significantly reduced computational cost but at the same level of accuracy compared to computing full potential of mean force. PMID:25136268
Smart algorithms and adaptive methods in computational fluid dynamics

NASA Astrophysics Data System (ADS)

Tinsley Oden, J.

1989-05-01

A review is presented of the use of smart algorithms which employ adaptive methods in processing large amounts of data in computational fluid dynamics (CFD). Smart algorithms use a rationally based set of criteria for automatic decision making in an attempt to produce optimal simulations of complex fluid dynamics problems. The information needed to make these decisions is not known beforehand and evolves in structure and form during the numerical solution of flow problems. Once the code makes a decision based on the available data, the structure of the data may change, and criteria may be reapplied in order to direct the analysis toward an acceptable end. Intelligent decisions are made by processing vast amounts of data that evolve unpredictably during the calculation. The basic components of adaptive methods and their application to complex problems of fluid dynamics are reviewed. The basic components of adaptive methods are: (1) data structures, that is what approaches are available for modifying data structures of an approximation so as to reduce errors; (2) error estimation, that is what techniques exist for estimating error evolution in a CFD calculation; and (3) solvers, what algorithms are available which can function in changing meshes. Numerical examples which demonstrate the viability of these approaches are presented.
Improved treatment of exact exchange in Quantum ESPRESSO

DOE PAGES

Barnes, Taylor A.; Kurth, Thorsten; Carrier, Pierre; ...

2017-01-18

Here, we present an algorithm and implementation for the parallel computation of exact exchange in Quantum ESPRESSO (QE) that exhibits greatly improved strong scaling. QE is an open-source software package for electronic structure calculations using plane wave density functional theory, and supports the use of local, semi-local, and hybrid DFT functionals. Wider application of hybrid functionals is desirable for the improved simulation of electronic band energy alignments and thermodynamic properties, but the computational complexity of evaluating the exact exchange potential limits the practical application of hybrid functionals to large systems and requires efficient implementations. We demonstrate that existing implementations ofmore » hybrid DFT that utilize a single data structure for both the local and exact exchange regions of the code are significantly limited in the degree of parallelization achievable. We present a band-pair parallelization approach, in which the calculation of exact exchange is parallelized and evaluated independently from the parallelization of the remainder of the calculation, with the wavefunction data being efficiently transformed on-the-fly into a form that is optimal for each part of the calculation. For a 64 water molecule supercell, our new algorithm reduces the overall time to solution by nearly an order of magnitude.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Barnes, Taylor A.; Kurth, Thorsten; Carrier, Pierre

Here, we present an algorithm and implementation for the parallel computation of exact exchange in Quantum ESPRESSO (QE) that exhibits greatly improved strong scaling. QE is an open-source software package for electronic structure calculations using plane wave density functional theory, and supports the use of local, semi-local, and hybrid DFT functionals. Wider application of hybrid functionals is desirable for the improved simulation of electronic band energy alignments and thermodynamic properties, but the computational complexity of evaluating the exact exchange potential limits the practical application of hybrid functionals to large systems and requires efficient implementations. We demonstrate that existing implementations ofmore » hybrid DFT that utilize a single data structure for both the local and exact exchange regions of the code are significantly limited in the degree of parallelization achievable. We present a band-pair parallelization approach, in which the calculation of exact exchange is parallelized and evaluated independently from the parallelization of the remainder of the calculation, with the wavefunction data being efficiently transformed on-the-fly into a form that is optimal for each part of the calculation. For a 64 water molecule supercell, our new algorithm reduces the overall time to solution by nearly an order of magnitude.« less
Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver

DOE PAGES

Shao, Meiyue; Aktulga, H. Metin; Yang, Chao; ...

2017-09-14

In this paper, we describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. Themore » use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. Finally, we also discuss the implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.« less
Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shao, Meiyue; Aktulga, H. Metin; Yang, Chao

In this paper, we describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. Themore » use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. Finally, we also discuss the implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.« less
Connes distance function on fuzzy sphere and the connection between geometry and statistics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Devi, Yendrembam Chaoba, E-mail: chaoba@bose.res.in; Chakraborty, Biswajit, E-mail: biswajit@bose.res.in; Prajapat, Shivraj, E-mail: shraprajapat@gmail.com

An algorithm to compute Connes spectral distance, adaptable to the Hilbert-Schmidt operatorial formulation of non-commutative quantum mechanics, was developed earlier by introducing the appropriate spectral triple and used to compute infinitesimal distances in the Moyal plane, revealing a deep connection between geometry and statistics. In this paper, using the same algorithm, the Connes spectral distance has been calculated in the Hilbert-Schmidt operatorial formulation for the fuzzy sphere whose spatial coordinates satisfy the su(2) algebra. This has been computed for both the discrete and the Perelemov’s SU(2) coherent state. Here also, we get a connection between geometry and statistics which ismore » shown by computing the infinitesimal distance between mixed states on the quantum Hilbert space of a particular fuzzy sphere, indexed by n ∈ ℤ/2.« less
Calculation of absolute protein-ligand binding free energy using distributed replica sampling.

PubMed

Rodinger, Tomas; Howell, P Lynne; Pomès, Régis

2008-10-21

Distributed replica sampling [T. Rodinger et al., J. Chem. Theory Comput. 2, 725 (2006)] is a simple and general scheme for Boltzmann sampling of conformational space by computer simulation in which multiple replicas of the system undergo a random walk in reaction coordinate or temperature space. Individual replicas are linked through a generalized Hamiltonian containing an extra potential energy term or bias which depends on the distribution of all replicas, thus enforcing the desired sampling distribution along the coordinate or parameter of interest regardless of free energy barriers. In contrast to replica exchange methods, efficient implementation of the algorithm does not require synchronicity of the individual simulations. The algorithm is inherently suited for large-scale simulations using shared or heterogeneous computing platforms such as a distributed network. In this work, we build on our original algorithm by introducing Boltzmann-weighted jumping, which allows moves of a larger magnitude and thus enhances sampling efficiency along the reaction coordinate. The approach is demonstrated using a realistic and biologically relevant application; we calculate the standard binding free energy of benzene to the L99A mutant of T4 lysozyme. Distributed replica sampling is used in conjunction with thermodynamic integration to compute the potential of mean force for extracting the ligand from protein and solvent along a nonphysical spatial coordinate. Dynamic treatment of the reaction coordinate leads to faster statistical convergence of the potential of mean force than a conventional static coordinate, which suffers from slow transitions on a rugged potential energy surface.
Calculation of absolute protein-ligand binding free energy using distributed replica sampling

NASA Astrophysics Data System (ADS)

Rodinger, Tomas; Howell, P. Lynne; Pomès, Régis

2008-10-01

Distributed replica sampling [T. Rodinger et al., J. Chem. Theory Comput. 2, 725 (2006)] is a simple and general scheme for Boltzmann sampling of conformational space by computer simulation in which multiple replicas of the system undergo a random walk in reaction coordinate or temperature space. Individual replicas are linked through a generalized Hamiltonian containing an extra potential energy term or bias which depends on the distribution of all replicas, thus enforcing the desired sampling distribution along the coordinate or parameter of interest regardless of free energy barriers. In contrast to replica exchange methods, efficient implementation of the algorithm does not require synchronicity of the individual simulations. The algorithm is inherently suited for large-scale simulations using shared or heterogeneous computing platforms such as a distributed network. In this work, we build on our original algorithm by introducing Boltzmann-weighted jumping, which allows moves of a larger magnitude and thus enhances sampling efficiency along the reaction coordinate. The approach is demonstrated using a realistic and biologically relevant application; we calculate the standard binding free energy of benzene to the L99A mutant of T4 lysozyme. Distributed replica sampling is used in conjunction with thermodynamic integration to compute the potential of mean force for extracting the ligand from protein and solvent along a nonphysical spatial coordinate. Dynamic treatment of the reaction coordinate leads to faster statistical convergence of the potential of mean force than a conventional static coordinate, which suffers from slow transitions on a rugged potential energy surface.
Introduction to Mathematica® for Physicists

NASA Astrophysics Data System (ADS)

Grozin, Andrey

We were taught at calculus classes that integration is an art, not a science (in contrast to differentiation—even a monkey can be trained to take derivatives). And we were taught wrong. The Risch algorithm (which is known for decades) allows one to find, in a finite number of steps, if a given indefinite integral can be taken in elementary functions, and if so, to calculate it. This algorithm has been constructed in works by an American mathematician Risch near 1970; many cases were not analyzed completely in these works and were later considered by other mathematicians. The algorithm is very complicated, and no computer algebra system implements it fully. Its implementation in Mathematica is rather complete, even with extensions to some classes of special functions, but details are not publicly known. Strictly speaking, it is not quite an algorithm, because it contains algorithmically unsolvable subproblems, such as finding out if a given combination of elementary functions vanishes. But in practice computer algebra systems are quite good in solving such problems. Here we shall consider, at a very elementary level, the main ideas of the Risch algorithm; see [16] for more details.
F-8C adaptive control law refinement and software development

NASA Technical Reports Server (NTRS)

Hartmann, G. L.; Stein, G.

1981-01-01

An explicit adaptive control algorithm based on maximum likelihood estimation of parameters was designed. To avoid iterative calculations, the algorithm uses parallel channels of Kalman filters operating at fixed locations in parameter space. This algorithm was implemented in NASA/DFRC's Remotely Augmented Vehicle (RAV) facility. Real-time sensor outputs (rate gyro, accelerometer, surface position) are telemetered to a ground computer which sends new gain values to an on-board system. Ground test data and flight records were used to establish design values of noise statistics and to verify the ground-based adaptive software.
K-Nearest Neighbor Algorithm Optimization in Text Categorization

NASA Astrophysics Data System (ADS)

Chen, Shufeng

2018-01-01

K-Nearest Neighbor (KNN) classification algorithm is one of the simplest methods of data mining. It has been widely used in classification, regression and pattern recognition. The traditional KNN method has some shortcomings such as large amount of sample computation and strong dependence on the sample library capacity. In this paper, a method of representative sample optimization based on CURE algorithm is proposed. On the basis of this, presenting a quick algorithm QKNN (Quick k-nearest neighbor) to find the nearest k neighbor samples, which greatly reduces the similarity calculation. The experimental results show that this algorithm can effectively reduce the number of samples and speed up the search for the k nearest neighbor samples to improve the performance of the algorithm.
A point kernel algorithm for microbeam radiation therapy

NASA Astrophysics Data System (ADS)

Debus, Charlotte; Oelfke, Uwe; Bartzsch, Stefan

2017-11-01

Microbeam radiation therapy (MRT) is a treatment approach in radiation therapy where the treatment field is spatially fractionated into arrays of a few tens of micrometre wide planar beams of unusually high peak doses separated by low dose regions of several hundred micrometre width. In preclinical studies, this treatment approach has proven to spare normal tissue more effectively than conventional radiation therapy, while being equally efficient in tumour control. So far dose calculations in MRT, a prerequisite for future clinical applications are based on Monte Carlo simulations. However, they are computationally expensive, since scoring volumes have to be small. In this article a kernel based dose calculation algorithm is presented that splits the calculation into photon and electron mediated energy transport, and performs the calculation of peak and valley doses in typical MRT treatment fields within a few minutes. Kernels are analytically calculated depending on the energy spectrum and material composition. In various homogeneous materials peak, valley doses and microbeam profiles are calculated and compared to Monte Carlo simulations. For a microbeam exposure of an anthropomorphic head phantom calculated dose values are compared to measurements and Monte Carlo calculations. Except for regions close to material interfaces calculated peak dose values match Monte Carlo results within 4% and valley dose values within 8% deviation. No significant differences are observed between profiles calculated by the kernel algorithm and Monte Carlo simulations. Measurements in the head phantom agree within 4% in the peak and within 10% in the valley region. The presented algorithm is attached to the treatment planning platform VIRTUOS. It was and is used for dose calculations in preclinical and pet-clinical trials at the biomedical beamline ID17 of the European synchrotron radiation facility in Grenoble, France.
Algorithms for Computing the Magnetic Field, Vector Potential, and Field Derivatives for a Thin Solenoid with Uniform Current Density

DOE Office of Scientific and Technical Information (OSTI.GOV)

Walstrom, Peter Lowell

A numerical algorithm for computing the field components B r and B z and their r and z derivatives with open boundaries in cylindrical coordinates for radially thin solenoids with uniform current density is described in this note. An algorithm for computing the vector potential A θ is also described. For the convenience of the reader, derivations of the final expressions from their defining integrals are given in detail, since their derivations are not all easily found in textbooks. Numerical calculations are based on evaluation of complete elliptic integrals using the Bulirsch algorithm cel. The (apparently) new feature of themore » algorithms described in this note applies to cases where the field point is outside of the bore of the solenoid and the field-point radius approaches the solenoid radius. Since the elliptic integrals of the third kind normally used in computing B z and A θ become infinite in this region of parameter space, fields for points with the axial coordinate z outside of the ends of the solenoid and near the solenoid radius are treated by use of elliptic integrals of the third kind of modified argument, derived by use of an addition theorem. Also, the algorithms also avoid the numerical difficulties the textbook solutions have for points near the axis arising from explicit factors of 1/r or 1/r 2 in the some of the expressions.« less
GPU-accelerated algorithms for many-particle continuous-time quantum walks

NASA Astrophysics Data System (ADS)

Piccinini, Enrico; Benedetti, Claudia; Siloi, Ilaria; Paris, Matteo G. A.; Bordone, Paolo

2017-06-01

Many-particle continuous-time quantum walks (CTQWs) represent a resource for several tasks in quantum technology, including quantum search algorithms and universal quantum computation. In order to design and implement CTQWs in a realistic scenario, one needs effective simulation tools for Hamiltonians that take into account static noise and fluctuations in the lattice, i.e. Hamiltonians containing stochastic terms. To this aim, we suggest a parallel algorithm based on the Taylor series expansion of the evolution operator, and compare its performances with those of algorithms based on the exact diagonalization of the Hamiltonian or a 4th order Runge-Kutta integration. We prove that both Taylor-series expansion and Runge-Kutta algorithms are reliable and have a low computational cost, the Taylor-series expansion showing the additional advantage of a memory allocation not depending on the precision of calculation. Both algorithms are also highly parallelizable within the SIMT paradigm, and are thus suitable for GPGPU computing. In turn, we have benchmarked 4 NVIDIA GPUs and 3 quad-core Intel CPUs for a 2-particle system over lattices of increasing dimension, showing that the speedup provided by GPU computing, with respect to the OPENMP parallelization, lies in the range between 8x and (more than) 20x, depending on the frequency of post-processing. GPU-accelerated codes thus allow one to overcome concerns about the execution time, and make it possible simulations with many interacting particles on large lattices, with the only limit of the memory available on the device.
Calculating Higher-Order Moments of Phylogenetic Stochastic Mapping Summaries in Linear Time.

PubMed

Dhar, Amrit; Minin, Vladimir N

2017-05-01

Stochastic mapping is a simulation-based method for probabilistically mapping substitution histories onto phylogenies according to continuous-time Markov models of evolution. This technique can be used to infer properties of the evolutionary process on the phylogeny and, unlike parsimony-based mapping, conditions on the observed data to randomly draw substitution mappings that do not necessarily require the minimum number of events on a tree. Most stochastic mapping applications simulate substitution mappings only to estimate the mean and/or variance of two commonly used mapping summaries: the number of particular types of substitutions (labeled substitution counts) and the time spent in a particular group of states (labeled dwelling times) on the tree. Fast, simulation-free algorithms for calculating the mean of stochastic mapping summaries exist. Importantly, these algorithms scale linearly in the number of tips/leaves of the phylogenetic tree. However, to our knowledge, no such algorithm exists for calculating higher-order moments of stochastic mapping summaries. We present one such simulation-free dynamic programming algorithm that calculates prior and posterior mapping variances and scales linearly in the number of phylogeny tips. Our procedure suggests a general framework that can be used to efficiently compute higher-order moments of stochastic mapping summaries without simulations. We demonstrate the usefulness of our algorithm by extending previously developed statistical tests for rate variation across sites and for detecting evolutionarily conserved regions in genomic sequences.
Calculating Higher-Order Moments of Phylogenetic Stochastic Mapping Summaries in Linear Time

PubMed Central

Dhar, Amrit

2017-01-01

Abstract Stochastic mapping is a simulation-based method for probabilistically mapping substitution histories onto phylogenies according to continuous-time Markov models of evolution. This technique can be used to infer properties of the evolutionary process on the phylogeny and, unlike parsimony-based mapping, conditions on the observed data to randomly draw substitution mappings that do not necessarily require the minimum number of events on a tree. Most stochastic mapping applications simulate substitution mappings only to estimate the mean and/or variance of two commonly used mapping summaries: the number of particular types of substitutions (labeled substitution counts) and the time spent in a particular group of states (labeled dwelling times) on the tree. Fast, simulation-free algorithms for calculating the mean of stochastic mapping summaries exist. Importantly, these algorithms scale linearly in the number of tips/leaves of the phylogenetic tree. However, to our knowledge, no such algorithm exists for calculating higher-order moments of stochastic mapping summaries. We present one such simulation-free dynamic programming algorithm that calculates prior and posterior mapping variances and scales linearly in the number of phylogeny tips. Our procedure suggests a general framework that can be used to efficiently compute higher-order moments of stochastic mapping summaries without simulations. We demonstrate the usefulness of our algorithm by extending previously developed statistical tests for rate variation across sites and for detecting evolutionarily conserved regions in genomic sequences. PMID:28177780
Patient‐specific CT dosimetry calculation: a feasibility study

PubMed Central

Xie, Huchen; Cheng, Jason Y.; Ning, Holly; Zhuge, Ying; Miller, Robert W.

2011-01-01

Current estimation of radiation dose from computed tomography (CT) scans on patients has relied on the measurement of Computed Tomography Dose Index (CTDI) in standard cylindrical phantoms, and calculations based on mathematical representations of “standard man”. Radiation dose to both adult and pediatric patients from a CT scan has been a concern, as noted in recent reports. The purpose of this study was to investigate the feasibility of adapting a radiation treatment planning system (RTPS) to provide patient‐specific CT dosimetry. A radiation treatment planning system was modified to calculate patient‐specific CT dose distributions, which can be represented by dose at specific points within an organ of interest, as well as organ dose‐volumes (after image segmentation) for a GE Light Speed Ultra Plus CT scanner. The RTPS calculation algorithm is based on a semi‐empirical, measured correction‐based algorithm, which has been well established in the radiotherapy community. Digital representations of the physical phantoms (virtual phantom) were acquired with the GE CT scanner in axial mode. Thermoluminescent dosimeter (TLDs) measurements in pediatric anthropomorphic phantoms were utilized to validate the dose at specific points within organs of interest relative to RTPS calculations and Monte Carlo simulations of the same virtual phantoms (digital representation). Congruence of the calculated and measured point doses for the same physical anthropomorphic phantom geometry was used to verify the feasibility of the method. The RTPS algorithm can be extended to calculate the organ dose by calculating a dose distribution point‐by‐point for a designated volume. Electron Gamma Shower (EGSnrc) codes for radiation transport calculations developed by National Research Council of Canada (NRCC) were utilized to perform the Monte Carlo (MC) simulation. In general, the RTPS and MC dose calculations are within 10% of the TLD measurements for the infant and child chest scans. With respect to the dose comparisons for the head, the RTPS dose calculations are slightly higher (10%–20%) than the TLD measurements, while the MC results were within 10% of the TLD measurements. The advantage of the algebraic dose calculation engine of the RTPS is a substantially reduced computation time (minutes vs. days) relative to Monte Carlo calculations, as well as providing patient‐specific dose estimation. It also provides the basis for a more elaborate reporting of dosimetric results, such as patient specific organ dose volumes after image segmentation. PACS numbers: 87.55.D‐, 87.57.Q‐, 87.53.Bn, 87.55.K‐ PMID:22089016
Area collapse algorithm computing new curve of 2D geometric objects

NASA Astrophysics Data System (ADS)

Buczek, Michał Mateusz

2017-06-01

The processing of cartographic data demands human involvement. Up-to-date algorithms try to automate a part of this process. The goal is to obtain a digital model, or additional information about shape and topology of input geometric objects. A topological skeleton is one of the most important tools in the branch of science called shape analysis. It represents topological and geometrical characteristics of input data. Its plot depends on using algorithms such as medial axis, skeletonization, erosion, thinning, area collapse and many others. Area collapse, also known as dimension change, replaces input data with lower-dimensional geometric objects like, for example, a polygon with a polygonal chain, a line segment with a point. The goal of this paper is to introduce a new algorithm for the automatic calculation of polygonal chains representing a 2D polygon. The output is entirely contained within the area of the input polygon, and it has a linear plot without branches. The computational process is automatic and repeatable. The requirements of input data are discussed. The author analyzes results based on the method of computing ends of output polygonal chains. Additional methods to improve results are explored. The algorithm was tested on real-world cartographic data received from BDOT/GESUT databases, and on point clouds from laser scanning. An implementation for computing hatching of embankment is described.

A novel measure and significance testing in data analysis of cell image segmentation.

PubMed

Wu, Jin Chu; Halter, Michael; Kacker, Raghu N; Elliott, John T; Plant, Anne L

2017-03-14

Cell image segmentation (CIS) is an essential part of quantitative imaging of biological cells. Designing a performance measure and conducting significance testing are critical for evaluating and comparing the CIS algorithms for image-based cell assays in cytometry. Many measures and methods have been proposed and implemented to evaluate segmentation methods. However, computing the standard errors (SE) of the measures and their correlation coefficient is not described, and thus the statistical significance of performance differences between CIS algorithms cannot be assessed. We propose the total error rate (TER), a novel performance measure for segmenting all cells in the supervised evaluation. The TER statistically aggregates all misclassification error rates (MER) by taking cell sizes as weights. The MERs are for segmenting each single cell in the population. The TER is fully supported by the pairwise comparisons of MERs using 106 manually segmented ground-truth cells with different sizes and seven CIS algorithms taken from ImageJ. Further, the SE and 95% confidence interval (CI) of TER are computed based on the SE of MER that is calculated using the bootstrap method. An algorithm for computing the correlation coefficient of TERs between two CIS algorithms is also provided. Hence, the 95% CI error bars can be used to classify CIS algorithms. The SEs of TERs and their correlation coefficient can be employed to conduct the hypothesis testing, while the CIs overlap, to determine the statistical significance of the performance differences between CIS algorithms. A novel measure TER of CIS is proposed. The TER's SEs and correlation coefficient are computed. Thereafter, CIS algorithms can be evaluated and compared statistically by conducting the significance testing.
Transient Solid Dynamics Simulations on the Sandia/Intel Teraflop Computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Attaway, S.; Brown, K.; Gardner, D.

1997-12-31

Transient solid dynamics simulations are among the most widely used engineering calculations. Industrial applications include vehicle crashworthiness studies, metal forging, and powder compaction prior to sintering. These calculations are also critical to defense applications including safety studies and weapons simulations. The practical importance of these calculations and their computational intensiveness make them natural candidates for parallelization. This has proved to be difficult, and existing implementations fail to scale to more than a few dozen processors. In this paper we describe our parallelization of PRONTO, Sandia`s transient solid dynamics code, via a novel algorithmic approach that utilizes multiple decompositions for differentmore » key segments of the computations, including the material contact calculation. This latter calculation is notoriously difficult to perform well in parallel, because it involves dynamically changing geometry, global searches for elements in contact, and unstructured communications among the compute nodes. Our approach scales to at least 3600 compute nodes of the Sandia/Intel Teraflop computer (the largest set of nodes to which we have had access to date) on problems involving millions of finite elements. On this machine we can simulate models using more than ten- million elements in a few tenths of a second per timestep, and solve problems more than 3000 times faster than a single processor Cray Jedi.« less
A user's manual for DELSOL3: A computer code for calculating the optical performance and optimal system design for solar thermal central receiver plants

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kistler, B.L.

DELSOL3 is a revised and updated version of the DELSOL2 computer program (SAND81-8237) for calculating collector field performance and layout and optimal system design for solar thermal central receiver plants. The code consists of a detailed model of the optical performance, a simpler model of the non-optical performance, an algorithm for field layout, and a searching algorithm to find the best system design based on energy cost. The latter two features are coupled to a cost model of central receiver components and an economic model for calculating energy costs. The code can handle flat, focused and/or canted heliostats, and externalmore » cylindrical, multi-aperture cavity, and flat plate receivers. The program optimizes the tower height, receiver size, field layout, heliostat spacings, and tower position at user specified power levels subject to flux limits on the receiver and land constraints for field layout. DELSOL3 maintains the advantages of speed and accuracy which are characteristics of DELSOL2.« less
Toward real-time diffuse optical tomography: accelerating light propagation modeling employing parallel computing on GPU and CPU.

PubMed

Doulgerakis, Matthaios; Eggebrecht, Adam; Wojtkiewicz, Stanislaw; Culver, Joseph; Dehghani, Hamid

2017-12-01

Parameter recovery in diffuse optical tomography is a computationally expensive algorithm, especially when used for large and complex volumes, as in the case of human brain functional imaging. The modeling of light propagation, also known as the forward problem, is the computational bottleneck of the recovery algorithm, whereby the lack of a real-time solution is impeding practical and clinical applications. The objective of this work is the acceleration of the forward model, within a diffusion approximation-based finite-element modeling framework, employing parallelization to expedite the calculation of light propagation in realistic adult head models. The proposed methodology is applicable for modeling both continuous wave and frequency-domain systems with the results demonstrating a 10-fold speed increase when GPU architectures are available, while maintaining high accuracy. It is shown that, for a very high-resolution finite-element model of the adult human head with ∼600,000 nodes, consisting of heterogeneous layers, light propagation can be calculated at ∼0.25 s/excitation source. (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).
A Highly Parallelized Special-Purpose Computer for Many-Body Simulations with an Arbitrary Central Force: MD-GRAPE

NASA Astrophysics Data System (ADS)

Fukushige, Toshiyuki; Taiji, Makoto; Makino, Junichiro; Ebisuzaki, Toshikazu; Sugimoto, Daiichiro

1996-09-01

We have developed a parallel, pipelined special-purpose computer for N-body simulations, MD-GRAPE (for "GRAvity PipE"). In gravitational N- body simulations, almost all computing time is spent on the calculation of interactions between particles. GRAPE is specialized hardware to calculate these interactions. It is used with a general-purpose front-end computer that performs all calculations other than the force calculation. MD-GRAPE is the first parallel GRAPE that can calculate an arbitrary central force. A force different from a pure 1/r potential is necessary for N-body simulations with periodic boundary conditions using the Ewald or particle-particle/particle-mesh (P^3^M) method. MD-GRAPE accelerates the calculation of particle-particle force for these algorithms. An MD- GRAPE board has four MD chips and its peak performance is 4.2 GFLOPS. On an MD-GRAPE board, a cosmological N-body simulation takes 6O0(N/10^6^)^3/2^ s per step for the Ewald method, where N is the number of particles, and would take 24O(N/10^6^) s per step for the P^3^M method, in a uniform distribution of particles.
Adaptive beamlet-based finite-size pencil beam dose calculation for independent verification of IMRT and VMAT.

PubMed

Park, Justin C; Li, Jonathan G; Arhjoul, Lahcen; Yan, Guanghua; Lu, Bo; Fan, Qiyong; Liu, Chihray

2015-04-01

The use of sophisticated dose calculation procedure in modern radiation therapy treatment planning is inevitable in order to account for complex treatment fields created by multileaf collimators (MLCs). As a consequence, independent volumetric dose verification is time consuming, which affects the efficiency of clinical workflow. In this study, the authors present an efficient adaptive beamlet-based finite-size pencil beam (AB-FSPB) dose calculation algorithm that minimizes the computational procedure while preserving the accuracy. The computational time of finite-size pencil beam (FSPB) algorithm is proportional to the number of infinitesimal and identical beamlets that constitute an arbitrary field shape. In AB-FSPB, dose distribution from each beamlet is mathematically modeled such that the sizes of beamlets to represent an arbitrary field shape no longer need to be infinitesimal nor identical. As a result, it is possible to represent an arbitrary field shape with combinations of different sized and minimal number of beamlets. In addition, the authors included the model parameters to consider MLC for its rounded edge and transmission. Root mean square error (RMSE) between treatment planning system and conventional FSPB on a 10 × 10 cm(2) square field using 10 × 10, 2.5 × 2.5, and 0.5 × 0.5 cm(2) beamlet sizes were 4.90%, 3.19%, and 2.87%, respectively, compared with RMSE of 1.10%, 1.11%, and 1.14% for AB-FSPB. This finding holds true for a larger square field size of 25 × 25 cm(2), where RMSE for 25 × 25, 2.5 × 2.5, and 0.5 × 0.5 cm(2) beamlet sizes were 5.41%, 4.76%, and 3.54% in FSPB, respectively, compared with RMSE of 0.86%, 0.83%, and 0.88% for AB-FSPB. It was found that AB-FSPB could successfully account for the MLC transmissions without major discrepancy. The algorithm was also graphical processing unit (GPU) compatible to maximize its computational speed. For an intensity modulated radiation therapy (∼12 segments) and a volumetric modulated arc therapy fields (∼90 control points) with a 3D grid size of 2.0 × 2.0 × 2.0 mm(3), dose was computed within 3-5 and 10-15 s timeframe, respectively. The authors have developed an efficient adaptive beamlet-based pencil beam dose calculation algorithm. The fast computation nature along with GPU compatibility has shown better performance than conventional FSPB. This enables the implementation of AB-FSPB in the clinical environment for independent volumetric dose verification.
Report of the AAPM Task Group No. 105: Issues associated with clinical implementation of Monte Carlo-based photon and electron external beam treatment planning.

PubMed

Chetty, Indrin J; Curran, Bruce; Cygler, Joanna E; DeMarco, John J; Ezzell, Gary; Faddegon, Bruce A; Kawrakow, Iwan; Keall, Paul J; Liu, Helen; Ma, C M Charlie; Rogers, D W O; Seuntjens, Jan; Sheikh-Bagheri, Daryoush; Siebers, Jeffrey V

2007-12-01

The Monte Carlo (MC) method has been shown through many research studies to calculate accurate dose distributions for clinical radiotherapy, particularly in heterogeneous patient tissues where the effects of electron transport cannot be accurately handled with conventional, deterministic dose algorithms. Despite its proven accuracy and the potential for improved dose distributions to influence treatment outcomes, the long calculation times previously associated with MC simulation rendered this method impractical for routine clinical treatment planning. However, the development of faster codes optimized for radiotherapy calculations and improvements in computer processor technology have substantially reduced calculation times to, in some instances, within minutes on a single processor. These advances have motivated several major treatment planning system vendors to embark upon the path of MC techniques. Several commercial vendors have already released or are currently in the process of releasing MC algorithms for photon and/or electron beam treatment planning. Consequently, the accessibility and use of MC treatment planning algorithms may well become widespread in the radiotherapy community. With MC simulation, dose is computed stochastically using first principles; this method is therefore quite different from conventional dose algorithms. Issues such as statistical uncertainties, the use of variance reduction techniques, the ability to account for geometric details in the accelerator treatment head simulation, and other features, are all unique components of a MC treatment planning algorithm. Successful implementation by the clinical physicist of such a system will require an understanding of the basic principles of MC techniques. The purpose of this report, while providing education and review on the use of MC simulation in radiotherapy planning, is to set out, for both users and developers, the salient issues associated with clinical implementation and experimental verification of MC dose algorithms. As the MC method is an emerging technology, this report is not meant to be prescriptive. Rather, it is intended as a preliminary report to review the tenets of the MC method and to provide the framework upon which to build a comprehensive program for commissioning and routine quality assurance of MC-based treatment planning systems.
Evolving Non-Dominated Parameter Sets for Computational Models from Multiple Experiments

NASA Astrophysics Data System (ADS)

Lane, Peter C. R.; Gobet, Fernand

2013-03-01

Creating robust, reproducible and optimal computational models is a key challenge for theorists in many sciences. Psychology and cognitive science face particular challenges as large amounts of data are collected and many models are not amenable to analytical techniques for calculating parameter sets. Particular problems are to locate the full range of acceptable model parameters for a given dataset, and to confirm the consistency of model parameters across different datasets. Resolving these problems will provide a better understanding of the behaviour of computational models, and so support the development of general and robust models. In this article, we address these problems using evolutionary algorithms to develop parameters for computational models against multiple sets of experimental data; in particular, we propose the `speciated non-dominated sorting genetic algorithm' for evolving models in several theories. We discuss the problem of developing a model of categorisation using twenty-nine sets of data and models drawn from four different theories. We find that the evolutionary algorithms generate high quality models, adapted to provide a good fit to all available data.
Energy Conservation Using Dynamic Voltage Frequency Scaling for Computational Cloud

PubMed Central

Florence, A. Paulin; Shanthi, V.; Simon, C. B. Sunil

2016-01-01

Cloud computing is a new technology which supports resource sharing on a “Pay as you go” basis around the world. It provides various services such as SaaS, IaaS, and PaaS. Computation is a part of IaaS and the entire computational requests are to be served efficiently with optimal power utilization in the cloud. Recently, various algorithms are developed to reduce power consumption and even Dynamic Voltage and Frequency Scaling (DVFS) scheme is also used in this perspective. In this paper we have devised methodology which analyzes the behavior of the given cloud request and identifies the associated type of algorithm. Once the type of algorithm is identified, using their asymptotic notations, its time complexity is calculated. Using best fit strategy the appropriate host is identified and the incoming job is allocated to the victimized host. Using the measured time complexity the required clock frequency of the host is measured. According to that CPU frequency is scaled up or down using DVFS scheme, enabling energy to be saved up to 55% of total Watts consumption. PMID:27239551
Energy Conservation Using Dynamic Voltage Frequency Scaling for Computational Cloud.

PubMed

Florence, A Paulin; Shanthi, V; Simon, C B Sunil

2016-01-01

Cloud computing is a new technology which supports resource sharing on a "Pay as you go" basis around the world. It provides various services such as SaaS, IaaS, and PaaS. Computation is a part of IaaS and the entire computational requests are to be served efficiently with optimal power utilization in the cloud. Recently, various algorithms are developed to reduce power consumption and even Dynamic Voltage and Frequency Scaling (DVFS) scheme is also used in this perspective. In this paper we have devised methodology which analyzes the behavior of the given cloud request and identifies the associated type of algorithm. Once the type of algorithm is identified, using their asymptotic notations, its time complexity is calculated. Using best fit strategy the appropriate host is identified and the incoming job is allocated to the victimized host. Using the measured time complexity the required clock frequency of the host is measured. According to that CPU frequency is scaled up or down using DVFS scheme, enabling energy to be saved up to 55% of total Watts consumption.
A hybrid Gerchberg-Saxton-like algorithm for DOE and CGH calculation

NASA Astrophysics Data System (ADS)

Wang, Haichao; Yue, Weirui; Song, Qiang; Liu, Jingdan; Situ, Guohai

2017-02-01

The Gerchberg-Saxton (GS) algorithm is widely used in various disciplines of modern sciences and technologies where phase retrieval is required. However, this legendary algorithm most likely stagnates after a few iterations. Many efforts have been taken to improve this situation. Here we propose to introduce the strategy of gradient descent and weighting technique to the GS algorithm, and demonstrate it using two examples: design of a diffractive optical element (DOE) to achieve off-axis illumination in lithographic tools, and design of a computer generated hologram (CGH) for holographic display. Both numerical simulation and optical experiments are carried out for demonstration.
Application of Approximate Pattern Matching in Two Dimensional Spaces to Grid Layout for Biochemical Network Maps

PubMed Central

Inoue, Kentaro; Shimozono, Shinichi; Yoshida, Hideaki; Kurata, Hiroyuki

2012-01-01

Background For visualizing large-scale biochemical network maps, it is important to calculate the coordinates of molecular nodes quickly and to enhance the understanding or traceability of them. The grid layout is effective in drawing compact, orderly, balanced network maps with node label spaces, but existing grid layout algorithms often require a high computational cost because they have to consider complicated positional constraints through the entire optimization process. Results We propose a hybrid grid layout algorithm that consists of a non-grid, fast layout (preprocessor) algorithm and an approximate pattern matching algorithm that distributes the resultant preprocessed nodes on square grid points. To demonstrate the feasibility of the hybrid layout algorithm, it is characterized in terms of the calculation time, numbers of edge-edge and node-edge crossings, relative edge lengths, and F-measures. The proposed algorithm achieves outstanding performances compared with other existing grid layouts. Conclusions Use of an approximate pattern matching algorithm quickly redistributes the laid-out nodes by fast, non-grid algorithms on the square grid points, while preserving the topological relationships among the nodes. The proposed algorithm is a novel use of the pattern matching, thereby providing a breakthrough for grid layout. This application program can be freely downloaded from http://www.cadlive.jp/hybridlayout/hybridlayout.html. PMID:22679486
Application of approximate pattern matching in two dimensional spaces to grid layout for biochemical network maps.

PubMed

Inoue, Kentaro; Shimozono, Shinichi; Yoshida, Hideaki; Kurata, Hiroyuki

2012-01-01

For visualizing large-scale biochemical network maps, it is important to calculate the coordinates of molecular nodes quickly and to enhance the understanding or traceability of them. The grid layout is effective in drawing compact, orderly, balanced network maps with node label spaces, but existing grid layout algorithms often require a high computational cost because they have to consider complicated positional constraints through the entire optimization process. We propose a hybrid grid layout algorithm that consists of a non-grid, fast layout (preprocessor) algorithm and an approximate pattern matching algorithm that distributes the resultant preprocessed nodes on square grid points. To demonstrate the feasibility of the hybrid layout algorithm, it is characterized in terms of the calculation time, numbers of edge-edge and node-edge crossings, relative edge lengths, and F-measures. The proposed algorithm achieves outstanding performances compared with other existing grid layouts. Use of an approximate pattern matching algorithm quickly redistributes the laid-out nodes by fast, non-grid algorithms on the square grid points, while preserving the topological relationships among the nodes. The proposed algorithm is a novel use of the pattern matching, thereby providing a breakthrough for grid layout. This application program can be freely downloaded from http://www.cadlive.jp/hybridlayout/hybridlayout.html.
Evaluation of automatic cloud removal method for high elevation areas in Landsat 8 OLI images to improve environmental indexes computation

NASA Astrophysics Data System (ADS)

Alvarez, César I.; Teodoro, Ana; Tierra, Alfonso

2017-10-01

Thin clouds in the optical remote sensing data are frequent and in most of the cases don't allow to have a pure surface data in order to calculate some indexes as Normalized Difference Vegetation Index (NDVI). This paper aims to evaluate the Automatic Cloud Removal Method (ACRM) algorithm over a high elevation city like Quito (Ecuador), with an altitude of 2800 meters above sea level, where the clouds are presented all the year. The ACRM is an algorithm that considers a linear regression between each Landsat 8 OLI band and the Cirrus band using the slope obtained with the linear regression established. This algorithm was employed without any reference image or mask to try to remove the clouds. The results of the application of the ACRM algorithm over Quito didn't show a good performance. Therefore, was considered improving this algorithm using a different slope value data (ACMR Improved). After, the NDVI computation was compared with a reference NDVI MODIS data (MOD13Q1). The ACMR Improved algorithm had a successful result when compared with the original ACRM algorithm. In the future, this Improved ACRM algorithm needs to be tested in different regions of the world with different conditions to evaluate if the algorithm works successfully for all conditions.
A new root-based direction-finding algorithm

NASA Astrophysics Data System (ADS)

Wasylkiwskyj, Wasyl; Kopriva, Ivica; DoroslovačKi, Miloš; Zaghloul, Amir I.

2007-04-01

Polynomial rooting direction-finding (DF) algorithms are a computationally efficient alternative to search-based DF algorithms and are particularly suitable for uniform linear arrays of physically identical elements provided that mutual interaction among the array elements can be either neglected or compensated for. A popular algorithm in such situations is Root Multiple Signal Classification (Root MUSIC (RM)), wherein the estimation of the directions of arrivals (DOA) requires the computation of the roots of a (2N - 2) -order polynomial, where N represents number of array elements. The DOA are estimated from the L pairs of roots closest to the unit circle, where L represents number of sources. In this paper we derive a modified root polynomial (MRP) algorithm requiring the calculation of only L roots in order to estimate the L DOA. We evaluate the performance of the MRP algorithm numerically and show that it is as accurate as the RM algorithm but with a significantly simpler algebraic structure. In order to demonstrate that the theoretically predicted performance can be achieved in an experimental setting, a decoupled array is emulated in hardware using phase shifters. The results are in excellent agreement with theory.
Kalman/Map filtering-aided fast normalized cross correlation-based Wi-Fi fingerprinting location sensing.

PubMed

Sun, Yongliang; Xu, Yubin; Li, Cheng; Ma, Lin

2013-11-13

A Kalman/map filtering (KMF)-aided fast normalized cross correlation (FNCC)-based Wi-Fi fingerprinting location sensing system is proposed in this paper. Compared with conventional neighbor selection algorithms that calculate localization results with received signal strength (RSS) mean samples, the proposed FNCC algorithm makes use of all the on-line RSS samples and reference point RSS variations to achieve higher fingerprinting accuracy. The FNCC computes efficiently while maintaining the same accuracy as the basic normalized cross correlation. Additionally, a KMF is also proposed to process fingerprinting localization results. It employs a new map matching algorithm to nonlinearize the linear location prediction process of Kalman filtering (KF) that takes advantage of spatial proximities of consecutive localization results. With a calibration model integrated into an indoor map, the map matching algorithm corrects unreasonable prediction locations of the KF according to the building interior structure. Thus, more accurate prediction locations are obtained. Using these locations, the KMF considerably improves fingerprinting algorithm performance. Experimental results demonstrate that the FNCC algorithm with reduced computational complexity outperforms other neighbor selection algorithms and the KMF effectively improves location sensing accuracy by using indoor map information and spatial proximities of consecutive localization results.
Kalman/Map Filtering-Aided Fast Normalized Cross Correlation-Based Wi-Fi Fingerprinting Location Sensing

PubMed Central

Sun, Yongliang; Xu, Yubin; Li, Cheng; Ma, Lin

2013-01-01

A Kalman/map filtering (KMF)-aided fast normalized cross correlation (FNCC)-based Wi-Fi fingerprinting location sensing system is proposed in this paper. Compared with conventional neighbor selection algorithms that calculate localization results with received signal strength (RSS) mean samples, the proposed FNCC algorithm makes use of all the on-line RSS samples and reference point RSS variations to achieve higher fingerprinting accuracy. The FNCC computes efficiently while maintaining the same accuracy as the basic normalized cross correlation. Additionally, a KMF is also proposed to process fingerprinting localization results. It employs a new map matching algorithm to nonlinearize the linear location prediction process of Kalman filtering (KF) that takes advantage of spatial proximities of consecutive localization results. With a calibration model integrated into an indoor map, the map matching algorithm corrects unreasonable prediction locations of the KF according to the building interior structure. Thus, more accurate prediction locations are obtained. Using these locations, the KMF considerably improves fingerprinting algorithm performance. Experimental results demonstrate that the FNCC algorithm with reduced computational complexity outperforms other neighbor selection algorithms and the KMF effectively improves location sensing accuracy by using indoor map information and spatial proximities of consecutive localization results. PMID:24233027
Generalized concurrence in boson sampling.

PubMed

Chin, Seungbeom; Huh, Joonsuk

2018-04-17

A fundamental question in linear optical quantum computing is to understand the origin of the quantum supremacy in the physical system. It is found that the multimode linear optical transition amplitudes are calculated through the permanents of transition operator matrices, which is a hard problem for classical simulations (boson sampling problem). We can understand this problem by considering a quantum measure that directly determines the runtime for computing the transition amplitudes. In this paper, we suggest a quantum measure named "Fock state concurrence sum" C S , which is the summation over all the members of "the generalized Fock state concurrence" (a measure analogous to the generalized concurrences of entanglement and coherence). By introducing generalized algorithms for computing the transition amplitudes of the Fock state boson sampling with an arbitrary number of photons per mode, we show that the minimal classical runtime for all the known algorithms directly depends on C S . Therefore, we can state that the Fock state concurrence sum C S behaves as a collective measure that controls the computational complexity of Fock state BS. We expect that our observation on the role of the Fock state concurrence in the generalized algorithm for permanents would provide a unified viewpoint to interpret the quantum computing power of linear optics.
A-VCI: A flexible method to efficiently compute vibrational spectra

NASA Astrophysics Data System (ADS)

Odunlami, Marc; Le Bris, Vincent; Bégué, Didier; Baraille, Isabelle; Coulaud, Olivier

2017-06-01

The adaptive vibrational configuration interaction algorithm has been introduced as a new method to efficiently reduce the dimension of the set of basis functions used in a vibrational configuration interaction process. It is based on the construction of nested bases for the discretization of the Hamiltonian operator according to a theoretical criterion that ensures the convergence of the method. In the present work, the Hamiltonian is written as a sum of products of operators. The purpose of this paper is to study the properties and outline the performance details of the main steps of the algorithm. New parameters have been incorporated to increase flexibility, and their influence has been thoroughly investigated. The robustness and reliability of the method are demonstrated for the computation of the vibrational spectrum up to 3000 cm-1 of a widely studied 6-atom molecule (acetonitrile). Our results are compared to the most accurate up to date computation; we also give a new reference calculation for future work on this system. The algorithm has also been applied to a more challenging 7-atom molecule (ethylene oxide). The computed spectrum up to 3200 cm-1 is the most accurate computation that exists today on such systems.
A-VCI: A flexible method to efficiently compute vibrational spectra.

PubMed

Odunlami, Marc; Le Bris, Vincent; Bégué, Didier; Baraille, Isabelle; Coulaud, Olivier

2017-06-07

The adaptive vibrational configuration interaction algorithm has been introduced as a new method to efficiently reduce the dimension of the set of basis functions used in a vibrational configuration interaction process. It is based on the construction of nested bases for the discretization of the Hamiltonian operator according to a theoretical criterion that ensures the convergence of the method. In the present work, the Hamiltonian is written as a sum of products of operators. The purpose of this paper is to study the properties and outline the performance details of the main steps of the algorithm. New parameters have been incorporated to increase flexibility, and their influence has been thoroughly investigated. The robustness and reliability of the method are demonstrated for the computation of the vibrational spectrum up to 3000 cm -1 of a widely studied 6-atom molecule (acetonitrile). Our results are compared to the most accurate up to date computation; we also give a new reference calculation for future work on this system. The algorithm has also been applied to a more challenging 7-atom molecule (ethylene oxide). The computed spectrum up to 3200 cm -1 is the most accurate computation that exists today on such systems.

A correction scheme for a simplified analytical random walk model algorithm of proton dose calculation in distal Bragg peak regions

NASA Astrophysics Data System (ADS)

Yao, Weiguang; Merchant, Thomas E.; Farr, Jonathan B.

2016-10-01

The lateral homogeneity assumption is used in most analytical algorithms for proton dose, such as the pencil-beam algorithms and our simplified analytical random walk model. To improve the dose calculation in the distal fall-off region in heterogeneous media, we analyzed primary proton fluence near heterogeneous media and propose to calculate the lateral fluence with voxel-specific Gaussian distributions. The lateral fluence from a beamlet is no longer expressed by a single Gaussian for all the lateral voxels, but by a specific Gaussian for each lateral voxel. The voxel-specific Gaussian for the beamlet of interest is calculated by re-initializing the fluence deviation on an effective surface where the proton energies of the beamlet of interest and the beamlet passing the voxel are the same. The dose improvement from the correction scheme was demonstrated by the dose distributions in two sets of heterogeneous phantoms consisting of cortical bone, lung, and water and by evaluating distributions in example patients with a head-and-neck tumor and metal spinal implants. The dose distributions from Monte Carlo simulations were used as the reference. The correction scheme effectively improved the dose calculation accuracy in the distal fall-off region and increased the gamma test pass rate. The extra computation for the correction was about 20% of that for the original algorithm but is dependent upon patient geometry.
A robust method of computing finite difference coefficients based on Vandermonde matrix

NASA Astrophysics Data System (ADS)

Zhang, Yijie; Gao, Jinghuai; Peng, Jigen; Han, Weimin

2018-05-01

When the finite difference (FD) method is employed to simulate the wave propagation, high-order FD method is preferred in order to achieve better accuracy. However, if the order of FD scheme is high enough, the coefficient matrix of the formula for calculating finite difference coefficients is close to be singular. In this case, when the FD coefficients are computed by matrix inverse operator of MATLAB, inaccuracy can be produced. In order to overcome this problem, we have suggested an algorithm based on Vandermonde matrix in this paper. After specified mathematical transformation, the coefficient matrix is transformed into a Vandermonde matrix. Then the FD coefficients of high-order FD method can be computed by the algorithm of Vandermonde matrix, which prevents the inverse of the singular matrix. The dispersion analysis and numerical results of a homogeneous elastic model and a geophysical model of oil and gas reservoir demonstrate that the algorithm based on Vandermonde matrix has better accuracy compared with matrix inverse operator of MATLAB.
Advantages of formulating an evolution equation directly for elastic distortional deformation in finite deformation plasticity

NASA Astrophysics Data System (ADS)

Rubin, M. B.; Cardiff, P.

2017-11-01

Simo (Comput Methods Appl Mech Eng 66:199-219, 1988) proposed an evolution equation for elastic deformation together with a constitutive equation for inelastic deformation rate in plasticity. The numerical algorithm (Simo in Comput Methods Appl Mech Eng 68:1-31, 1988) for determining elastic distortional deformation was simple. However, the proposed inelastic deformation rate caused plastic compaction. The corrected formulation (Simo in Comput Methods Appl Mech Eng 99:61-112, 1992) preserves isochoric plasticity but the numerical integration algorithm is complicated and needs special methods for calculation of the exponential map of a tensor. Alternatively, an evolution equation for elastic distortional deformation can be proposed directly with a simplified constitutive equation for inelastic distortional deformation rate. This has the advantage that the physics of inelastic distortional deformation is separated from that of dilatation. The example of finite deformation J2 plasticity with linear isotropic hardening is used to demonstrate the simplicity of the numerical algorithm.
Simulation of subwavelength metallic gratings using a new implementation of the recursive convolution finite-difference time-domain algorithm.

PubMed

Banerjee, Saswatee; Hoshino, Tetsuya; Cole, James B

2008-08-01

We introduce a new implementation of the finite-difference time-domain (FDTD) algorithm with recursive convolution (RC) for first-order Drude metals. We implemented RC for both Maxwell's equations for light polarized in the plane of incidence (TM mode) and the wave equation for light polarized normal to the plane of incidence (TE mode). We computed the Drude parameters at each wavelength using the measured value of the dielectric constant as a function of the spatial and temporal discretization to ensure both the accuracy of the material model and algorithm stability. For the TE mode, where Maxwell's equations reduce to the wave equation (even in a region of nonuniform permittivity) we introduced a wave equation formulation of RC-FDTD. This greatly reduces the computational cost. We used our methods to compute the diffraction characteristics of metallic gratings in the visible wavelength band and compared our results with frequency-domain calculations.
A complex guided spectral transform Lanczos method for studying quantum resonance states

DOE PAGES

Yu, Hua-Gen

2014-12-28

A complex guided spectral transform Lanczos (cGSTL) algorithm is proposed to compute both bound and resonance states including energies, widths and wavefunctions. The algorithm comprises of two layers of complex-symmetric Lanczos iterations. A short inner layer iteration produces a set of complex formally orthogonal Lanczos (cFOL) polynomials. They are used to span the guided spectral transform function determined by a retarded Green operator. An outer layer iteration is then carried out with the transform function to compute the eigen-pairs of the system. The guided spectral transform function is designed to have the same wavefunctions as the eigenstates of the originalmore » Hamiltonian in the spectral range of interest. Therefore the energies and/or widths of bound or resonance states can be easily computed with their wavefunctions or by using a root-searching method from the guided spectral transform surface. The new cGSTL algorithm is applied to bound and resonance states of HO₂, and compared to previous calculations.« less
The Calculation of VOCs Diffusion Coefficient for Building Materials

NASA Astrophysics Data System (ADS)

Zhang, Xin; Deng, Quancai; Chen, Haijiang; Wu, Xiaoyun

2018-05-01

Volatile Organic Compounds (VOCS), as one of the major sources of air contaminations, has an important bearing on one’s general health. The adsorption capacity and velocity of the material for VOCs can be described separately using. In this paper, the detailed process and method of VOCs diffusion and partition coefficients by genetic algorithm is introduced, the algorithm is realized easily by computer program and the result by the method is precise and practical.
A computer program for the calculation of the flow field including boundary layer effects for mixed-compression inlets at angle of attack

NASA Technical Reports Server (NTRS)

Vadyak, J.; Hoffman, J. D.

1982-01-01

A computer program was developed which is capable of calculating the flow field in the supersonic portion of a mixed compression aircraft inlet operating at angle of attack. The supersonic core flow is computed using a second-order three dimensional method-of-characteristics algorithm. The bow shock and the internal shock train are treated discretely using a three dimensional shock fitting procedure. The boundary layer flows are computed using a second-order implicit finite difference method. The shock wave-boundary layer interaction is computed using an integral formulation. The general structure of the computer program is discussed, and a brief description of each subroutine is given. All program input parameters are defined, and a brief discussion on interpretation of the output is provided. A number of sample cases, complete with data listings, are provided.
Mathematical model of a rotational bioreactor for the dynamic cultivation of scaffold-adhered human mesenchymal stem cells for bone regeneration

NASA Astrophysics Data System (ADS)

Ganimedov, V. L.; Papaeva, E. O.; Maslov, N. A.; Larionov, P. M.

2017-09-01

Development of cell-mediated scaffold technologies for the treatment of critical bone defects is very important for the purpose of reparative bone regeneration. Today the properties of the bioreactor for cell-seeded scaffold cultivation are the subject of intensive research. We used the mathematical modeling of rotational reactor and construct computational algorithm with the help of ANSYS software package to develop this new procedure. The solution obtained with the help of the constructed computational algorithm is in good agreement with the analytical solution of Couette for the task of two coaxial cylinders. The series of flow computations for different rotation frequencies (1, 0.75, 0.5, 0.33, 1.125 Hz) was performed for the laminar flow regime approximation with the help of computational algorithm. It was found that Taylor vortices appear in the annular gap between the cylinders in a simulated bioreactor. It was obtained that shear stress in the range of interest (0.002-0.1 Pa) arise on outer surface of inner cylinder when it rotates with the frequency not exceeding 0.8 Hz. So the constructed mathematical model and the created computational algorithm for calculating the flow parameters allow predicting the shear stress and pressure values depending on the rotation frequency and geometric parameters, as well as optimizing the operating mode of the bioreactor.
Improved teaching-learning-based and JAYA optimization algorithms for solving flexible flow shop scheduling problems

NASA Astrophysics Data System (ADS)

Buddala, Raviteja; Mahapatra, Siba Sankar

2017-11-01

Flexible flow shop (or a hybrid flow shop) scheduling problem is an extension of classical flow shop scheduling problem. In a simple flow shop configuration, a job having `g' operations is performed on `g' operation centres (stages) with each stage having only one machine. If any stage contains more than one machine for providing alternate processing facility, then the problem becomes a flexible flow shop problem (FFSP). FFSP which contains all the complexities involved in a simple flow shop and parallel machine scheduling problems is a well-known NP-hard (Non-deterministic polynomial time) problem. Owing to high computational complexity involved in solving these problems, it is not always possible to obtain an optimal solution in a reasonable computation time. To obtain near-optimal solutions in a reasonable computation time, a large variety of meta-heuristics have been proposed in the past. However, tuning algorithm-specific parameters for solving FFSP is rather tricky and time consuming. To address this limitation, teaching-learning-based optimization (TLBO) and JAYA algorithm are chosen for the study because these are not only recent meta-heuristics but they do not require tuning of algorithm-specific parameters. Although these algorithms seem to be elegant, they lose solution diversity after few iterations and get trapped at the local optima. To alleviate such drawback, a new local search procedure is proposed in this paper to improve the solution quality. Further, mutation strategy (inspired from genetic algorithm) is incorporated in the basic algorithm to maintain solution diversity in the population. Computational experiments have been conducted on standard benchmark problems to calculate makespan and computational time. It is found that the rate of convergence of TLBO is superior to JAYA. From the results, it is found that TLBO and JAYA outperform many algorithms reported in the literature and can be treated as efficient methods for solving the FFSP.
A time-efficient algorithm for implementing the Catmull-Clark subdivision method

NASA Astrophysics Data System (ADS)

Ioannou, G.; Savva, A.; Stylianou, V.

2015-10-01

Splines are the most popular methods in Figure Modeling and CAGD (Computer Aided Geometric Design) in generating smooth surfaces from a number of control points. The control points define the shape of a figure and splines calculate the required number of points which when displayed on a computer screen the result is a smooth surface. However, spline methods are based on a rectangular topological structure of points, i.e., a two-dimensional table of vertices, and thus cannot generate complex figures, such as the human and animal bodies that their complex structure does not allow them to be defined by a regular rectangular grid. On the other hand surface subdivision methods, which are derived by splines, generate surfaces which are defined by an arbitrary topology of control points. This is the reason that during the last fifteen years subdivision methods have taken the lead over regular spline methods in all areas of modeling in both industry and research. The cost of executing computer software developed to read control points and calculate the surface is run-time, due to the fact that the surface-structure required for handling arbitrary topological grids is very complicate. There are many software programs that have been developed related to the implementation of subdivision surfaces however, not many algorithms are documented in the literature, to support developers for writing efficient code. This paper aims to assist programmers by presenting a time-efficient algorithm for implementing subdivision splines. The Catmull-Clark which is the most popular of the subdivision methods has been employed to illustrate the algorithm.
Determining Hypocentral Parameters for Local Earthquakes in 1-D Using a Genetic Algorithm and Two-point ray tracing

NASA Astrophysics Data System (ADS)

Kim, W.; Hahm, I.; Ahn, S. J.; Lim, D. H.

2005-12-01

This paper introduces a powerful method for determining hypocentral parameters for local earthquakes in 1-D using a genetic algorithm (GA) and two-point ray tracing. Using existing algorithms to determine hypocentral parameters is difficult, because these parameters can vary based on initial velocity models. We developed a new method to solve this problem by applying a GA to an existing algorithm, HYPO-71 (Lee and Larh, 1975). The original HYPO-71 algorithm was modified by applying two-point ray tracing and a weighting factor with respect to the takeoff angle at the source to reduce errors from the ray path and hypocenter depth. Artificial data, without error, were generated by computer using two-point ray tracing in a true model, in which velocity structure and hypocentral parameters were known. The accuracy of the calculated results was easily determined by comparing calculated and actual values. We examined the accuracy of this method for several cases by changing the true and modeled layer numbers and thicknesses. The computational results show that this method determines nearly exact hypocentral parameters without depending on initial velocity models. Furthermore, accurate and nearly unique hypocentral parameters were obtained, although the number of modeled layers and thicknesses differed from those in the true model. Therefore, this method can be a useful tool for determining hypocentral parameters in regions where reliable local velocity values are unknown. This method also provides the basic a priori information for 3-D studies. KEY -WORDS: hypocentral parameters, genetic algorithm (GA), two-point ray tracing
Community detection using preference networks

NASA Astrophysics Data System (ADS)

Tasgin, Mursel; Bingol, Haluk O.

2018-04-01

Community detection is the task of identifying clusters or groups of nodes in a network where nodes within the same group are more connected with each other than with nodes in different groups. It has practical uses in identifying similar functions or roles of nodes in many biological, social and computer networks. With the availability of very large networks in recent years, performance and scalability of community detection algorithms become crucial, i.e. if time complexity of an algorithm is high, it cannot run on large networks. In this paper, we propose a new community detection algorithm, which has a local approach and is able to run on large networks. It has a simple and effective method; given a network, algorithm constructs a preference network of nodes where each node has a single outgoing edge showing its preferred node to be in the same community with. In such a preference network, each connected component is a community. Selection of the preferred node is performed using similarity based metrics of nodes. We use two alternatives for this purpose which can be calculated in 1-neighborhood of nodes, i.e. number of common neighbors of selector node and its neighbors and, the spread capability of neighbors around the selector node which is calculated by the gossip algorithm of Lind et.al. Our algorithm is tested on both computer generated LFR networks and real-life networks with ground-truth community structure. It can identify communities accurately in a fast way. It is local, scalable and suitable for distributed execution on large networks.
Tensor-decomposed vibrational coupled-cluster theory: Enabling large-scale, highly accurate vibrational-structure calculations

NASA Astrophysics Data System (ADS)

Madsen, Niels Kristian; Godtliebsen, Ian H.; Losilla, Sergio A.; Christiansen, Ove

2018-01-01

A new implementation of vibrational coupled-cluster (VCC) theory is presented, where all amplitude tensors are represented in the canonical polyadic (CP) format. The CP-VCC algorithm solves the non-linear VCC equations without ever constructing the amplitudes or error vectors in full dimension but still formally includes the full parameter space of the VCC[n] model in question resulting in the same vibrational energies as the conventional method. In a previous publication, we have described the non-linear-equation solver for CP-VCC calculations. In this work, we discuss the general algorithm for evaluating VCC error vectors in CP format including the rank-reduction methods used during the summation of the many terms in the VCC amplitude equations. Benchmark calculations for studying the computational scaling and memory usage of the CP-VCC algorithm are performed on a set of molecules including thiadiazole and an array of polycyclic aromatic hydrocarbons. The results show that the reduced scaling and memory requirements of the CP-VCC algorithm allows for performing high-order VCC calculations on systems with up to 66 vibrational modes (anthracene), which indeed are not possible using the conventional VCC method. This paves the way for obtaining highly accurate vibrational spectra and properties of larger molecules.
Fast Fourier Transform Spectral Analysis Program

NASA Technical Reports Server (NTRS)

Daniel, J. A., Jr.; Graves, M. L.; Hovey, N. M.

1969-01-01

Fast Fourier Transform Spectral Analysis Program is used in frequency spectrum analysis of postflight, space vehicle telemetered trajectory data. This computer program with a digital algorithm can calculate power spectrum rms amplitudes and cross spectrum of sampled parameters at even time increments.
Dosimetric verification of radiotherapy treatment planning systems in Serbia: national audit

PubMed Central

2012-01-01

Background Independent external audits play an important role in quality assurance programme in radiation oncology. The audit supported by the IAEA in Serbia was designed to review the whole chain of activities in 3D conformal radiotherapy (3D-CRT) workflow, from patient data acquisition to treatment planning and dose delivery. The audit was based on the IAEA recommendations and focused on dosimetry part of the treatment planning and delivery processes. Methods The audit was conducted in three radiotherapy departments of Serbia. An anthropomorphic phantom was scanned with a computed tomography unit (CT) and treatment plans for eight different test cases involving various beam configurations suggested by the IAEA were prepared on local treatment planning systems (TPSs). The phantom was irradiated following the treatment plans for these test cases and doses in specific points were measured with an ionization chamber. The differences between the measured and calculated doses were reported. Results The measurements were conducted for different photon beam energies and TPS calculation algorithms. The deviation between the measured and calculated values for all test cases made with advanced algorithms were within the agreement criteria, while the larger deviations were observed for simpler algorithms. The number of measurements with results outside the agreement criteria increased with the increase of the beam energy and decreased with TPS calculation algorithm sophistication. Also, a few errors in the basic dosimetry data in TPS were detected and corrected. Conclusions The audit helped the users to better understand the operational features and limitations of their TPSs and resulted in increased confidence in dose calculation accuracy using TPSs. The audit results indicated the shortcomings of simpler algorithms for the test cases performed and, therefore the transition to more advanced algorithms is highly desirable. PMID:22971539
Dosimetric verification of radiotherapy treatment planning systems in Serbia: national audit.

PubMed

Rutonjski, Laza; Petrović, Borislava; Baucal, Milutin; Teodorović, Milan; Cudić, Ozren; Gershkevitsh, Eduard; Izewska, Joanna

2012-09-12

Independent external audits play an important role in quality assurance programme in radiation oncology. The audit supported by the IAEA in Serbia was designed to review the whole chain of activities in 3D conformal radiotherapy (3D-CRT) workflow, from patient data acquisition to treatment planning and dose delivery. The audit was based on the IAEA recommendations and focused on dosimetry part of the treatment planning and delivery processes. The audit was conducted in three radiotherapy departments of Serbia. An anthropomorphic phantom was scanned with a computed tomography unit (CT) and treatment plans for eight different test cases involving various beam configurations suggested by the IAEA were prepared on local treatment planning systems (TPSs). The phantom was irradiated following the treatment plans for these test cases and doses in specific points were measured with an ionization chamber. The differences between the measured and calculated doses were reported. The measurements were conducted for different photon beam energies and TPS calculation algorithms. The deviation between the measured and calculated values for all test cases made with advanced algorithms were within the agreement criteria, while the larger deviations were observed for simpler algorithms. The number of measurements with results outside the agreement criteria increased with the increase of the beam energy and decreased with TPS calculation algorithm sophistication. Also, a few errors in the basic dosimetry data in TPS were detected and corrected. The audit helped the users to better understand the operational features and limitations of their TPSs and resulted in increased confidence in dose calculation accuracy using TPSs. The audit results indicated the shortcomings of simpler algorithms for the test cases performed and, therefore the transition to more advanced algorithms is highly desirable.
Gaussian-Beam Laser-Resonator Program

NASA Technical Reports Server (NTRS)

Cross, Patricia L.; Bair, Clayton H.; Barnes, Norman

1989-01-01

Gaussian Beam Laser Resonator Program models laser resonators by use of Gaussian-beam-propagation techniques. Used to determine radii of beams as functions of position in laser resonators. Algorithm used in program has three major components. First, ray-transfer matrix for laser resonator must be calculated. Next, initial parameters of beam calculated. Finally, propagation of beam through optical elements computed. Written in Microsoft FORTRAN (Version 4.01).
Multi-Objective UAV Mission Planning Using Evolutionary Computation

DTIC Science & Technology

2008-03-01

on a Solution Space. . . . . . . . . . . . . . . . . . . . 41 4.3. Crowding distance calculation. Dark points are non-dominated solutions. [14...SPEA2 was devel- oped by Zitzler [64] as an improvement to the original SPEA algorithm [65]. SPEA2 Figure 4.3: Crowding distance calculation. Dark ...thesis, Los Angeles, CA, USA, 2003. Adviser-Maja J. Mataric . 114 21. Homberger, Joerg and Hermann Gehring. “Two Evolutionary Metaheuristics for the
Projected role of advanced computational aerodynamic methods at the Lockheed-Georgia company

NASA Technical Reports Server (NTRS)

Lores, M. E.

1978-01-01

Experience with advanced computational methods being used at the Lockheed-Georgia Company to aid in the evaluation and design of new and modified aircraft indicates that large and specialized computers will be needed to make advanced three-dimensional viscous aerodynamic computations practical. The Numerical Aerodynamic Simulation Facility should be used to provide a tool for designing better aerospace vehicles while at the same time reducing development costs by performing computations using Navier-Stokes equations solution algorithms and permitting less sophisticated but nevertheless complex calculations to be made efficiently. Configuration definition procedures and data output formats can probably best be defined in cooperation with industry, therefore, the computer should handle many remote terminals efficiently. The capability of transferring data to and from other computers needs to be provided. Because of the significant amount of input and output associated with 3-D viscous flow calculations and because of the exceedingly fast computation speed envisioned for the computer, special attention should be paid to providing rapid, diversified, and efficient input and output.
A Microsoft Excel® 2010 Based Tool for Calculating Interobserver Agreement

PubMed Central

Azulay, Richard L

2011-01-01

This technical report provides detailed information on the rationale for using a common computer spreadsheet program (Microsoft Excel®) to calculate various forms of interobserver agreement for both continuous and discontinuous data sets. In addition, we provide a brief tutorial on how to use an Excel spreadsheet to automatically compute traditional total count, partial agreement-within-intervals, exact agreement, trial-by-trial, interval-by-interval, scored-interval, unscored-interval, total duration, and mean duration-per-interval interobserver agreement algorithms. We conclude with a discussion of how practitioners may integrate this tool into their clinical work. PMID:22649578

A microsoft excel(®) 2010 based tool for calculating interobserver agreement.

PubMed

Reed, Derek D; Azulay, Richard L

2011-01-01

This technical report provides detailed information on the rationale for using a common computer spreadsheet program (Microsoft Excel(®)) to calculate various forms of interobserver agreement for both continuous and discontinuous data sets. In addition, we provide a brief tutorial on how to use an Excel spreadsheet to automatically compute traditional total count, partial agreement-within-intervals, exact agreement, trial-by-trial, interval-by-interval, scored-interval, unscored-interval, total duration, and mean duration-per-interval interobserver agreement algorithms. We conclude with a discussion of how practitioners may integrate this tool into their clinical work.
Measurement of the center edge angle and determination of the Severin classification using digital radiography, computer-assisted measurement tools, and a Severin algorithm: intraobserver and interobserver reliability revisited.

PubMed

Carroll, Kristen L; Murray, Kathleen A; MacLeod, Lynne M; Hennessey, Theresa A; Woiczik, Marcella R; Roach, James W

2011-06-01

Numerous studies underscore the poor intraobserver and interobserver reliability of both the center edge angle (CEA) and the Severin classification using plain film measurements. In this study, experienced observers applied a computer-assisted measurement program to determine the CEA in digital pelvic radiographs of adults who had been previously treated for dysplasia of the hip (DDH). Using a teaching aid/algorithm of the Severin classification, the observers then assigned a Severin rating to these hips. Intraobserver and interobserver errors were then calculated on both the CEA measurements and the Severin classifications. Four pediatric orthopaedic surgeons and 1 pediatric radiologist calculated the CEAs using the OrthoView TM planning system and then determined the Severin classification on 41 blinded digital pelvic radiographs. The radiographs were evaluated by each examiner twice, with evaluations separated by 2 months. All examiners reviewed a Severin classification algorithm before making their Severin assignments. The intraobserver and interobserver reliability for both the CEA and the Severin classification were calculated using the interclass correlation coefficients and Cohen and Fleiss κ scores, respectively. The intraobserver and interobserver reliability for CEA measurement was moderate to almost perfect. When we separated the Severin classification into 3 clinically relevant groups of good (Severin I and II), dysplastic (Severin III), and poor (Severin IV and above), our interobserver reliability neared almost perfect. The Severin classification is an extremely useful and oft-used radiographic measure for the success of DDH treatment. Our research found digital radiography, computer-aided measurement tools, the use of a Severin algorithm, and separating the Severin classification into 3 clinically relevant groups significantly increased the intraobserver and interobserver reliability of both the CEA and Severin classification. This finding will assist future studies using the CEA and Severin classification in the radiographic assessment of DDH treatment outcomes.
Automatic estimation of detector radial position for contoured SPECT acquisition using CT images on a SPECT/CT system.

PubMed

Liu, Ruijie Rachel; Erwin, William D

2006-08-01

An algorithm was developed to estimate noncircular orbit (NCO) single-photon emission computed tomography (SPECT) detector radius on a SPECT/CT imaging system using the CT images, for incorporation into collimator resolution modeling for iterative SPECT reconstruction. Simulated male abdominal (arms up), male head and neck (arms down) and female chest (arms down) anthropomorphic phantom, and ten patient, medium-energy SPECT/CT scans were acquired on a hybrid imaging system. The algorithm simulated inward SPECT detector radial motion and object contour detection at each projection angle, employing the calculated average CT image and a fixed Hounsfield unit (HU) threshold. Calculated radii were compared to the observed true radii, and optimal CT threshold values, corresponding to patient bed and clothing surfaces, were found to be between -970 and -950 HU. The algorithm was constrained by the 45 cm CT field-of-view (FOV), which limited the detected radii to < or = 22.5 cm and led to occasional radius underestimation in the case of object truncation by CT. Two methods incorporating the algorithm were implemented: physical model (PM) and best fit (BF). The PM method computed an offset that produced maximum overlap of calculated and true radii for the phantom scans, and applied that offset as a calculated-to-true radius transformation. For the BF method, the calculated-to-true radius transformation was based upon a linear regression between calculated and true radii. For the PM method, a fixed offset of +2.75 cm provided maximum calculated-to-true radius overlap for the phantom study, which accounted for the camera system's object contour detect sensor surface-to-detector face distance. For the BF method, a linear regression of true versus calculated radius from a reference patient scan was used as a calculated-to-true radius transform. Both methods were applied to ten patient scans. For -970 and -950 HU thresholds, the combined overall average root-mean-square (rms) error in radial position for eight patient scans without truncation were 3.37 cm (12.9%) for PM and 1.99 cm (8.6%) for BF, indicating BF is superior to PM in the absence of truncation. For two patient scans with truncation, the rms error was 3.24 cm (12.2%) for PM and 4.10 cm (18.2%) for BF. The slightly better performance of PM in the case of truncation is anomalous, due to FOV edge truncation artifacts in the CT reconstruction, and thus is suspect. The calculated NCO contour for a patient SPECT/CT scan was used with an iterative reconstruction algorithm that incorporated compensation for system resolution. The resulting image was qualitatively superior to the image obtained by reconstructing the data using the fixed radius stored by the scanner. The result was also superior to the image reconstructed using the iterative algorithm provided with the system, which does not incorporate resolution modeling. These results suggest that, under conditions of no or only mild lateral truncation of the CT scan, the algorithm is capable of providing radius estimates suitable for iterative SPECT reconstruction collimator geometric resolution modeling.
The energy-dependent electron loss model: backscattering and application to heterogeneous slab media.

PubMed

Lee, Tae Kyu; Sandison, George A

2003-01-21

Electron backscattering has been incorporated into the energy-dependent electron loss (EL) model and the resulting algorithm is applied to predict dose deposition in slab heterogeneous media. This algorithm utilizes a reflection coefficient from the interface that is computed on the basis of Goudsmit-Saunderson theory and an average energy for the backscattered electrons based on Everhart's theory. Predictions of dose deposition in slab heterogeneous media are compared to the Monte Carlo based dose planning method (DPM) and a numerical discrete ordinates method (DOM). The slab media studied comprised water/Pb, water/Al, water/bone, water/bone/water, and water/lung/water, and incident electron beam energies of 10 MeV and 18 MeV. The predicted dose enhancement due to backscattering is accurate to within 3% of dose maximum even for lead as the backscattering medium. Dose discrepancies at large depths beyond the interface were as high as 5% of dose maximum and we speculate that this error may be attributed to the EL model assuming a Gaussian energy distribution for the electrons at depth. The computational cost is low compared to Monte Carlo simulations making the EL model attractive as a fast dose engine for dose optimization algorithms. The predictive power of the algorithm demonstrates that the small angle scattering restriction on the EL model can be overcome while retaining dose calculation accuracy and requiring only one free variable, chi, in the algorithm to be determined in advance of calculation.
The energy-dependent electron loss model: backscattering and application to heterogeneous slab media

NASA Astrophysics Data System (ADS)

Lee, Tae Kyu; Sandison, George A.

2003-01-01

Electron backscattering has been incorporated into the energy-dependent electron loss (EL) model and the resulting algorithm is applied to predict dose deposition in slab heterogeneous media. This algorithm utilizes a reflection coefficient from the interface that is computed on the basis of Goudsmit-Saunderson theory and an average energy for the backscattered electrons based on Everhart's theory. Predictions of dose deposition in slab heterogeneous media are compared to the Monte Carlo based dose planning method (DPM) and a numerical discrete ordinates method (DOM). The slab media studied comprised water/Pb, water/Al, water/bone, water/bone/water, and water/lung/water, and incident electron beam energies of 10 MeV and 18 MeV. The predicted dose enhancement due to backscattering is accurate to within 3% of dose maximum even for lead as the backscattering medium. Dose discrepancies at large depths beyond the interface were as high as 5% of dose maximum and we speculate that this error may be attributed to the EL model assuming a Gaussian energy distribution for the electrons at depth. The computational cost is low compared to Monte Carlo simulations making the EL model attractive as a fast dose engine for dose optimization algorithms. The predictive power of the algorithm demonstrates that the small angle scattering restriction on the EL model can be overcome while retaining dose calculation accuracy and requiring only one free variable, χ, in the algorithm to be determined in advance of calculation.
Computation of the acoustic radiation force using the finite-difference time-domain method.

PubMed

Cai, Feiyan; Meng, Long; Jiang, Chunxiang; Pan, Yu; Zheng, Hairong

2010-10-01

The computational details related to calculating the acoustic radiation force on an object using a 2-D grid finite-difference time-domain method (FDTD) are presented. The method is based on propagating the stress and velocity fields through the grid and determining the energy flow with and without the object. The axial and radial acoustic radiation forces predicted by FDTD method are in excellent agreement with the results obtained by analytical evaluation of the scattering method. In particular, the results indicate that it is possible to trap the steel cylinder in the radial direction by optimizing the width of Gaussian source and the operation frequency. As the sizes of the relating objects are smaller than or comparable to wavelength, the algorithm presented here can be easily extended to 3-D and include torque computation algorithms, thus providing a highly flexible and universally usable computation engine.
Preliminary study of the use of the STAR-100 computer for transonic flow calculations

NASA Technical Reports Server (NTRS)

Keller, J. D.; Jameson, A.

1977-01-01

An explicit method for solving the transonic small-disturbance potential equation is presented. This algorithm, which is suitable for the new vector-processor computers such as the CDC STAR-100, is compared to successive line over-relaxation (SLOR) on a simple test problem. The convergence rate of the explicit scheme is slower than that of SLOR, however, the efficiency of the explicit scheme on the STAR-100 computer is sufficient to overcome the slower convergence rate and allow an overall speedup compared to SLOR on the CYBER 175 computer.
Computing NLTE Opacities -- Node Level Parallel Calculation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holladay, Daniel

Presentation. The goal: to produce a robust library capable of computing reasonably accurate opacities inline with the assumption of LTE relaxed (non-LTE). Near term: demonstrate acceleration of non-LTE opacity computation. Far term (if funded): connect to application codes with in-line capability and compute opacities. Study science problems. Use efficient algorithms that expose many levels of parallelism and utilize good memory access patterns for use on advanced architectures. Portability to multiple types of hardware including multicore processors, manycore processors such as KNL, GPUs, etc. Easily coupled to radiation hydrodynamics and thermal radiative transfer codes.
Additional development of the XTRAN3S computer program

NASA Technical Reports Server (NTRS)

Borland, C. J.

1989-01-01

Additional developments and enhancements to the XTRAN3S computer program, a code for calculation of steady and unsteady aerodynamics, and associated aeroelastic solutions, for 3-D wings in the transonic flow regime are described. Algorithm improvements for the XTRAN3S program were provided including an implicit finite difference scheme to enhance the allowable time step and vectorization for improved computational efficiency. The code was modified to treat configurations with a fuselage, multiple stores/nacelles/pylons, and winglets. Computer program changes (updates) for error corrections and updates for version control are provided.
A simplified approach to characterizing a kilovoltage source spectrum for accurate dose computation.

PubMed

Poirier, Yannick; Kouznetsov, Alexei; Tambasco, Mauro

2012-06-01

To investigate and validate the clinical feasibility of using half-value layer (HVL) and peak tube potential (kVp) for characterizing a kilovoltage (kV) source spectrum for the purpose of computing kV x-ray dose accrued from imaging procedures. To use this approach to characterize a Varian® On-Board Imager® (OBI) source and perform experimental validation of a novel in-house hybrid dose computation algorithm for kV x-rays. We characterized the spectrum of an imaging kV x-ray source using the HVL and the kVp as the sole beam quality identifiers using third-party freeware Spektr to generate the spectra. We studied the sensitivity of our dose computation algorithm to uncertainties in the beam's HVL and kVp by systematically varying these spectral parameters. To validate our approach experimentally, we characterized the spectrum of a Varian® OBI system by measuring the HVL using a Farmer-type Capintec ion chamber (0.06 cc) in air and compared dose calculations using our computationally validated in-house kV dose calculation code to measured percent depth-dose and transverse dose profiles for 80, 100, and 125 kVp open beams in a homogeneous phantom and a heterogeneous phantom comprising tissue, lung, and bone equivalent materials. The sensitivity analysis of the beam quality parameters (i.e., HVL, kVp, and field size) on dose computation accuracy shows that typical measurement uncertainties in the HVL and kVp (±0.2 mm Al and ±2 kVp, respectively) source characterization parameters lead to dose computation errors of less than 2%. Furthermore, for an open beam with no added filtration, HVL variations affect dose computation accuracy by less than 1% for a 125 kVp beam when field size is varied from 5 × 5 cm(2) to 40 × 40 cm(2). The central axis depth dose calculations and experimental measurements for the 80, 100, and 125 kVp energies agreed within 2% for the homogeneous and heterogeneous block phantoms, and agreement for the transverse dose profiles was within 6%. The HVL and kVp are sufficient for characterizing a kV x-ray source spectrum for accurate dose computation. As these parameters can be easily and accurately measured, they provide for a clinically feasible approach to characterizing a kV energy spectrum to be used for patient specific x-ray dose computations. Furthermore, these results provide experimental validation of our novel hybrid dose computation algorithm. © 2012 American Association of Physicists in Medicine.
Accelerating molecular property calculations with nonorthonormal Krylov space methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Furche, Filipp; Krull, Brandon T.; Nguyen, Brian D.

Here, we formulate Krylov space methods for large eigenvalue problems and linear equation systems that take advantage of decreasing residual norms to reduce the cost of matrix-vector multiplication. The residuals are used as subspace basis without prior orthonormalization, which leads to generalized eigenvalue problems or linear equation systems on the Krylov space. These nonorthonormal Krylov space (nKs) algorithms are favorable for large matrices with irregular sparsity patterns whose elements are computed on the fly, because fewer operations are necessary as the residual norm decreases as compared to the conventional method, while errors in the desired eigenpairs and solution vectors remainmore » small. We consider real symmetric and symplectic eigenvalue problems as well as linear equation systems and Sylvester equations as they appear in configuration interaction and response theory. The nKs method can be implemented in existing electronic structure codes with minor modifications and yields speed-ups of 1.2-1.8 in typical time-dependent Hartree-Fock and density functional applications without accuracy loss. The algorithm can compute entire linear subspaces simultaneously which benefits electronic spectra and force constant calculations requiring many eigenpairs or solution vectors. The nKs approach is related to difference density methods in electronic ground state calculations, and particularly efficient for integral direct computations of exchange-type contractions. By combination with resolution-of-the-identity methods for Coulomb contractions, three- to fivefold speed-ups of hybrid time-dependent density functional excited state and response calculations are achieved.« less
Accelerating molecular property calculations with nonorthonormal Krylov space methods

DOE PAGES

Furche, Filipp; Krull, Brandon T.; Nguyen, Brian D.; ...

2016-05-03

Here, we formulate Krylov space methods for large eigenvalue problems and linear equation systems that take advantage of decreasing residual norms to reduce the cost of matrix-vector multiplication. The residuals are used as subspace basis without prior orthonormalization, which leads to generalized eigenvalue problems or linear equation systems on the Krylov space. These nonorthonormal Krylov space (nKs) algorithms are favorable for large matrices with irregular sparsity patterns whose elements are computed on the fly, because fewer operations are necessary as the residual norm decreases as compared to the conventional method, while errors in the desired eigenpairs and solution vectors remainmore » small. We consider real symmetric and symplectic eigenvalue problems as well as linear equation systems and Sylvester equations as they appear in configuration interaction and response theory. The nKs method can be implemented in existing electronic structure codes with minor modifications and yields speed-ups of 1.2-1.8 in typical time-dependent Hartree-Fock and density functional applications without accuracy loss. The algorithm can compute entire linear subspaces simultaneously which benefits electronic spectra and force constant calculations requiring many eigenpairs or solution vectors. The nKs approach is related to difference density methods in electronic ground state calculations, and particularly efficient for integral direct computations of exchange-type contractions. By combination with resolution-of-the-identity methods for Coulomb contractions, three- to fivefold speed-ups of hybrid time-dependent density functional excited state and response calculations are achieved.« less
Sample entropy applied to the analysis of synthetic time series and tachograms

NASA Astrophysics Data System (ADS)

Muñoz-Diosdado, A.; Gálvez-Coyt, G. G.; Solís-Montufar, E.

2017-01-01

Entropy is a method of non-linear analysis that allows an estimate of the irregularity of a system, however, there are different types of computational entropy that were considered and tested in order to obtain one that would give an index of signals complexity taking into account the data number of the analysed time series, the computational resources demanded by the method, and the accuracy of the calculation. An algorithm for the generation of fractal time-series with a certain value of β was used for the characterization of the different entropy algorithms. We obtained a significant variation for most of the algorithms in terms of the series size, which could result counterproductive for the study of real signals of different lengths. The chosen method was sample entropy, which shows great independence of the series size. With this method, time series of heart interbeat intervals or tachograms of healthy subjects and patients with congestive heart failure were analysed. The calculation of sample entropy was carried out for 24-hour tachograms and time subseries of 6-hours for sleepiness and wakefulness. The comparison between the two populations shows a significant difference that is accentuated when the patient is sleeping.
Fast 3D dosimetric verifications based on an electronic portal imaging device using a GPU calculation engine.

PubMed

Zhu, Jinhan; Chen, Lixin; Chen, Along; Luo, Guangwen; Deng, Xiaowu; Liu, Xiaowei

2015-04-11

To use a graphic processing unit (GPU) calculation engine to implement a fast 3D pre-treatment dosimetric verification procedure based on an electronic portal imaging device (EPID). The GPU algorithm includes the deconvolution and convolution method for the fluence-map calculations, the collapsed-cone convolution/superposition (CCCS) algorithm for the 3D dose calculations and the 3D gamma evaluation calculations. The results of the GPU-based CCCS algorithm were compared to those of Monte Carlo simulations. The planned and EPID-based reconstructed dose distributions in overridden-to-water phantoms and the original patients were compared for 6 MV and 10 MV photon beams in intensity-modulated radiation therapy (IMRT) treatment plans based on dose differences and gamma analysis. The total single-field dose computation time was less than 8 s, and the gamma evaluation for a 0.1-cm grid resolution was completed in approximately 1 s. The results of the GPU-based CCCS algorithm exhibited good agreement with those of the Monte Carlo simulations. The gamma analysis indicated good agreement between the planned and reconstructed dose distributions for the treatment plans. For the target volume, the differences in the mean dose were less than 1.8%, and the differences in the maximum dose were less than 2.5%. For the critical organs, minor differences were observed between the reconstructed and planned doses. The GPU calculation engine was used to boost the speed of 3D dose and gamma evaluation calculations, thus offering the possibility of true real-time 3D dosimetric verification.
On-the-fly Numerical Surface Integration for Finite-Difference Poisson-Boltzmann Methods.

PubMed

Cai, Qin; Ye, Xiang; Wang, Jun; Luo, Ray

2011-11-01

Most implicit solvation models require the definition of a molecular surface as the interface that separates the solute in atomic detail from the solvent approximated as a continuous medium. Commonly used surface definitions include the solvent accessible surface (SAS), the solvent excluded surface (SES), and the van der Waals surface. In this study, we present an efficient numerical algorithm to compute the SES and SAS areas to facilitate the applications of finite-difference Poisson-Boltzmann methods in biomolecular simulations. Different from previous numerical approaches, our algorithm is physics-inspired and intimately coupled to the finite-difference Poisson-Boltzmann methods to fully take advantage of its existing data structures. Our analysis shows that the algorithm can achieve very good agreement with the analytical method in the calculation of the SES and SAS areas. Specifically, in our comprehensive test of 1,555 molecules, the average unsigned relative error is 0.27% in the SES area calculations and 1.05% in the SAS area calculations at the grid spacing of 1/2Å. In addition, a systematic correction analysis can be used to improve the accuracy for the coarse-grid SES area calculations, with the average unsigned relative error in the SES areas reduced to 0.13%. These validation studies indicate that the proposed algorithm can be applied to biomolecules over a broad range of sizes and structures. Finally, the numerical algorithm can also be adapted to evaluate the surface integral of either a vector field or a scalar field defined on the molecular surface for additional solvation energetics and force calculations.
SMV⊥: Simplex of maximal volume based upon the Gram-Schmidt process

NASA Astrophysics Data System (ADS)

Salazar-Vazquez, Jairo; Mendez-Vazquez, Andres

2015-10-01

In recent years, different algorithms for Hyperspectral Image (HI) analysis have been introduced. The high spectral resolution of these images allows to develop different algorithms for target detection, material mapping, and material identification for applications in Agriculture, Security and Defense, Industry, etc. Therefore, from the computer science's point of view, there is fertile field of research for improving and developing algorithms in HI analysis. In some applications, the spectral pixels of a HI can be classified using laboratory spectral signatures. Nevertheless, for many others, there is no enough available prior information or spectral signatures, making any analysis a difficult task. One of the most popular algorithms for the HI analysis is the N-FINDR because it is easy to understand and provides a way to unmix the original HI in the respective material compositions. The N-FINDR is computationally expensive and its performance depends on a random initialization process. This paper proposes a novel idea to reduce the complexity of the N-FINDR by implementing a bottom-up approach based in an observation from linear algebra and the use of the Gram-Schmidt process. Therefore, the Simplex of Maximal Volume Perpendicular (SMV⊥) algorithm is proposed for fast endmember extraction in hyperspectral imagery. This novel algorithm has complexity O(n) with respect to the number of pixels. In addition, the evidence shows that SMV⊥ calculates a bigger volume, and has lower computational time complexity than other poular algorithms on synthetic and real scenarios.
A mathematical model for the stressed state analysis of cylindrical laminated-composite pressure vessels

NASA Astrophysics Data System (ADS)

Bak, Roman; Matyja, Tomasz

An algorithm and a computer program have been developed for calculating the strength of pressure vessels made of laminated composites. Numerical results for pressure vessels of Kevlar 49 laminates are compared with experimental data in the literature.
SU-F-T-600: Influence of Acuros XB and AAA Dose Calculation Algorithms On Plan Quality Metrics and Normal Lung Doses in Lung SBRT

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yaparpalvi, R; Mynampati, D; Kuo, H

Purpose: To study the influence of superposition-beam model (AAA) and determinant-photon transport-solver (Acuros XB) dose calculation algorithms on the treatment plan quality metrics and on normal lung dose in Lung SBRT. Methods: Treatment plans of 10 Lung SBRT patients were randomly selected. Patients were prescribed to a total dose of 50-54Gy in 3–5 fractions (10?5 or 18?3). Doses were optimized accomplished with 6-MV using 2-arcs (VMAT). Doses were calculated using AAA algorithm with heterogeneity correction. For each plan, plan quality metrics in the categories- coverage, homogeneity, conformity and gradient were quantified. Repeat dosimetry for these AAA treatment plans was performedmore » using AXB algorithm with heterogeneity correction for same beam and MU parameters. Plan quality metrics were again evaluated and compared with AAA plan metrics. For normal lung dose, V{sub 20} and V{sub 5} to (Total lung- GTV) were evaluated. Results: The results are summarized in Supplemental Table 1. PTV volume was mean 11.4 (±3.3) cm{sup 3}. Comparing RTOG 0813 protocol criteria for conformality, AXB plans yielded on average, similar PITV ratio (individual PITV ratio differences varied from −9 to +15%), reduced target coverage (−1.6%) and increased R50% (+2.6%). Comparing normal lung doses, the lung V{sub 20} (+3.1%) and V{sub 5} (+1.5%) were slightly higher for AXB plans compared to AAA plans. High-dose spillage ((V105%PD - PTV)/ PTV) was slightly lower for AXB plans but the % low dose spillage (D2cm) was similar between the two calculation algorithms. Conclusion: AAA algorithm overestimates lung target dose. Routinely adapting to AXB for dose calculations in Lung SBRT planning may improve dose calculation accuracy, as AXB based calculations have been shown to be closer to Monte Carlo based dose predictions in accuracy and with relatively faster computational time. For clinical practice, revisiting dose-fractionation in Lung SBRT to correct for dose overestimates attributable to algorithm may very well be warranted.« less
Golden Ratio Genetic Algorithm Based Approach for Modelling and Analysis of the Capacity Expansion of Urban Road Traffic Network

PubMed Central

Zhang, Lun; Zhang, Meng; Yang, Wenchen; Dong, Decun

2015-01-01

This paper presents the modelling and analysis of the capacity expansion of urban road traffic network (ICURTN). Thebilevel programming model is first employed to model the ICURTN, in which the utility of the entire network is maximized with the optimal utility of travelers' route choice. Then, an improved hybrid genetic algorithm integrated with golden ratio (HGAGR) is developed to enhance the local search of simple genetic algorithms, and the proposed capacity expansion model is solved by the combination of the HGAGR and the Frank-Wolfe algorithm. Taking the traditional one-way network and bidirectional network as the study case, three numerical calculations are conducted to validate the presented model and algorithm, and the primary influencing factors on extended capacity model are analyzed. The calculation results indicate that capacity expansion of road network is an effective measure to enlarge the capacity of urban road network, especially on the condition of limited construction budget; the average computation time of the HGAGR is 122 seconds, which meets the real-time demand in the evaluation of the road network capacity. PMID:25802512
Holographic near-eye display system based on double-convergence light Gerchberg-Saxton algorithm.

PubMed

Sun, Peng; Chang, Shengqian; Liu, Siqi; Tao, Xiao; Wang, Chang; Zheng, Zhenrong

2018-04-16

In this paper, a method is proposed to implement noises reduced three-dimensional (3D) holographic near-eye display by phase-only computer-generated hologram (CGH). The CGH is calculated from a double-convergence light Gerchberg-Saxton (GS) algorithm, in which the phases of two virtual convergence lights are introduced into GS algorithm simultaneously. The first phase of convergence light is a replacement of random phase as the iterative initial value and the second phase of convergence light will modulate the phase distribution calculated by GS algorithm. Both simulations and experiments are carried out to verify the feasibility of the proposed method. The results indicate that this method can effectively reduce the noises in the reconstruction. Field of view (FOV) of the reconstructed image reaches 40 degrees and experimental light path in the 4-f system is shortened. As for 3D experiments, the results demonstrate that the proposed algorithm can present 3D images with 180cm zooming range and continuous depth cues. This method may provide a promising solution in future 3D augmented reality (AR) realization.

Analytical Algorithms to Quantify the Uncertainty in Remaining Useful Life Prediction

NASA Technical Reports Server (NTRS)

Sankararaman, Shankar; Saxena, Abhinav; Daigle, Matthew; Goebel, Kai

2013-01-01

This paper investigates the use of analytical algorithms to quantify the uncertainty in the remaining useful life (RUL) estimate of components used in aerospace applications. The prediction of RUL is affected by several sources of uncertainty and it is important to systematically quantify their combined effect by computing the uncertainty in the RUL prediction in order to aid risk assessment, risk mitigation, and decisionmaking. While sampling-based algorithms have been conventionally used for quantifying the uncertainty in RUL, analytical algorithms are computationally cheaper and sometimes, are better suited for online decision-making. While exact analytical algorithms are available only for certain special cases (for e.g., linear models with Gaussian variables), effective approximations can be made using the the first-order second moment method (FOSM), the first-order reliability method (FORM), and the inverse first-order reliability method (Inverse FORM). These methods can be used not only to calculate the entire probability distribution of RUL but also to obtain probability bounds on RUL. This paper explains these three methods in detail and illustrates them using the state-space model of a lithium-ion battery.
Load balancing prediction method of cloud storage based on analytic hierarchy process and hybrid hierarchical genetic algorithm.

PubMed

Zhou, Xiuze; Lin, Fan; Yang, Lvqing; Nie, Jing; Tan, Qian; Zeng, Wenhua; Zhang, Nian

2016-01-01

With the continuous expansion of the cloud computing platform scale and rapid growth of users and applications, how to efficiently use system resources to improve the overall performance of cloud computing has become a crucial issue. To address this issue, this paper proposes a method that uses an analytic hierarchy process group decision (AHPGD) to evaluate the load state of server nodes. Training was carried out by using a hybrid hierarchical genetic algorithm (HHGA) for optimizing a radial basis function neural network (RBFNN). The AHPGD makes the aggregative indicator of virtual machines in cloud, and become input parameters of predicted RBFNN. Also, this paper proposes a new dynamic load balancing scheduling algorithm combined with a weighted round-robin algorithm, which uses the predictive periodical load value of nodes based on AHPPGD and RBFNN optimized by HHGA, then calculates the corresponding weight values of nodes and makes constant updates. Meanwhile, it keeps the advantages and avoids the shortcomings of static weighted round-robin algorithm.
Computing interior eigenvalues of nonsymmetric matrices: application to three-dimensional metamaterial composites.

PubMed

Terao, Takamichi

2010-08-01

We propose a numerical method to calculate interior eigenvalues and corresponding eigenvectors for nonsymmetric matrices. Based on the subspace projection technique onto expanded Ritz subspace, it becomes possible to obtain eigenvalues and eigenvectors with sufficiently high precision. This method overcomes the difficulties of the traditional nonsymmetric Lanczos algorithm, and improves the accuracy of the obtained interior eigenvalues and eigenvectors. Using this algorithm, we investigate three-dimensional metamaterial composites consisting of positive and negative refractive index materials, and it is demonstrated that the finite-difference frequency-domain algorithm is applicable to analyze these metamaterial composites.
Algorithm For Hypersonic Flow In Chemical Equilibrium

NASA Technical Reports Server (NTRS)

Palmer, Grant

1989-01-01

Implicit, finite-difference, shock-capturing algorithm calculates inviscid, hypersonic flows in chemical equilibrium. Implicit formulation chosen because overcomes limitation on mathematical stability encountered in explicit formulations. For dynamical portion of problem, Euler equations written in conservation-law form in Cartesian coordinate system for two-dimensional or axisymmetric flow. For chemical portion of problem, equilibrium state of gas at each point in computational grid determined by minimizing local Gibbs free energy, subject to local conservation of molecules, atoms, ions, and total enthalpy. Major advantage: resulting algorithm naturally stable and captures strong shocks without help of artificial-dissipation terms to damp out spurious numerical oscillations.
Visualizing staggered fields and analyzing electromagnetic data with PerceptEM

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shasharina, Svetlana

This project resulted in VSimSP: a software for simulating large photonic devices of high-performance computers. It includes: GUI for Photonics Simulations; High-Performance Meshing Algorithm; 2d Order Multimaterials Algorithm; Mode Solver for Waveguides; 2d Order Material Dispersion Algorithm; S Parameters Calculation; High-Performance Workflow at NERSC ; and Large Photonic Devices Simulation Setups We believe we became the only company in the world which can simulate large photonics devices in 3D on modern supercomputers without the need to split them into subparts or do low-fidelity modeling. We started commercial engagement with a manufacturing company.
An open source software for fast grid-based data-mining in spatial epidemiology (FGBASE).

PubMed

Baker, David M; Valleron, Alain-Jacques

2014-10-30

Examining whether disease cases are clustered in space is an important part of epidemiological research. Another important part of spatial epidemiology is testing whether patients suffering from a disease are more, or less, exposed to environmental factors of interest than adequately defined controls. Both approaches involve determining the number of cases and controls (or population at risk) in specific zones. For cluster searches, this often must be done for millions of different zones. Doing this by calculating distances can lead to very lengthy computations. In this work we discuss the computational advantages of geographical grid-based methods, and introduce an open source software (FGBASE) which we have created for this purpose. Geographical grids based on the Lambert Azimuthal Equal Area projection are well suited for spatial epidemiology because they preserve area: each cell of the grid has the same area. We describe how data is projected onto such a grid, as well as grid-based algorithms for spatial epidemiological data-mining. The software program (FGBASE), that we have developed, implements these grid-based methods. The grid based algorithms perform extremely fast. This is particularly the case for cluster searches. When applied to a cohort of French Type 1 Diabetes (T1D) patients, as an example, the grid based algorithms detected potential clusters in a few seconds on a modern laptop. This compares very favorably to an equivalent cluster search using distance calculations instead of a grid, which took over 4 hours on the same computer. In the case study we discovered 4 potential clusters of T1D cases near the cities of Le Havre, Dunkerque, Toulouse and Nantes. One example of environmental analysis with our software was to study whether a significant association could be found between distance to vineyards with heavy pesticide. None was found. In both examples, the software facilitates the rapid testing of hypotheses. Grid-based algorithms for mining spatial epidemiological data provide advantages in terms of computational complexity thus improving the speed of computations. We believe that these methods and this software tool (FGBASE) will lower the computational barriers to entry for those performing epidemiological research.
Efficient Online Optimized Quantum Control for Adiabatic Quantum Computation

NASA Astrophysics Data System (ADS)

Quiroz, Gregory

Adiabatic quantum computation (AQC) relies on controlled adiabatic evolution to implement a quantum algorithm. While control evolution can take many forms, properly designed time-optimal control has been shown to be particularly advantageous for AQC. Grover's search algorithm is one such example where analytically-derived time-optimal control leads to improved scaling of the minimum energy gap between the ground state and first excited state and thus, the well-known quadratic quantum speedup. Analytical extensions beyond Grover's search algorithm present a daunting task that requires potentially intractable calculations of energy gaps and a significant degree of model certainty. Here, an in situ quantum control protocol is developed for AQC. The approach is shown to yield controls that approach the analytically-derived time-optimal controls for Grover's search algorithm. In addition, the protocol's convergence rate as a function of iteration number is shown to be essentially independent of system size. Thus, the approach is potentially scalable to many-qubit systems.
Integration of Libration Point Orbit Dynamics into a Universal 3-D Autonomous Formation Flying Algorithm

NASA Technical Reports Server (NTRS)

Folta, David; Bauer, Frank H. (Technical Monitor)

2001-01-01

The autonomous formation flying control algorithm developed by the Goddard Space Flight Center (GSFC) for the New Millennium Program (NMP) Earth Observing-1 (EO-1) mission is investigated for applicability to libration point orbit formations. In the EO-1 formation-flying algorithm, control is accomplished via linearization about a reference transfer orbit with a state transition matrix (STM) computed from state inputs. The effect of libration point orbit dynamics on this algorithm architecture is explored via computation of STMs using the flight proven code, a monodromy matrix developed from a N-body model of a libration orbit, and a standard STM developed from the gravitational and coriolis effects as measured at the libration point. A comparison of formation flying Delta-Vs calculated from these methods is made to a standard linear quadratic regulator (LQR) method. The universal 3-D approach is optimal in the sense that it can be accommodated as an open-loop or closed-loop control using only state information.
The application of the large particles method of numerical modeling of the process of carbonic nanostructures synthesis in plasma

NASA Astrophysics Data System (ADS)

Abramov, G. V.; Gavrilov, A. N.

2018-03-01

The article deals with the numerical solution of the mathematical model of the particles motion and interaction in multicomponent plasma by the example of electric arc synthesis of carbon nanostructures. The high order of the particles and the number of their interactions requires a significant input of machine resources and time for calculations. Application of the large particles method makes it possible to reduce the amount of computation and the requirements for hardware resources without affecting the accuracy of numerical calculations. The use of technology of GPGPU parallel computing using the Nvidia CUDA technology allows organizing all General purpose computation on the basis of the graphical processor graphics card. The comparative analysis of different approaches to parallelization of computations to speed up calculations with the choice of the algorithm in which to calculate the accuracy of the solution shared memory is used. Numerical study of the influence of particles density in the macro particle on the motion parameters and the total number of particle collisions in the plasma for different modes of synthesis has been carried out. The rational range of the coherence coefficient of particle in the macro particle is computed.
Accelerating and focusing protein-protein docking correlations using multi-dimensional rotational FFT generating functions.

PubMed

Ritchie, David W; Kozakov, Dima; Vajda, Sandor

2008-09-01

Predicting how proteins interact at the molecular level is a computationally intensive task. Many protein docking algorithms begin by using fast Fourier transform (FFT) correlation techniques to find putative rigid body docking orientations. Most such approaches use 3D Cartesian grids and are therefore limited to computing three dimensional (3D) translational correlations. However, translational FFTs can speed up the calculation in only three of the six rigid body degrees of freedom, and they cannot easily incorporate prior knowledge about a complex to focus and hence further accelerate the calculation. Furthemore, several groups have developed multi-term interaction potentials and others use multi-copy approaches to simulate protein flexibility, which both add to the computational cost of FFT-based docking algorithms. Hence there is a need to develop more powerful and more versatile FFT docking techniques. This article presents a closed-form 6D spherical polar Fourier correlation expression from which arbitrary multi-dimensional multi-property multi-resolution FFT correlations may be generated. The approach is demonstrated by calculating 1D, 3D and 5D rotational correlations of 3D shape and electrostatic expansions up to polynomial order L=30 on a 2 GB personal computer. As expected, 3D correlations are found to be considerably faster than 1D correlations but, surprisingly, 5D correlations are often slower than 3D correlations. Nonetheless, we show that 5D correlations will be advantageous when calculating multi-term knowledge-based interaction potentials. When docking the 84 complexes of the Protein Docking Benchmark, blind 3D shape plus electrostatic correlations take around 30 minutes on a contemporary personal computer and find acceptable solutions within the top 20 in 16 cases. Applying a simple angular constraint to focus the calculation around the receptor binding site produces acceptable solutions within the top 20 in 28 cases. Further constraining the search to the ligand binding site gives up to 48 solutions within the top 20, with calculation times of just a few minutes per complex. Hence the approach described provides a practical and fast tool for rigid body protein-protein docking, especially when prior knowledge about one or both binding sites is available.
Creation and Validation of an Automated Algorithm to Determine Postoperative Ventilator Requirements After Cardiac Surgery.

PubMed

Gabel, Eilon; Hofer, Ira S; Satou, Nancy; Grogan, Tristan; Shemin, Richard; Mahajan, Aman; Cannesson, Maxime

2017-05-01

In medical practice today, clinical data registries have become a powerful tool for measuring and driving quality improvement, especially among multicenter projects. Registries face the known problem of trying to create dependable and clear metrics from electronic medical records data, which are typically scattered and often based on unreliable data sources. The Society for Thoracic Surgery (STS) is one such example, and it supports manually collected data by trained clinical staff in an effort to obtain the highest-fidelity data possible. As a possible alternative, our team designed an algorithm to test the feasibility of producing computer-derived data for the case of postoperative mechanical ventilation hours. In this article, we study and compare the accuracy of algorithm-derived mechanical ventilation data with manual data extraction. We created a novel algorithm that is able to calculate mechanical ventilation duration for any postoperative patient using raw data from our EPIC electronic medical record. Utilizing nursing documentation of airway devices, documentation of lines, drains, and airways, and respiratory therapist ventilator settings, the algorithm produced results that were then validated against the STS registry. This enabled us to compare our algorithm results with data collected by human chart review. Any discrepancies were then resolved with manual calculation by a research team member. The STS registry contained a total of 439 University of California Los Angeles cardiac cases from April 1, 2013, to March 31, 2014. After excluding 201 patients for not remaining intubated, tracheostomy use, or for having 2 surgeries on the same day, 238 cases met inclusion criteria. Comparing the postoperative ventilation durations between the 2 data sources resulted in 158 (66%) ventilation durations agreeing within 1 hour, indicating a probable correct value for both sources. Among the discrepant cases, the algorithm yielded results that were exclusively correct in 75 (93.8%) cases, whereas the STS results were exclusively correct once (1.3%). The remaining 4 cases had inconclusive results after manual review because of a prolonged documentation gap between mechanical and spontaneous ventilation. In these cases, STS and algorithm results were different from one another but were both within the transition timespan. This yields an overall accuracy of 99.6% (95% confidence interval, 98.7%-100%) for the algorithm when compared with 68.5% (95% confidence interval, 62.6%-74.4%) for the STS data (P < .001). There is a significant appeal to having a computer algorithm capable of calculating metrics such as total ventilator times, especially because it is labor intensive and prone to human error. By incorporating 3 different sources into our algorithm and by using preprogrammed clinical judgment to overcome common errors with data entry, our results proved to be more comprehensive and more accurate, and they required a fraction of the computation time compared with manual review.
MODA: a new algorithm to compute optical depths in multidimensional hydrodynamic simulations

NASA Astrophysics Data System (ADS)

Perego, Albino; Gafton, Emanuel; Cabezón, Rubén; Rosswog, Stephan; Liebendörfer, Matthias

2014-08-01

Aims: We introduce the multidimensional optical depth algorithm (MODA) for the calculation of optical depths in approximate multidimensional radiative transport schemes, equally applicable to neutrinos and photons. Motivated by (but not limited to) neutrino transport in three-dimensional simulations of core-collapse supernovae and neutron star mergers, our method makes no assumptions about the geometry of the matter distribution, apart from expecting optically transparent boundaries. Methods: Based on local information about opacities, the algorithm figures out an escape route that tends to minimize the optical depth without assuming any predefined paths for radiation. Its adaptivity makes it suitable for a variety of astrophysical settings with complicated geometry (e.g., core-collapse supernovae, compact binary mergers, tidal disruptions, star formation, etc.). We implement the MODA algorithm into both a Eulerian hydrodynamics code with a fixed, uniform grid and into an SPH code where we use a tree structure that is otherwise used for searching neighbors and calculating gravity. Results: In a series of numerical experiments, we compare the MODA results with analytically known solutions. We also use snapshots from actual 3D simulations and compare the results of MODA with those obtained with other methods, such as the global and local ray-by-ray method. It turns out that MODA achieves excellent accuracy at a moderate computational cost. In appendix we also discuss implementation details and parallelization strategies.
Technical Note: spektr 3.0-A computational tool for x-ray spectrum modeling and analysis.

PubMed

Punnoose, J; Xu, J; Sisniega, A; Zbijewski, W; Siewerdsen, J H

2016-08-01

A computational toolkit (spektr 3.0) has been developed to calculate x-ray spectra based on the tungsten anode spectral model using interpolating cubic splines (TASMICS) algorithm, updating previous work based on the tungsten anode spectral model using interpolating polynomials (TASMIP) spectral model. The toolkit includes a matlab (The Mathworks, Natick, MA) function library and improved user interface (UI) along with an optimization algorithm to match calculated beam quality with measurements. The spektr code generates x-ray spectra (photons/mm(2)/mAs at 100 cm from the source) using TASMICS as default (with TASMIP as an option) in 1 keV energy bins over beam energies 20-150 kV, extensible to 640 kV using the TASMICS spectra. An optimization tool was implemented to compute the added filtration (Al and W) that provides a best match between calculated and measured x-ray tube output (mGy/mAs or mR/mAs) for individual x-ray tubes that may differ from that assumed in TASMICS or TASMIP and to account for factors such as anode angle. The median percent difference in photon counts for a TASMICS and TASMIP spectrum was 4.15% for tube potentials in the range 30-140 kV with the largest percentage difference arising in the low and high energy bins due to measurement errors in the empirically based TASMIP model and inaccurate polynomial fitting. The optimization tool reported a close agreement between measured and calculated spectra with a Pearson coefficient of 0.98. The computational toolkit, spektr, has been updated to version 3.0, validated against measurements and existing models, and made available as open source code. Video tutorials for the spektr function library, UI, and optimization tool are available.
Mathematical detection of aortic valve opening (B point) in impedance cardiography: A comparison of three popular algorithms.

PubMed

Árbol, Javier Rodríguez; Perakakis, Pandelis; Garrido, Alba; Mata, José Luis; Fernández-Santaella, M Carmen; Vila, Jaime

2017-03-01

The preejection period (PEP) is an index of left ventricle contractility widely used in psychophysiological research. Its computation requires detecting the moment when the aortic valve opens, which coincides with the B point in the first derivative of impedance cardiogram (ICG). Although this operation has been traditionally made via visual inspection, several algorithms based on derivative calculations have been developed to enable an automatic performance of the task. However, despite their popularity, data about their empirical validation are not always available. The present study analyzes the performance in the estimation of the aortic valve opening of three popular algorithms, by comparing their performance with the visual detection of the B point made by two independent scorers. Algorithm 1 is based on the first derivative of the ICG, Algorithm 2 on the second derivative, and Algorithm 3 on the third derivative. Algorithm 3 showed the highest accuracy rate (78.77%), followed by Algorithm 1 (24.57%) and Algorithm 2 (13.82%). In the automatic computation of PEP, Algorithm 2 resulted in significantly more missed cycles (48.57%) than Algorithm 1 (6.3%) and Algorithm 3 (3.5%). Algorithm 2 also estimated a significantly lower average PEP (70 ms), compared with the values obtained by Algorithm 1 (119 ms) and Algorithm 3 (113 ms). Our findings indicate that the algorithm based on the third derivative of the ICG performs significantly better. Nevertheless, a visual inspection of the signal proves indispensable, and this article provides a novel visual guide to facilitate the manual detection of the B point. © 2016 Society for Psychophysiological Research.
An improved K-means clustering algorithm in agricultural image segmentation

NASA Astrophysics Data System (ADS)

Cheng, Huifeng; Peng, Hui; Liu, Shanmei

Image segmentation is the first important step to image analysis and image processing. In this paper, according to color crops image characteristics, we firstly transform the color space of image from RGB to HIS, and then select proper initial clustering center and cluster number in application of mean-variance approach and rough set theory followed by clustering calculation in such a way as to automatically segment color component rapidly and extract target objects from background accurately, which provides a reliable basis for identification, analysis, follow-up calculation and process of crops images. Experimental results demonstrate that improved k-means clustering algorithm is able to reduce the computation amounts and enhance precision and accuracy of clustering.
An algorithm of Saxena-Easo on fuzzy time series forecasting

NASA Astrophysics Data System (ADS)

Ramadhani, L. C.; Anggraeni, D.; Kamsyakawuni, A.; Hadi, A. F.

2018-04-01

This paper presents a forecast model of Saxena-Easo fuzzy time series prediction to study the prediction of Indonesia inflation rate in 1970-2016. We use MATLAB software to compute this method. The algorithm of Saxena-Easo fuzzy time series doesn’t need stationarity like conventional forecasting method, capable of dealing with the value of time series which are linguistic and has the advantage of reducing the calculation, time and simplifying the calculation process. Generally it’s focus on percentage change as the universe discourse, interval partition and defuzzification. The result indicate that between the actual data and the forecast data are close enough with Root Mean Square Error (RMSE) = 1.5289.
Random Numbers and Monte Carlo Methods

NASA Astrophysics Data System (ADS)

Scherer, Philipp O. J.

Many-body problems often involve the calculation of integrals of very high dimension which cannot be treated by standard methods. For the calculation of thermodynamic averages Monte Carlo methods are very useful which sample the integration volume at randomly chosen points. After summarizing some basic statistics, we discuss algorithms for the generation of pseudo-random numbers with given probability distribution which are essential for all Monte Carlo methods. We show how the efficiency of Monte Carlo integration can be improved by sampling preferentially the important configurations. Finally the famous Metropolis algorithm is applied to classical many-particle systems. Computer experiments visualize the central limit theorem and apply the Metropolis method to the traveling salesman problem.
Hybrid reduced order modeling for assembly calculations

DOE PAGES

Bang, Youngsuk; Abdel-Khalik, Hany S.; Jessee, Matthew A.; ...

2015-08-14

While the accuracy of assembly calculations has greatly improved due to the increase in computer power enabling more refined description of the phase space and use of more sophisticated numerical algorithms, the computational cost continues to increase which limits the full utilization of their effectiveness for routine engineering analysis. Reduced order modeling is a mathematical vehicle that scales down the dimensionality of large-scale numerical problems to enable their repeated executions on small computing environment, often available to end users. This is done by capturing the most dominant underlying relationships between the model's inputs and outputs. Previous works demonstrated the usemore » of the reduced order modeling for a single physics code, such as a radiation transport calculation. This paper extends those works to coupled code systems as currently employed in assembly calculations. Finally, numerical tests are conducted using realistic SCALE assembly models with resonance self-shielding, neutron transport, and nuclides transmutation/depletion models representing the components of the coupled code system.« less
Extending the length and time scales of Gram-Schmidt Lyapunov vector computations

NASA Astrophysics Data System (ADS)

Costa, Anthony B.; Green, Jason R.

2013-08-01

Lyapunov vectors have found growing interest recently due to their ability to characterize systems out of thermodynamic equilibrium. The computation of orthogonal Gram-Schmidt vectors requires multiplication and QR decomposition of large matrices, which grow as N2 (with the particle count). This expense has limited such calculations to relatively small systems and short time scales. Here, we detail two implementations of an algorithm for computing Gram-Schmidt vectors. The first is a distributed-memory message-passing method using Scalapack. The second uses the newly-released MAGMA library for GPUs. We compare the performance of both codes for Lennard-Jones fluids from N=100 to 1300 between Intel Nahalem/Infiniband DDR and NVIDIA C2050 architectures. To our best knowledge, these are the largest systems for which the Gram-Schmidt Lyapunov vectors have been computed, and the first time their calculation has been GPU-accelerated. We conclude that Lyapunov vector calculations can be significantly extended in length and time by leveraging the power of GPU-accelerated linear algebra.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, C.; Yu, G.; Wang, K.

The physical designs of the new concept reactors which have complex structure, various materials and neutronic energy spectrum, have greatly improved the requirements to the calculation methods and the corresponding computing hardware. Along with the widely used parallel algorithm, heterogeneous platforms architecture has been introduced into numerical computations in reactor physics. Because of the natural parallel characteristics, the CPU-FPGA architecture is often used to accelerate numerical computation. This paper studies the application and features of this kind of heterogeneous platforms used in numerical calculation of reactor physics through practical examples. After the designed neutron diffusion module based on CPU-FPGA architecturemore » achieves a 11.2 speed up factor, it is proved to be feasible to apply this kind of heterogeneous platform into reactor physics. (authors)« less

Partitioning and packing mathematical simulation models for calculation on parallel computers

NASA Technical Reports Server (NTRS)

Arpasi, D. J.; Milner, E. J.

1986-01-01

The development of multiprocessor simulations from a serial set of ordinary differential equations describing a physical system is described. Degrees of parallelism (i.e., coupling between the equations) and their impact on parallel processing are discussed. The problem of identifying computational parallelism within sets of closely coupled equations that require the exchange of current values of variables is described. A technique is presented for identifying this parallelism and for partitioning the equations for parallel solution on a multiprocessor. An algorithm which packs the equations into a minimum number of processors is also described. The results of the packing algorithm when applied to a turbojet engine model are presented in terms of processor utilization.
Calibration of the computer model describing flows in the water supply system; example of the application of a genetic algorithm

NASA Astrophysics Data System (ADS)

Orłowska-Szostak, Maria; Orłowski, Ryszard

2017-11-01

The paper discusses some relevant aspects of the calibration of a computer model describing flows in the water supply system. The authors described an exemplary water supply system and used it as a practical illustration of calibration. A range of measures was discussed and applied, which improve the convergence and effective use of calculations in the calibration process and also the effect of such calibration which is the validity of the results obtained. Drawing up results of performed measurements, i.e. estimating pipe roughnesses, the authors performed using the genetic algorithm implementation of which is a software developed by Resan Labs company from Brazil.
Flight data processing with the F-8 adaptive algorithm

NASA Technical Reports Server (NTRS)

Hartmann, G.; Stein, G.; Petersen, K.

1977-01-01

An explicit adaptive control algorithm based on maximum likelihood estimation of parameters has been designed for NASA's DFBW F-8 aircraft. To avoid iterative calculations, the algorithm uses parallel channels of Kalman filters operating at fixed locations in parameter space. This algorithm has been implemented in NASA/DFRC's Remotely Augmented Vehicle (RAV) facility. Real-time sensor outputs (rate gyro, accelerometer and surface position) are telemetered to a ground computer which sends new gain values to an on-board system. Ground test data and flight records were used to establish design values of noise statistics and to verify the ground-based adaptive software. The software and its performance evaluation based on flight data are described
Computation of viscous flows over airfoils, including separation, with a coupling approach

NASA Technical Reports Server (NTRS)

Leballeur, J. C.

1983-01-01

Viscous incompressible flows over single or multiple airfoils, with or without separation, were computed using an inviscid flow calculation, with modified boundary conditions, and by a method providing calculation and coupling for boundary layers and wakes, within conditions of strong viscous interaction. The inviscid flow is calculated with a method of singularities, the numerics of which were improved by using both source and vortex distributions over profiles, associated with regularity conditions for the fictitious flows inside of the airfoils. The viscous calculation estimates the difference between viscous flow and inviscid interacting flow, with a direct or inverse integral method, laminar or turbulent, with or without reverse flow. The numerical method for coupling determines iteratively the boundary conditions for the inviscid flow. For attached viscous layers regions, an underrelaxation is locally calculated to insure stability. For separated or separating regions, a special semi-inverse algorithm is used. Comparisons with experiments are presented.
Techniques for the computation in demographic projections of health manpower.

PubMed

Horbach, L

1979-01-01

Some basic principles and algorithms are presented which can be used for projective calculations of medical staff on the basis of demographic data. The effects of modifications of the input data such as by health policy measures concerning training capacity, can be demonstrated by repeated calculations with assumptions. Such models give a variety of results and may highlight the probable future balance between health manpower supply and requirements.
Accelerating MP2C dispersion corrections for dimers and molecular crystals

NASA Astrophysics Data System (ADS)

Huang, Yuanhang; Shao, Yihan; Beran, Gregory J. O.

2013-06-01

The MP2C dispersion correction of Pitonak and Hesselmann [J. Chem. Theory Comput. 6, 168 (2010)], 10.1021/ct9005882 substantially improves the performance of second-order Møller-Plesset perturbation theory for non-covalent interactions, albeit with non-trivial computational cost. Here, the MP2C correction is computed in a monomer-centered basis instead of a dimer-centered one. When applied to a single dimer MP2 calculation, this change accelerates the MP2C dispersion correction several-fold while introducing only trivial new errors. More significantly, in the context of fragment-based molecular crystal studies, combination of the new monomer basis algorithm and the periodic symmetry of the crystal reduces the cost of computing the dispersion correction by two orders of magnitude. This speed-up reduces the MP2C dispersion correction calculation from a significant computational expense to a negligible one in crystals like aspirin or oxalyl dihydrazide, without compromising accuracy.
Computing Gröbner and Involutive Bases for Linear Systems of Difference Equations

NASA Astrophysics Data System (ADS)

Yanovich, Denis

2018-02-01

The computation of involutive bases and Gröbner bases for linear systems of difference equations is solved and its importance for physical and mathematical problems is discussed. The algorithm and issues concerning its implementation in C are presented and calculation times are compared with the competing programs. The paper ends with consideration on the parallel version of this implementation and its scalability.
Aircraft Trajectories Computation-Prediction-Control. Volume 1 (La Trajectoire de l’Avion Calcul-Prediction-Controle)

DTIC Science & Technology

1990-03-01

knowledge covering problems of this type is called calculus of variations or optimal control theory (Refs. 1-8). As stated before, appli - cations occur...to the optimality conditions and the feasibility equations of Problem (GP), respectively. Clearly, after the transformation (26) is applied , the...trajectories, the primal sequential gradient-restoration algorithm (PSGRA) is applied to compute optimal trajectories for aeroassisted orbital transfer
A computer-based servo system for controlling isotonic contractions of muscle.

PubMed

Smith, J P; Barsotti, R J

1993-11-01

We have developed a computer-based servo system for controlling isotonic releases in muscle. This system is a composite of commercially available devices: an IBM personal computer, an analog-to-digital (A/D) board, an Akers AE801 force transducer, and a Cambridge Technology motor. The servo loop controlling the force clamp is generated by computer via the A/D board, using a program written in QuickBASIC 4.5. Results are shown that illustrate the ability of the system to clamp the force generated by either skinned cardiac trabeculae or single rabbit psoas fibers down to the resolution of the force transducer within 4 ms. This rate is independent of the level of activation of the tissue and the size of the load imposed during the release. The key to the effectiveness of the system consists of two algorithms that are described in detail. The first is used to calculate the error signal to hold force to the desired level. The second algorithm is used to calculate the appropriate gain of the servo for a particular fiber and the size of the desired load to be imposed. The results show that the described computer-based method for controlling isotonic releases in muscle represents a good compromise between simplicity and performance and is an alternative to the custom-built digital/analog servo devices currently being used in studies of muscle mechanics.
Improved algorithm for calculating the Chandrasekhar function

NASA Astrophysics Data System (ADS)

Jablonski, A.

2013-02-01

Theoretical models of electron transport in condensed matter require an effective source of the Chandrasekhar H(x,omega) function. A code providing the H(x,omega) function has to be both accurate and very fast. The current revision of the code published earlier [A. Jablonski, Comput. Phys. Commun. 183 (2012) 1773] decreased the running time, averaged over different pairs of arguments x and omega, by a factor of more than 20. The decrease of the running time in the range of small values of the argument x, less than 0.05, is even more pronounced, reaching a factor of 30. The accuracy of the current code is not affected, and is typically better than 12 decimal places. New version program summaryProgram title: CHANDRAS_v2 Catalogue identifier: AEMC_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEMC_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC license, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 976 No. of bytes in distributed program, including test data, etc.: 11416 Distribution format: tar.gz Programming language: Fortran 90 Computer: Any computer with a Fortran 90 compiler Operating system: Windows 7, Windows XP, Unix/Linux RAM: 0.7 MB Classification: 2.4, 7.2 Catalogue identifier of previous version: AEMC_v1_0 Journal reference of previous version: Comput. Phys. Commun. 183 (2012) 1773 Does the new version supersede the old program: Yes Nature of problem: An attempt has been made to develop a subroutine that calculates the Chandrasekhar function with high accuracy, of at least 10 decimal places. Simultaneously, this subroutine should be very fast. Both requirements stem from the theory of electron transport in condensed matter. Solution method: Two algorithms were developed, each based on a different integral representation of the Chandrasekhar function. The final algorithm is edited by mixing these two algorithms by selecting ranges of the argument omega in which the performance is the fastest. Reasons for the new version: Some of the theoretical models describing electron transport in condensed matter need a source of the Chandrasekhar H function values with an accuracy of at least 10 decimal places. Additionally, calculations of this function should be as fast as possible since frequent calls to a subroutine providing this function are made (e.g., numerical evaluation of a double integral with a complicated integrand containing the H function). Both conditions were satisfied in the algorithm previously published [1]. However, it has been found that a proper selection of the quadrature in an integral representation of the Chandrasekhar function may considerably decrease the running time. By suitable selection of the number of abscissas in Gauss-Legendre quadrature, the execution time was decreased by a factor of more than 20. Simultaneously, the accuracy of results has not been affected. Summary of revisions: (1) As in previous work [1], two integral representations of the Chandrasekhar function, H(x,omega), were considered: the expression published by Dudarev and Whelan [2] and the expression published by Davidović et al. [3]. The algorithms implementing these representations were designated A and B, respectively. All integrals in these implementations were previously calculated using Romberg quadrature. It has been found, however, that the use of Gauss-Legendre quadrature considerably improved the performance of both algorithms. Two conditions have to be satisfied. (i) The number of abscissas, N, has to be rather large, and (ii) the abscissas and corresponding weights should be determined with accuracy as high as possible. The abscissas and weights are available for N=16, 20, 24, 32, 40, 48, 64, 80, and 96 with accuracy of 20 decimal places [4], and all these values were introduced into a new procedure GAUSS replacing procedure ROMBERG. Due to the fact that the implemented tables are rather extensive, they were recalculated using the Rybicki algorithm (Ref. [5], pp. 183-184) and rechecked. No errors or misprints were found. (2) In the integral representation of the H function derived by Davidović et al. [3], the positive root ν0 of the so-called dispersion function needs to be calculated with accuracy of at least 10 decimal places (see. Ref [6], pp. 61-64 and Ref. [1], Eqs. (5) and (29)). For small values of the argument omega and values of omega close to unity, the nonlinear equation in one unknown, ν0, can be solved analytically. New simple analytical expressions were derived here that can be efficiently used in calculations of the root. (3) The above modifications of the code considerably decreased the time of calculation of both algorithms A and B. The results are summarized in Fig. 1. The time of calculations is in fact the CPU time in microseconds for a computer equipped with an Inter Xeon processor (3.46 GHz) using Lahey-Fujitsu Fortran v. 7.2. Time of calculations of the H(x,omega) function averaged over different pairs of arguments x and omega. (a) 400 pairs uniformly distributed in the ranges 0<=x<=0.05 and 0<=omega<=1; (b) 400 pairs uniformly distributed in the ranges 0.05<=x<=1 and 0<=omega<=1. The shortest execution time averaged over values of the argument x exceeding 0.05 has been observed for algorithm B and Gauss-Legendre quadrature with the number of abscissas equal to 64 (23.2 μs). As compared with Romberg quadrature, the execution time was shortened by a factor of 22.5. For small x values, below 0.05, both algorithms A and B are considerably faster if Gauss-Legendre quadrature is used. For N=64, the average time of execution of algorithm B is decreased with respect to Romberg quadrature by a factor close to 30. However, in that range of argument x, algorithm A exhibits much faster performance. Furthermore, the average execution time of algorithm A, equal to about 100 μs, is practically independent of the number of abscissas N. (4) For Romberg quadrature, to optimize the performance, the mixed algorithm C was proposed in which algorithm A is used for argument x smaller than or equal to x0=0.4, while algorithm B is used for x larger than 0.4 [1]. For Gauss-Legendre quadrature, the limit x0 was found to depend on the number of abscissas N. For each value of N considered, the time of calculations of the H function was determined for pairs of arguments uniformly distributed in the ranges 0<=x<=0.05 and 0<=omega<=1, and for pairs of arguments uniformly distributed in the ranges 0.05<=x<=1 and 0<=omega<=1. As shown in Fig. 2 for N=64, algorithm A is faster than algorithm B for x smaller than or equal to 0.0225. Comparison of the running times of algorithms A and B. Open circles: algorithm B is faster than the algorithm A; full circles: algorithm A is faster than algorithm B. Thus, the value of x0=0.0225 is proposed for the mixed algorithm C when Gauss-Legendere quadrature with N=64 is used. Similar computer experiments performed for other values of N are summarized below. L N0 1 16 0.25 2 20 0.15 3 24 0.10 4 32 0.050 5 40 0.030 6 48 0.045 7 64 0.0225-Recommended 8 80 0.0125 9 96 0.020 The flag L is one of the input parameters for the subroutine GAUSS. In the programs implementing algorithms A, B, and C (CHANDRA, CHANDRB, and CHANDRC), Gauss-Legendre quadrature with N=64 is currently set. As follows from Fig. 1, algorithm B (and consequently algorithm C) is the fastest in that case. It is still possible to change the number of abscissas; the flag L then has to be modified in lines 165, 169, 185, 189, and 304 of program CHANDRAS_v2, and the value of x0 in line 111 has to be adjusted according to the table above. (5) The above modifications of the code did not affect the accuracy of the calculated Chandrasekhar function, as compared to the original code [1]. For the pairs of arguments shown in Fig. 2, the accuracy of the H function, calculated from algorithms A and B, reached at least 12 decimal digits; however, in the majority of cases, the accuracy is equal to 13 decimal digits. Restrictions: Two input parameters for the Chandrasekhar function, x and omega, are restricted to the ranges 0<=x<=1 and 0<=omega<=1, which is sufficient in numerous applications. Running time: between 15 and 100 μs for one pair of arguments of the Chandrasekhar function.
Quantum Molecular Dynamics Simulations of Nanotube Tip Assisted Reactions

NASA Technical Reports Server (NTRS)

Menon, Madhu

1998-01-01

In this report we detail the development and application of an efficient quantum molecular dynamics computational algorithm and its application to the nanotube-tip assisted reactions on silicon and diamond surfaces. The calculations shed interesting insights into the microscopic picture of tip surface interactions.
SIMPLE METHOD FOR THE REPRESENTATION, QUANTIFICATION, AND COMPARISON OF THE VOLUMES AND SHAPES OF CHEMICAL COMPOUNDS

EPA Science Inventory

A conceptually and computationally simple method for the definition, display, quantification, and comparison of the shapes of three-dimensional mathematical molecular models is presented. Molecular or solvent-accessible volume and surface area can also be calculated. Algorithms, ...
Automatic energy calibration algorithm for an RBS setup

DOE Office of Scientific and Technical Information (OSTI.GOV)

Silva, Tiago F.; Moro, Marcos V.; Added, Nemitala

2013-05-06

This work describes a computer algorithm for automatic extraction of the energy calibration parameters from a Rutherford Back-Scattering Spectroscopy (RBS) spectrum. Parameters like the electronic gain, electronic offset and detection resolution (FWHM) of a RBS setup are usually determined using a standard sample. In our case, the standard sample comprises of a multi-elemental thin film made of a mixture of Ti-Al-Ta that is analyzed at the beginning of each run at defined beam energy. A computer program has been developed to extract automatically the calibration parameters from the spectrum of the standard sample. The code evaluates the first derivative ofmore » the energy spectrum, locates the trailing edges of the Al, Ti and Ta peaks and fits a first order polynomial for the energy-channel relation. The detection resolution is determined fitting the convolution of a pre-calculated theoretical spectrum. To test the code, data of two years have been analyzed and the results compared with the manual calculations done previously, obtaining good agreement.« less
A general-purpose framework to simulate musculoskeletal system of human body: using a motion tracking approach.

PubMed

Ehsani, Hossein; Rostami, Mostafa; Gudarzi, Mohammad

2016-02-01

Computation of muscle force patterns that produce specified movements of muscle-actuated dynamic models is an important and challenging problem. This problem is an undetermined one, and then a proper optimization is required to calculate muscle forces. The purpose of this paper is to develop a general model for calculating all muscle activation and force patterns in an arbitrary human body movement. For this aim, the equations of a multibody system forward dynamics, which is considered for skeletal system of the human body model, is derived using Lagrange-Euler formulation. Next, muscle contraction dynamics is added to this model and forward dynamics of an arbitrary musculoskeletal system is obtained. For optimization purpose, the obtained model is used in computed muscle control algorithm, and a closed-loop system for tracking desired motions is derived. Finally, a popular sport exercise, biceps curl, is simulated by using this algorithm and the validity of the obtained results is evaluated via EMG signals.
An effective algorithm for calculating the Chandrasekhar function

NASA Astrophysics Data System (ADS)

Jablonski, A.

2012-08-01

Numerical values of the Chandrasekhar function are needed with high accuracy in evaluations of theoretical models describing electron transport in condensed matter. An algorithm for such calculations should be possibly fast and also accurate, e.g. an accuracy of 10 decimal digits is needed for some applications. Two of the integral representations of the Chandrasekhar function are prospective for constructing such an algorithm, but suitable transformations are needed to obtain a rapidly converging quadrature. A mixed algorithm is proposed in which the Chandrasekhar function is calculated from two algorithms, depending on the value of one of the arguments. Catalogue identifier: AEMC_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEMC_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 567 No. of bytes in distributed program, including test data, etc.: 4444 Distribution format: tar.gz Programming language: Fortran 90 Computer: Any computer with a FORTRAN 90 compiler Operating system: Linux, Windows 7, Windows XP RAM: 0.6 Mb Classification: 2.4, 7.2 Nature of problem: An attempt has been made to develop a subroutine that calculates the Chandrasekhar function with high accuracy, of at least 10 decimal places. Simultaneously, this subroutine should be very fast. Both requirements stem from the theory of electron transport in condensed matter. Solution method: Two algorithms were developed, each based on a different integral representation of the Chandrasekhar function. The final algorithm is edited by mixing these two algorithms and by selecting ranges of the argument ω in which performance is the fastest. Restrictions: Two input parameters for the Chandrasekhar function, x and ω (notation used in the code), are restricted to the range: 0⩽x⩽1 and 0⩽ω⩽1, which is sufficient in numerous applications. Unusual features: The program uses the Romberg quadrature for integration. This quadrature is applicable to integrands that satisfy several requirements (the integrand does not vary rapidly and does not change sign in the integration interval; furthermore, the integrand is finite at the endpoints). Consequently, the analyzed integrands were transformed so that these requirements were satisfied. In effect, one can conveniently control the accuracy of integration. Although the desired fractional accuracy was set at 10-10, the obtained accuracy of the Chandrasekhar function was much higher, typically 13 decimal places. Running time: Between 0.7 and 5 milliseconds for one pair of arguments of the Chandrasekhar function.
Computer architectures for computational physics work done by Computational Research and Technology Branch and Advanced Computational Concepts Group

NASA Technical Reports Server (NTRS)

1985-01-01

Slides are reproduced that describe the importance of having high performance number crunching and graphics capability. They also indicate the types of research and development underway at Ames Research Center to ensure that, in the near term, Ames is a smart buyer and user, and in the long-term that Ames knows the best possible solutions for number crunching and graphics needs. The drivers for this research are real computational physics applications of interest to Ames and NASA. They are concerned with how to map the applications, and how to maximize the physics learned from the results of the calculations. The computer graphics activities are aimed at getting maximum information from the three-dimensional calculations by using the real time manipulation of three-dimensional data on the Silicon Graphics workstation. Work is underway on new algorithms that will permit the display of experimental results that are sparse and random, the same way that the dense and regular computed results are displayed.
Computer Drawing Method for Operating Characteristic Curve of PV Power Plant Array Unit

NASA Astrophysics Data System (ADS)

Tan, Jianbin

2018-02-01

According to the engineering design of large-scale grid-connected photovoltaic power stations and the research and development of many simulation and analysis systems, it is necessary to draw a good computer graphics of the operating characteristic curves of photovoltaic array elements and to propose a good segmentation non-linear interpolation algorithm. In the calculation method, Component performance parameters as the main design basis, the computer can get 5 PV module performances. At the same time, combined with the PV array series and parallel connection, the computer drawing of the performance curve of the PV array unit can be realized. At the same time, the specific data onto the module of PV development software can be calculated, and the good operation of PV array unit can be improved on practical application.
Parallel discontinuous Galerkin FEM for computing hyperbolic conservation law on unstructured grids

NASA Astrophysics Data System (ADS)

Ma, Xinrong; Duan, Zhijian

2018-04-01

High-order resolution Discontinuous Galerkin finite element methods (DGFEM) has been known as a good method for solving Euler equations and Navier-Stokes equations on unstructured grid, but it costs too much computational resources. An efficient parallel algorithm was presented for solving the compressible Euler equations. Moreover, the multigrid strategy based on three-stage three-order TVD Runge-Kutta scheme was used in order to improve the computational efficiency of DGFEM and accelerate the convergence of the solution of unsteady compressible Euler equations. In order to make each processor maintain load balancing, the domain decomposition method was employed. Numerical experiment performed for the inviscid transonic flow fluid problems around NACA0012 airfoil and M6 wing. The results indicated that our parallel algorithm can improve acceleration and efficiency significantly, which is suitable for calculating the complex flow fluid.
Bioinformatics algorithm based on a parallel implementation of a machine learning approach using transducers

NASA Astrophysics Data System (ADS)

Roche-Lima, Abiel; Thulasiram, Ruppa K.

2012-02-01

Finite automata, in which each transition is augmented with an output label in addition to the familiar input label, are considered finite-state transducers. Transducers have been used to analyze some fundamental issues in bioinformatics. Weighted finite-state transducers have been proposed to pairwise alignments of DNA and protein sequences; as well as to develop kernels for computational biology. Machine learning algorithms for conditional transducers have been implemented and used for DNA sequence analysis. Transducer learning algorithms are based on conditional probability computation. It is calculated by using techniques, such as pair-database creation, normalization (with Maximum-Likelihood normalization) and parameters optimization (with Expectation-Maximization - EM). These techniques are intrinsically costly for computation, even worse when are applied to bioinformatics, because the databases sizes are large. In this work, we describe a parallel implementation of an algorithm to learn conditional transducers using these techniques. The algorithm is oriented to bioinformatics applications, such as alignments, phylogenetic trees, and other genome evolution studies. Indeed, several experiences were developed using the parallel and sequential algorithm on Westgrid (specifically, on the Breeze cluster). As results, we obtain that our parallel algorithm is scalable, because execution times are reduced considerably when the data size parameter is increased. Another experience is developed by changing precision parameter. In this case, we obtain smaller execution times using the parallel algorithm. Finally, number of threads used to execute the parallel algorithm on the Breezy cluster is changed. In this last experience, we obtain as result that speedup is considerably increased when more threads are used; however there is a convergence for number of threads equal to or greater than 16.
Design and implementation of the modified signed digit multiplication routine on a ternary optical computer.

PubMed

Xu, Qun; Wang, Xianchao; Xu, Chao

2017-06-01

Multiplication with traditional electronic computers is faced with a low calculating accuracy and a long computation time delay. To overcome these problems, the modified signed digit (MSD) multiplication routine is established based on the MSD system and the carry-free adder. Also, its parallel algorithm and optimization techniques are studied in detail. With the help of a ternary optical computer's characteristics, the structured data processor is designed especially for the multiplication routine. Several ternary optical operators are constructed to perform M transformations and summations in parallel, which has accelerated the iterative process of multiplication. In particular, the routine allocates data bits of the ternary optical processor based on digits of multiplication input, so the accuracy of the calculation results can always satisfy the users. Finally, the routine is verified by simulation experiments, and the results are in full compliance with the expectations. Compared with an electronic computer, the MSD multiplication routine is not only good at dealing with large-value data and high-precision arithmetic, but also maintains lower power consumption and fewer calculating delays.

A cubic scaling algorithm for excited states calculations in particle-particle random phase approximation

NASA Astrophysics Data System (ADS)

Lu, Jianfeng; Yang, Haizhao

2017-07-01

The particle-particle random phase approximation (pp-RPA) has been shown to be capable of describing double, Rydberg, and charge transfer excitations, for which the conventional time-dependent density functional theory (TDDFT) might not be suitable. It is thus desirable to reduce the computational cost of pp-RPA so that it can be efficiently applied to larger molecules and even solids. This paper introduces an O (N3) algorithm, where N is the number of orbitals, based on an interpolative separable density fitting technique and the Jacobi-Davidson eigensolver to calculate a few low-lying excitations in the pp-RPA framework. The size of the pp-RPA matrix can also be reduced by keeping only a small portion of orbitals with orbital energy close to the Fermi energy. This reduced system leads to a smaller prefactor of the cubic scaling algorithm, while keeping the accuracy for the low-lying excitation energies.
Dosimetric Evaluation of Metal Artefact Reduction using Metal Artefact Reduction (MAR) Algorithm and Dual-energy Computed Tomography (CT) Method

NASA Astrophysics Data System (ADS)

Laguda, Edcer Jerecho

Purpose: Computed Tomography (CT) is one of the standard diagnostic imaging modalities for the evaluation of a patient's medical condition. In comparison to other imaging modalities such as Magnetic Resonance Imaging (MRI), CT is a fast acquisition imaging device with higher spatial resolution and higher contrast-to-noise ratio (CNR) for bony structures. CT images are presented through a gray scale of independent values in Hounsfield units (HU). High HU-valued materials represent higher density. High density materials, such as metal, tend to erroneously increase the HU values around it due to reconstruction software limitations. This problem of increased HU values due to metal presence is referred to as metal artefacts. Hip prostheses, dental fillings, aneurysm clips, and spinal clips are a few examples of metal objects that are of clinical relevance. These implants create artefacts such as beam hardening and photon starvation that distort CT images and degrade image quality. This is of great significance because the distortions may cause improper evaluation of images and inaccurate dose calculation in the treatment planning system. Different algorithms are being developed to reduce these artefacts for better image quality for both diagnostic and therapeutic purposes. However, very limited information is available about the effect of artefact correction on dose calculation accuracy. This research study evaluates the dosimetric effect of metal artefact reduction algorithms on severe artefacts on CT images. This study uses Gemstone Spectral Imaging (GSI)-based MAR algorithm, projection-based Metal Artefact Reduction (MAR) algorithm, and the Dual-Energy method. Materials and Methods: The Gemstone Spectral Imaging (GSI)-based and SMART Metal Artefact Reduction (MAR) algorithms are metal artefact reduction protocols embedded in two different CT scanner models by General Electric (GE), and the Dual-Energy Imaging Method was developed at Duke University. All three approaches were applied in this research for dosimetric evaluation on CT images with severe metal artefacts. The first part of the research used a water phantom with four iodine syringes. Two sets of plans, multi-arc plans and single-arc plans, using the Volumetric Modulated Arc therapy (VMAT) technique were designed to avoid or minimize influences from high-density objects. The second part of the research used projection-based MAR Algorithm and the Dual-Energy Method. Calculated Doses (Mean, Minimum, and Maximum Doses) to the planning treatment volume (PTV) were compared and homogeneity index (HI) calculated. Results: (1) Without the GSI-based MAR application, a percent error between mean dose and the absolute dose ranging from 3.4-5.7% per fraction was observed. In contrast, the error was decreased to a range of 0.09-2.3% per fraction with the GSI-based MAR algorithm. There was a percent difference ranging from 1.7-4.2% per fraction between with and without using the GSI-based MAR algorithm. (2) A range of 0.1-3.2% difference was observed for the maximum dose values, 1.5-10.4% for minimum dose difference, and 1.4-1.7% difference on the mean doses. Homogeneity indexes (HI) ranging from 0.068-0.065 for dual-energy method and 0.063-0.141 with projection-based MAR algorithm were also calculated. Conclusion: (1) Percent error without using the GSI-based MAR algorithm may deviate as high as 5.7%. This error invalidates the goal of Radiation Therapy to provide a more precise treatment. Thus, GSI-based MAR algorithm was desirable due to its better dose calculation accuracy. (2) Based on direct numerical observation, there was no apparent deviation between the mean doses of different techniques but deviation was evident on the maximum and minimum doses. The HI for the dual-energy method almost achieved the desirable null values. In conclusion, the Dual-Energy method gave better dose calculation accuracy to the planning treatment volume (PTV) for images with metal artefacts than with or without GE MAR Algorithm.
Fast adaptive flat-histogram ensemble to enhance the sampling in large systems

NASA Astrophysics Data System (ADS)

Xu, Shun; Zhou, Xin; Jiang, Yi; Wang, YanTing

2015-09-01

An efficient novel algorithm was developed to estimate the Density of States (DOS) for large systems by calculating the ensemble means of an extensive physical variable, such as the potential energy, U, in generalized canonical ensembles to interpolate the interior reverse temperature curve , where S( U) is the logarithm of the DOS. This curve is computed with different accuracies in different energy regions to capture the dependence of the reverse temperature on U without setting prior grid in the U space. By combining with a U-compression transformation, we decrease the computational complexity from O( N 3/2) in the normal Wang Landau type method to O( N 1/2) in the current algorithm, as the degrees of freedom of system N. The efficiency of the algorithm is demonstrated by applying to Lennard Jones fluids with various N, along with its ability to find different macroscopic states, including metastable states.
Gradient-based Optimization for Poroelastic and Viscoelastic MR Elastography

PubMed Central

Tan, Likun; McGarry, Matthew D.J.; Van Houten, Elijah E.W.; Ji, Ming; Solamen, Ligin; Weaver, John B.

2017-01-01

We describe an efficient gradient computation for solving inverse problems arising in magnetic resonance elastography (MRE). The algorithm can be considered as a generalized ‘adjoint method’ based on a Lagrangian formulation. One requirement for the classic adjoint method is assurance of the self-adjoint property of the stiffness matrix in the elasticity problem. In this paper, we show this property is no longer a necessary condition in our algorithm, but the computational performance can be as efficient as the classic method, which involves only two forward solutions and is independent of the number of parameters to be estimated. The algorithm is developed and implemented in material property reconstructions using poroelastic and viscoelastic modeling. Various gradient- and Hessian-based optimization techniques have been tested on simulation, phantom and in vivo brain data. The numerical results show the feasibility and the efficiency of the proposed scheme for gradient calculation. PMID:27608454
Adaptive DIT-Based Fringe Tracking and Prediction at IOTA

NASA Technical Reports Server (NTRS)

Wilson, Edward; Pedretti, Ettore; Bregman, Jesse; Mah, Robert W.; Traub, Wesley A.

2004-01-01

An automatic fringe tracking system has been developed and implemented at the Infrared Optical Telescope Array (IOTA). In testing during May 2002, the system successfully minimized the optical path differences (OPDs) for all three baselines at IOTA. Based on sliding window discrete Fourier transform (DFT) calculations that were optimized for computational efficiency and robustness to atmospheric disturbances, the algorithm has also been tested extensively on off-line data. Implemented in ANSI C on the 266 MHZ PowerPC processor running the VxWorks real-time operating system, the algorithm runs in approximately 2.0 milliseconds per scan (including all three interferograms), using the science camera and piezo scanners to measure and correct the OPDs. Preliminary analysis on an extension of this algorithm indicates a potential for predictive tracking, although at present, real-time implementation of this extension would require significantly more computational capacity.
CUDA-based high-performance computing of the S-BPF algorithm with no-waiting pipelining

NASA Astrophysics Data System (ADS)

Deng, Lin; Yan, Bin; Chang, Qingmei; Han, Yu; Zhang, Xiang; Xi, Xiaoqi; Li, Lei

2015-10-01

The backprojection-filtration (BPF) algorithm has become a good solution for local reconstruction in cone-beam computed tomography (CBCT). However, the reconstruction speed of BPF is a severe limitation for clinical applications. The selective-backprojection filtration (S-BPF) algorithm is developed to improve the parallel performance of BPF by selective backprojection. Furthermore, the general-purpose graphics processing unit (GP-GPU) is a popular tool for accelerating the reconstruction. Much work has been performed aiming for the optimization of the cone-beam back-projection. As the cone-beam back-projection process becomes faster, the data transportation holds a much bigger time proportion in the reconstruction than before. This paper focuses on minimizing the total time in the reconstruction with the S-BPF algorithm by hiding the data transportation among hard disk, CPU and GPU. And based on the analysis of the S-BPF algorithm, some strategies are implemented: (1) the asynchronous calls are used to overlap the implemention of CPU and GPU, (2) an innovative strategy is applied to obtain the DBP image to hide the transport time effectively, (3) two streams for data transportation and calculation are synchronized by the cudaEvent in the inverse of finite Hilbert transform on GPU. Our main contribution is a smart reconstruction of the S-BPF algorithm with GPU's continuous calculation and no data transportation time cost. a 5123 volume is reconstructed in less than 0.7 second on a single Tesla-based K20 GPU from 182 views projection with 5122 pixel per projection. The time cost of our implementation is about a half of that without the overlap behavior.
Efficient Geometric Probabilities of Multi-transiting Systems, Circumbinary Planets, and Exoplanet Mutual Events

NASA Astrophysics Data System (ADS)

Brakensiek, Joshua; Ragozzine, D.

2012-10-01

The transit method for discovering extra-solar planets relies on detecting regular diminutions of light from stars due to the shadows of planets passing in between the star and the observer. NASA's Kepler Mission has successfully discovered thousands of exoplanet candidates using this technique, including hundreds of stars with multiple transiting planets. In order to estimate the frequency of these valuable systems, our research concerns the efficient calculation of geometric probabilities for detecting multiple transiting extrasolar planets around the same parent star. In order to improve on previous studies that used numerical methods (e.g., Ragozzine & Holman 2010, Tremaine & Dong 2011), we have constructed an efficient, analytical algorithm which, given a collection of conjectured exoplanets orbiting a star, computes the probability that any particular group of exoplanets are transiting. The algorithm applies theorems of elementary differential geometry to compute the areas bounded by circular curves on the surface of a sphere (see Ragozzine & Holman 2010). The implemented algorithm is more accurate and orders of magnitude faster than previous algorithms, based on comparison with Monte Carlo simulations. Expanding this work, we have also developed semi-analytical methods for determining the frequency of exoplanet mutual events, i.e., the geometric probability two planets will transit each other (Planet-Planet Occultation) and the probability that this transit occurs simultaneously as they transit their star (Overlapping Double Transits; see Ragozzine & Holman 2010). The latter algorithm can also be applied to calculating the probability of observing transiting circumbinary planets (Doyle et al. 2011, Welsh et al. 2012). All of these algorithms have been coded in C and will be made publicly available. We will present and advertise these codes and illustrate their value for studying exoplanetary systems.
Efficient algorithms for fast integration on large data sets from multiple sources.

PubMed

Mi, Tian; Rajasekaran, Sanguthevar; Aseltine, Robert

2012-06-28

Recent large scale deployments of health information technology have created opportunities for the integration of patient medical records with disparate public health, human service, and educational databases to provide comprehensive information related to health and development. Data integration techniques, which identify records belonging to the same individual that reside in multiple data sets, are essential to these efforts. Several algorithms have been proposed in the literatures that are adept in integrating records from two different datasets. Our algorithms are aimed at integrating multiple (in particular more than two) datasets efficiently. Hierarchical clustering based solutions are used to integrate multiple (in particular more than two) datasets. Edit distance is used as the basic distance calculation, while distance calculation of common input errors is also studied. Several techniques have been applied to improve the algorithms in terms of both time and space: 1) Partial Construction of the Dendrogram (PCD) that ignores the level above the threshold; 2) Ignoring the Dendrogram Structure (IDS); 3) Faster Computation of the Edit Distance (FCED) that predicts the distance with the threshold by upper bounds on edit distance; and 4) A pre-processing blocking phase that limits dynamic computation within each block. We have experimentally validated our algorithms on large simulated as well as real data. Accuracy and completeness are defined stringently to show the performance of our algorithms. In addition, we employ a four-category analysis. Comparison with FEBRL shows the robustness of our approach. In the experiments we conducted, the accuracy we observed exceeded 90% for the simulated data in most cases. 97.7% and 98.1% accuracy were achieved for the constant and proportional threshold, respectively, in a real dataset of 1,083,878 records.
Porting ONETEP to graphical processing unit-based coprocessors. 1. FFT box operations.

PubMed

Wilkinson, Karl; Skylaris, Chris-Kriton

2013-10-30

We present the first graphical processing unit (GPU) coprocessor-enabled version of the Order-N Electronic Total Energy Package (ONETEP) code for linear-scaling first principles quantum mechanical calculations on materials. This work focuses on porting to the GPU the parts of the code that involve atom-localized fast Fourier transform (FFT) operations. These are among the most computationally intensive parts of the code and are used in core algorithms such as the calculation of the charge density, the local potential integrals, the kinetic energy integrals, and the nonorthogonal generalized Wannier function gradient. We have found that direct porting of the isolated FFT operations did not provide any benefit. Instead, it was necessary to tailor the port to each of the aforementioned algorithms to optimize data transfer to and from the GPU. A detailed discussion of the methods used and tests of the resulting performance are presented, which show that individual steps in the relevant algorithms are accelerated by a significant amount. However, the transfer of data between the GPU and host machine is a significant bottleneck in the reported version of the code. In addition, an initial investigation into a dynamic precision scheme for the ONETEP energy calculation has been performed to take advantage of the enhanced single precision capabilities of GPUs. The methods used here result in no disruption to the existing code base. Furthermore, as the developments reported here concern the core algorithms, they will benefit the full range of ONETEP functionality. Our use of a directive-based programming model ensures portability to other forms of coprocessors and will allow this work to form the basis of future developments to the code designed to support emerging high-performance computing platforms. Copyright © 2013 Wiley Periodicals, Inc.
Parallel algorithms for quantum chemistry. I. Integral transformations on a hypercube multiprocessor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Whiteside, R.A.; Binkley, J.S.; Colvin, M.E.

1987-02-15

For many years it has been recognized that fundamental physical constraints such as the speed of light will limit the ultimate speed of single processor computers to less than about three billion floating point operations per second (3 GFLOPS). This limitation is becoming increasingly restrictive as commercially available machines are now within an order of magnitude of this asymptotic limit. A natural way to avoid this limit is to harness together many processors to work on a single computational problem. In principle, these parallel processing computers have speeds limited only by the number of processors one chooses to acquire. Themore » usefulness of potentially unlimited processing speed to a computationally intensive field such as quantum chemistry is obvious. If these methods are to be applied to significantly larger chemical systems, parallel schemes will have to be employed. For this reason we have developed distributed-memory algorithms for a number of standard quantum chemical methods. We are currently implementing these on a 32 processor Intel hypercube. In this paper we present our algorithm and benchmark results for one of the bottleneck steps in quantum chemical calculations: the four index integral transformation.« less
3-D Magnetotelluric Forward Modeling And Inversion Incorporating Topography By Using Vector Finite-Element Method Combined With Divergence Corrections Based On The Magnetic Field (VFEH++)

NASA Astrophysics Data System (ADS)

Shi, X.; Utada, H.; Jiaying, W.

2009-12-01

The vector finite-element method combined with divergence corrections based on the magnetic field H, referred to as VFEH++ method, is developed to simulate the magnetotelluric (MT) responses of 3-D conductivity models. The advantages of the new VFEH++ method are the use of edge-elements to eliminate the vector parasites and the divergence corrections to explicitly guarantee the divergence-free conditions in the whole modeling domain. 3-D MT topographic responses are modeling using the new VFEH++ method, and are compared with those calculated by other numerical methods. The results show that MT responses can be modeled highly accurate using the VFEH+ +method. The VFEH++ algorithm is also employed for the 3-D MT data inversion incorporating topography. The 3-D MT inverse problem is formulated as a minimization problem of the regularized misfit function. In order to avoid the huge memory requirement and very long time for computing the Jacobian sensitivity matrix for Gauss-Newton method, we employ the conjugate gradient (CG) approach to solve the inversion equation. In each iteration of CG algorithm, the cost computation is the product of the Jacobian sensitivity matrix with a model vector x or its transpose with a data vector y, which can be transformed into two pseudo-forwarding modeling. This avoids the full explicitly Jacobian matrix calculation and storage which leads to considerable savings in the memory required by the inversion program in PC computer. The performance of CG algorithm will be illustrated by several typical 3-D models with horizontal earth surface and topographic surfaces. The results show that the VFEH++ and CG algorithms can be effectively employed to 3-D MT field data inversion.
Implementation of High Time Delay Accuracy of Ultrasonic Phased Array Based on Interpolation CIC Filter

PubMed Central

Liu, Peilu; Li, Xinghua; Li, Haopeng; Su, Zhikun; Zhang, Hongxu

2017-01-01

In order to improve the accuracy of ultrasonic phased array focusing time delay, analyzing the original interpolation Cascade-Integrator-Comb (CIC) filter, an 8× interpolation CIC filter parallel algorithm was proposed, so that interpolation and multichannel decomposition can simultaneously process. Moreover, we summarized the general formula of arbitrary multiple interpolation CIC filter parallel algorithm and established an ultrasonic phased array focusing time delay system based on 8× interpolation CIC filter parallel algorithm. Improving the algorithmic structure, 12.5% of addition and 29.2% of multiplication was reduced, meanwhile the speed of computation is still very fast. Considering the existing problems of the CIC filter, we compensated the CIC filter; the compensated CIC filter’s pass band is flatter, the transition band becomes steep, and the stop band attenuation increases. Finally, we verified the feasibility of this algorithm on Field Programming Gate Array (FPGA). In the case of system clock is 125 MHz, after 8× interpolation filtering and decomposition, time delay accuracy of the defect echo becomes 1 ns. Simulation and experimental results both show that the algorithm we proposed has strong feasibility. Because of the fast calculation, small computational amount and high resolution, this algorithm is especially suitable for applications with high time delay accuracy and fast detection. PMID:29023385
Implementation of High Time Delay Accuracy of Ultrasonic Phased Array Based on Interpolation CIC Filter.

PubMed

Liu, Peilu; Li, Xinghua; Li, Haopeng; Su, Zhikun; Zhang, Hongxu

2017-10-12

In order to improve the accuracy of ultrasonic phased array focusing time delay, analyzing the original interpolation Cascade-Integrator-Comb (CIC) filter, an 8× interpolation CIC filter parallel algorithm was proposed, so that interpolation and multichannel decomposition can simultaneously process. Moreover, we summarized the general formula of arbitrary multiple interpolation CIC filter parallel algorithm and established an ultrasonic phased array focusing time delay system based on 8× interpolation CIC filter parallel algorithm. Improving the algorithmic structure, 12.5% of addition and 29.2% of multiplication was reduced, meanwhile the speed of computation is still very fast. Considering the existing problems of the CIC filter, we compensated the CIC filter; the compensated CIC filter's pass band is flatter, the transition band becomes steep, and the stop band attenuation increases. Finally, we verified the feasibility of this algorithm on Field Programming Gate Array (FPGA). In the case of system clock is 125 MHz, after 8× interpolation filtering and decomposition, time delay accuracy of the defect echo becomes 1 ns. Simulation and experimental results both show that the algorithm we proposed has strong feasibility. Because of the fast calculation, small computational amount and high resolution, this algorithm is especially suitable for applications with high time delay accuracy and fast detection.
Microscopic image analysis for reticulocyte based on watershed algorithm

NASA Astrophysics Data System (ADS)

Wang, J. Q.; Liu, G. F.; Liu, J. G.; Wang, G.

2007-12-01

We present a watershed-based algorithm in the analysis of light microscopic image for reticulocyte (RET), which will be used in an automated recognition system for RET in peripheral blood. The original images, obtained by micrography, are segmented by modified watershed algorithm and are recognized in term of gray entropy and area of connective area. In the process of watershed algorithm, judgment conditions are controlled according to character of the image, besides, the segmentation is performed by morphological subtraction. The algorithm was simulated with MATLAB software. It is similar for automated and manual scoring and there is good correlation(r=0.956) between the methods, which is resulted from 50 pieces of RET images. The result indicates that the algorithm for peripheral blood RETs is comparable to conventional manual scoring, and it is superior in objectivity. This algorithm avoids time-consuming calculation such as ultra-erosion and region-growth, which will speed up the computation consequentially.
Absorbing Boundary Conditions For Optical Pulses In Dispersive, Nonlinear Materials

NASA Technical Reports Server (NTRS)

Goorjian, Peter M.; Kwak, Dochan (Technical Monitor)

1995-01-01

This paper will present results in computational nonlinear optics. An algorithm will be described that provides absorbing boundary conditions for optical pulses in dispersive, nonlinear materials. A new numerical absorber at the boundaries has been developed that is responsive to the spectral content of the pulse. Also, results will be shown of calculations of 2-D electromagnetic nonlinear waves computed by directly integrating in time the nonlinear vector Maxwell's equations. The results will include simulations of "light bullet" like pulses. Here diffraction and dispersion will be counteracted by nonlinear effects. Comparisons will be shown of calculations that use the standard boundary conditions and the new ones.
Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo

DOE PAGES

McDaniel, Tyler; D’Azevedo, Ed F.; Li, Ying Wai; ...

2017-11-07

Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is therefore formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with applicationmore » of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. Here this procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi- core CPUs and GPUs.« less
Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo

DOE Office of Scientific and Technical Information (OSTI.GOV)

McDaniel, Tyler; D’Azevedo, Ed F.; Li, Ying Wai

Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is therefore formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with applicationmore » of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. Here this procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi- core CPUs and GPUs.« less
Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo.

PubMed

McDaniel, T; D'Azevedo, E F; Li, Y W; Wong, K; Kent, P R C

2017-11-07

Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is, therefore, formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with an application of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. This procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo, where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi-core central processing units and graphical processing units.
Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo

NASA Astrophysics Data System (ADS)

McDaniel, T.; D'Azevedo, E. F.; Li, Y. W.; Wong, K.; Kent, P. R. C.

2017-11-01

Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is, therefore, formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with an application of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. This procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo, where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi-core central processing units and graphical processing units.
High-efficiency wavefunction updates for large scale Quantum Monte Carlo

NASA Astrophysics Data System (ADS)

Kent, Paul; McDaniel, Tyler; Li, Ying Wai; D'Azevedo, Ed

Within ab intio Quantum Monte Carlo (QMC) simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunctions. The evaluation of each Monte Carlo move requires finding the determinant of a dense matrix, which is traditionally iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. For calculations with thousands of electrons, this operation dominates the execution profile. We propose a novel rank- k delayed update scheme. This strategy enables probability evaluation for multiple successive Monte Carlo moves, with application of accepted moves to the matrices delayed until after a predetermined number of moves, k. Accepted events grouped in this manner are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency. This procedure does not change the underlying Monte Carlo sampling or the sampling efficiency. For large systems and algorithms such as diffusion Monte Carlo where the acceptance ratio is high, order of magnitude speedups can be obtained on both multi-core CPU and on GPUs, making this algorithm highly advantageous for current petascale and future exascale computations.

Study of Variation in Dose Calculation Accuracy Between kV Cone-Beam Computed Tomography and kV fan-Beam Computed Tomography

PubMed Central

Kaliyaperumal, Venkatesan; Raphael, C. Jomon; Varghese, K. Mathew; Gopu, Paul; Sivakumar, S.; Boban, Minu; Raj, N. Arunai Nambi; Senthilnathan, K.; Babu, P. Ramesh

2017-01-01

Cone-beam computed tomography (CBCT) images are presently used for geometric verification for daily patient positioning. In this work, we have compared the images of CBCT with the images of conventional fan beam CT (FBCT) in terms of image quality and Hounsfield units (HUs). We also compared the dose calculated using CBCT with that of FBCT. Homogenous RW3 plates and Catphan phantom were scanned by FBCT and CBCT. In RW3 and Catphan phantom, percentage depth dose (PDD), profiles, isodose distributions (for intensity modulated radiotherapy plans), and calculated dose volume histograms were compared. The HU difference was within ± 20 HU (central region) and ± 30 HU (peripheral region) for homogeneous RW3 plates. In the Catphan phantom, the difference in HU was ± 20 HU in the central area and peripheral areas. The HU differences were within ± 30 HU for all HU ranges starting from −1000 to 990 in phantom and patient images. In treatment plans done with simple symmetric and asymmetric fields, dose difference (DD) between CBCT plan and FBCT plan was within 1.2% for both phantoms. In intensity modulated radiotherapy (IMRT) treatment plans, for different target volumes, the difference was <2%. This feasibility study investigated HU variation and dose calculation accuracy between FBCT and CBCT based planning and has validated inverse planning algorithms with CBCT. In our study, we observed a larger deviation of HU values in the peripheral region compared to the central region. This is due to the ring artifact and scatter contribution which may prevent the use of CBCT as the primary imaging modality for radiotherapy treatment planning. The reconstruction algorithm needs to be modified further for improving the image quality and accuracy in HU values. However, our study with TG-119 and intensity modulated radiotherapy test targets shows that CBCT can be used for adaptive replanning as the recalculation of dose with the anisotropic analytical algorithm is in full accord with conventional planning CT except in the build-up regions. Patient images with CBCT have to be carefully analyzed for any artifacts before using them for such dose calculations. PMID:28974864
Fast and accurate computation of system matrix for area integral model-based algebraic reconstruction technique

NASA Astrophysics Data System (ADS)

Zhang, Shunli; Zhang, Dinghua; Gong, Hao; Ghasemalizadeh, Omid; Wang, Ge; Cao, Guohua

2014-11-01

Iterative algorithms, such as the algebraic reconstruction technique (ART), are popular for image reconstruction. For iterative reconstruction, the area integral model (AIM) is more accurate for better reconstruction quality than the line integral model (LIM). However, the computation of the system matrix for AIM is more complex and time-consuming than that for LIM. Here, we propose a fast and accurate method to compute the system matrix for AIM. First, we calculate the intersection of each boundary line of a narrow fan-beam with pixels in a recursive and efficient manner. Then, by grouping the beam-pixel intersection area into six types according to the slopes of the two boundary lines, we analytically compute the intersection area of the narrow fan-beam with the pixels in a simple algebraic fashion. Overall, experimental results show that our method is about three times faster than the Siddon algorithm and about two times faster than the distance-driven model (DDM) in computation of the system matrix. The reconstruction speed of our AIM-based ART is also faster than the LIM-based ART that uses the Siddon algorithm and DDM-based ART, for one iteration. The fast reconstruction speed of our method was accomplished without compromising the image quality.
An efficient pseudomedian filter for tiling microrrays.

PubMed

Royce, Thomas E; Carriero, Nicholas J; Gerstein, Mark B

2007-06-07

Tiling microarrays are becoming an essential technology in the functional genomics toolbox. They have been applied to the tasks of novel transcript identification, elucidation of transcription factor binding sites, detection of methylated DNA and several other applications in several model organisms. These experiments are being conducted at increasingly finer resolutions as the microarray technology enjoys increasingly greater feature densities. The increased densities naturally lead to increased data analysis requirements. Specifically, the most widely employed algorithm for tiling array analysis involves smoothing observed signals by computing pseudomedians within sliding windows, a O(n2logn) calculation in each window. This poor time complexity is an issue for tiling array analysis and could prove to be a real bottleneck as tiling microarray experiments become grander in scope and finer in resolution. We therefore implemented Monahan's HLQEST algorithm that reduces the runtime complexity for computing the pseudomedian of n numbers to O(nlogn) from O(n2logn). For a representative tiling microarray dataset, this modification reduced the smoothing procedure's runtime by nearly 90%. We then leveraged the fact that elements within sliding windows remain largely unchanged in overlapping windows (as one slides across genomic space) to further reduce computation by an additional 43%. This was achieved by the application of skip lists to maintaining a sorted list of values from window to window. This sorted list could be maintained with simple O(log n) inserts and deletes. We illustrate the favorable scaling properties of our algorithms with both time complexity analysis and benchmarking on synthetic datasets. Tiling microarray analyses that rely upon a sliding window pseudomedian calculation can require many hours of computation. We have eased this requirement significantly by implementing efficient algorithms that scale well with genomic feature density. This result not only speeds the current standard analyses, but also makes possible ones where many iterations of the filter may be required, such as might be required in a bootstrap or parameter estimation setting. Source code and executables are available at http://tiling.gersteinlab.org/pseudomedian/.
An efficient pseudomedian filter for tiling microrrays

PubMed Central

Royce, Thomas E; Carriero, Nicholas J; Gerstein, Mark B

2007-01-01

Background Tiling microarrays are becoming an essential technology in the functional genomics toolbox. They have been applied to the tasks of novel transcript identification, elucidation of transcription factor binding sites, detection of methylated DNA and several other applications in several model organisms. These experiments are being conducted at increasingly finer resolutions as the microarray technology enjoys increasingly greater feature densities. The increased densities naturally lead to increased data analysis requirements. Specifically, the most widely employed algorithm for tiling array analysis involves smoothing observed signals by computing pseudomedians within sliding windows, a O(n2logn) calculation in each window. This poor time complexity is an issue for tiling array analysis and could prove to be a real bottleneck as tiling microarray experiments become grander in scope and finer in resolution. Results We therefore implemented Monahan's HLQEST algorithm that reduces the runtime complexity for computing the pseudomedian of n numbers to O(nlogn) from O(n2logn). For a representative tiling microarray dataset, this modification reduced the smoothing procedure's runtime by nearly 90%. We then leveraged the fact that elements within sliding windows remain largely unchanged in overlapping windows (as one slides across genomic space) to further reduce computation by an additional 43%. This was achieved by the application of skip lists to maintaining a sorted list of values from window to window. This sorted list could be maintained with simple O(log n) inserts and deletes. We illustrate the favorable scaling properties of our algorithms with both time complexity analysis and benchmarking on synthetic datasets. Conclusion Tiling microarray analyses that rely upon a sliding window pseudomedian calculation can require many hours of computation. We have eased this requirement significantly by implementing efficient algorithms that scale well with genomic feature density. This result not only speeds the current standard analyses, but also makes possible ones where many iterations of the filter may be required, such as might be required in a bootstrap or parameter estimation setting. Source code and executables are available at . PMID:17555595
Multi scales based sparse matrix spectral clustering image segmentation

NASA Astrophysics Data System (ADS)

Liu, Zhongmin; Chen, Zhicai; Li, Zhanming; Hu, Wenjin

2018-04-01

In image segmentation, spectral clustering algorithms have to adopt the appropriate scaling parameter to calculate the similarity matrix between the pixels, which may have a great impact on the clustering result. Moreover, when the number of data instance is large, computational complexity and memory use of the algorithm will greatly increase. To solve these two problems, we proposed a new spectral clustering image segmentation algorithm based on multi scales and sparse matrix. We devised a new feature extraction method at first, then extracted the features of image on different scales, at last, using the feature information to construct sparse similarity matrix which can improve the operation efficiency. Compared with traditional spectral clustering algorithm, image segmentation experimental results show our algorithm have better degree of accuracy and robustness.
Genetic Algorithm for Initial Orbit Determination with Too Short Arc (Continued)

NASA Astrophysics Data System (ADS)

Li, Xin-ran; Wang, Xin

2017-04-01

When the genetic algorithm is used to solve the problem of too short-arc (TSA) orbit determination, due to the difference of computing process between the genetic algorithm and the classical method, the original method for outlier deletion is no longer applicable. In the genetic algorithm, the robust estimation is realized by introducing different loss functions for the fitness function, then the outlier problem of the TSA orbit determination is solved. Compared with the classical method, the genetic algorithm is greatly simplified by introducing in different loss functions. Through the comparison on the calculations of multiple loss functions, it is found that the least median square (LMS) estimation and least trimmed square (LTS) estimation can greatly improve the robustness of the TSA orbit determination, and have a high breakdown point.
Algorithm for astronomical, point source, signal to noise ratio calculations

NASA Technical Reports Server (NTRS)

Jayroe, R. R.; Schroeder, D. J.

1984-01-01

An algorithm was developed to simulate the expected signal to noise ratios as a function of observation time in the charge coupled device detector plane of an optical telescope located outside the Earth's atmosphere for a signal star, and an optional secondary star, embedded in a uniform cosmic background. By choosing the appropriate input values, the expected point source signal to noise ratio can be computed for the Hubble Space Telescope using the Wide Field/Planetary Camera science instrument.
Minimizing inner product data dependencies in conjugate gradient iteration

NASA Technical Reports Server (NTRS)

Vanrosendale, J.

1983-01-01

The amount of concurrency available in conjugate gradient iteration is limited by the summations required in the inner product computations. The inner product of two vectors of length N requires time c log(N), if N or more processors are available. This paper describes an algebraic restructuring of the conjugate gradient algorithm which minimizes data dependencies due to inner product calculations. After an initial start up, the new algorithm can perform a conjugate gradient iteration in time c*log(log(N)).
Computer-Aided Structural Engineering (CASE) Project: Investigation and Design of U-Frame Structures Using Program CUFRBC. Volume C. User’s Guide for Channels

DTIC Science & Technology

1990-05-01

1988) or ACI 318-83 (1983). Actual calculations for section strength are made using subroutines taken from the CASE program CSTR (Hamby and Price...validity of the design of their par- ticular structure. Thus, it is essential that the user of the program under- stand the design algorithm included...modes. However, several restrictions were placed on the design mode to avoid unnecessary com- plications of the design algorithm for cases rarely
Unstructured mesh algorithms for aerodynamic calculations

NASA Technical Reports Server (NTRS)

Mavriplis, D. J.

1992-01-01

The use of unstructured mesh techniques for solving complex aerodynamic flows is discussed. The principle advantages of unstructured mesh strategies, as they relate to complex geometries, adaptive meshing capabilities, and parallel processing are emphasized. The various aspects required for the efficient and accurate solution of aerodynamic flows are addressed. These include mesh generation, mesh adaptivity, solution algorithms, convergence acceleration, and turbulence modeling. Computations of viscous turbulent two-dimensional flows and inviscid three-dimensional flows about complex configurations are demonstrated. Remaining obstacles and directions for future research are also outlined.
Implementation of a Pseudo-Bending Seismic Travel-Time Calculator in a Distributed Parallel Computing Environment

DTIC Science & Technology

2008-09-01

algorithms that have been proposed to accomplish it fall into three broad categories. Eikonal solvers (e.g., Vidale, 1988, 1990; Podvin and Lecomte, 1991...difference eikonal solvers, the FMM algorithm works by following a wavefront as it moves across a volume of grid points, updating the travel times in...the grid according to the eikonal differential equation, using a second-order finite-difference scheme. We chose to use FMM for our comparison because
Stable computations with flat radial basis functions using vector-valued rational approximations

NASA Astrophysics Data System (ADS)

Wright, Grady B.; Fornberg, Bengt

2017-02-01

One commonly finds in applications of smooth radial basis functions (RBFs) that scaling the kernels so they are 'flat' leads to smaller discretization errors. However, the direct numerical approach for computing with flat RBFs (RBF-Direct) is severely ill-conditioned. We present an algorithm for bypassing this ill-conditioning that is based on a new method for rational approximation (RA) of vector-valued analytic functions with the property that all components of the vector share the same singularities. This new algorithm (RBF-RA) is more accurate, robust, and easier to implement than the Contour-Padé method, which is similarly based on vector-valued rational approximation. In contrast to the stable RBF-QR and RBF-GA algorithms, which are based on finding a better conditioned base in the same RBF-space, the new algorithm can be used with any type of smooth radial kernel, and it is also applicable to a wider range of tasks (including calculating Hermite type implicit RBF-FD stencils). We present a series of numerical experiments demonstrating the effectiveness of this new method for computing RBF interpolants in the flat regime. We also demonstrate the flexibility of the method by using it to compute implicit RBF-FD formulas in the flat regime and then using these for solving Poisson's equation in a 3-D spherical shell.
A Topological Model for Parallel Algorithm Design

DTIC Science & Technology

1991-09-01

by Charles Babbage in 1842-(250): When a long series of identical computations is to be performed, ... the machine can ... give several results at...Papers. Addison-Wesley, Reading, MA, 1964. 250. P. Morrison and E. Morrison. Charles Babbagc and His Calculating Engincs. Dover, New York, 1961. 251
FIVE YEARS OF SYNTHESIS OF SOLAR SPECTRAL IRRADIANCE FROM SDID/SISA AND SDO /AIA IMAGES

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fontenla, J. M.; Codrescu, M.; Fedrizzi, M.

In this paper we describe the synthetic solar spectral irradiance (SSI) calculated from 2010 to 2015 using data from the Atmospheric Imaging Assembly (AIA) instrument, on board the Solar Dynamics Observatory spacecraft. We used the algorithms for solar disk image decomposition (SDID) and the spectral irradiance synthesis algorithm (SISA) that we had developed over several years. The SDID algorithm decomposes the images of the solar disk into areas occupied by nine types of chromospheric and 5 types of coronal physical structures. With this decomposition and a set of pre-computed angle-dependent spectra for each of the features, the SISA algorithm ismore » used to calculate the SSI. We discuss the application of the basic SDID/SISA algorithm to a subset of the AIA images and the observed variation occurring in the 2010–2015 period of the relative areas of the solar disk covered by the various solar surface features. Our results consist of the SSI and total solar irradiance variations over the 2010–2015 period. The SSI results include soft X-ray, ultraviolet, visible, infrared, and far-infrared observations and can be used for studies of the solar radiative forcing of the Earth’s atmosphere. These SSI estimates were used to drive a thermosphere–ionosphere physical simulation model. Predictions of neutral mass density at low Earth orbit altitudes in the thermosphere and peak plasma densities at mid-latitudes are in reasonable agreement with the observations. The correlation between the simulation results and the observations was consistently better when fluxes computed by SDID/SISA procedures were used.« less
Development and evaluation of a LOR-based image reconstruction with 3D system response modeling for a PET insert with dual-layer offset crystal design.

PubMed

Zhang, Xuezhu; Stortz, Greg; Sossi, Vesna; Thompson, Christopher J; Retière, Fabrice; Kozlowski, Piotr; Thiessen, Jonathan D; Goertzen, Andrew L

2013-12-07

In this study we present a method of 3D system response calculation for analytical computer simulation and statistical image reconstruction for a magnetic resonance imaging (MRI) compatible positron emission tomography (PET) insert system that uses a dual-layer offset (DLO) crystal design. The general analytical system response functions (SRFs) for detector geometric and inter-crystal penetration of coincident crystal pairs are derived first. We implemented a 3D ray-tracing algorithm with 4π sampling for calculating the SRFs of coincident pairs of individual DLO crystals. The determination of which detector blocks are intersected by a gamma ray is made by calculating the intersection of the ray with virtual cylinders with radii just inside the inner surface and just outside the outer-edge of each crystal layer of the detector ring. For efficient ray-tracing computation, the detector block and ray to be traced are then rotated so that the crystals are aligned along the X-axis, facilitating calculation of ray/crystal boundary intersection points. This algorithm can be applied to any system geometry using either single-layer (SL) or multi-layer array design with or without offset crystals. For effective data organization, a direct lines of response (LOR)-based indexed histogram-mode method is also presented in this work. SRF calculation is performed on-the-fly in both forward and back projection procedures during each iteration of image reconstruction, with acceleration through use of eight-fold geometric symmetry and multi-threaded parallel computation. To validate the proposed methods, we performed a series of analytical and Monte Carlo computer simulations for different system geometry and detector designs. The full-width-at-half-maximum of the numerical SRFs in both radial and tangential directions are calculated and compared for various system designs. By inspecting the sinograms obtained for different detector geometries, it can be seen that the DLO crystal design can provide better sampling density than SL or dual-layer no-offset system designs with the same total crystal length. The results of the image reconstruction with SRFs modeling for phantom studies exhibit promising image recovery capability for crystal widths of 1.27-1.43 mm and top/bottom layer lengths of 4/6 mm. In conclusion, we have developed efficient algorithms for system response modeling of our proposed PET insert with DLO crystal arrays. This provides an effective method for both 3D computer simulation and quantitative image reconstruction, and will aid in the optimization of our PET insert system with various crystal designs.
Computing Bounds on Resource Levels for Flexible Plans

NASA Technical Reports Server (NTRS)

Muscvettola, Nicola; Rijsman, David

2009-01-01

A new algorithm efficiently computes the tightest exact bound on the levels of resources induced by a flexible activity plan (see figure). Tightness of bounds is extremely important for computations involved in planning because tight bounds can save potentially exponential amounts of search (through early backtracking and detection of solutions), relative to looser bounds. The bound computed by the new algorithm, denoted the resource-level envelope, constitutes the measure of maximum and minimum consumption of resources at any time for all fixed-time schedules in the flexible plan. At each time, the envelope guarantees that there are two fixed-time instantiations one that produces the minimum level and one that produces the maximum level. Therefore, the resource-level envelope is the tightest possible resource-level bound for a flexible plan because any tighter bound would exclude the contribution of at least one fixed-time schedule. If the resource- level envelope can be computed efficiently, one could substitute looser bounds that are currently used in the inner cores of constraint-posting scheduling algorithms, with the potential for great improvements in performance. What is needed to reduce the cost of computation is an algorithm, the measure of complexity of which is no greater than a low-degree polynomial in N (where N is the number of activities). The new algorithm satisfies this need. In this algorithm, the computation of resource-level envelopes is based on a novel combination of (1) the theory of shortest paths in the temporal-constraint network for the flexible plan and (2) the theory of maximum flows for a flow network derived from the temporal and resource constraints. The measure of asymptotic complexity of the algorithm is O(N O(maxflow(N)), where O(x) denotes an amount of computing time or a number of arithmetic operations proportional to a number of the order of x and O(maxflow(N)) is the measure of complexity (and thus of cost) of a maximumflow algorithm applied to an auxiliary flow network of 2N nodes. The algorithm is believed to be efficient in practice; experimental analysis shows the practical cost of maxflow to be as low as O(N1.5). The algorithm could be enhanced following at least two approaches. In the first approach, incremental subalgorithms for the computation of the envelope could be developed. By use of temporal scanning of the events in the temporal network, it may be possible to significantly reduce the size of the networks on which it is necessary to run the maximum-flow subalgorithm, thereby significantly reducing the time required for envelope calculation. In the second approach, the practical effectiveness of resource envelopes in the inner loops of search algorithms could be tested for multi-capacity resource scheduling. This testing would include inner-loop backtracking and termination tests and variable and value-ordering heuristics that exploit the properties of resource envelopes more directly.
Design Criteria for Low Profile Flange Calculations

NASA Technical Reports Server (NTRS)

Leimbach, K. R.

1973-01-01

An analytical method and a design procedure to develop flanged separable pipe connectors are discussed. A previously established algorithm is the basis for calculating low profile flanges. The characteristics and advantages of the low profile flange are analyzed. The use of aluminum, titanium, and plastics for flange materials is described. Mathematical models are developed to show the mechanical properties of various flange configurations. A computer program for determining the structural stability of the flanges is described.
A geometrical interpretation of the 2n-th central difference

NASA Technical Reports Server (NTRS)

Tapia, R. A.

1972-01-01

Many algorithms used for data smoothing, data classification and error detection require the calculation of the distance from a point to the polynomial interpolating its 2n neighbors (n on each side). This computation, if performed naively, would require the solution of a system of equations and could create numerical problems. This note shows that if the data is equally spaced, then this calculation can be performed using a simple recursion formula.
Physics Computing '92: Proceedings of the 4th International Conference

NASA Astrophysics Data System (ADS)

de Groot, Robert A.; Nadrchal, Jaroslav

1993-04-01

The Table of Contents for the book is as follows: * Preface * INVITED PAPERS * Ab Initio Theoretical Approaches to the Structural, Electronic and Vibrational Properties of Small Clusters and Fullerenes: The State of the Art * Neural Multigrid Methods for Gauge Theories and Other Disordered Systems * Multicanonical Monte Carlo Simulations * On the Use of the Symbolic Language Maple in Physics and Chemistry: Several Examples * Nonequilibrium Phase Transitions in Catalysis and Population Models * Computer Algebra, Symmetry Analysis and Integrability of Nonlinear Evolution Equations * The Path-Integral Quantum Simulation of Hydrogen in Metals * Digital Optical Computing: A New Approach of Systolic Arrays Based on Coherence Modulation of Light and Integrated Optics Technology * Molecular Dynamics Simulations of Granular Materials * Numerical Implementation of a K.A.M. Algorithm * Quasi-Monte Carlo, Quasi-Random Numbers and Quasi-Error Estimates * What Can We Learn from QMC Simulations * Physics of Fluctuating Membranes * Plato, Apollonius, and Klein: Playing with Spheres * Steady States in Nonequilibrium Lattice Systems * CONVODE: A REDUCE Package for Differential Equations * Chaos in Coupled Rotators * Symplectic Numerical Methods for Hamiltonian Problems * Computer Simulations of Surfactant Self Assembly * High-dimensional and Very Large Cellular Automata for Immunological Shape Space * A Review of the Lattice Boltzmann Method * Electronic Structure of Solids in the Self-interaction Corrected Local-spin-density Approximation * Dedicated Computers for Lattice Gauge Theory Simulations * Physics Education: A Survey of Problems and Possible Solutions * Parallel Computing and Electronic-Structure Theory * High Precision Simulation Techniques for Lattice Field Theory * CONTRIBUTED PAPERS * Case Study of Microscale Hydrodynamics Using Molecular Dynamics and Lattice Gas Methods * Computer Modelling of the Structural and Electronic Properties of the Supported Metal Catalysis * Ordered Particle Simulations for Serial and MIMD Parallel Computers * "NOLP" -- Program Package for Laser Plasma Nonlinear Optics * Algorithms to Solve Nonlinear Least Square Problems * Distribution of Hydrogen Atoms in Pd-H Computed by Molecular Dynamics * A Ray Tracing of Optical System for Protein Crystallography Beamline at Storage Ring-SIBERIA-2 * Vibrational Properties of a Pseudobinary Linear Chain with Correlated Substitutional Disorder * Application of the Software Package Mathematica in Generalized Master Equation Method * Linelist: An Interactive Program for Analysing Beam-foil Spectra * GROMACS: A Parallel Computer for Molecular Dynamics Simulations * GROMACS Method of Virial Calculation Using a Single Sum * The Interactive Program for the Solution of the Laplace Equation with the Elimination of Singularities for Boundary Functions * Random-Number Generators: Testing Procedures and Comparison of RNG Algorithms * Micro-TOPIC: A Tokamak Plasma Impurities Code * Rotational Molecular Scattering Calculations * Orthonormal Polynomial Method for Calibrating of Cryogenic Temperature Sensors * Frame-based System Representing Basis of Physics * The Role of Massively Data-parallel Computers in Large Scale Molecular Dynamics Simulations * Short-range Molecular Dynamics on a Network of Processors and Workstations * An Algorithm for Higher-order Perturbation Theory in Radiative Transfer Computations * Hydrostochastics: The Master Equation Formulation of Fluid Dynamics * HPP Lattice Gas on Transputers and Networked Workstations * Study on the Hysteresis Cycle Simulation Using Modeling with Different Functions on Intervals * Refined Pruning Techniques for Feed-forward Neural Networks * Random Walk Simulation of the Motion of Transient Charges in Photoconductors * The Optical Hysteresis in Hydrogenated Amorphous Silicon * Diffusion Monte Carlo Analysis of Modern Interatomic Potentials for He * A Parallel Strategy for Molecular Dynamics Simulations of Polar Liquids on Transputer Arrays * Distribution of Ions Reflected on Rough Surfaces * The Study of Step Density Distribution During Molecular Beam Epitaxy Growth: Monte Carlo Computer Simulation * Towards a Formal Approach to the Construction of Large-scale Scientific Applications Software * Correlated Random Walk and Discrete Modelling of Propagation through Inhomogeneous Media * Teaching Plasma Physics Simulation * A Theoretical Determination of the Au-Ni Phase Diagram * Boson and Fermion Kinetics in One-dimensional Lattices * Computational Physics Course on the Technical University * Symbolic Computations in Simulation Code Development and Femtosecond-pulse Laser-plasma Interaction Studies * Computer Algebra and Integrated Computing Systems in Education of Physical Sciences * Coordinated System of Programs for Undergraduate Physics Instruction * Program Package MIRIAM and Atomic Physics of Extreme Systems * High Energy Physics Simulation on the T_Node * The Chapman-Kolmogorov Equation as Representation of Huygens' Principle and the Monolithic Self-consistent Numerical Modelling of Lasers * Authoring System for Simulation Developments * Molecular Dynamics Study of Ion Charge Effects in the Structure of Ionic Crystals * A Computational Physics Introductory Course * Computer Calculation of Substrate Temperature Field in MBE System * Multimagnetical Simulation of the Ising Model in Two and Three Dimensions * Failure of the CTRW Treatment of the Quasicoherent Excitation Transfer * Implementation of a Parallel Conjugate Gradient Method for Simulation of Elastic Light Scattering * Algorithms for Study of Thin Film Growth * Algorithms and Programs for Physics Teaching in Romanian Technical Universities * Multicanonical Simulation of 1st order Transitions: Interface Tension of the 2D 7-State Potts Model * Two Numerical Methods for the Calculation of Periodic Orbits in Hamiltonian Systems * Chaotic Behavior in a Probabilistic Cellular Automata? * Wave Optics Computing by a Networked-based Vector Wave Automaton * Tensor Manipulation Package in REDUCE * Propagation of Electromagnetic Pulses in Stratified Media * The Simple Molecular Dynamics Model for the Study of Thermalization of the Hot Nucleon Gas * Electron Spin Polarization in PdCo Alloys Calculated by KKR-CPA-LSD Method * Simulation Studies of Microscopic Droplet Spreading * A Vectorizable Algorithm for the Multicolor Successive Overrelaxation Method * Tetragonality of the CuAu I Lattice and Its Relation to Electronic Specific Heat and Spin Susceptibility * Computer Simulation of the Formation of Metallic Aggregates Produced by Chemical Reactions in Aqueous Solution * Scaling in Growth Models with Diffusion: A Monte Carlo Study * The Nucleus as the Mesoscopic System * Neural Network Computation as Dynamic System Simulation * First-principles Theory of Surface Segregation in Binary Alloys * Data Smooth Approximation Algorithm for Estimating the Temperature Dependence of the Ice Nucleation Rate * Genetic Algorithms in Optical Design * Application of 2D-FFT in the Study of Molecular Exchange Processes by NMR * Advanced Mobility Model for Electron Transport in P-Si Inversion Layers * Computer Simulation for Film Surfaces and its Fractal Dimension * Parallel Computation Techniques and the Structure of Catalyst Surfaces * Educational SW to Teach Digital Electronics and the Corresponding Text Book * Primitive Trinomials (Mod 2) Whose Degree is a Mersenne Exponent * Stochastic Modelisation and Parallel Computing * Remarks on the Hybrid Monte Carlo Algorithm for the ∫4 Model * An Experimental Computer Assisted Workbench for Physics Teaching * A Fully Implicit Code to Model Tokamak Plasma Edge Transport * EXPFIT: An Interactive Program for Automatic Beam-foil Decay Curve Analysis * Mapping Technique for Solving General, 1-D Hamiltonian Systems * Freeway Traffic, Cellular Automata, and Some (Self-Organizing) Criticality * Photonuclear Yield Analysis by Dynamic Programming * Incremental Representation of the Simply Connected Planar Curves * Self-convergence in Monte Carlo Methods * Adaptive Mesh Technique for Shock Wave Propagation * Simulation of Supersonic Coronal Streams and Their Interaction with the Solar Wind * The Nature of Chaos in Two Systems of Ordinary Nonlinear Differential Equations * Considerations of a Window-shopper * Interpretation of Data Obtained by RTP 4-Channel Pulsed Radar Reflectometer Using a Multi Layer Perceptron * Statistics of Lattice Bosons for Finite Systems * Fractal Based Image Compression with Affine Transformations * Algorithmic Studies on Simulation Codes for Heavy-ion Reactions * An Energy-Wise Computer Simulation of DNA-Ion-Water Interactions Explains the Abnormal Structure of Poly[d(A)]:Poly[d(T)] * Computer Simulation Study of Kosterlitz-Thouless-Like Transitions * Problem-oriented Software Package GUN-EBT for Computer Simulation of Beam Formation and Transport in Technological Electron-Optical Systems * Parallelization of a Boundary Value Solver and its Application in Nonlinear Dynamics * The Symbolic Classification of Real Four-dimensional Lie Algebras * Short, Singular Pulses Generation by a Dye Laser at Two Wavelengths Simultaneously * Quantum Monte Carlo Simulations of the Apex-Oxygen-Model * Approximation Procedures for the Axial Symmetric Static Einstein-Maxwell-Higgs Theory * Crystallization on a Sphere: Parallel Simulation on a Transputer Network * FAMULUS: A Software Product (also) for Physics Education * MathCAD vs. FAMULUS -- A Brief Comparison * First-principles Dynamics Used to Study Dissociative Chemisorption * A Computer Controlled System for Crystal Growth from Melt * A Time Resolved Spectroscopic Method for Short Pulsed Particle Emission * Green's Function Computation in Radiative Transfer Theory * Random Search Optimization Technique for One-criteria and Multi-criteria Problems * Hartley Transform Applications to Thermal Drift Elimination in Scanning Tunneling Microscopy * Algorithms of Measuring, Processing and Interpretation of Experimental Data Obtained with Scanning Tunneling Microscope * Time-dependent Atom-surface Interactions * Local and Global Minima on Molecular Potential Energy Surfaces: An Example of N3 Radical * Computation of Bifurcation Surfaces * Symbolic Computations in Quantum Mechanics: Energies in Next-to-solvable Systems * A Tool for RTP Reactor and Lamp Field Design * Modelling of Particle Spectra for the Analysis of Solid State Surface * List of Participants
Numerical Analysis and Improved Algorithms for Lyapunov-Exponent Calculation of Discrete-Time Chaotic Systems

NASA Astrophysics Data System (ADS)

He, Jianbin; Yu, Simin; Cai, Jianping

2016-12-01

Lyapunov exponent is an important index for describing chaotic systems behavior, and the largest Lyapunov exponent can be used to determine whether a system is chaotic or not. For discrete-time dynamical systems, the Lyapunov exponents are calculated by an eigenvalue method. In theory, according to eigenvalue method, the more accurate calculations of Lyapunov exponent can be obtained with the increment of iterations, and the limits also exist. However, due to the finite precision of computer and other reasons, the results will be numeric overflow, unrecognized, or inaccurate, which can be stated as follows: (1) The iterations cannot be too large, otherwise, the simulation result will appear as an error message of NaN or Inf; (2) If the error message of NaN or Inf does not appear, then with the increment of iterations, all Lyapunov exponents will get close to the largest Lyapunov exponent, which leads to inaccurate calculation results; (3) From the viewpoint of numerical calculation, obviously, if the iterations are too small, then the results are also inaccurate. Based on the analysis of Lyapunov-exponent calculation in discrete-time systems, this paper investigates two improved algorithms via QR orthogonal decomposition and SVD orthogonal decomposition approaches so as to solve the above-mentioned problems. Finally, some examples are given to illustrate the feasibility and effectiveness of the improved algorithms.

Dynamic splitting of Gaussian pencil beams in heterogeneity-correction algorithms for radiotherapy with heavy charged particles.

PubMed

Kanematsu, Nobuyuki; Komori, Masataka; Yonai, Shunsuke; Ishizaki, Azusa

2009-04-07

The pencil-beam algorithm is valid only when elementary Gaussian beams are small enough compared to the lateral heterogeneity of a medium, which is not always true in actual radiotherapy with protons and ions. This work addresses a solution for the problem. We found approximate self-similarity of Gaussian distributions, with which Gaussian beams can split into narrower and deflecting daughter beams when their sizes have overreached lateral heterogeneity in the beam-transport calculation. The effectiveness was assessed in a carbon-ion beam experiment in the presence of steep range compensation, where the splitting calculation reproduced a detour effect amounting to about 10% in dose or as large as the lateral particle disequilibrium effect. The efficiency was analyzed in calculations for carbon-ion and proton radiations with a heterogeneous phantom model, where the beam splitting increased computing times by factors of 4.7 and 3.2. The present method generally improves the accuracy of the pencil-beam algorithm without severe inefficiency. It will therefore be useful for treatment planning and potentially other demanding applications.
A Localization Method for Underwater Wireless Sensor Networks Based on Mobility Prediction and Particle Swarm Optimization Algorithms

PubMed Central

Zhang, Ying; Liang, Jixing; Jiang, Shengming; Chen, Wei

2016-01-01

Due to their special environment, Underwater Wireless Sensor Networks (UWSNs) are usually deployed over a large sea area and the nodes are usually floating. This results in a lower beacon node distribution density, a longer time for localization, and more energy consumption. Currently most of the localization algorithms in this field do not pay enough consideration on the mobility of the nodes. In this paper, by analyzing the mobility patterns of water near the seashore, a localization method for UWSNs based on a Mobility Prediction and a Particle Swarm Optimization algorithm (MP-PSO) is proposed. In this method, the range-based PSO algorithm is used to locate the beacon nodes, and their velocities can be calculated. The velocity of an unknown node is calculated by using the spatial correlation of underwater object’s mobility, and then their locations can be predicted. The range-based PSO algorithm may cause considerable energy consumption and its computation complexity is a little bit high, nevertheless the number of beacon nodes is relatively smaller, so the calculation for the large number of unknown nodes is succinct, and this method can obviously decrease the energy consumption and time cost of localizing these mobile nodes. The simulation results indicate that this method has higher localization accuracy and better localization coverage rate compared with some other widely used localization methods in this field. PMID:26861348
2.5D complex resistivity modeling and inversion using unstructured grids

NASA Astrophysics Data System (ADS)

Xu, Kaijun; Sun, Jie

2016-04-01

The characteristic of complex resistivity on rock and ore has been recognized by people for a long time. Generally we have used the Cole-Cole Model(CCM) to describe complex resistivity. It has been proved that the electrical anomaly of geologic body can be quantitative estimated by CCM parameters such as direct resistivity(ρ0), chargeability(m), time constant(τ) and frequency dependence(c). Thus it is very important to obtain the complex parameters of geologic body. It is difficult to approximate complex structures and terrain using traditional rectangular grid. In order to enhance the numerical accuracy and rationality of modeling and inversion, we use an adaptive finite-element algorithm for forward modeling of the frequency-domain 2.5D complex resistivity and implement the conjugate gradient algorithm in the inversion of 2.5D complex resistivity. An adaptive finite element method is applied for solving the 2.5D complex resistivity forward modeling of horizontal electric dipole source. First of all, the CCM is introduced into the Maxwell's equations to calculate the complex resistivity electromagnetic fields. Next, the pseudo delta function is used to distribute electric dipole source. Then the electromagnetic fields can be expressed in terms of the primary fields caused by layered structure and the secondary fields caused by inhomogeneities anomalous conductivity. At last, we calculated the electromagnetic fields response of complex geoelectric structures such as anticline, syncline, fault. The modeling results show that adaptive finite-element methods can automatically improve mesh generation and simulate complex geoelectric models using unstructured grids. The 2.5D complex resistivity invertion is implemented based the conjugate gradient algorithm.The conjugate gradient algorithm doesn't need to compute the sensitivity matrix but directly computes the sensitivity matrix or its transpose multiplying vector. In addition, the inversion target zones are segmented with fine grids and the background zones are segmented with big grid, the method can reduce the grid amounts of inversion, it is very helpful to improve the computational efficiency. The inversion results verify the validity and stability of conjugate gradient inversion algorithm. The results of theoretical calculation indicate that the modeling and inversion of 2.5D complex resistivity using unstructured grids are feasible. Using unstructured grids can improve the accuracy of modeling, but the large number of grids inversion is extremely time-consuming, so the parallel computation for the inversion is necessary. Acknowledgments: We thank to the support of the National Natural Science Foundation of China(41304094).
Real-time fluorescence target/background (T/B) ratio calculation in multimodal endoscopy for detecting GI tract cancer

NASA Astrophysics Data System (ADS)

Jiang, Yang; Gong, Yuanzheng; Wang, Thomas D.; Seibel, Eric J.

2017-02-01

Multimodal endoscopy, with fluorescence-labeled probes binding to overexpressed molecular targets, is a promising technology to visualize early-stage cancer. T/B ratio is the quantitative analysis used to correlate fluorescence regions to cancer. Currently, T/B ratio calculation is post-processing and does not provide real-time feedback to the endoscopist. To achieve real-time computer assisted diagnosis (CAD), we establish image processing protocols for calculating T/B ratio and locating high-risk fluorescence regions for guiding biopsy and therapy in Barrett's esophagus (BE) patients. Methods: Chan-Vese algorithm, an active contour model, is used to segment high-risk regions in fluorescence videos. A semi-implicit gradient descent method was applied to minimize the energy function of this algorithm and evolve the segmentation. The surrounding background was then identified using morphology operation. The average T/B ratio was computed and regions of interest were highlighted based on user-selected thresholding. Evaluation was conducted on 50 fluorescence videos acquired from clinical video recordings using a custom multimodal endoscope. Results: With a processing speed of 2 fps on a laptop computer, we obtained accurate segmentation of high-risk regions examined by experts. For each case, the clinical user could optimize target boundary by changing the penalty on area inside the contour. Conclusion: Automatic and real-time procedure of calculating T/B ratio and identifying high-risk regions of early esophageal cancer was developed. Future work will increase processing speed to <5 fps, refine the clinical interface, and apply to additional GI cancers and fluorescence peptides.
Introducing difference recurrence relations for faster semi-global alignment of long sequences.

PubMed

Suzuki, Hajime; Kasahara, Masahiro

2018-02-19

The read length of single-molecule DNA sequencers is reaching 1 Mb. Popular alignment software tools widely used for analyzing such long reads often take advantage of single-instruction multiple-data (SIMD) operations to accelerate calculation of dynamic programming (DP) matrices in the Smith-Waterman-Gotoh (SWG) algorithm with a fixed alignment start position at the origin. Nonetheless, 16-bit or 32-bit integers are necessary for storing the values in a DP matrix when sequences to be aligned are long; this situation hampers the use of the full SIMD width of modern processors. We proposed a faster semi-global alignment algorithm, "difference recurrence relations," that runs more rapidly than the state-of-the-art algorithm by a factor of 2.1. Instead of calculating and storing all the values in a DP matrix directly, our algorithm computes and stores mainly the differences between the values of adjacent cells in the matrix. Although the SWG algorithm and our algorithm can output exactly the same result, our algorithm mainly involves 8-bit integer operations, enabling us to exploit the full width of SIMD operations (e.g., 32) on modern processors. We also developed a library, libgaba, so that developers can easily integrate our algorithm into alignment programs. Our novel algorithm and optimized library implementation will facilitate accelerating nucleotide long-read analysis algorithms that use pairwise alignment stages. The library is implemented in the C programming language and available at https://github.com/ocxtal/libgaba .
Experiences with explicit finite-difference schemes for complex fluid dynamics problems on STAR-100 and CYBER-203 computers

NASA Technical Reports Server (NTRS)

Kumar, A.; Rudy, D. H.; Drummond, J. P.; Harris, J. E.

1982-01-01

Several two- and three-dimensional external and internal flow problems solved on the STAR-100 and CYBER-203 vector processing computers are described. The flow field was described by the full Navier-Stokes equations which were then solved by explicit finite-difference algorithms. Problem results and computer system requirements are presented. Program organization and data base structure for three-dimensional computer codes which will eliminate or improve on page faulting, are discussed. Storage requirements for three-dimensional codes are reduced by calculating transformation metric data in each step. As a result, in-core grid points were increased in number by 50% to 150,000, with a 10% execution time increase. An assessment of current and future machine requirements shows that even on the CYBER-205 computer only a few problems can be solved realistically. Estimates reveal that the present situation is more storage limited than compute rate limited, but advancements in both storage and speed are essential to realistically calculate three-dimensional flow.
Agreement between gamma passing rates using computed tomography in radiotherapy and secondary cancer risk prediction from more advanced dose calculated models

PubMed Central

Balosso, Jacques

2017-01-01

Background During the past decades, in radiotherapy, the dose distributions were calculated using density correction methods with pencil beam as type ‘a’ algorithm. The objectives of this study are to assess and evaluate the impact of dose distribution shift on the predicted secondary cancer risk (SCR), using modern advanced dose calculation algorithms, point kernel, as type ‘b’, which consider change in lateral electrons transport. Methods Clinical examples of pediatric cranio-spinal irradiation patients were evaluated. For each case, two radiotherapy treatment plans with were generated using the same prescribed dose to the target resulting in different number of monitor units (MUs) per field. The dose distributions were calculated, respectively, using both algorithms types. A gamma index (γ) analysis was used to compare dose distribution in the lung. The organ equivalent dose (OED) has been calculated with three different models, the linear, the linear-exponential and the plateau dose response curves. The excess absolute risk ratio (EAR) was also evaluated as (EAR = OED type ‘b’ / OED type ‘a’). Results The γ analysis results indicated an acceptable dose distribution agreement of 95% with 3%/3 mm. Although, the γ-maps displayed dose displacement >1 mm around the healthy lungs. Compared to type ‘a’, the OED values from type ‘b’ dose distributions’ were about 8% to 16% higher, leading to an EAR ratio >1, ranged from 1.08 to 1.13 depending on SCR models. Conclusions The shift of dose calculation in radiotherapy, according to the algorithm, can significantly influence the SCR prediction and the plan optimization, since OEDs are calculated from DVH for a specific treatment. The agreement between dose distribution and SCR prediction depends on dose response models and epidemiological data. In addition, the γ passing rates of 3%/3 mm does not translate the difference, up to 15%, in the predictions of SCR resulting from alternative algorithms. Considering that modern algorithms are more accurate, showing more precisely the dose distributions, but that the prediction of absolute SCR is still very imprecise, only the EAR ratio could be used to rank radiotherapy plans. PMID:28811995
Adaptable Iterative and Recursive Kalman Filter Schemes

NASA Technical Reports Server (NTRS)

Zanetti, Renato

2014-01-01

Nonlinear filters are often very computationally expensive and usually not suitable for real-time applications. Real-time navigation algorithms are typically based on linear estimators, such as the extended Kalman filter (EKF) and, to a much lesser extent, the unscented Kalman filter. The Iterated Kalman filter (IKF) and the Recursive Update Filter (RUF) are two algorithms that reduce the consequences of the linearization assumption of the EKF by performing N updates for each new measurement, where N is the number of recursions, a tuning parameter. This paper introduces an adaptable RUF algorithm to calculate N on the go, a similar technique can be used for the IKF as well.
Lattice QCD Application Development within the US DOE Exascale Computing Project

NASA Astrophysics Data System (ADS)

Brower, Richard; Christ, Norman; DeTar, Carleton; Edwards, Robert; Mackenzie, Paul

2018-03-01

In October, 2016, the US Department of Energy launched the Exascale Computing Project, which aims to deploy exascale computing resources for science and engineering in the early 2020's. The project brings together application teams, software developers, and hardware vendors in order to realize this goal. Lattice QCD is one of the applications. Members of the US lattice gauge theory community with significant collaborators abroad are developing algorithms and software for exascale lattice QCD calculations. We give a short description of the project, our activities, and our plans.
Lattice QCD Application Development within the US DOE Exascale Computing Project

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brower, Richard; Christ, Norman; DeTar, Carleton

In October, 2016, the US Department of Energy launched the Exascale Computing Project, which aims to deploy exascale computing resources for science and engineering in the early 2020's. The project brings together application teams, software developers, and hardware vendors in order to realize this goal. Lattice QCD is one of the applications. Members of the US lattice gauge theory community with significant collaborators abroad are developing algorithms and software for exascale lattice QCD calculations. We give a short description of the project, our activities, and our plans.
Discrete Fourier Transform Analysis in a Complex Vector Space

NASA Technical Reports Server (NTRS)

Dean, Bruce H.

2009-01-01

Alternative computational strategies for the Discrete Fourier Transform (DFT) have been developed using analysis of geometric manifolds. This approach provides a general framework for performing DFT calculations, and suggests a more efficient implementation of the DFT for applications using iterative transform methods, particularly phase retrieval. The DFT can thus be implemented using fewer operations when compared to the usual DFT counterpart. The software decreases the run time of the DFT in certain applications such as phase retrieval that iteratively call the DFT function. The algorithm exploits a special computational approach based on analysis of the DFT as a transformation in a complex vector space. As such, this approach has the potential to realize a DFT computation that approaches N operations versus Nlog(N) operations for the equivalent Fast Fourier Transform (FFT) calculation.
Automated neurovascular tracing and analysis of the knife-edge scanning microscope Rat Nissl data set using a computing cluster.

PubMed

Sungjun Lim; Nowak, Michael R; Yoonsuck Choe

2016-08-01

We present a novel, parallelizable algorithm capable of automatically reconstructing and calculating anatomical statistics of cerebral vascular networks embedded in large volumes of Rat Nissl-stained data. In this paper, we report the results of our method using Rattus somatosensory cortical data acquired using Knife-Edge Scanning Microscopy. Our algorithm performs the reconstruction task with averaged precision, recall, and F2-score of 0.978, 0.892, and 0.902 respectively. Calculated anatomical statistics show some conformance to values previously reported. The results that can be obtained from our method are expected to help explicate the relationship between the structural organization of the microcirculation and normal (and abnormal) cerebral functioning.
Efficient calculation of higher-order optical waveguide dispersion.

PubMed

Mores, J A; Malheiros-Silveira, G N; Fragnito, H L; Hernández-Figueroa, H E

2010-09-13

An efficient numerical strategy to compute the higher-order dispersion parameters of optical waveguides is presented. For the first time to our knowledge, a systematic study of the errors involved in the higher-order dispersions' numerical calculation process is made, showing that the present strategy can accurately model those parameters. Such strategy combines a full-vectorial finite element modal solver and a proper finite difference differentiation algorithm. Its performance has been carefully assessed through the analysis of several key geometries. In addition, the optimization of those higher-order dispersion parameters can also be carried out by coupling to the present scheme a genetic algorithm, as shown here through the design of a photonic crystal fiber suitable for parametric amplification applications.
A Scheme for Text Analysis Using Fortran.

ERIC Educational Resources Information Center

Koether, Mary E.; Coke, Esther U.

Using string-manipulation algorithms, FORTRAN computer programs were designed for analysis of written material. The programs measure length of a text and its complexity in terms of the average length of words and sentences, map the occurrences of keywords or phrases, calculate word frequency distribution and certain indicators of style. Trials of…
Understanding Computation of Impulse Response in Microwave Software Tools

ERIC Educational Resources Information Center

Potrebic, Milka M.; Tosic, Dejan V.; Pejovic, Predrag V.

2010-01-01

In modern microwave engineering curricula, the introduction of the many new topics in microwave industrial development, or of software tools for design and simulation, sometimes results in students having an inadequate understanding of the fundamental theory. The terminology for and the explanation of algorithms for calculating impulse response in…
Electromagnetic Characterization of Materials Using a Dual Chambered High Temperature Waveguide

DTIC Science & Technology

to just one day through simultaneous measurement of the sample and the empty second chamber. A vector network analyzer (VNA) will be used to run X-band...calculated from the Nicolson-Ross-Weir inversion algorithm for computing permittivity and permeability using VNA measured S-parameters at increasing temperatures.
Pilot study of an automated method to determine Melasma Area and Severity Index.

PubMed

Tay, E Y; Gan, E Y; Tan, V W D; Lin, Z; Liang, Y; Lin, F; Wee, S; Thng, T G

2015-06-01

Objective outcome measures for melasma severity are essential for the evaluation of severity as well as results of treatment. The modified Melasma Area and Severity Index (mMASI) score is a validated tool for assessing melasma severity but is often subject to inter-observer variability. To develop and validate a novel image analysis software designed to automatically calculate the area and degree of hyperpigmentation in melasma from computer image analysis of whole-face digital photographs, thereby deriving an automated mMASI score (aMASI). The algorithm was developed in collaboration between dermatologists and image analysis experts. Firstly, using an adaptive threshold method, the algorithm identifies, segments and calculates the areas involved. It then calculates the darkness. Finally, the derived area and darkness are then used to calculate mMASI. The scores derived from the algorithm are validated prospectively. Twenty-nine patients with melasma using depigmenting agents were recruited for validation. Three dermatologists scored mMASI at baseline and post-treatment using standardized photographs. These scores were compared with aMASI scores derived from computer analysis. aMASI scores correlated well with clinical mMASI pre-treatment (r = 0·735, P < 0·001) and post-treatment (r = 0·608, P < 0·001). aMASI was reliable in detecting changes with treatment. These changes in aMASI scores correlated well with changes in clinician-assessed mMASI (r = 0·622, P < 0·001). This study proposes a novel approach in melasma scoring using digital image analysis. It holds promise as a tool that would enable clinicians worldwide to standardize melasma severity scoring and outcome measures in an easy and reproducible manner, enabling different treatment options to be compared accurately. © 2015 British Association of Dermatologists.
Progress and supercomputing in computational fluid dynamics; Proceedings of U.S.-Israel Workshop, Jerusalem, Israel, December 1984

NASA Technical Reports Server (NTRS)

Murman, E. M. (Editor); Abarbanel, S. S. (Editor)

1985-01-01

Current developments and future trends in the application of supercomputers to computational fluid dynamics are discussed in reviews and reports. Topics examined include algorithm development for personal-size supercomputers, a multiblock three-dimensional Euler code for out-of-core and multiprocessor calculations, simulation of compressible inviscid and viscous flow, high-resolution solutions of the Euler equations for vortex flows, algorithms for the Navier-Stokes equations, and viscous-flow simulation by FEM and related techniques. Consideration is given to marching iterative methods for the parabolized and thin-layer Navier-Stokes equations, multigrid solutions to quasi-elliptic schemes, secondary instability of free shear flows, simulation of turbulent flow, and problems connected with weather prediction.
Interactive software tool to comprehend the calculation of optimal sequence alignments with dynamic programming.

PubMed

Ibarra, Ignacio L; Melo, Francisco

2010-07-01

Dynamic programming (DP) is a general optimization strategy that is successfully used across various disciplines of science. In bioinformatics, it is widely applied in calculating the optimal alignment between pairs of protein or DNA sequences. These alignments form the basis of new, verifiable biological hypothesis. Despite its importance, there are no interactive tools available for training and education on understanding the DP algorithm. Here, we introduce an interactive computer application with a graphical interface, for the purpose of educating students about DP. The program displays the DP scoring matrix and the resulting optimal alignment(s), while allowing the user to modify key parameters such as the values in the similarity matrix, the sequence alignment algorithm version and the gap opening/extension penalties. We hope that this software will be useful to teachers and students of bioinformatics courses, as well as researchers who implement the DP algorithm for diverse applications. The software is freely available at: http:/melolab.org/sat. The software is written in the Java computer language, thus it runs on all major platforms and operating systems including Windows, Mac OS X and LINUX. All inquiries or comments about this software should be directed to Francisco Melo at fmelo@bio.puc.cl.
Fast local reconstruction by selective backprojection for low dose in dental computed tomography

NASA Astrophysics Data System (ADS)

Yan, Bin; Deng, Lin; Han, Yu; Zhang, Feng; Wang, Xian-Chao; Li, Lei

2014-10-01

The high radiation dose in computed tomography (CT) scans increases the lifetime risk of cancer, which becomes a major clinical concern. The backprojection-filtration (BPF) algorithm could reduce the radiation dose by reconstructing the images from truncated data in a short scan. In a dental CT, it could reduce the radiation dose for the teeth by using the projection acquired in a short scan, and could avoid irradiation to the other part by using truncated projection. However, the limit of integration for backprojection varies per PI-line, resulting in low calculation efficiency and poor parallel performance. Recently, a tent BPF has been proposed to improve the calculation efficiency by rearranging the projection. However, the memory-consuming data rebinning process is included. Accordingly, the selective BPF (S-BPF) algorithm is proposed in this paper. In this algorithm, the derivative of the projection is backprojected to the points whose x coordinate is less than that of the source focal spot to obtain the differentiated backprojection. The finite Hilbert inverse is then applied to each PI-line segment. S-BPF avoids the influence of the variable limit of integration by selective backprojection without additional time cost or memory cost. The simulation experiment and the real experiment demonstrated the higher reconstruction efficiency of S-BPF.

A design tool for direct and non-stochastic calculations of near-field radiative transfer in complex structures: The NF-RT-FDTD algorithm

NASA Astrophysics Data System (ADS)

Didari, Azadeh; Pinar Mengüç, M.

2017-08-01

Advances in nanotechnology and nanophotonics are inextricably linked with the need for reliable computational algorithms to be adapted as design tools for the development of new concepts in energy harvesting, radiative cooling, nanolithography and nano-scale manufacturing, among others. In this paper, we provide an outline for such a computational tool, named NF-RT-FDTD, to determine the near-field radiative transfer between structured surfaces using Finite Difference Time Domain method. NF-RT-FDTD is a direct and non-stochastic algorithm, which accounts for the statistical nature of the thermal radiation and is easily applicable to any arbitrary geometry at thermal equilibrium. We present a review of the fundamental relations for far- and near-field radiative transfer between different geometries with nano-scale surface and volumetric features and gaps, and then we discuss the details of the NF-RT-FDTD formulation, its application to sample geometries and outline its future expansion to more complex geometries. In addition, we briefly discuss some of the recent numerical works for direct and indirect calculations of near-field thermal radiation transfer, including Scattering Matrix method, Finite Difference Time Domain method (FDTD), Wiener Chaos Expansion, Fluctuating Surface Current (FSC), Fluctuating Volume Current (FVC) and Thermal Discrete Dipole Approximations (TDDA).
The chemical information ontology: provenance and disambiguation for chemical data on the biological semantic web.

PubMed

Hastings, Janna; Chepelev, Leonid; Willighagen, Egon; Adams, Nico; Steinbeck, Christoph; Dumontier, Michel

2011-01-01

Cheminformatics is the application of informatics techniques to solve chemical problems in silico. There are many areas in biology where cheminformatics plays an important role in computational research, including metabolism, proteomics, and systems biology. One critical aspect in the application of cheminformatics in these fields is the accurate exchange of data, which is increasingly accomplished through the use of ontologies. Ontologies are formal representations of objects and their properties using a logic-based ontology language. Many such ontologies are currently being developed to represent objects across all the domains of science. Ontologies enable the definition, classification, and support for querying objects in a particular domain, enabling intelligent computer applications to be built which support the work of scientists both within the domain of interest and across interrelated neighbouring domains. Modern chemical research relies on computational techniques to filter and organise data to maximise research productivity. The objects which are manipulated in these algorithms and procedures, as well as the algorithms and procedures themselves, enjoy a kind of virtual life within computers. We will call these information entities. Here, we describe our work in developing an ontology of chemical information entities, with a primary focus on data-driven research and the integration of calculated properties (descriptors) of chemical entities within a semantic web context. Our ontology distinguishes algorithmic, or procedural information from declarative, or factual information, and renders of particular importance the annotation of provenance to calculated data. The Chemical Information Ontology is being developed as an open collaborative project. More details, together with a downloadable OWL file, are available at http://code.google.com/p/semanticchemistry/ (license: CC-BY-SA).
A fast radiative transfer model for visible through shortwave infrared spectral reflectances in clear and cloudy atmospheres

NASA Astrophysics Data System (ADS)

Wang, Chenxi; Yang, Ping; Nasiri, Shaima L.; Platnick, Steven; Baum, Bryan A.; Heidinger, Andrew K.; Liu, Xu

2013-02-01

A computationally efficient radiative transfer model (RTM) for calculating visible (VIS) through shortwave infrared (SWIR) reflectances is developed for use in satellite and airborne cloud property retrievals. The full radiative transfer equation (RTE) for combinations of cloud, aerosol, and molecular layers is solved approximately by using six independent RTEs that assume the plane-parallel approximation along with a single-scattering approximation for Rayleigh scattering. Each of the six RTEs can be solved analytically if the bidirectional reflectance/transmittance distribution functions (BRDF/BTDF) of the cloud/aerosol layers are known. The adding/doubling (AD) algorithm is employed to account for overlapped cloud/aerosol layers and non-Lambertian surfaces. Two approaches are used to mitigate the significant computational burden of the AD algorithm. First, the BRDF and BTDF of single cloud/aerosol layers are pre-computed using the discrete ordinates radiative transfer program (DISORT) implemented with 128 streams, and second, the required integral in the AD algorithm is numerically implemented on a twisted icosahedral mesh. A concise surface BRDF simulator associated with the MODIS land surface product (MCD43) is merged into a fast RTM to accurately account for non-isotropic surface reflectance. The resulting fast RTM is evaluated with respect to its computational accuracy and efficiency. The simulation bias between DISORT and the fast RTM is large (e.g., relative error >5%) only when both the solar zenith angle (SZA) and the viewing zenith angle (VZA) are large (i.e., SZA>45° and VZA>70°). For general situations, i.e., cloud/aerosol layers above a non-Lambertian surface, the fast RTM calculation rate is faster than that of the 128-stream DISORT by approximately two orders of magnitude.
The Chemical Information Ontology: Provenance and Disambiguation for Chemical Data on the Biological Semantic Web

PubMed Central

Hastings, Janna; Chepelev, Leonid; Willighagen, Egon; Adams, Nico; Steinbeck, Christoph; Dumontier, Michel

2011-01-01

Cheminformatics is the application of informatics techniques to solve chemical problems in silico. There are many areas in biology where cheminformatics plays an important role in computational research, including metabolism, proteomics, and systems biology. One critical aspect in the application of cheminformatics in these fields is the accurate exchange of data, which is increasingly accomplished through the use of ontologies. Ontologies are formal representations of objects and their properties using a logic-based ontology language. Many such ontologies are currently being developed to represent objects across all the domains of science. Ontologies enable the definition, classification, and support for querying objects in a particular domain, enabling intelligent computer applications to be built which support the work of scientists both within the domain of interest and across interrelated neighbouring domains. Modern chemical research relies on computational techniques to filter and organise data to maximise research productivity. The objects which are manipulated in these algorithms and procedures, as well as the algorithms and procedures themselves, enjoy a kind of virtual life within computers. We will call these information entities. Here, we describe our work in developing an ontology of chemical information entities, with a primary focus on data-driven research and the integration of calculated properties (descriptors) of chemical entities within a semantic web context. Our ontology distinguishes algorithmic, or procedural information from declarative, or factual information, and renders of particular importance the annotation of provenance to calculated data. The Chemical Information Ontology is being developed as an open collaborative project. More details, together with a downloadable OWL file, are available at http://code.google.com/p/semanticchemistry/ (license: CC-BY-SA). PMID:21991315
Computional algorithm for lifetime exposure to antimicrobials in pigs using register data-The LEA algorithm.

PubMed

Birkegård, Anna Camilla; Andersen, Vibe Dalhoff; Halasa, Tariq; Jensen, Vibeke Frøkjær; Toft, Nils; Vigre, Håkan

2017-10-01

Accurate and detailed data on antimicrobial exposure in pig production are essential when studying the association between antimicrobial exposure and antimicrobial resistance. Due to difficulties in obtaining primary data on antimicrobial exposure in a large number of farms, there is a need for a robust and valid method to estimate the exposure using register data. An approach that estimates the antimicrobial exposure in every rearing period during the lifetime of a pig using register data was developed into a computational algorithm. In this approach data from national registers on antimicrobial purchases, movements of pigs and farm demographics registered at farm level are used. The algorithm traces batches of pigs retrospectively from slaughter to the farm(s) that housed the pigs during their finisher, weaner, and piglet period. Subsequently, the algorithm estimates the antimicrobial exposure as the number of Animal Defined Daily Doses for treatment of one kg pig in each of the rearing periods. Thus, the antimicrobial purchase data at farm level are translated into antimicrobial exposure estimates at batch level. A batch of pigs is defined here as pigs sent to slaughter at the same day from the same farm. In this study we present, validate, and optimise a computational algorithm that calculate the lifetime exposure of antimicrobials for slaughter pigs. The algorithm was evaluated by comparing the computed estimates to data on antimicrobial usage from farm records in 15 farm units. We found a good positive correlation between the two estimates. The algorithm was run for Danish slaughter pigs sent to slaughter in January to March 2015 from farms with more than 200 finishers to estimate the proportion of farms that it was applicable for. In the final process, the algorithm was successfully run for batches of pigs originating from 3026 farms with finisher units (77% of the initial population). This number can be increased if more accurate register data can be obtained. The algorithm provides a systematic and repeatable approach to estimating the antimicrobial exposure throughout the rearing period, independent of rearing site for finisher batches, as a lifetime exposure measurement. Copyright © 2017 Elsevier B.V. All rights reserved.
Linear static structural and vibration analysis on high-performance computers

NASA Technical Reports Server (NTRS)

Baddourah, M. A.; Storaasli, O. O.; Bostic, S. W.

1993-01-01

Parallel computers offer the oppurtunity to significantly reduce the computation time necessary to analyze large-scale aerospace structures. This paper presents algorithms developed for and implemented on massively-parallel computers hereafter referred to as Scalable High-Performance Computers (SHPC), for the most computationally intensive tasks involved in structural analysis, namely, generation and assembly of system matrices, solution of systems of equations and calculation of the eigenvalues and eigenvectors. Results on SHPC are presented for large-scale structural problems (i.e. models for High-Speed Civil Transport). The goal of this research is to develop a new, efficient technique which extends structural analysis to SHPC and makes large-scale structural analyses tractable.
Covering Resilience: A Recent Development for Binomial Checkpointing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Walther, Andrea; Narayanan, Sri Hari Krishna

In terms of computing time, adjoint methods offer a very attractive alternative to compute gradient information, required, e.g., for optimization purposes. However, together with this very favorable temporal complexity result comes a memory requirement that is in essence proportional with the operation count of the underlying function, e.g., if algorithmic differentiation is used to provide the adjoints. For this reason, checkpointing approaches in many variants have become popular. This paper analyzes an extension of the so-called binomial approach to cover also possible failures of the computing systems. Such a measure of precaution is of special interest for massive parallel simulationsmore » and adjoint calculations where the mean time between failure of the large scale computing system is smaller than the time needed to complete the calculation of the adjoint information. We describe the extensions of standard checkpointing approaches required for such resilience, provide a corresponding implementation and discuss first numerical results.« less
Calculation of power spectrums from digital time series with missing data points

NASA Technical Reports Server (NTRS)

Murray, C. W., Jr.

1980-01-01

Two algorithms are developed for calculating power spectrums from the autocorrelation function when there are missing data points in the time series. Both methods use an average sampling interval to compute lagged products. One method, the correlation function power spectrum, takes the discrete Fourier transform of the lagged products directly to obtain the spectrum, while the other, the modified Blackman-Tukey power spectrum, takes the Fourier transform of the mean lagged products. Both techniques require fewer calculations than other procedures since only 50% to 80% of the maximum lags need be calculated. The algorithms are compared with the Fourier transform power spectrum and two least squares procedures (all for an arbitrary data spacing). Examples are given showing recovery of frequency components from simulated periodic data where portions of the time series are missing and random noise has been added to both the time points and to values of the function. In addition the methods are compared using real data. All procedures performed equally well in detecting periodicities in the data.
Modern Approaches to the Computation of the Probability of Target Detection in Cluttered Environments

NASA Astrophysics Data System (ADS)

Meitzler, Thomas J.

The field of computer vision interacts with fields such as psychology, vision research, machine vision, psychophysics, mathematics, physics, and computer science. The focus of this thesis is new algorithms and methods for the computation of the probability of detection (Pd) of a target in a cluttered scene. The scene can be either a natural visual scene such as one sees with the naked eye (visual), or, a scene displayed on a monitor with the help of infrared sensors. The relative clutter and the temperature difference between the target and background (DeltaT) are defined and then used to calculate a relative signal -to-clutter ratio (SCR) from which the Pd is calculated for a target in a cluttered scene. It is shown how this definition can include many previous definitions of clutter and (DeltaT). Next, fuzzy and neural -fuzzy techniques are used to calculate the Pd and it is shown how these methods can give results that have a good correlation with experiment. The experimental design for actually measuring the Pd of a target by observers is described. Finally, wavelets are applied to the calculation of clutter and it is shown how this new definition of clutter based on wavelets can be used to compute the Pd of a target.
Nonlinear Rayleigh wave inversion based on the shuffled frog-leaping algorithm

NASA Astrophysics Data System (ADS)

Sun, Cheng-Yu; Wang, Yan-Yan; Wu, Dun-Shi; Qin, Xiao-Jun

2017-12-01

At present, near-surface shear wave velocities are mainly calculated through Rayleigh wave dispersion-curve inversions in engineering surface investigations, but the required calculations pose a highly nonlinear global optimization problem. In order to alleviate the risk of falling into a local optimal solution, this paper introduces a new global optimization method, the shuffle frog-leaping algorithm (SFLA), into the Rayleigh wave dispersion-curve inversion process. SFLA is a swarm-intelligence-based algorithm that simulates a group of frogs searching for food. It uses a few parameters, achieves rapid convergence, and is capability of effective global searching. In order to test the reliability and calculation performance of SFLA, noise-free and noisy synthetic datasets were inverted. We conducted a comparative analysis with other established algorithms using the noise-free dataset, and then tested the ability of SFLA to cope with data noise. Finally, we inverted a real-world example to examine the applicability of SFLA. Results from both synthetic and field data demonstrated the effectiveness of SFLA in the interpretation of Rayleigh wave dispersion curves. We found that SFLA is superior to the established methods in terms of both reliability and computational efficiency, so it offers great potential to improve our ability to solve geophysical inversion problems.
A variational eigenvalue solver on a photonic quantum processor

PubMed Central

Peruzzo, Alberto; McClean, Jarrod; Shadbolt, Peter; Yung, Man-Hong; Zhou, Xiao-Qi; Love, Peter J.; Aspuru-Guzik, Alán; O’Brien, Jeremy L.

2014-01-01

Quantum computers promise to efficiently solve important problems that are intractable on a conventional computer. For quantum systems, where the physical dimension grows exponentially, finding the eigenvalues of certain operators is one such intractable problem and remains a fundamental challenge. The quantum phase estimation algorithm efficiently finds the eigenvalue of a given eigenvector but requires fully coherent evolution. Here we present an alternative approach that greatly reduces the requirements for coherent evolution and combine this method with a new approach to state preparation based on ansätze and classical optimization. We implement the algorithm by combining a highly reconfigurable photonic quantum processor with a conventional computer. We experimentally demonstrate the feasibility of this approach with an example from quantum chemistry—calculating the ground-state molecular energy for He–H+. The proposed approach drastically reduces the coherence time requirements, enhancing the potential of quantum resources available today and in the near future. PMID:25055053
The study on servo-control system in the large aperture telescope

NASA Astrophysics Data System (ADS)

Hu, Wei; Zhenchao, Zhang; Daxing, Wang

2008-08-01

Large astronomical telescope or extremely enormous astronomical telescope servo tracking technique will be one of crucial technology that must be solved in researching and manufacturing. To control technique feature of large astronomical telescope or extremely enormous astronomical telescope, this paper design a sort of large astronomical telescope servo tracking control system. This system composes a principal and subordinate distributed control system, host computer sends steering instruction and receive slave computer functional mode, slave computer accomplish control algorithm and execute real-time control. Large astronomical telescope servo control use direct drive machine, and adopt DSP technology to complete direct torque control algorithm, Such design can not only increase control system performance, but also greatly reduced volume and costs of control system, which has a significant occurrence. The system design scheme can be proved reasonably by calculating and simulating. This system can be applied to large astronomical telescope.
Global Artificial Boundary Conditions for Computation of External Flow Problems with Propulsive Jets

NASA Technical Reports Server (NTRS)

Tsynkov, Semyon; Abarbanel, Saul; Nordstrom, Jan; Ryabenkii, Viktor; Vatsa, Veer

1998-01-01

We propose new global artificial boundary conditions (ABC's) for computation of flows with propulsive jets. The algorithm is based on application of the difference potentials method (DPM). Previously, similar boundary conditions have been implemented for calculation of external compressible viscous flows around finite bodies. The proposed modification substantially extends the applicability range of the DPM-based algorithm. In the paper, we present the general formulation of the problem, describe our numerical methodology, and discuss the corresponding computational results. The particular configuration that we analyze is a slender three-dimensional body with boat-tail geometry and supersonic jet exhaust in a subsonic external flow under zero angle of attack. Similarly to the results obtained earlier for the flows around airfoils and wings, current results for the jet flow case corroborate the superiority of the DPM-based ABC's over standard local methodologies from the standpoints of accuracy, overall numerical performance, and robustness.
Optimization of the Monte Carlo code for modeling of photon migration in tissue.

PubMed

Zołek, Norbert S; Liebert, Adam; Maniewski, Roman

2006-10-01

The Monte Carlo method is frequently used to simulate light transport in turbid media because of its simplicity and flexibility, allowing to analyze complicated geometrical structures. Monte Carlo simulations are, however, time consuming because of the necessity to track the paths of individual photons. The time consuming computation is mainly associated with the calculation of the logarithmic and trigonometric functions as well as the generation of pseudo-random numbers. In this paper, the Monte Carlo algorithm was developed and optimized, by approximation of the logarithmic and trigonometric functions. The approximations were based on polynomial and rational functions, and the errors of these approximations are less than 1% of the values of the original functions. The proposed algorithm was verified by simulations of the time-resolved reflectance at several source-detector separations. The results of the calculation using the approximated algorithm were compared with those of the Monte Carlo simulations obtained with an exact computation of the logarithm and trigonometric functions as well as with the solution of the diffusion equation. The errors of the moments of the simulated distributions of times of flight of photons (total number of photons, mean time of flight and variance) are less than 2% for a range of optical properties, typical of living tissues. The proposed approximated algorithm allows to speed up the Monte Carlo simulations by a factor of 4. The developed code can be used on parallel machines, allowing for further acceleration.
Towards topological quantum computer

NASA Astrophysics Data System (ADS)

Melnikov, D.; Mironov, A.; Mironov, S.; Morozov, A.; Morozov, An.

2018-01-01

Quantum R-matrices, the entangling deformations of non-entangling (classical) permutations, provide a distinguished basis in the space of unitary evolutions and, consequently, a natural choice for a minimal set of basic operations (universal gates) for quantum computation. Yet they play a special role in group theory, integrable systems and modern theory of non-perturbative calculations in quantum field and string theory. Despite recent developments in those fields the idea of topological quantum computing and use of R-matrices, in particular, practically reduce to reinterpretation of standard sets of quantum gates, and subsequently algorithms, in terms of available topological ones. In this paper we summarize a modern view on quantum R-matrix calculus and propose to look at the R-matrices acting in the space of irreducible representations, which are unitary for the real-valued couplings in Chern-Simons theory, as the fundamental set of universal gates for topological quantum computer. Such an approach calls for a more thorough investigation of the relation between topological invariants of knots and quantum algorithms.
On the Rapid Computation of Various Polylogarithmic Constants

NASA Technical Reports Server (NTRS)

Bailey, David H.; Borwein, Peter; Plouffe, Simon

1996-01-01

We give algorithms for the computation of the d-th digit of certain transcendental numbers in various bases. These algorithms can be easily implemented (multiple precision arithmetic is not needed), require virtually no memory, and feature run times that scale nearly linearly with the order of the digit desired. They make it feasible to compute, for example, the billionth binary digit of log(2) or pi on a modest workstation in a few hours run time. We demonstrate this technique by computing the ten billionth hexadecimal digit of pi, the billionth hexadecimal digits of pi-squared, log(2) and log-squared(2), and the ten billionth decimal digit of log(9/10). These calculations rest on the observation that very special types of identities exist for certain numbers like pi, pi-squared, log(2) and log-squared(2). These are essentially polylogarithmic ladders in an integer base. A number of these identities that we derive in this work appear to be new, for example a critical identity for pi.
Intelligent QoS routing algorithm based on improved AODV protocol for Ad Hoc networks

NASA Astrophysics Data System (ADS)

Huibin, Liu; Jun, Zhang

2016-04-01

Mobile Ad Hoc Networks were playing an increasingly important part in disaster reliefs, military battlefields and scientific explorations. However, networks routing difficulties are more and more outstanding due to inherent structures. This paper proposed an improved cuckoo searching-based Ad hoc On-Demand Distance Vector Routing protocol (CSAODV). It elaborately designs the calculation methods of optimal routing algorithm used by protocol and transmission mechanism of communication-package. In calculation of optimal routing algorithm by CS Algorithm, by increasing QoS constraint, the found optimal routing algorithm can conform to the requirements of specified bandwidth and time delay, and a certain balance can be obtained among computation spending, bandwidth and time delay. Take advantage of NS2 simulation software to take performance test on protocol in three circumstances and validate the feasibility and validity of CSAODV protocol. In results, CSAODV routing protocol is more adapt to the change of network topological structure than AODV protocol, which improves package delivery fraction of protocol effectively, reduce the transmission time delay of network, reduce the extra burden to network brought by controlling information, and improve the routing efficiency of network.
TH-E-BRE-07: Development of Dose Calculation Error Predictors for a Widely Implemented Clinical Algorithm

DOE Office of Scientific and Technical Information (OSTI.GOV)

Egan, A; Laub, W

2014-06-15

Purpose: Several shortcomings of the current implementation of the analytic anisotropic algorithm (AAA) may lead to dose calculation errors in highly modulated treatments delivered to highly heterogeneous geometries. Here we introduce a set of dosimetric error predictors that can be applied to a clinical treatment plan and patient geometry in order to identify high risk plans. Once a problematic plan is identified, the treatment can be recalculated with more accurate algorithm in order to better assess its viability. Methods: Here we focus on three distinct sources dosimetric error in the AAA algorithm. First, due to a combination of discrepancies inmore » smallfield beam modeling as well as volume averaging effects, dose calculated through small MLC apertures can be underestimated, while that behind small MLC blocks can overestimated. Second, due the rectilinear scaling of the Monte Carlo generated pencil beam kernel, energy is not properly transported through heterogeneities near, but not impeding, the central axis of the beamlet. And third, AAA overestimates dose in regions very low density (< 0.2 g/cm{sup 3}). We have developed an algorithm to detect the location and magnitude of each scenario within the patient geometry, namely the field-size index (FSI), the heterogeneous scatter index (HSI), and the lowdensity index (LDI) respectively. Results: Error indices successfully identify deviations between AAA and Monte Carlo dose distributions in simple phantom geometries. Algorithms are currently implemented in the MATLAB computing environment and are able to run on a typical RapidArc head and neck geometry in less than an hour. Conclusion: Because these error indices successfully identify each type of error in contrived cases, with sufficient benchmarking, this method can be developed into a clinical tool that may be able to help estimate AAA dose calculation errors and when it might be advisable to use Monte Carlo calculations.« less
Dosimetric comparison of peripheral NSCLC SBRT using Acuros XB and AAA calculation algorithms.

PubMed

Ong, Chloe C H; Ang, Khong Wei; Soh, Roger C X; Tin, Kah Ming; Yap, Jerome H H; Lee, James C L; Bragg, Christopher M

2017-01-01

There is a concern for dose calculation in highly heterogenous environments such as the thorax region. This study compares the quality of treatment plans of peripheral non-small cell lung cancer (NSCLC) stereotactic body radiation therapy (SBRT) using 2 calculation algorithms, namely, Eclipse Anisotropic Analytical Algorithm (AAA) and Acuros External Beam (AXB), for 3-dimensional conformal radiation therapy (3DCRT) and volumetric-modulated arc therapy (VMAT). Four-dimensional computed tomography (4DCT) data from 20 anonymized patients were studied using Varian Eclipse planning system, AXB, and AAA version 10.0.28. A 3DCRT plan and a VMAT plan were generated using AAA and AXB with constant plan parameters for each patient. The prescription and dose constraints were benchmarked against Radiation Therapy Oncology Group (RTOG) 0915 protocol. Planning parameters of the plan were compared statistically using Mann-Whitney U tests. Results showed that 3DCRT and VMAT plans have a lower target coverage up to 8% when calculated using AXB as compared with AAA. The conformity index (CI) for AXB plans was 4.7% lower than AAA plans, but was closer to unity, which indicated better target conformity. AXB produced plans with global maximum doses which were, on average, 2% hotter than AAA plans. Both 3DCRT and VMAT plans were able to achieve D95%. VMAT plans were shown to be more conformal (CI = 1.01) and were at least 3.2% and 1.5% lower in terms of PTV maximum and mean dose, respectively. There was no statistically significant difference for doses received by organs at risk (OARs) regardless of calculation algorithms and treatment techniques. In general, the difference in tissue modeling for AXB and AAA algorithm is responsible for the dose distribution between the AXB and the AAA algorithms. The AXB VMAT plans could be used to benefit patients receiving peripheral NSCLC SBRT. Copyright © 2017 American Association of Medical Dosimetrists. Published by Elsevier Inc. All rights reserved.
FPGA design for constrained energy minimization

NASA Astrophysics Data System (ADS)

Wang, Jianwei; Chang, Chein-I.; Cao, Mang

2004-02-01

The Constrained Energy Minimization (CEM) has been widely used for hyperspectral detection and classification. The feasibility of implementing the CEM as a real-time processing algorithm in systolic arrays has been also demonstrated. The main challenge of realizing the CEM in hardware architecture in the computation of the inverse of the data correlation matrix performed in the CEM, which requires a complete set of data samples. In order to cope with this problem, the data correlation matrix must be calculated in a causal manner which only needs data samples up to the sample at the time it is processed. This paper presents a Field Programmable Gate Arrays (FPGA) design of such a causal CEM. The main feature of the proposed FPGA design is to use the Coordinate Rotation DIgital Computer (CORDIC) algorithm that can convert a Givens rotation of a vector to a set of shift-add operations. As a result, the CORDIC algorithm can be easily implemented in hardware architecture, therefore in FPGA. Since the computation of the inverse of the data correlction involves a series of Givens rotations, the utility of the CORDIC algorithm allows the causal CEM to perform real-time processing in FPGA. In this paper, an FPGA implementation of the causal CEM will be studied and its detailed architecture will be also described.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.