efficient computational methods: Topics by Science.gov

Sample records for efficient computational methods

Computational efficiency for the surface renewal method

NASA Astrophysics Data System (ADS)

Kelley, Jason; Higgins, Chad

2018-04-01

Measuring surface fluxes using the surface renewal (SR) method requires programmatic algorithms for tabulation, algebraic calculation, and data quality control. A number of different methods have been published describing automated calibration of SR parameters. Because the SR method utilizes high-frequency (10 Hz+) measurements, some steps in the flux calculation are computationally expensive, especially when automating SR to perform many iterations of these calculations. Several new algorithms were written that perform the required calculations more efficiently and rapidly, and that tested for sensitivity to length of flux averaging period, ability to measure over a large range of lag timescales, and overall computational efficiency. These algorithms utilize signal processing techniques and algebraic simplifications that demonstrate simple modifications that dramatically improve computational efficiency. The results here complement efforts by other authors to standardize a robust and accurate computational SR method. Increased speed of computation time grants flexibility to implementing the SR method, opening new avenues for SR to be used in research, for applied monitoring, and in novel field deployments.
An efficient method for computation of the manipulator inertia matrix

NASA Technical Reports Server (NTRS)

Fijany, Amir; Bejczy, Antal K.

1989-01-01

An efficient method of computation of the manipulator inertia matrix is presented. Using spatial notations, the method leads to the definition of the composite rigid-body spatial inertia, which is a spatial representation of the notion of augmented body. The previously proposed methods, the physical interpretations leading to their derivation, and their redundancies are analyzed. The proposed method achieves a greater efficiency by eliminating the redundancy in the intrinsic equations as well as by a better choice of coordinate frame for their projection. In this case, removing the redundancy leads to greater efficiency of the computation in both serial and parallel senses.
An approximate solution to improve computational efficiency of impedance-type payload load prediction

NASA Technical Reports Server (NTRS)

White, C. W.

1981-01-01

The computational efficiency of the impedance type loads prediction method was studied. Three goals were addressed: devise a method to make the impedance method operate more efficiently in the computer; assess the accuracy and convenience of the method for determining the effect of design changes; and investigate the use of the method to identify design changes for reduction of payload loads. The method is suitable for calculation of dynamic response in either the frequency or time domain. It is concluded that: the choice of an orthogonal coordinate system will allow the impedance method to operate more efficiently in the computer; the approximate mode impedance technique is adequate for determining the effect of design changes, and is applicable for both statically determinate and statically indeterminate payload attachments; and beneficial design changes to reduce payload loads can be identified by the combined application of impedance techniques and energy distribution review techniques.
Methods for Computationally Efficient Structured CFD Simulations of Complex Turbomachinery Flows

NASA Technical Reports Server (NTRS)

Herrick, Gregory P.; Chen, Jen-Ping

2012-01-01

This research presents more efficient computational methods by which to perform multi-block structured Computational Fluid Dynamics (CFD) simulations of turbomachinery, thus facilitating higher-fidelity solutions of complicated geometries and their associated flows. This computational framework offers flexibility in allocating resources to balance process count and wall-clock computation time, while facilitating research interests of simulating axial compressor stall inception with more complete gridding of the flow passages and rotor tip clearance regions than is typically practiced with structured codes. The paradigm presented herein facilitates CFD simulation of previously impractical geometries and flows. These methods are validated and demonstrate improved computational efficiency when applied to complicated geometries and flows.
Efficient calibration for imperfect computer models

DOE PAGES

Tuo, Rui; Wu, C. F. Jeff

2015-12-01

Many computer models contain unknown parameters which need to be estimated using physical observations. Furthermore, the calibration method based on Gaussian process models may lead to unreasonable estimate for imperfect computer models. In this work, we extend their study to calibration problems with stochastic physical data. We propose a novel method, called the L 2 calibration, and show its semiparametric efficiency. The conventional method of the ordinary least squares is also studied. Theoretical analysis shows that it is consistent but not efficient. Here, numerical examples show that the proposed method outperforms the existing ones.
Developing an Efficient Computational Method that Estimates the Ability of Students in a Web-Based Learning Environment

ERIC Educational Resources Information Center

Lee, Young-Jin

2012-01-01

This paper presents a computational method that can efficiently estimate the ability of students from the log files of a Web-based learning environment capturing their problem solving processes. The computational method developed in this study approximates the posterior distribution of the student's ability obtained from the conventional Bayes…
Efficient computation of photonic crystal waveguide modes with dispersive material.

PubMed

Schmidt, Kersten; Kappeler, Roman

2010-03-29

The optimization of PhC waveguides is a key issue for successfully designing PhC devices. Since this design task is computationally expensive, efficient methods are demanded. The available codes for computing photonic bands are also applied to PhC waveguides. They are reliable but not very efficient, which is even more pronounced for dispersive material. We present a method based on higher order finite elements with curved cells, which allows to solve for the band structure taking directly into account the dispersiveness of the materials. This is accomplished by reformulating the wave equations as a linear eigenproblem in the complex wave-vectors k. For this method, we demonstrate the high efficiency for the computation of guided PhC waveguide modes by a convergence analysis.
Efficient path-based computations on pedigree graphs with compact encodings

PubMed Central

2012-01-01

A pedigree is a diagram of family relationships, and it is often used to determine the mode of inheritance (dominant, recessive, etc.) of genetic diseases. Along with rapidly growing knowledge of genetics and accumulation of genealogy information, pedigree data is becoming increasingly important. In large pedigree graphs, path-based methods for efficiently computing genealogical measurements, such as inbreeding and kinship coefficients of individuals, depend on efficient identification and processing of paths. In this paper, we propose a new compact path encoding scheme on large pedigrees, accompanied by an efficient algorithm for identifying paths. We demonstrate the utilization of our proposed method by applying it to the inbreeding coefficient computation. We present time and space complexity analysis, and also manifest the efficiency of our method for evaluating inbreeding coefficients as compared to previous methods by experimental results using pedigree graphs with real and synthetic data. Both theoretical and experimental results demonstrate that our method is more scalable and efficient than previous methods in terms of time and space requirements. PMID:22536898
Probabilistic methods for rotordynamics analysis

NASA Technical Reports Server (NTRS)

Wu, Y.-T.; Torng, T. Y.; Millwater, H. R.; Fossum, A. F.; Rheinfurth, M. H.

1991-01-01

This paper summarizes the development of the methods and a computer program to compute the probability of instability of dynamic systems that can be represented by a system of second-order ordinary linear differential equations. Two instability criteria based upon the eigenvalues or Routh-Hurwitz test functions are investigated. Computational methods based on a fast probability integration concept and an efficient adaptive importance sampling method are proposed to perform efficient probabilistic analysis. A numerical example is provided to demonstrate the methods.
[Economic efficiency of computer monitoring of health].

PubMed

Il'icheva, N P; Stazhadze, L L

2001-01-01

Presents the method of computer monitoring of health, based on utilization of modern information technologies in public health. The method helps organize preventive activities of an outpatient clinic at a high level and essentially decrease the time and money loss. Efficiency of such preventive measures, increased number of computer and Internet users suggests that such methods are promising and further studies in this field are needed.
The diffusive finite state projection algorithm for efficient simulation of the stochastic reaction-diffusion master equation.

PubMed

Drawert, Brian; Lawson, Michael J; Petzold, Linda; Khammash, Mustafa

2010-02-21

We have developed a computational framework for accurate and efficient simulation of stochastic spatially inhomogeneous biochemical systems. The new computational method employs a fractional step hybrid strategy. A novel formulation of the finite state projection (FSP) method, called the diffusive FSP method, is introduced for the efficient and accurate simulation of diffusive transport. Reactions are handled by the stochastic simulation algorithm.
A LSQR-type method provides a computationally efficient automated optimal choice of regularization parameter in diffuse optical tomography.

PubMed

Prakash, Jaya; Yalavarthy, Phaneendra K

2013-03-01

Developing a computationally efficient automated method for the optimal choice of regularization parameter in diffuse optical tomography. The least-squares QR (LSQR)-type method that uses Lanczos bidiagonalization is known to be computationally efficient in performing the reconstruction procedure in diffuse optical tomography. The same is effectively deployed via an optimization procedure that uses the simplex method to find the optimal regularization parameter. The proposed LSQR-type method is compared with the traditional methods such as L-curve, generalized cross-validation (GCV), and recently proposed minimal residual method (MRM)-based choice of regularization parameter using numerical and experimental phantom data. The results indicate that the proposed LSQR-type and MRM-based methods performance in terms of reconstructed image quality is similar and superior compared to L-curve and GCV-based methods. The proposed method computational complexity is at least five times lower compared to MRM-based method, making it an optimal technique. The LSQR-type method was able to overcome the inherent limitation of computationally expensive nature of MRM-based automated way finding the optimal regularization parameter in diffuse optical tomographic imaging, making this method more suitable to be deployed in real-time.
Interaction Entropy: A New Paradigm for Highly Efficient and Reliable Computation of Protein-Ligand Binding Free Energy.

PubMed

Duan, Lili; Liu, Xiao; Zhang, John Z H

2016-05-04

Efficient and reliable calculation of protein-ligand binding free energy is a grand challenge in computational biology and is of critical importance in drug design and many other molecular recognition problems. The main challenge lies in the calculation of entropic contribution to protein-ligand binding or interaction systems. In this report, we present a new interaction entropy method which is theoretically rigorous, computationally efficient, and numerically reliable for calculating entropic contribution to free energy in protein-ligand binding and other interaction processes. Drastically different from the widely employed but extremely expensive normal mode method for calculating entropy change in protein-ligand binding, the new method calculates the entropic component (interaction entropy or -TΔS) of the binding free energy directly from molecular dynamics simulation without any extra computational cost. Extensive study of over a dozen randomly selected protein-ligand binding systems demonstrated that this interaction entropy method is both computationally efficient and numerically reliable and is vastly superior to the standard normal mode approach. This interaction entropy paradigm introduces a novel and intuitive conceptual understanding of the entropic effect in protein-ligand binding and other general interaction systems as well as a practical method for highly efficient calculation of this effect.
An efficient method to compute spurious end point contributions in PO solutions. [Physical Optics

NASA Technical Reports Server (NTRS)

Gupta, Inder J.; Burnside, Walter D.; Pistorius, Carl W. I.

1987-01-01

A method is given to compute the spurious endpoint contributions in the physical optics solution for electromagnetic scattering from conducting bodies. The method is applicable to general three-dimensional structures. The only information required to use the method is the radius of curvature of the body at the shadow boundary. Thus, the method is very efficient for numerical computations. As an illustration, the method is applied to several bodies of revolution to compute the endpoint contributions for backscattering in the case of axial incidence. It is shown that in high-frequency situations, the endpoint contributions obtained using the method are equal to the true endpoint contributions.
Multiscale Space-Time Computational Methods for Fluid-Structure Interactions

DTIC Science & Technology

2015-09-13

prescribed fully or partially, is from an actual locust, extracted from high-speed, multi-camera video recordings of the locust in a wind tunnel . We use...With creative methods for coupling the fluid and structure, we can increase the scope and efficiency of the FSI modeling . Multiscale methods, which now...play an important role in computational mathematics, can also increase the accuracy and efficiency of the computer modeling techniques. The main
Determination of efficiency of an aged HPGe detector for gaseous sources by self absorption correction and point source methods

NASA Astrophysics Data System (ADS)

Sarangapani, R.; Jose, M. T.; Srinivasan, T. K.; Venkatraman, B.

2017-07-01

Methods for the determination of efficiency of an aged high purity germanium (HPGe) detector for gaseous sources have been presented in the paper. X-ray radiography of the detector has been performed to get detector dimensions for computational purposes. The dead layer thickness of HPGe detector has been ascertained from experiments and Monte Carlo computations. Experimental work with standard point and liquid sources in several cylindrical geometries has been undertaken for obtaining energy dependant efficiency. Monte Carlo simulations have been performed for computing efficiencies for point, liquid and gaseous sources. Self absorption correction factors have been obtained using mathematical equations for volume sources and MCNP simulations. Self-absorption correction and point source methods have been used to estimate the efficiency for gaseous sources. The efficiencies determined from the present work have been used to estimate activity of cover gas sample of a fast reactor.
Towards a Highly Efficient Meshfree Simulation of Non-Newtonian Free Surface Ice Flow: Application to the Haut Glacier d'Arolla

NASA Astrophysics Data System (ADS)

Shcherbakov, V.; Ahlkrona, J.

2016-12-01

In this work we develop a highly efficient meshfree approach to ice sheet modeling. Traditionally mesh based methods such as finite element methods are employed to simulate glacier and ice sheet dynamics. These methods are mature and well developed. However, despite of numerous advantages these methods suffer from some drawbacks such as necessity to remesh the computational domain every time it changes its shape, which significantly complicates the implementation on moving domains, or a costly assembly procedure for nonlinear problems. We introduce a novel meshfree approach that frees us from all these issues. The approach is built upon a radial basis function (RBF) method that, thanks to its meshfree nature, allows for an efficient handling of moving margins and free ice surface. RBF methods are also accurate and easy to implement. Since the formulation is stated in strong form it allows for a substantial reduction of the computational cost associated with the linear system assembly inside the nonlinear solver. We implement a global RBF method that defines an approximation on the entire computational domain. This method exhibits high accuracy properties. However, it suffers from a disadvantage that the coefficient matrix is dense, and therefore the computational efficiency decreases. In order to overcome this issue we also implement a localized RBF method that rests upon a partition of unity approach to subdivide the domain into several smaller subdomains. The radial basis function partition of unity method (RBF-PUM) inherits high approximation characteristics form the global RBF method while resulting in a sparse system of equations, which essentially increases the computational efficiency. To demonstrate the usefulness of the RBF methods we model the velocity field of ice flow in the Haut Glacier d'Arolla. We assume that the flow is governed by the nonlinear Blatter-Pattyn equations. We test the methods for different basal conditions and for a free moving surface. Both RBF methods are compared with a classical finite element method in terms of accuracy and efficiency. We find that the RBF methods are more efficient than the finite element method and well suited for ice dynamics modeling, especially the partition of unity approach.
A Computationally Efficient Parallel Levenberg-Marquardt Algorithm for Large-Scale Big-Data Inversion

NASA Astrophysics Data System (ADS)

Lin, Y.; O'Malley, D.; Vesselinov, V. V.

2015-12-01

Inverse modeling seeks model parameters given a set of observed state variables. However, for many practical problems due to the facts that the observed data sets are often large and model parameters are often numerous, conventional methods for solving the inverse modeling can be computationally expensive. We have developed a new, computationally-efficient Levenberg-Marquardt method for solving large-scale inverse modeling. Levenberg-Marquardt methods require the solution of a dense linear system of equations which can be prohibitively expensive to compute for large-scale inverse problems. Our novel method projects the original large-scale linear problem down to a Krylov subspace, such that the dimensionality of the measurements can be significantly reduced. Furthermore, instead of solving the linear system for every Levenberg-Marquardt damping parameter, we store the Krylov subspace computed when solving the first damping parameter and recycle it for all the following damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved by using these computational techniques. We apply this new inverse modeling method to invert for a random transitivity field. Our algorithm is fast enough to solve for the distributed model parameters (transitivity) at each computational node in the model domain. The inversion is also aided by the use regularization techniques. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). Julia is an advanced high-level scientific programing language that allows for efficient memory management and utilization of high-performance computational resources. By comparing with a Levenberg-Marquardt method using standard linear inversion techniques, our Levenberg-Marquardt method yields speed-up ratio of 15 in a multi-core computational environment and a speed-up ratio of 45 in a single-core computational environment. Therefore, our new inverse modeling method is a powerful tool for large-scale applications.
Efficient computation of the Grünwald-Letnikov fractional diffusion derivative using adaptive time step memory

NASA Astrophysics Data System (ADS)

MacDonald, Christopher L.; Bhattacharya, Nirupama; Sprouse, Brian P.; Silva, Gabriel A.

2015-09-01

Computing numerical solutions to fractional differential equations can be computationally intensive due to the effect of non-local derivatives in which all previous time points contribute to the current iteration. In general, numerical approaches that depend on truncating part of the system history while efficient, can suffer from high degrees of error and inaccuracy. Here we present an adaptive time step memory method for smooth functions applied to the Grünwald-Letnikov fractional diffusion derivative. This method is computationally efficient and results in smaller errors during numerical simulations. Sampled points along the system's history at progressively longer intervals are assumed to reflect the values of neighboring time points. By including progressively fewer points backward in time, a temporally 'weighted' history is computed that includes contributions from the entire past of the system, maintaining accuracy, but with fewer points actually calculated, greatly improving computational efficiency.
A synthetic visual plane algorithm for visibility computation in consideration of accuracy and efficiency

NASA Astrophysics Data System (ADS)

Yu, Jieqing; Wu, Lixin; Hu, Qingsong; Yan, Zhigang; Zhang, Shaoliang

2017-12-01

Visibility computation is of great interest to location optimization, environmental planning, ecology, and tourism. Many algorithms have been developed for visibility computation. In this paper, we propose a novel method of visibility computation, called synthetic visual plane (SVP), to achieve better performance with respect to efficiency, accuracy, or both. The method uses a global horizon, which is a synthesis of line-of-sight information of all nearer points, to determine the visibility of a point, which makes it an accurate visibility method. We used discretization of horizon to gain a good performance in efficiency. After discretization, the accuracy and efficiency of SVP depends on the scale of discretization (i.e., zone width). The method is more accurate at smaller zone widths, but this requires a longer operating time. Users must strike a balance between accuracy and efficiency at their discretion. According to our experiments, SVP is less accurate but more efficient than R2 if the zone width is set to one grid. However, SVP becomes more accurate than R2 when the zone width is set to 1/24 grid, while it continues to perform as fast or faster than R2. Although SVP performs worse than reference plane and depth map with respect to efficiency, it is superior in accuracy to these other two algorithms.

An Assessment of Artificial Compressibility and Pressure Projection Methods for Incompressible Flow Simulations

NASA Technical Reports Server (NTRS)

Kwak, Dochan; Kiris, C.; Smith, Charles A. (Technical Monitor)

1998-01-01

Performance of the two commonly used numerical procedures, one based on artificial compressibility method and the other pressure projection method, are compared. These formulations are selected primarily because they are designed for three-dimensional applications. The computational procedures are compared by obtaining steady state solutions of a wake vortex and unsteady solutions of a curved duct flow. For steady computations, artificial compressibility was very efficient in terms of computing time and robustness. For an unsteady flow which requires small physical time step, pressure projection method was found to be computationally more efficient than an artificial compressibility method. This comparison is intended to give some basis for selecting a method or a flow solution code for large three-dimensional applications where computing resources become a critical issue.
Parallel scalability and efficiency of vortex particle method for aeroelasticity analysis of bluff bodies

NASA Astrophysics Data System (ADS)

Tolba, Khaled Ibrahim; Morgenthal, Guido

2018-01-01

This paper presents an analysis of the scalability and efficiency of a simulation framework based on the vortex particle method. The code is applied for the numerical aerodynamic analysis of line-like structures. The numerical code runs on multicore CPU and GPU architectures using OpenCL framework. The focus of this paper is the analysis of the parallel efficiency and scalability of the method being applied to an engineering test case, specifically the aeroelastic response of a long-span bridge girder at the construction stage. The target is to assess the optimal configuration and the required computer architecture, such that it becomes feasible to efficiently utilise the method within the computational resources available for a regular engineering office. The simulations and the scalability analysis are performed on a regular gaming type computer.
An efficient algorithm to compute marginal posterior genotype probabilities for every member of a pedigree with loops

PubMed Central

2009-01-01

Background Marginal posterior genotype probabilities need to be computed for genetic analyses such as geneticcounseling in humans and selective breeding in animal and plant species. Methods In this paper, we describe a peeling based, deterministic, exact algorithm to compute efficiently genotype probabilities for every member of a pedigree with loops without recourse to junction-tree methods from graph theory. The efficiency in computing the likelihood by peeling comes from storing intermediate results in multidimensional tables called cutsets. Computing marginal genotype probabilities for individual i requires recomputing the likelihood for each of the possible genotypes of individual i. This can be done efficiently by storing intermediate results in two types of cutsets called anterior and posterior cutsets and reusing these intermediate results to compute the likelihood. Examples A small example is used to illustrate the theoretical concepts discussed in this paper, and marginal genotype probabilities are computed at a monogenic disease locus for every member in a real cattle pedigree. PMID:19958551
Multiresolution molecular mechanics: Implementation and efficiency

NASA Astrophysics Data System (ADS)

Biyikli, Emre; To, Albert C.

2017-01-01

Atomistic/continuum coupling methods combine accurate atomistic methods and efficient continuum methods to simulate the behavior of highly ordered crystalline systems. Coupled methods utilize the advantages of both approaches to simulate systems at a lower computational cost, while retaining the accuracy associated with atomistic methods. Many concurrent atomistic/continuum coupling methods have been proposed in the past; however, their true computational efficiency has not been demonstrated. The present work presents an efficient implementation of a concurrent coupling method called the Multiresolution Molecular Mechanics (MMM) for serial, parallel, and adaptive analysis. First, we present the features of the software implemented along with the associated technologies. The scalability of the software implementation is demonstrated, and the competing effects of multiscale modeling and parallelization are discussed. Then, the algorithms contributing to the efficiency of the software are presented. These include algorithms for eliminating latent ghost atoms from calculations and measurement-based dynamic balancing of parallel workload. The efficiency improvements made by these algorithms are demonstrated by benchmark tests. The efficiency of the software is found to be on par with LAMMPS, a state-of-the-art Molecular Dynamics (MD) simulation code, when performing full atomistic simulations. Speed-up of the MMM method is shown to be directly proportional to the reduction of the number of the atoms visited in force computation. Finally, an adaptive MMM analysis on a nanoindentation problem, containing over a million atoms, is performed, yielding an improvement of 6.3-8.5 times in efficiency, over the full atomistic MD method. For the first time, the efficiency of a concurrent atomistic/continuum coupling method is comprehensively investigated and demonstrated.
Fast sweeping methods for hyperbolic systems of conservation laws at steady state II

NASA Astrophysics Data System (ADS)

Engquist, Björn; Froese, Brittany D.; Tsai, Yen-Hsi Richard

2015-04-01

The idea of using fast sweeping methods for solving stationary systems of conservation laws has previously been proposed for efficiently computing solutions with sharp shocks. We further develop these methods to allow for a more challenging class of problems including problems with sonic points, shocks originating in the interior of the domain, rarefaction waves, and two-dimensional systems. We show that fast sweeping methods can produce higher-order accuracy. Computational results validate the claims of accuracy, sharp shock curves, and optimal computational efficiency.
Tensor Factorization for Low-Rank Tensor Completion.

PubMed

Zhou, Pan; Lu, Canyi; Lin, Zhouchen; Zhang, Chao

2018-03-01

Recently, a tensor nuclear norm (TNN) based method was proposed to solve the tensor completion problem, which has achieved state-of-the-art performance on image and video inpainting tasks. However, it requires computing tensor singular value decomposition (t-SVD), which costs much computation and thus cannot efficiently handle tensor data, due to its natural large scale. Motivated by TNN, we propose a novel low-rank tensor factorization method for efficiently solving the 3-way tensor completion problem. Our method preserves the low-rank structure of a tensor by factorizing it into the product of two tensors of smaller sizes. In the optimization process, our method only needs to update two smaller tensors, which can be more efficiently conducted than computing t-SVD. Furthermore, we prove that the proposed alternating minimization algorithm can converge to a Karush-Kuhn-Tucker point. Experimental results on the synthetic data recovery, image and video inpainting tasks clearly demonstrate the superior performance and efficiency of our developed method over state-of-the-arts including the TNN and matricization methods.
A Computationally Efficient Method for Polyphonic Pitch Estimation

NASA Astrophysics Data System (ADS)

Zhou, Ruohua; Reiss, Joshua D.; Mattavelli, Marco; Zoia, Giorgio

2009-12-01

This paper presents a computationally efficient method for polyphonic pitch estimation. The method employs the Fast Resonator Time-Frequency Image (RTFI) as the basic time-frequency analysis tool. The approach is composed of two main stages. First, a preliminary pitch estimation is obtained by means of a simple peak-picking procedure in the pitch energy spectrum. Such spectrum is calculated from the original RTFI energy spectrum according to harmonic grouping principles. Then the incorrect estimations are removed according to spectral irregularity and knowledge of the harmonic structures of the music notes played on commonly used music instruments. The new approach is compared with a variety of other frame-based polyphonic pitch estimation methods, and results demonstrate the high performance and computational efficiency of the approach.
Efficient Algorithms for Estimating the Absorption Spectrum within Linear Response TDDFT

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brabec, Jiri; Lin, Lin; Shao, Meiyue

We present two iterative algorithms for approximating the absorption spectrum of molecules within linear response of time-dependent density functional theory (TDDFT) framework. These methods do not attempt to compute eigenvalues or eigenvectors of the linear response matrix. They are designed to approximate the absorption spectrum as a function directly. They take advantage of the special structure of the linear response matrix. Neither method requires the linear response matrix to be constructed explicitly. They only require a procedure that performs the multiplication of the linear response matrix with a vector. These methods can also be easily modified to efficiently estimate themore » density of states (DOS) of the linear response matrix without computing the eigenvalues of this matrix. We show by computational experiments that the methods proposed in this paper can be much more efficient than methods that are based on the exact diagonalization of the linear response matrix. We show that they can also be more efficient than real-time TDDFT simulations. We compare the pros and cons of these methods in terms of their accuracy as well as their computational and storage cost.« less
An efficient method for hybrid density functional calculation with spin-orbit coupling

NASA Astrophysics Data System (ADS)

Wang, Maoyuan; Liu, Gui-Bin; Guo, Hong; Yao, Yugui

2018-03-01

In first-principles calculations, hybrid functional is often used to improve accuracy from local exchange correlation functionals. A drawback is that evaluating the hybrid functional needs significantly more computing effort. When spin-orbit coupling (SOC) is taken into account, the non-collinear spin structure increases computing effort by at least eight times. As a result, hybrid functional calculations with SOC are intractable in most cases. In this paper, we present an approximate solution to this problem by developing an efficient method based on a mixed linear combination of atomic orbital (LCAO) scheme. We demonstrate the power of this method using several examples and we show that the results compare very well with those of direct hybrid functional calculations with SOC, yet the method only requires a computing effort similar to that without SOC. The presented technique provides a good balance between computing efficiency and accuracy, and it can be extended to magnetic materials.
Geospatial Representation, Analysis and Computing Using Bandlimited Functions

DTIC Science & Technology

2010-02-19

navigation of aircraft and missiles require detailed representations of gravity and efficient methods for determining orbits and trajectories. However, many...efficient on today’s computers. Under this grant new, computationally efficient, localized representations of gravity have been developed and tested. As a...step in developing a new approach to estimating gravitational potentials, a multiresolution representation for gravity estimation has been proposed
Efficient Computational Prototyping of Mixed Technology Microfluidic Components and Systems

DTIC Science & Technology

2002-08-01

AFRL-IF-RS-TR-2002-190 Final Technical Report August 2002 EFFICIENT COMPUTATIONAL PROTOTYPING OF MIXED TECHNOLOGY MICROFLUIDIC...SUBTITLE EFFICIENT COMPUTATIONAL PROTOTYPING OF MIXED TECHNOLOGY MICROFLUIDIC COMPONENTS AND SYSTEMS 6. AUTHOR(S) Narayan R. Aluru, Jacob White...Aided Design (CAD) tools for microfluidic components and systems were developed in this effort. Innovative numerical methods and algorithms for mixed
Efficient forced vibration reanalysis method for rotating electric machines

NASA Astrophysics Data System (ADS)

Saito, Akira; Suzuki, Hiromitsu; Kuroishi, Masakatsu; Nakai, Hideo

2015-01-01

Rotating electric machines are subject to forced vibration by magnetic force excitation with wide-band frequency spectrum that are dependent on the operating conditions. Therefore, when designing the electric machines, it is inevitable to compute the vibration response of the machines at various operating conditions efficiently and accurately. This paper presents an efficient frequency-domain vibration analysis method for the electric machines. The method enables the efficient re-analysis of the vibration response of electric machines at various operating conditions without the necessity to re-compute the harmonic response by finite element analyses. Theoretical background of the proposed method is provided, which is based on the modal reduction of the magnetic force excitation by a set of amplitude-modulated standing-waves. The method is applied to the forced response vibration of the interior permanent magnet motor at a fixed operating condition. The results computed by the proposed method agree very well with those computed by the conventional harmonic response analysis by the FEA. The proposed method is then applied to the spin-up test condition to demonstrate its applicability to various operating conditions. It is observed that the proposed method can successfully be applied to the spin-up test conditions, and the measured dominant frequency peaks in the frequency response can be well captured by the proposed approach.
76 FR 21673 - Alternative Efficiency Determination Methods and Alternate Rating Methods

Federal Register 2010, 2011, 2012, 2013, 2014

2011-04-18

... EERE-2011-BP-TP-00024] RIN 1904-AC46 Alternative Efficiency Determination Methods and Alternate Rating Methods AGENCY: Office of Energy Efficiency and Renewable Energy, Department of Energy. ACTION: Notice of... and data related to the use of computer simulations, mathematical methods, and other alternative...
GPU-accelerated element-free reverse-time migration with Gauss points partition

NASA Astrophysics Data System (ADS)

Zhou, Zhen; Jia, Xiaofeng; Qiang, Xiaodong

2018-06-01

An element-free method (EFM) has been demonstrated successfully in elasticity, heat conduction and fatigue crack growth problems. We present the theory of EFM and its numerical applications in seismic modelling and reverse time migration (RTM). Compared with the finite difference method and the finite element method, the EFM has unique advantages: (1) independence of grids in computation and (2) lower expense and more flexibility (because only the information of the nodes and the boundary of the concerned area is required). However, in EFM, due to improper computation and storage of some large sparse matrices, such as the mass matrix and the stiffness matrix, the method is difficult to apply to seismic modelling and RTM for a large velocity model. To solve the problem of storage and computation efficiency, we propose a concept of Gauss points partition and utilise the graphics processing unit to improve the computational efficiency. We employ the compressed sparse row format to compress the intermediate large sparse matrices and attempt to simplify the operations by solving the linear equations with CULA solver. To improve the computation efficiency further, we introduce the concept of the lumped mass matrix. Numerical experiments indicate that the proposed method is accurate and more efficient than the regular EFM.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Biyikli, Emre; To, Albert C., E-mail: albertto@pitt.edu

Atomistic/continuum coupling methods combine accurate atomistic methods and efficient continuum methods to simulate the behavior of highly ordered crystalline systems. Coupled methods utilize the advantages of both approaches to simulate systems at a lower computational cost, while retaining the accuracy associated with atomistic methods. Many concurrent atomistic/continuum coupling methods have been proposed in the past; however, their true computational efficiency has not been demonstrated. The present work presents an efficient implementation of a concurrent coupling method called the Multiresolution Molecular Mechanics (MMM) for serial, parallel, and adaptive analysis. First, we present the features of the software implemented along with themore » associated technologies. The scalability of the software implementation is demonstrated, and the competing effects of multiscale modeling and parallelization are discussed. Then, the algorithms contributing to the efficiency of the software are presented. These include algorithms for eliminating latent ghost atoms from calculations and measurement-based dynamic balancing of parallel workload. The efficiency improvements made by these algorithms are demonstrated by benchmark tests. The efficiency of the software is found to be on par with LAMMPS, a state-of-the-art Molecular Dynamics (MD) simulation code, when performing full atomistic simulations. Speed-up of the MMM method is shown to be directly proportional to the reduction of the number of the atoms visited in force computation. Finally, an adaptive MMM analysis on a nanoindentation problem, containing over a million atoms, is performed, yielding an improvement of 6.3–8.5 times in efficiency, over the full atomistic MD method. For the first time, the efficiency of a concurrent atomistic/continuum coupling method is comprehensively investigated and demonstrated.« less
Hamiltonian Monte Carlo acceleration using surrogate functions with random bases.

PubMed

Zhang, Cheng; Shahbaba, Babak; Zhao, Hongkai

2017-11-01

For big data analysis, high computational cost for Bayesian methods often limits their applications in practice. In recent years, there have been many attempts to improve computational efficiency of Bayesian inference. Here we propose an efficient and scalable computational technique for a state-of-the-art Markov chain Monte Carlo methods, namely, Hamiltonian Monte Carlo. The key idea is to explore and exploit the structure and regularity in parameter space for the underlying probabilistic model to construct an effective approximation of its geometric properties. To this end, we build a surrogate function to approximate the target distribution using properly chosen random bases and an efficient optimization process. The resulting method provides a flexible, scalable, and efficient sampling algorithm, which converges to the correct target distribution. We show that by choosing the basis functions and optimization process differently, our method can be related to other approaches for the construction of surrogate functions such as generalized additive models or Gaussian process models. Experiments based on simulated and real data show that our approach leads to substantially more efficient sampling algorithms compared to existing state-of-the-art methods.
A computationally efficient parallel Levenberg-Marquardt algorithm for highly parameterized inverse model analyses

NASA Astrophysics Data System (ADS)

Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.

2016-09-01

Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace such that the dimensionality of the problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2-D and a random hydraulic conductivity field in 3-D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ˜101 to ˜102 in a multicore computational environment. Therefore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate to large-scale problems.
Multiscale Reactive Molecular Dynamics

DTIC Science & Technology

2012-08-15

biology cannot be described without considering electronic and nuclear-level dynamics and their coupling to slower, cooperative motions of the system ...coupling to slower, cooperative motions of the system . These inherently multiscale problems require computationally efficient and accurate methods to...condensed phase systems with computational efficiency orders of magnitudes greater than currently possible with ab initio simulation methods, thus
Word aligned bitmap compression method, data structure, and apparatus

DOEpatents

Wu, Kesheng; Shoshani, Arie; Otoo, Ekow

2004-12-14

The Word-Aligned Hybrid (WAH) bitmap compression method and data structure is a relatively efficient method for searching and performing logical, counting, and pattern location operations upon large datasets. The technique is comprised of a data structure and methods that are optimized for computational efficiency by using the WAH compression method, which typically takes advantage of the target computing system's native word length. WAH is particularly apropos to infrequently varying databases, including those found in the on-line analytical processing (OLAP) industry, due to the increased computational efficiency of the WAH compressed bitmap index. Some commercial database products already include some version of a bitmap index, which could possibly be replaced by the WAH bitmap compression techniques for potentially increased operation speed, as well as increased efficiencies in constructing compressed bitmaps. Combined together, this technique may be particularly useful for real-time business intelligence. Additional WAH applications may include scientific modeling, such as climate and combustion simulations, to minimize search time for analysis and subsequent data visualization.
Projected role of advanced computational aerodynamic methods at the Lockheed-Georgia company

NASA Technical Reports Server (NTRS)

Lores, M. E.

1978-01-01

Experience with advanced computational methods being used at the Lockheed-Georgia Company to aid in the evaluation and design of new and modified aircraft indicates that large and specialized computers will be needed to make advanced three-dimensional viscous aerodynamic computations practical. The Numerical Aerodynamic Simulation Facility should be used to provide a tool for designing better aerospace vehicles while at the same time reducing development costs by performing computations using Navier-Stokes equations solution algorithms and permitting less sophisticated but nevertheless complex calculations to be made efficiently. Configuration definition procedures and data output formats can probably best be defined in cooperation with industry, therefore, the computer should handle many remote terminals efficiently. The capability of transferring data to and from other computers needs to be provided. Because of the significant amount of input and output associated with 3-D viscous flow calculations and because of the exceedingly fast computation speed envisioned for the computer, special attention should be paid to providing rapid, diversified, and efficient input and output.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rebay, S.

This work is devoted to the description of an efficient unstructured mesh generation method entirely based on the Delaunay triangulation. The distinctive characteristic of the proposed method is that point positions and connections are computed simultaneously. This result is achieved by taking advantage of the sequential way in which the Bowyer-Watson algorithm computes the Delaunay triangulation. Two methods are proposed which have great geometrical flexibility, in that they allow us to treat domains of arbitrary shape and topology and to generate arbitrarily nonuniform meshes. The methods are computationally efficient and are applicable both in two and three dimensions. 11 refs.,more » 20 figs., 1 tab.« less
Storage and computationally efficient permutations of factorized covariance and square-root information arrays

NASA Technical Reports Server (NTRS)

Muellerschoen, R. J.

1988-01-01

A unified method to permute vector stored Upper triangular Diagonal factorized covariance and vector stored upper triangular Square Root Information arrays is presented. The method involves cyclic permutation of the rows and columns of the arrays and retriangularization with fast (slow) Givens rotations (reflections). Minimal computation is performed, and a one dimensional scratch array is required. To make the method efficient for large arrays on a virtual memory machine, computations are arranged so as to avoid expensive paging faults. This method is potentially important for processing large volumes of radio metric data in the Deep Space Network.
Novel scheme to compute chemical potentials of chain molecules on a lattice

NASA Astrophysics Data System (ADS)

Mooij, G. C. A. M.; Frenkel, D.

We present a novel method that allows efficient computation of the total number of allowed conformations of a chain molecule in a dense phase. Using this method, it is possible to estimate the chemical potential of such a chain molecule. We have tested the present method in simulations of a two-dimensional monolayer of chain molecules on a lattice (Whittington-Chapman model) and compared it with existing schemes to compute the chemical potential. We find that the present approach is two to three orders of magnitude faster than the most efficient of the existing methods.
Automated Measurement of Patient-Specific Tibial Slopes from MRI

PubMed Central

Amerinatanzi, Amirhesam; Summers, Rodney K.; Ahmadi, Kaveh; Goel, Vijay K.; Hewett, Timothy E.; Nyman, Edward

2017-01-01

Background: Multi-planar proximal tibial slopes may be associated with increased likelihood of osteoarthritis and anterior cruciate ligament injury, due in part to their role in checking the anterior-posterior stability of the knee. Established methods suffer repeatability limitations and lack computational efficiency for intuitive clinical adoption. The aims of this study were to develop a novel automated approach and to compare the repeatability and computational efficiency of the approach against previously established methods. Methods: Tibial slope geometries were obtained via MRI and measured using an automated Matlab-based approach. Data were compared for repeatability and evaluated for computational efficiency. Results: Mean lateral tibial slope (LTS) for females (7.2°) was greater than for males (1.66°). Mean LTS in the lateral concavity zone was greater for females (7.8° for females, 4.2° for males). Mean medial tibial slope (MTS) for females was greater (9.3° vs. 4.6°). Along the medial concavity zone, female subjects demonstrated greater MTS. Conclusion: The automated method was more repeatable and computationally efficient than previously identified methods and may aid in the clinical assessment of knee injury risk, inform surgical planning, and implant design efforts. PMID:28952547
Evaluation of the discrete vortex wake cross flow model using vector computers. Part 1: Theory and application

NASA Technical Reports Server (NTRS)

1979-01-01

The current program had the objective to modify a discrete vortex wake method to efficiently compute the aerodynamic forces and moments on high fineness ratio bodies (f approximately 10.0). The approach is to increase computational efficiency by structuring the program to take advantage of new computer vector software and by developing new algorithms when vector software can not efficiently be used. An efficient program was written and substantial savings achieved. Several test cases were run for fineness ratios up to f = 16.0 and angles of attack up to 50 degrees.
A POSTERIORI ERROR ANALYSIS OF TWO STAGE COMPUTATION METHODS WITH APPLICATION TO EFFICIENT DISCRETIZATION AND THE PARAREAL ALGORITHM.

PubMed

Chaudhry, Jehanzeb Hameed; Estep, Don; Tavener, Simon; Carey, Varis; Sandelin, Jeff

2016-01-01

We consider numerical methods for initial value problems that employ a two stage approach consisting of solution on a relatively coarse discretization followed by solution on a relatively fine discretization. Examples include adaptive error control, parallel-in-time solution schemes, and efficient solution of adjoint problems for computing a posteriori error estimates. We describe a general formulation of two stage computations then perform a general a posteriori error analysis based on computable residuals and solution of an adjoint problem. The analysis accommodates various variations in the two stage computation and in formulation of the adjoint problems. We apply the analysis to compute "dual-weighted" a posteriori error estimates, to develop novel algorithms for efficient solution that take into account cancellation of error, and to the Parareal Algorithm. We test the various results using several numerical examples.
Efficient integration method for fictitious domain approaches

NASA Astrophysics Data System (ADS)

Duczek, Sascha; Gabbert, Ulrich

2015-10-01

In the current article, we present an efficient and accurate numerical method for the integration of the system matrices in fictitious domain approaches such as the finite cell method (FCM). In the framework of the FCM, the physical domain is embedded in a geometrically larger domain of simple shape which is discretized using a regular Cartesian grid of cells. Therefore, a spacetree-based adaptive quadrature technique is normally deployed to resolve the geometry of the structure. Depending on the complexity of the structure under investigation this method accounts for most of the computational effort. To reduce the computational costs for computing the system matrices an efficient quadrature scheme based on the divergence theorem (Gauß-Ostrogradsky theorem) is proposed. Using this theorem the dimension of the integral is reduced by one, i.e. instead of solving the integral for the whole domain only its contour needs to be considered. In the current paper, we present the general principles of the integration method and its implementation. The results to several two-dimensional benchmark problems highlight its properties. The efficiency of the proposed method is compared to conventional spacetree-based integration techniques.
Remote sensing image ship target detection method based on visual attention model

NASA Astrophysics Data System (ADS)

Sun, Yuejiao; Lei, Wuhu; Ren, Xiaodong

2017-11-01

The traditional methods of detecting ship targets in remote sensing images mostly use sliding window to search the whole image comprehensively. However, the target usually occupies only a small fraction of the image. This method has high computational complexity for large format visible image data. The bottom-up selective attention mechanism can selectively allocate computing resources according to visual stimuli, thus improving the computational efficiency and reducing the difficulty of analysis. Considering of that, a method of ship target detection in remote sensing images based on visual attention model was proposed in this paper. The experimental results show that the proposed method can reduce the computational complexity while improving the detection accuracy, and improve the detection efficiency of ship targets in remote sensing images.
An efficient computational method for solving nonlinear stochastic Itô integral equations: Application for stochastic problems in physics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Heydari, M.H., E-mail: heydari@stu.yazd.ac.ir; The Laboratory of Quantum Information Processing, Yazd University, Yazd; Hooshmandasl, M.R., E-mail: hooshmandasl@yazd.ac.ir

Because of the nonlinearity, closed-form solutions of many important stochastic functional equations are virtually impossible to obtain. Thus, numerical solutions are a viable alternative. In this paper, a new computational method based on the generalized hat basis functions together with their stochastic operational matrix of Itô-integration is proposed for solving nonlinear stochastic Itô integral equations in large intervals. In the proposed method, a new technique for computing nonlinear terms in such problems is presented. The main advantage of the proposed method is that it transforms problems under consideration into nonlinear systems of algebraic equations which can be simply solved. Errormore » analysis of the proposed method is investigated and also the efficiency of this method is shown on some concrete examples. The obtained results reveal that the proposed method is very accurate and efficient. As two useful applications, the proposed method is applied to obtain approximate solutions of the stochastic population growth models and stochastic pendulum problem.« less
hp-Adaptive time integration based on the BDF for viscous flows

NASA Astrophysics Data System (ADS)

Hay, A.; Etienne, S.; Pelletier, D.; Garon, A.

2015-06-01

This paper presents a procedure based on the Backward Differentiation Formulas of order 1 to 5 to obtain efficient time integration of the incompressible Navier-Stokes equations. The adaptive algorithm performs both stepsize and order selections to control respectively the solution accuracy and the computational efficiency of the time integration process. The stepsize selection (h-adaptivity) is based on a local error estimate and an error controller to guarantee that the numerical solution accuracy is within a user prescribed tolerance. The order selection (p-adaptivity) relies on the idea that low-accuracy solutions can be computed efficiently by low order time integrators while accurate solutions require high order time integrators to keep computational time low. The selection is based on a stability test that detects growing numerical noise and deems a method of order p stable if there is no method of lower order that delivers the same solution accuracy for a larger stepsize. Hence, it guarantees both that (1) the used method of integration operates inside of its stability region and (2) the time integration procedure is computationally efficient. The proposed time integration procedure also features a time-step rejection and quarantine mechanisms, a modified Newton method with a predictor and dense output techniques to compute solution at off-step points.
Acceleration of FDTD mode solver by high-performance computing techniques.

PubMed

Han, Lin; Xi, Yanping; Huang, Wei-Ping

2010-06-21

A two-dimensional (2D) compact finite-difference time-domain (FDTD) mode solver is developed based on wave equation formalism in combination with the matrix pencil method (MPM). The method is validated for calculation of both real guided and complex leaky modes of typical optical waveguides against the bench-mark finite-difference (FD) eigen mode solver. By taking advantage of the inherent parallel nature of the FDTD algorithm, the mode solver is implemented on graphics processing units (GPUs) using the compute unified device architecture (CUDA). It is demonstrated that the high-performance computing technique leads to significant acceleration of the FDTD mode solver with more than 30 times improvement in computational efficiency in comparison with the conventional FDTD mode solver running on CPU of a standard desktop computer. The computational efficiency of the accelerated FDTD method is in the same order of magnitude of the standard finite-difference eigen mode solver and yet require much less memory (e.g., less than 10%). Therefore, the new method may serve as an efficient, accurate and robust tool for mode calculation of optical waveguides even when the conventional eigen value mode solvers are no longer applicable due to memory limitation.
Improved Quasi-Newton method via PSB update for solving systems of nonlinear equations

NASA Astrophysics Data System (ADS)

Mamat, Mustafa; Dauda, M. K.; Waziri, M. Y.; Ahmad, Fadhilah; Mohamad, Fatma Susilawati

2016-10-01

The Newton method has some shortcomings which includes computation of the Jacobian matrix which may be difficult or even impossible to compute and solving the Newton system in every iteration. Also, the common setback with some quasi-Newton methods is that they need to compute and store an n × n matrix at each iteration, this is computationally costly for large scale problems. To overcome such drawbacks, an improved Method for solving systems of nonlinear equations via PSB (Powell-Symmetric-Broyden) update is proposed. In the proposed method, the approximate Jacobian inverse Hk of PSB is updated and its efficiency has improved thereby require low memory storage, hence the main aim of this paper. The preliminary numerical results show that the proposed method is practically efficient when applied on some benchmark problems.
Unified treatment of microscopic boundary conditions and efficient algorithms for estimating tangent operators of the homogenized behavior in the computational homogenization method

NASA Astrophysics Data System (ADS)

Nguyen, Van-Dung; Wu, Ling; Noels, Ludovic

2017-03-01

This work provides a unified treatment of arbitrary kinds of microscopic boundary conditions usually considered in the multi-scale computational homogenization method for nonlinear multi-physics problems. An efficient procedure is developed to enforce the multi-point linear constraints arising from the microscopic boundary condition either by the direct constraint elimination or by the Lagrange multiplier elimination methods. The macroscopic tangent operators are computed in an efficient way from a multiple right hand sides linear system whose left hand side matrix is the stiffness matrix of the microscopic linearized system at the converged solution. The number of vectors at the right hand side is equal to the number of the macroscopic kinematic variables used to formulate the microscopic boundary condition. As the resolution of the microscopic linearized system often follows a direct factorization procedure, the computation of the macroscopic tangent operators is then performed using this factorized matrix at a reduced computational time.
High accurate interpolation of NURBS tool path for CNC machine tools

NASA Astrophysics Data System (ADS)

Liu, Qiang; Liu, Huan; Yuan, Songmei

2016-09-01

Feedrate fluctuation caused by approximation errors of interpolation methods has great effects on machining quality in NURBS interpolation, but few methods can efficiently eliminate or reduce it to a satisfying level without sacrificing the computing efficiency at present. In order to solve this problem, a high accurate interpolation method for NURBS tool path is proposed. The proposed method can efficiently reduce the feedrate fluctuation by forming a quartic equation with respect to the curve parameter increment, which can be efficiently solved by analytic methods in real-time. Theoretically, the proposed method can totally eliminate the feedrate fluctuation for any 2nd degree NURBS curves and can interpolate 3rd degree NURBS curves with minimal feedrate fluctuation. Moreover, a smooth feedrate planning algorithm is also proposed to generate smooth tool motion with considering multiple constraints and scheduling errors by an efficient planning strategy. Experiments are conducted to verify the feasibility and applicability of the proposed method. This research presents a novel NURBS interpolation method with not only high accuracy but also satisfying computing efficiency.
A computationally efficient parallel Levenberg-Marquardt algorithm for highly parameterized inverse model analyses

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.

Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally-efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace, such that the dimensionality of themore » problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2D and a random hydraulic conductivity field in 3D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ~10 1 to ~10 2 in a multi-core computational environment. Furthermore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate- to large-scale problems.« less
A computationally efficient parallel Levenberg-Marquardt algorithm for highly parameterized inverse model analyses

DOE PAGES

Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.

2016-09-01

Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally-efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace, such that the dimensionality of themore » problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2D and a random hydraulic conductivity field in 3D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ~10 1 to ~10 2 in a multi-core computational environment. Furthermore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate- to large-scale problems.« less
LOCAL ORTHOGONAL CUTTING METHOD FOR COMPUTING MEDIAL CURVES AND ITS BIOMEDICAL APPLICATIONS

PubMed Central

Einstein, Daniel R.; Dyedov, Vladimir

2010-01-01

Medial curves have a wide range of applications in geometric modeling and analysis (such as shape matching) and biomedical engineering (such as morphometry and computer assisted surgery). The computation of medial curves poses significant challenges, both in terms of theoretical analysis and practical efficiency and reliability. In this paper, we propose a definition and analysis of medial curves and also describe an efficient and robust method called local orthogonal cutting (LOC) for computing medial curves. Our approach is based on three key concepts: a local orthogonal decomposition of objects into substructures, a differential geometry concept called the interior center of curvature (ICC), and integrated stability and consistency tests. These concepts lend themselves to robust numerical techniques and result in an algorithm that is efficient and noise resistant. We illustrate the effectiveness and robustness of our approach with some highly complex, large-scale, noisy biomedical geometries derived from medical images, including lung airways and blood vessels. We also present comparisons of our method with some existing methods. PMID:20628546
An efficient computational method for the approximate solution of nonlinear Lane-Emden type equations arising in astrophysics

NASA Astrophysics Data System (ADS)

Singh, Harendra

2018-04-01

The key purpose of this article is to introduce an efficient computational method for the approximate solution of the homogeneous as well as non-homogeneous nonlinear Lane-Emden type equations. Using proposed computational method given nonlinear equation is converted into a set of nonlinear algebraic equations whose solution gives the approximate solution to the Lane-Emden type equation. Various nonlinear cases of Lane-Emden type equations like standard Lane-Emden equation, the isothermal gas spheres equation and white-dwarf equation are discussed. Results are compared with some well-known numerical methods and it is observed that our results are more accurate.
Efficient Predictions of Excited State for Nanomaterials Using Aces 3 and 4

DTIC Science & Technology

2017-12-20

by first-principle methods in the software package ACES by using large parallel computers, growing tothe exascale. 15. SUBJECT TERMS Computer...modeling, excited states, optical properties, structure, stability, activation barriers first principle methods , parallel computing 16. SECURITY...2 Progress with new density functional methods
Discovering Synergistic Drug Combination from a Computational Perspective.

PubMed

Ding, Pingjian; Luo, Jiawei; Liang, Cheng; Xiao, Qiu; Cao, Buwen; Li, Guanghui

2018-03-30

Synergistic drug combinations play an important role in the treatment of complex diseases. The identification of effective drug combination is vital to further reduce the side effects and improve therapeutic efficiency. In previous years, in vitro method has been the main route to discover synergistic drug combinations. However, many limitations of time and resource consumption lie within the in vitro method. Therefore, with the rapid development of computational models and the explosive growth of large and phenotypic data, computational methods for discovering synergistic drug combinations are an efficient and promising tool and contribute to precision medicine. It is the key of computational methods how to construct the computational model. Different computational strategies generate different performance. In this review, the recent advancements in computational methods for predicting effective drug combination are concluded from multiple aspects. First, various datasets utilized to discover synergistic drug combinations are summarized. Second, we discussed feature-based approaches and partitioned these methods into two classes including feature-based methods in terms of similarity measure, and feature-based methods in terms of machine learning. Third, we discussed network-based approaches for uncovering synergistic drug combinations. Finally, we analyzed and prospected computational methods for predicting effective drug combinations. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

Fast Reduction Method in Dominance-Based Information Systems

NASA Astrophysics Data System (ADS)

Li, Yan; Zhou, Qinghua; Wen, Yongchuan

2018-01-01

In real world applications, there are often some data with continuous values or preference-ordered values. Rough sets based on dominance relations can effectively deal with these kinds of data. Attribute reduction can be done in the framework of dominance-relation based approach to better extract decision rules. However, the computational cost of the dominance classes greatly affects the efficiency of attribute reduction and rule extraction. This paper presents an efficient method of computing dominance classes, and further compares it with traditional method with increasing attributes and samples. Experiments on UCI data sets show that the proposed algorithm obviously improves the efficiency of the traditional method, especially for large-scale data.
Leveraging Gibbs Ensemble Molecular Dynamics and Hybrid Monte Carlo/Molecular Dynamics for Efficient Study of Phase Equilibria.

PubMed

Gartner, Thomas E; Epps, Thomas H; Jayaraman, Arthi

2016-11-08

We describe an extension of the Gibbs ensemble molecular dynamics (GEMD) method for studying phase equilibria. Our modifications to GEMD allow for direct control over particle transfer between phases and improve the method's numerical stability. Additionally, we found that the modified GEMD approach had advantages in computational efficiency in comparison to a hybrid Monte Carlo (MC)/MD Gibbs ensemble scheme in the context of the single component Lennard-Jones fluid. We note that this increase in computational efficiency does not compromise the close agreement of phase equilibrium results between the two methods. However, numerical instabilities in the GEMD scheme hamper GEMD's use near the critical point. We propose that the computationally efficient GEMD simulations can be used to map out the majority of the phase window, with hybrid MC/MD used as a follow up for conditions under which GEMD may be unstable (e.g., near-critical behavior). In this manner, we can capitalize on the contrasting strengths of these two methods to enable the efficient study of phase equilibria for systems that present challenges for a purely stochastic GEMC method, such as dense or low temperature systems, and/or those with complex molecular topologies.
Convergence Acceleration of a Navier-Stokes Solver for Efficient Static Aeroelastic Computations

NASA Technical Reports Server (NTRS)

Obayashi, Shigeru; Guruswamy, Guru P.

1995-01-01

New capabilities have been developed for a Navier-Stokes solver to perform steady-state simulations more efficiently. The flow solver for solving the Navier-Stokes equations is based on a combination of the lower-upper factored symmetric Gauss-Seidel implicit method and the modified Harten-Lax-van Leer-Einfeldt upwind scheme. A numerically stable and efficient pseudo-time-marching method is also developed for computing steady flows over flexible wings. Results are demonstrated for transonic flows over rigid and flexible wings.
Parallel implementation of geometrical shock dynamics for two dimensional converging shock waves

NASA Astrophysics Data System (ADS)

Qiu, Shi; Liu, Kuang; Eliasson, Veronica

2016-10-01

Geometrical shock dynamics (GSD) theory is an appealing method to predict the shock motion in the sense that it is more computationally efficient than solving the traditional Euler equations, especially for converging shock waves. However, to solve and optimize large scale configurations, the main bottleneck is the computational cost. Among the existing numerical GSD schemes, there is only one that has been implemented on parallel computers, with the purpose to analyze detonation waves. To extend the computational advantage of the GSD theory to more general applications such as converging shock waves, a numerical implementation using a spatial decomposition method has been coupled with a front tracking approach on parallel computers. In addition, an efficient tridiagonal system solver for massively parallel computers has been applied to resolve the most expensive function in this implementation, resulting in an efficiency of 0.93 while using 32 HPCC cores. Moreover, symmetric boundary conditions have been developed to further reduce the computational cost, achieving a speedup of 19.26 for a 12-sided polygonal converging shock.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Tuo, Rui; Wu, C. F. Jeff

Many computer models contain unknown parameters which need to be estimated using physical observations. Furthermore, the calibration method based on Gaussian process models may lead to unreasonable estimate for imperfect computer models. In this work, we extend their study to calibration problems with stochastic physical data. We propose a novel method, called the L 2 calibration, and show its semiparametric efficiency. The conventional method of the ordinary least squares is also studied. Theoretical analysis shows that it is consistent but not efficient. Here, numerical examples show that the proposed method outperforms the existing ones.
Design synthesis and optimization of permanent magnet synchronous machines based on computationally-efficient finite element analysis

NASA Astrophysics Data System (ADS)

Sizov, Gennadi Y.

In this dissertation, a model-based multi-objective optimal design of permanent magnet ac machines, supplied by sine-wave current regulated drives, is developed and implemented. The design procedure uses an efficient electromagnetic finite element-based solver to accurately model nonlinear material properties and complex geometric shapes associated with magnetic circuit design. Application of an electromagnetic finite element-based solver allows for accurate computation of intricate performance parameters and characteristics. The first contribution of this dissertation is the development of a rapid computational method that allows accurate and efficient exploration of large multi-dimensional design spaces in search of optimum design(s). The computationally efficient finite element-based approach developed in this work provides a framework of tools that allow rapid analysis of synchronous electric machines operating under steady-state conditions. In the developed modeling approach, major steady-state performance parameters such as, winding flux linkages and voltages, average, cogging and ripple torques, stator core flux densities, core losses, efficiencies and saturated machine winding inductances, are calculated with minimum computational effort. In addition, the method includes means for rapid estimation of distributed stator forces and three-dimensional effects of stator and/or rotor skew on the performance of the machine. The second contribution of this dissertation is the development of the design synthesis and optimization method based on a differential evolution algorithm. The approach relies on the developed finite element-based modeling method for electromagnetic analysis and is able to tackle large-scale multi-objective design problems using modest computational resources. Overall, computational time savings of up to two orders of magnitude are achievable, when compared to current and prevalent state-of-the-art methods. These computational savings allow one to expand the optimization problem to achieve more complex and comprehensive design objectives. The method is used in the design process of several interior permanent magnet industrial motors. The presented case studies demonstrate that the developed finite element-based approach practically eliminates the need for using less accurate analytical and lumped parameter equivalent circuit models for electric machine design optimization. The design process and experimental validation of the case-study machines are detailed in the dissertation.
TU-AB-303-08: GPU-Based Software Platform for Efficient Image-Guided Adaptive Radiation Therapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Park, S; Robinson, A; McNutt, T

2015-06-15

Purpose: In this study, we develop an integrated software platform for adaptive radiation therapy (ART) that combines fast and accurate image registration, segmentation, and dose computation/accumulation methods. Methods: The proposed system consists of three key components; 1) deformable image registration (DIR), 2) automatic segmentation, and 3) dose computation/accumulation. The computationally intensive modules including DIR and dose computation have been implemented on a graphics processing unit (GPU). All required patient-specific data including the planning CT (pCT) with contours, daily cone-beam CTs, and treatment plan are automatically queried and retrieved from their own databases. To improve the accuracy of DIR between pCTmore » and CBCTs, we use the double force demons DIR algorithm in combination with iterative CBCT intensity correction by local intensity histogram matching. Segmentation of daily CBCT is then obtained by propagating contours from the pCT. Daily dose delivered to the patient is computed on the registered pCT by a GPU-accelerated superposition/convolution algorithm. Finally, computed daily doses are accumulated to show the total delivered dose to date. Results: Since the accuracy of DIR critically affects the quality of the other processes, we first evaluated our DIR method on eight head-and-neck cancer cases and compared its performance. Normalized mutual-information (NMI) and normalized cross-correlation (NCC) computed as similarity measures, and our method produced overall NMI of 0.663 and NCC of 0.987, outperforming conventional methods by 3.8% and 1.9%, respectively. Experimental results show that our registration method is more consistent and roust than existing algorithms, and also computationally efficient. Computation time at each fraction took around one minute (30–50 seconds for registration and 15–25 seconds for dose computation). Conclusion: We developed an integrated GPU-accelerated software platform that enables accurate and efficient DIR, auto-segmentation, and dose computation, thus supporting an efficient ART workflow. This work was supported by NIH/NCI under grant R42CA137886.« less
Higher-order methods for simulations on quantum computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sornborger, A.T.; Stewart, E.D.

1999-09-01

To implement many-qubit gates for use in quantum simulations on quantum computers efficiently, we develop and present methods reexpressing exp[[minus]i(H[sub 1]+H[sub 2]+[center dot][center dot][center dot])[Delta]t] as a product of factors exp[[minus]iH[sub 1][Delta]t], exp[[minus]iH[sub 2][Delta]t],[hor ellipsis], which is accurate to third or fourth order in [Delta]t. The methods we derive are an extended form of the symplectic method, and can also be used for an integration of classical Hamiltonians on classical computers. We derive both integral and irrational methods, and find the most efficient methods in both cases. [copyright] [ital 1999] [ital The American Physical Society
Parallel, adaptive finite element methods for conservation laws

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Devine, Karen D.; Flaherty, Joseph E.

1994-01-01

We construct parallel finite element methods for the solution of hyperbolic conservation laws in one and two dimensions. Spatial discretization is performed by a discontinuous Galerkin finite element method using a basis of piecewise Legendre polynomials. Temporal discretization utilizes a Runge-Kutta method. Dissipative fluxes and projection limiting prevent oscillations near solution discontinuities. A posteriori estimates of spatial errors are obtained by a p-refinement technique using superconvergence at Radau points. The resulting method is of high order and may be parallelized efficiently on MIMD computers. We compare results using different limiting schemes and demonstrate parallel efficiency through computations on an NCUBE/2 hypercube. We also present results using adaptive h- and p-refinement to reduce the computational cost of the method.
An efficient formulation of robot arm dynamics for control and computer simulation

NASA Astrophysics Data System (ADS)

Lee, C. S. G.; Nigam, R.

This paper describes an efficient formulation of the dynamic equations of motion of industrial robots based on the Lagrange formulation of d'Alembert's principle. This formulation, as applied to a PUMA robot arm, results in a set of closed form second order differential equations with cross product terms. They are not as efficient in computation as those formulated by the Newton-Euler method, but provide a better analytical model for control analysis and computer simulation. Computational complexities of this dynamic model together with other models are tabulated for discussion.
Efficient computation of kinship and identity coefficients on large pedigrees.

PubMed

Cheng, En; Elliott, Brendan; Ozsoyoglu, Z Meral

2009-06-01

With the rapidly expanding field of medical genetics and genetic counseling, genealogy information is becoming increasingly abundant. An important computation on pedigree data is the calculation of identity coefficients, which provide a complete description of the degree of relatedness of a pair of individuals. The areas of application of identity coefficients are numerous and diverse, from genetic counseling to disease tracking, and thus, the computation of identity coefficients merits special attention. However, the computation of identity coefficients is not done directly, but rather as the final step after computing a set of generalized kinship coefficients. In this paper, we first propose a novel Path-Counting Formula for calculating generalized kinship coefficients, which is motivated by Wright's path-counting method for computing inbreeding coefficient. We then present an efficient and scalable scheme for calculating generalized kinship coefficients on large pedigrees using NodeCodes, a special encoding scheme for expediting the evaluation of queries on pedigree graph structures. Furthermore, we propose an improved scheme using Family NodeCodes for the computation of generalized kinship coefficients, which is motivated by the significant improvement of using Family NodeCodes for inbreeding coefficient over the use of NodeCodes. We also perform experiments for evaluating the efficiency of our method, and compare it with the performance of the traditional recursive algorithm for three individuals. Experimental results demonstrate that the resulting scheme is more scalable and efficient than the traditional recursive methods for computing generalized kinship coefficients.
CFD Analysis and Design Optimization Using Parallel Computers

NASA Technical Reports Server (NTRS)

Martinelli, Luigi; Alonso, Juan Jose; Jameson, Antony; Reuther, James

1997-01-01

A versatile and efficient multi-block method is presented for the simulation of both steady and unsteady flow, as well as aerodynamic design optimization of complete aircraft configurations. The compressible Euler and Reynolds Averaged Navier-Stokes (RANS) equations are discretized using a high resolution scheme on body-fitted structured meshes. An efficient multigrid implicit scheme is implemented for time-accurate flow calculations. Optimum aerodynamic shape design is achieved at very low cost using an adjoint formulation. The method is implemented on parallel computing systems using the MPI message passing interface standard to ensure portability. The results demonstrate that, by combining highly efficient algorithms with parallel computing, it is possible to perform detailed steady and unsteady analysis as well as automatic design for complex configurations using the present generation of parallel computers.
An efficient graph theory based method to identify every minimal reaction set in a metabolic network

PubMed Central

2014-01-01

Background Development of cells with minimal metabolic functionality is gaining importance due to their efficiency in producing chemicals and fuels. Existing computational methods to identify minimal reaction sets in metabolic networks are computationally expensive. Further, they identify only one of the several possible minimal reaction sets. Results In this paper, we propose an efficient graph theory based recursive optimization approach to identify all minimal reaction sets. Graph theoretical insights offer systematic methods to not only reduce the number of variables in math programming and increase its computational efficiency, but also provide efficient ways to find multiple optimal solutions. The efficacy of the proposed approach is demonstrated using case studies from Escherichia coli and Saccharomyces cerevisiae. In case study 1, the proposed method identified three minimal reaction sets each containing 38 reactions in Escherichia coli central metabolic network with 77 reactions. Analysis of these three minimal reaction sets revealed that one of them is more suitable for developing minimal metabolism cell compared to other two due to practically achievable internal flux distribution. In case study 2, the proposed method identified 256 minimal reaction sets from the Saccharomyces cerevisiae genome scale metabolic network with 620 reactions. The proposed method required only 4.5 hours to identify all the 256 minimal reaction sets and has shown a significant reduction (approximately 80%) in the solution time when compared to the existing methods for finding minimal reaction set. Conclusions Identification of all minimal reactions sets in metabolic networks is essential since different minimal reaction sets have different properties that effect the bioprocess development. The proposed method correctly identified all minimal reaction sets in a both the case studies. The proposed method is computationally efficient compared to other methods for finding minimal reaction sets and useful to employ with genome-scale metabolic networks. PMID:24594118
Fast hydrological model calibration based on the heterogeneous parallel computing accelerated shuffled complex evolution method

NASA Astrophysics Data System (ADS)

Kan, Guangyuan; He, Xiaoyan; Ding, Liuqian; Li, Jiren; Hong, Yang; Zuo, Depeng; Ren, Minglei; Lei, Tianjie; Liang, Ke

2018-01-01

Hydrological model calibration has been a hot issue for decades. The shuffled complex evolution method developed at the University of Arizona (SCE-UA) has been proved to be an effective and robust optimization approach. However, its computational efficiency deteriorates significantly when the amount of hydrometeorological data increases. In recent years, the rise of heterogeneous parallel computing has brought hope for the acceleration of hydrological model calibration. This study proposed a parallel SCE-UA method and applied it to the calibration of a watershed rainfall-runoff model, the Xinanjiang model. The parallel method was implemented on heterogeneous computing systems using OpenMP and CUDA. Performance testing and sensitivity analysis were carried out to verify its correctness and efficiency. Comparison results indicated that heterogeneous parallel computing-accelerated SCE-UA converged much more quickly than the original serial version and possessed satisfactory accuracy and stability for the task of fast hydrological model calibration.
a Novel Discrete Optimal Transport Method for Bayesian Inverse Problems

NASA Astrophysics Data System (ADS)

Bui-Thanh, T.; Myers, A.; Wang, K.; Thiery, A.

2017-12-01

We present the Augmented Ensemble Transform (AET) method for generating approximate samples from a high-dimensional posterior distribution as a solution to Bayesian inverse problems. Solving large-scale inverse problems is critical for some of the most relevant and impactful scientific endeavors of our time. Therefore, constructing novel methods for solving the Bayesian inverse problem in more computationally efficient ways can have a profound impact on the science community. This research derives the novel AET method for exploring a posterior by solving a sequence of linear programming problems, resulting in a series of transport maps which map prior samples to posterior samples, allowing for the computation of moments of the posterior. We show both theoretical and numerical results, indicating this method can offer superior computational efficiency when compared to other SMC methods. Most of this efficiency is derived from matrix scaling methods to solve the linear programming problem and derivative-free optimization for particle movement. We use this method to determine inter-well connectivity in a reservoir and the associated uncertainty related to certain parameters. The attached file shows the difference between the true parameter and the AET parameter in an example 3D reservoir problem. The error is within the Morozov discrepancy allowance with lower computational cost than other particle methods.
A fast indirect method to compute functions of genomic relationships concerning genotyped and ungenotyped individuals, for diversity management.

PubMed

Colleau, Jean-Jacques; Palhière, Isabelle; Rodríguez-Ramilo, Silvia T; Legarra, Andres

2017-12-01

Pedigree-based management of genetic diversity in populations, e.g., using optimal contributions, involves computation of the [Formula: see text] type yielding elements (relationships) or functions (usually averages) of relationship matrices. For pedigree-based relationships [Formula: see text], a very efficient method exists. When all the individuals of interest are genotyped, genomic management can be addressed using the genomic relationship matrix [Formula: see text]; however, to date, the computational problem of efficiently computing [Formula: see text] has not been well studied. When some individuals of interest are not genotyped, genomic management should consider the relationship matrix [Formula: see text] that combines genotyped and ungenotyped individuals; however, direct computation of [Formula: see text] is computationally very demanding, because construction of a possibly huge matrix is required. Our work presents efficient ways of computing [Formula: see text] and [Formula: see text], with applications on real data from dairy sheep and dairy goat breeding schemes. For genomic relationships, an efficient indirect computation with quadratic instead of cubic cost is [Formula: see text], where Z is a matrix relating animals to genotypes. For the relationship matrix [Formula: see text], we propose an indirect method based on the difference between vectors [Formula: see text], which involves computation of [Formula: see text] and of products such as [Formula: see text] and [Formula: see text], where [Formula: see text] is a working vector derived from [Formula: see text]. The latter computation is the most demanding but can be done using sparse Cholesky decompositions of matrix [Formula: see text], which allows handling very large genomic and pedigree data files. Studies based on simulations reported in the literature show that the trends of average relationships in [Formula: see text] and [Formula: see text] differ as genomic selection proceeds. When selection is based on genomic relationships but management is based on pedigree data, the true genetic diversity is overestimated. However, our tests on real data from sheep and goat obtained before genomic selection started do not show this. We present efficient methods to compute elements and statistics of the genomic relationships [Formula: see text] and of matrix [Formula: see text] that combines ungenotyped and genotyped individuals. These methods should be useful to monitor and handle genomic diversity.
ADVANCED COMPUTATIONAL METHODS IN DOSE MODELING: APPLICATION OF COMPUTATIONAL BIOPHYSICAL TRANSPORT, COMPUTATIONAL CHEMISTRY, AND COMPUTATIONAL BIOLOGY

EPA Science Inventory

Computational toxicology (CompTox) leverages the significant gains in computing power and computational techniques (e.g., numerical approaches, structure-activity relationships, bioinformatics) realized over the last few years, thereby reducing costs and increasing efficiency i...
Development of efficient time-evolution method based on three-term recurrence relation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Akama, Tomoko, E-mail: a.tomo---s-b-l-r@suou.waseda.jp; Kobayashi, Osamu; Nanbu, Shinkoh, E-mail: shinkoh.nanbu@sophia.ac.jp

The advantage of the real-time (RT) propagation method is a direct solution of the time-dependent Schrödinger equation which describes frequency properties as well as all dynamics of a molecular system composed of electrons and nuclei in quantum physics and chemistry. Its applications have been limited by computational feasibility, as the evaluation of the time-evolution operator is computationally demanding. In this article, a new efficient time-evolution method based on the three-term recurrence relation (3TRR) was proposed to reduce the time-consuming numerical procedure. The basic formula of this approach was derived by introducing a transformation of the operator using the arcsine function.more » Since this operator transformation causes transformation of time, we derived the relation between original and transformed time. The formula was adapted to assess the performance of the RT time-dependent Hartree-Fock (RT-TDHF) method and the time-dependent density functional theory. Compared to the commonly used fourth-order Runge-Kutta method, our new approach decreased computational time of the RT-TDHF calculation by about factor of four, showing the 3TRR formula to be an efficient time-evolution method for reducing computational cost.« less
COMPUTATIONAL METHODS FOR SENSITIVITY AND UNCERTAINTY ANALYSIS FOR ENVIRONMENTAL AND BIOLOGICAL MODELS

EPA Science Inventory

This work introduces a computationally efficient alternative method for uncertainty propagation, the Stochastic Response Surface Method (SRSM). The SRSM approximates uncertainties in model outputs through a series expansion in normal random variables (polynomial chaos expansion)...
On the value of incorporating spatial statistics in large-scale geophysical inversions: the SABRe case

NASA Astrophysics Data System (ADS)

Kokkinaki, A.; Sleep, B. E.; Chambers, J. E.; Cirpka, O. A.; Nowak, W.

2010-12-01

Electrical Resistance Tomography (ERT) is a popular method for investigating subsurface heterogeneity. The method relies on measuring electrical potential differences and obtaining, through inverse modeling, the underlying electrical conductivity field, which can be related to hydraulic conductivities. The quality of site characterization strongly depends on the utilized inversion technique. Standard ERT inversion methods, though highly computationally efficient, do not consider spatial correlation of soil properties; as a result, they often underestimate the spatial variability observed in earth materials, thereby producing unrealistic subsurface models. Also, these methods do not quantify the uncertainty of the estimated properties, thus limiting their use in subsequent investigations. Geostatistical inverse methods can be used to overcome both these limitations; however, they are computationally expensive, which has hindered their wide use in practice. In this work, we compare a standard Gauss-Newton smoothness constrained least squares inversion method against the quasi-linear geostatistical approach using the three-dimensional ERT dataset of the SABRe (Source Area Bioremediation) project. The two methods are evaluated for their ability to: a) produce physically realistic electrical conductivity fields that agree with the wide range of data available for the SABRe site while being computationally efficient, and b) provide information on the spatial statistics of other parameters of interest, such as hydraulic conductivity. To explore the trade-off between inversion quality and computational efficiency, we also employ a 2.5-D forward model with corrections for boundary conditions and source singularities. The 2.5-D model accelerates the 3-D geostatistical inversion method. New adjoint equations are developed for the 2.5-D forward model for the efficient calculation of sensitivities. Our work shows that spatial statistics can be incorporated in large-scale ERT inversions to improve the inversion results without making them computationally prohibitive.

Parallel computing method for simulating hydrological processesof large rivers under climate change

NASA Astrophysics Data System (ADS)

Wang, H.; Chen, Y.

2016-12-01

Climate change is one of the proverbial global environmental problems in the world.Climate change has altered the watershed hydrological processes in time and space distribution, especially in worldlarge rivers.Watershed hydrological process simulation based on physically based distributed hydrological model can could have better results compared with the lumped models.However, watershed hydrological process simulation includes large amount of calculations, especially in large rivers, thus needing huge computing resources that may not be steadily available for the researchers or at high expense, this seriously restricted the research and application. To solve this problem, the current parallel method are mostly parallel computing in space and time dimensions.They calculate the natural features orderly thatbased on distributed hydrological model by grid (unit, a basin) from upstream to downstream.This articleproposes ahigh-performancecomputing method of hydrological process simulation with high speedratio and parallel efficiency.It combinedthe runoff characteristics of time and space of distributed hydrological model withthe methods adopting distributed data storage, memory database, distributed computing, parallel computing based on computing power unit.The method has strong adaptability and extensibility,which means it canmake full use of the computing and storage resources under the condition of limited computing resources, and the computing efficiency can be improved linearly with the increase of computing resources .This method can satisfy the parallel computing requirements ofhydrological process simulation in small, medium and large rivers.
Multigrid Methods for Aerodynamic Problems in Complex Geometries

NASA Technical Reports Server (NTRS)

Caughey, David A.

1995-01-01

Work has been directed at the development of efficient multigrid methods for the solution of aerodynamic problems involving complex geometries, including the development of computational methods for the solution of both inviscid and viscous transonic flow problems. The emphasis is on problems of complex, three-dimensional geometry. The methods developed are based upon finite-volume approximations to both the Euler and the Reynolds-Averaged Navier-Stokes equations. The methods are developed for use on multi-block grids using diagonalized implicit multigrid methods to achieve computational efficiency. The work is focused upon aerodynamic problems involving complex geometries, including advanced engine inlets.
An Efficient Numerical Method for Computing Synthetic Seismograms for a Layered Half-space with Sources and Receivers at Close or Same Depths

NASA Astrophysics Data System (ADS)

Zhang, H.-m.; Chen, X.-f.; Chang, S.

- It is difficult to compute synthetic seismograms for a layered half-space with sources and receivers at close to or the same depths using the generalized R/T coefficient method (Kennett, 1983; Luco and Apsel, 1983; Yao and Harkrider, 1983; Chen, 1993), because the wavenumber integration converges very slowly. A semi-analytic method for accelerating the convergence, in which part of the integration is implemented analytically, was adopted by some authors (Apsel and Luco, 1983; Hisada, 1994, 1995). In this study, based on the principle of the Repeated Averaging Method (Dahlquist and Björck, 1974; Chang, 1988), we propose an alternative, efficient, numerical method, the peak-trough averaging method (PTAM), to overcome the difficulty mentioned above. Compared with the semi-analytic method, PTAM is not only much simpler mathematically and easier to implement in practice, but also more efficient. Using numerical examples, we illustrate the validity, accuracy and efficiency of the new method.
A sub-space greedy search method for efficient Bayesian Network inference.

PubMed

Zhang, Qing; Cao, Yong; Li, Yong; Zhu, Yanming; Sun, Samuel S M; Guo, Dianjing

2011-09-01

Bayesian network (BN) has been successfully used to infer the regulatory relationships of genes from microarray dataset. However, one major limitation of BN approach is the computational cost because the calculation time grows more than exponentially with the dimension of the dataset. In this paper, we propose a sub-space greedy search method for efficient Bayesian Network inference. Particularly, this method limits the greedy search space by only selecting gene pairs with higher partial correlation coefficients. Using both synthetic and real data, we demonstrate that the proposed method achieved comparable results with standard greedy search method yet saved ∼50% of the computational time. We believe that sub-space search method can be widely used for efficient BN inference in systems biology. Copyright © 2011 Elsevier Ltd. All rights reserved.
A novel patient-specific model to compute coronary fractional flow reserve.

PubMed

Kwon, Soon-Sung; Chung, Eui-Chul; Park, Jin-Seo; Kim, Gook-Tae; Kim, Jun-Woo; Kim, Keun-Hong; Shin, Eun-Seok; Shim, Eun Bo

2014-09-01

The fractional flow reserve (FFR) is a widely used clinical index to evaluate the functional severity of coronary stenosis. A computer simulation method based on patients' computed tomography (CT) data is a plausible non-invasive approach for computing the FFR. This method can provide a detailed solution for the stenosed coronary hemodynamics by coupling computational fluid dynamics (CFD) with the lumped parameter model (LPM) of the cardiovascular system. In this work, we have implemented a simple computational method to compute the FFR. As this method uses only coronary arteries for the CFD model and includes only the LPM of the coronary vascular system, it provides simpler boundary conditions for the coronary geometry and is computationally more efficient than existing approaches. To test the efficacy of this method, we simulated a three-dimensional straight vessel using CFD coupled with the LPM. The computed results were compared with those of the LPM. To validate this method in terms of clinically realistic geometry, a patient-specific model of stenosed coronary arteries was constructed from CT images, and the computed FFR was compared with clinically measured results. We evaluated the effect of a model aorta on the computed FFR and compared this with a model without the aorta. Computationally, the model without the aorta was more efficient than that with the aorta, reducing the CPU time required for computing a cardiac cycle to 43.4%. Copyright © 2014. Published by Elsevier Ltd.
Computational methods for efficient structural reliability and reliability sensitivity analysis

NASA Technical Reports Server (NTRS)

Wu, Y.-T.

1993-01-01

This paper presents recent developments in efficient structural reliability analysis methods. The paper proposes an efficient, adaptive importance sampling (AIS) method that can be used to compute reliability and reliability sensitivities. The AIS approach uses a sampling density that is proportional to the joint PDF of the random variables. Starting from an initial approximate failure domain, sampling proceeds adaptively and incrementally with the goal of reaching a sampling domain that is slightly greater than the failure domain to minimize over-sampling in the safe region. Several reliability sensitivity coefficients are proposed that can be computed directly and easily from the above AIS-based failure points. These probability sensitivities can be used for identifying key random variables and for adjusting design to achieve reliability-based objectives. The proposed AIS methodology is demonstrated using a turbine blade reliability analysis problem.
Parallel discontinuous Galerkin FEM for computing hyperbolic conservation law on unstructured grids

NASA Astrophysics Data System (ADS)

Ma, Xinrong; Duan, Zhijian

2018-04-01

High-order resolution Discontinuous Galerkin finite element methods (DGFEM) has been known as a good method for solving Euler equations and Navier-Stokes equations on unstructured grid, but it costs too much computational resources. An efficient parallel algorithm was presented for solving the compressible Euler equations. Moreover, the multigrid strategy based on three-stage three-order TVD Runge-Kutta scheme was used in order to improve the computational efficiency of DGFEM and accelerate the convergence of the solution of unsteady compressible Euler equations. In order to make each processor maintain load balancing, the domain decomposition method was employed. Numerical experiment performed for the inviscid transonic flow fluid problems around NACA0012 airfoil and M6 wing. The results indicated that our parallel algorithm can improve acceleration and efficiency significantly, which is suitable for calculating the complex flow fluid.
A Simple and Computationally Efficient Sampling Approach to Covariate Adjustment for Multifactor Dimensionality Reduction Analysis of Epistasis

PubMed Central

Gui, Jiang; Andrew, Angeline S.; Andrews, Peter; Nelson, Heather M.; Kelsey, Karl T.; Karagas, Margaret R.; Moore, Jason H.

2010-01-01

Epistasis or gene-gene interaction is a fundamental component of the genetic architecture of complex traits such as disease susceptibility. Multifactor dimensionality reduction (MDR) was developed as a nonparametric and model-free method to detect epistasis when there are no significant marginal genetic effects. However, in many studies of complex disease, other covariates like age of onset and smoking status could have a strong main effect and may potentially interfere with MDR's ability to achieve its goal. In this paper, we present a simple and computationally efficient sampling method to adjust for covariate effects in MDR. We use simulation to show that after adjustment, MDR has sufficient power to detect true gene-gene interactions. We also compare our method with the state-of-art technique in covariate adjustment. The results suggest that our proposed method performs similarly, but is more computationally efficient. We then apply this new method to an analysis of a population-based bladder cancer study in New Hampshire. PMID:20924193
Local Orthogonal Cutting Method for Computing Medial Curves and Its Biomedical Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jiao, Xiangmin; Einstein, Daniel R.; Dyedov, Volodymyr

2010-03-24

Medial curves have a wide range of applications in geometric modeling and analysis (such as shape matching) and biomedical engineering (such as morphometry and computer assisted surgery). The computation of medial curves poses significant challenges, both in terms of theoretical analysis and practical efficiency and reliability. In this paper, we propose a definition and analysis of medial curves and also describe an efficient and robust method for computing medial curves. Our approach is based on three key concepts: a local orthogonal decomposition of objects into substructures, a differential geometry concept called the interior center of curvature (ICC), and integrated stabilitymore » and consistency tests. These concepts lend themselves to robust numerical techniques including eigenvalue analysis, weighted least squares approximations, and numerical minimization, resulting in an algorithm that is efficient and noise resistant. We illustrate the effectiveness and robustness of our approach with some highly complex, large-scale, noisy biomedical geometries derived from medical images, including lung airways and blood vessels. We also present comparisons of our method with some existing methods.« less
Green's function methods in heavy ion shielding

NASA Technical Reports Server (NTRS)

Wilson, John W.; Costen, Robert C.; Shinn, Judy L.; Badavi, Francis F.

1993-01-01

An analytic solution to the heavy ion transport in terms of Green's function is used to generate a highly efficient computer code for space applications. The efficiency of the computer code is accomplished by a nonperturbative technique extending Green's function over the solution domain. The computer code can also be applied to accelerator boundary conditions to allow code validation in laboratory experiments.
Inverse targeting —An effective immunization strategy

NASA Astrophysics Data System (ADS)

Schneider, C. M.; Mihaljev, T.; Herrmann, H. J.

2012-05-01

We propose a new method to immunize populations or computer networks against epidemics which is more efficient than any continuous immunization method considered before. The novelty of our method resides in the way of determining the immunization targets. First we identify those individuals or computers that contribute the least to the disease spreading measured through their contribution to the size of the largest connected cluster in the social or a computer network. The immunization process follows the list of identified individuals or computers in inverse order, immunizing first those which are most relevant for the epidemic spreading. We have applied our immunization strategy to several model networks and two real networks, the Internet and the collaboration network of high-energy physicists. We find that our new immunization strategy is in the case of model networks up to 14%, and for real networks up to 33% more efficient than immunizing dynamically the most connected nodes in a network. Our strategy is also numerically efficient and can therefore be applied to large systems.
Computer-based learning: interleaving whole and sectional representation of neuroanatomy.

PubMed

Pani, John R; Chariker, Julia H; Naaz, Farah

2013-01-01

The large volume of material to be learned in biomedical disciplines requires optimizing the efficiency of instruction. In prior work with computer-based instruction of neuroanatomy, it was relatively efficient for learners to master whole anatomy and then transfer to learning sectional anatomy. It may, however, be more efficient to continuously integrate learning of whole and sectional anatomy. A study of computer-based learning of neuroanatomy was conducted to compare a basic transfer paradigm for learning whole and sectional neuroanatomy with a method in which the two forms of representation were interleaved (alternated). For all experimental groups, interactive computer programs supported an approach to instruction called adaptive exploration. Each learning trial consisted of time-limited exploration of neuroanatomy, self-timed testing, and graphical feedback. The primary result of this study was that interleaved learning of whole and sectional neuroanatomy was more efficient than the basic transfer method, without cost to long-term retention or generalization of knowledge to recognizing new images (Visible Human and MRI). Copyright © 2012 American Association of Anatomists.
Computer-Based Learning: Interleaving Whole and Sectional Representation of Neuroanatomy

PubMed Central

Pani, John R.; Chariker, Julia H.; Naaz, Farah

2015-01-01

The large volume of material to be learned in biomedical disciplines requires optimizing the efficiency of instruction. In prior work with computer-based instruction of neuroanatomy, it was relatively efficient for learners to master whole anatomy and then transfer to learning sectional anatomy. It may, however, be more efficient to continuously integrate learning of whole and sectional anatomy. A study of computer-based learning of neuroanatomy was conducted to compare a basic transfer paradigm for learning whole and sectional neuroanatomy with a method in which the two forms of representation were interleaved (alternated). For all experimental groups, interactive computer programs supported an approach to instruction called adaptive exploration. Each learning trial consisted of time-limited exploration of neuroanatomy, self-timed testing, and graphical feedback. The primary result of this study was that interleaved learning of whole and sectional neuroanatomy was more efficient than the basic transfer method, without cost to long-term retention or generalization of knowledge to recognizing new images (Visible Human and MRI). PMID:22761001
On computational methods for crashworthiness

NASA Technical Reports Server (NTRS)

Belytschko, T.

1992-01-01

The evolution of computational methods for crashworthiness and related fields is described and linked with the decreasing cost of computational resources and with improvements in computation methodologies. The latter includes more effective time integration procedures and more efficient elements. Some recent developments in methodologies and future trends are also summarized. These include multi-time step integration (or subcycling), further improvements in elements, adaptive meshes, and the exploitation of parallel computers.
Efficient computational methods to study new and innovative signal detection techniques in SETI

NASA Technical Reports Server (NTRS)

Deans, Stanley R.

1991-01-01

The purpose of the research reported here is to provide a rapid computational method for computing various statistical parameters associated with overlapped Hann spectra. These results are important for the Targeted Search part of the Search for ExtraTerrestrial Intelligence (SETI) Microwave Observing Project.
10 CFR 431.17 - Determination of efficiency.

Code of Federal Regulations, 2010 CFR

2010-01-01

... method or methods used; the mathematical model, the engineering or statistical analysis, computer... accordance with § 431.16 of this subpart, or by application of an alternative efficiency determination method... must be: (i) Derived from a mathematical model that represents the mechanical and electrical...
The Improvement of Efficiency in the Numerical Computation of Orbit Trajectories

NASA Technical Reports Server (NTRS)

Dyer, J.; Danchick, R.; Pierce, S.; Haney, R.

1972-01-01

An analysis, system design, programming, and evaluation of results are described for numerical computation of orbit trajectories. Evaluation of generalized methods, interaction of different formulations for satellite motion, transformation of equations of motion and integrator loads, and development of efficient integrators are also considered.
Computational methods for aerodynamic design using numerical optimization

NASA Technical Reports Server (NTRS)

Peeters, M. F.

1983-01-01

Five methods to increase the computational efficiency of aerodynamic design using numerical optimization, by reducing the computer time required to perform gradient calculations, are examined. The most promising method consists of drastically reducing the size of the computational domain on which aerodynamic calculations are made during gradient calculations. Since a gradient calculation requires the solution of the flow about an airfoil whose geometry was slightly perturbed from a base airfoil, the flow about the base airfoil is used to determine boundary conditions on the reduced computational domain. This method worked well in subcritical flow.
CCOMP: An efficient algorithm for complex roots computation of determinantal equations

NASA Astrophysics Data System (ADS)

Zouros, Grigorios P.

2018-01-01

In this paper a free Python algorithm, entitled CCOMP (Complex roots COMPutation), is developed for the efficient computation of complex roots of determinantal equations inside a prescribed complex domain. The key to the method presented is the efficient determination of the candidate points inside the domain which, in their close neighborhood, a complex root may lie. Once these points are detected, the algorithm proceeds to a two-dimensional minimization problem with respect to the minimum modulus eigenvalue of the system matrix. In the core of CCOMP exist three sub-algorithms whose tasks are the efficient estimation of the minimum modulus eigenvalues of the system matrix inside the prescribed domain, the efficient computation of candidate points which guarantee the existence of minima, and finally, the computation of minima via bound constrained minimization algorithms. Theoretical results and heuristics support the development and the performance of the algorithm, which is discussed in detail. CCOMP supports general complex matrices, and its efficiency, applicability and validity is demonstrated to a variety of microwave applications.
Computationally efficient control allocation

NASA Technical Reports Server (NTRS)

Durham, Wayne (Inventor)

2001-01-01

A computationally efficient method for calculating near-optimal solutions to the three-objective, linear control allocation problem is disclosed. The control allocation problem is that of distributing the effort of redundant control effectors to achieve some desired set of objectives. The problem is deemed linear if control effectiveness is affine with respect to the individual control effectors. The optimal solution is that which exploits the collective maximum capability of the effectors within their individual physical limits. Computational efficiency is measured by the number of floating-point operations required for solution. The method presented returned optimal solutions in more than 90% of the cases examined; non-optimal solutions returned by the method were typically much less than 1% different from optimal and the errors tended to become smaller than 0.01% as the number of controls was increased. The magnitude of the errors returned by the present method was much smaller than those that resulted from either pseudo inverse or cascaded generalized inverse solutions. The computational complexity of the method presented varied linearly with increasing numbers of controls; the number of required floating point operations increased from 5.5 i, to seven times faster than did the minimum-norm solution (the pseudoinverse), and at about the same rate as did the cascaded generalized inverse solution. The computational requirements of the method presented were much better than that of previously described facet-searching methods which increase in proportion to the square of the number of controls.

A GPU-accelerated implicit meshless method for compressible flows

NASA Astrophysics Data System (ADS)

Zhang, Jia-Le; Ma, Zhi-Hua; Chen, Hong-Quan; Cao, Cheng

2018-05-01

This paper develops a recently proposed GPU based two-dimensional explicit meshless method (Ma et al., 2014) by devising and implementing an efficient parallel LU-SGS implicit algorithm to further improve the computational efficiency. The capability of the original 2D meshless code is extended to deal with 3D complex compressible flow problems. To resolve the inherent data dependency of the standard LU-SGS method, which causes thread-racing conditions destabilizing numerical computation, a generic rainbow coloring method is presented and applied to organize the computational points into different groups by painting neighboring points with different colors. The original LU-SGS method is modified and parallelized accordingly to perform calculations in a color-by-color manner. The CUDA Fortran programming model is employed to develop the key kernel functions to apply boundary conditions, calculate time steps, evaluate residuals as well as advance and update the solution in the temporal space. A series of two- and three-dimensional test cases including compressible flows over single- and multi-element airfoils and a M6 wing are carried out to verify the developed code. The obtained solutions agree well with experimental data and other computational results reported in the literature. Detailed analysis on the performance of the developed code reveals that the developed CPU based implicit meshless method is at least four to eight times faster than its explicit counterpart. The computational efficiency of the implicit method could be further improved by ten to fifteen times on the GPU.
A GPU accelerated and error-controlled solver for the unbounded Poisson equation in three dimensions

NASA Astrophysics Data System (ADS)

Exl, Lukas

2017-12-01

An efficient solver for the three dimensional free-space Poisson equation is presented. The underlying numerical method is based on finite Fourier series approximation. While the error of all involved approximations can be fully controlled, the overall computation error is driven by the convergence of the finite Fourier series of the density. For smooth and fast-decaying densities the proposed method will be spectrally accurate. The method scales with O(N log N) operations, where N is the total number of discretization points in the Cartesian grid. The majority of the computational costs come from fast Fourier transforms (FFT), which makes it ideal for GPU computation. Several numerical computations on CPU and GPU validate the method and show efficiency and convergence behavior. Tests are performed using the Vienna Scientific Cluster 3 (VSC3). A free MATLAB implementation for CPU and GPU is provided to the interested community.
Characterization of Meta-Materials Using Computational Electromagnetic Methods

NASA Technical Reports Server (NTRS)

Deshpande, Manohar; Shin, Joon

2005-01-01

An efficient and powerful computational method is presented to synthesize a meta-material to specified electromagnetic properties. Using the periodicity of meta-materials, the Finite Element Methodology (FEM) is developed to estimate the reflection and transmission through the meta-material structure for a normal plane wave incidence. For efficient computations of the reflection and transmission over a wide band frequency range through a meta-material a Finite Difference Time Domain (FDTD) approach is also developed. Using the Nicholson-Ross method and the Genetic Algorithms, a robust procedure to extract electromagnetic properties of meta-material from the knowledge of its reflection and transmission coefficients is described. Few numerical examples are also presented to validate the present approach.
Recursive Newton-Euler formulation of manipulator dynamics

NASA Technical Reports Server (NTRS)

Nasser, M. G.

1989-01-01

A recursive Newton-Euler procedure is presented for the formulation and solution of manipulator dynamical equations. The procedure includes rotational and translational joints and a topological tree. This model was verified analytically using a planar two-link manipulator. Also, the model was tested numerically against the Walker-Orin model using the Shuttle Remote Manipulator System data. The hinge accelerations obtained from both models were identical. The computational requirements of the model vary linearly with the number of joints. The computational efficiency of this method exceeds that of Walker-Orin methods. This procedure may be viewed as a considerable generalization of Armstrong's method. A six-by-six formulation is adopted which enhances both the computational efficiency and simplicity of the model.
Computer Facilitated Mathematical Methods in Chemical Engineering--Similarity Solution

ERIC Educational Resources Information Center

Subramanian, Venkat R.

2006-01-01

High-performance computers coupled with highly efficient numerical schemes and user-friendly software packages have helped instructors to teach numerical solutions and analysis of various nonlinear models more efficiently in the classroom. One of the main objectives of a model is to provide insight about the system of interest. Analytical…
Efficient Computation of Difference Vibrational Spectra in Isothermal-Isobaric Ensemble.

PubMed

Joutsuka, Tatsuya; Morita, Akihiro

2016-11-03

Difference spectroscopy between two close systems is widely used to augment its selectivity to the different parts of the observed system, though the molecular dynamics calculation of tiny difference spectra would be computationally extraordinary demanding by subtraction of two spectra. Therefore, we have proposed an efficient computational algorithm of difference spectra without resorting to the subtraction. The present paper reports our extension of the theoretical method in the isothermal-isobaric (NPT) ensemble. The present theory expands our applications of analysis including pressure dependence of the spectra. We verified that the present theory yields accurate difference spectra in the NPT condition as well, with remarkable computational efficiency over the straightforward subtraction by several orders of magnitude. This method is further applied to vibrational spectra of liquid water with varying pressure and succeeded in reproducing tiny difference spectra by pressure change. The anomalous pressure dependence is elucidated in relation to other properties of liquid water.
Fast approximation for joint optimization of segmentation, shape, and location priors, and its application in gallbladder segmentation.

PubMed

Saito, Atsushi; Nawano, Shigeru; Shimizu, Akinobu

2017-05-01

This paper addresses joint optimization for segmentation and shape priors, including translation, to overcome inter-subject variability in the location of an organ. Because a simple extension of the previous exact optimization method is too computationally complex, we propose a fast approximation for optimization. The effectiveness of the proposed approximation is validated in the context of gallbladder segmentation from a non-contrast computed tomography (CT) volume. After spatial standardization and estimation of the posterior probability of the target organ, simultaneous optimization of the segmentation, shape, and location priors is performed using a branch-and-bound method. Fast approximation is achieved by combining sampling in the eigenshape space to reduce the number of shape priors and an efficient computational technique for evaluating the lower bound. Performance was evaluated using threefold cross-validation of 27 CT volumes. Optimization in terms of translation of the shape prior significantly improved segmentation performance. The proposed method achieved a result of 0.623 on the Jaccard index in gallbladder segmentation, which is comparable to that of state-of-the-art methods. The computational efficiency of the algorithm is confirmed to be good enough to allow execution on a personal computer. Joint optimization of the segmentation, shape, and location priors was proposed, and it proved to be effective in gallbladder segmentation with high computational efficiency.
Computationally Efficient Clustering of Audio-Visual Meeting Data

NASA Astrophysics Data System (ADS)

Hung, Hayley; Friedland, Gerald; Yeo, Chuohao

This chapter presents novel computationally efficient algorithms to extract semantically meaningful acoustic and visual events related to each of the participants in a group discussion using the example of business meeting recordings. The recording setup involves relatively few audio-visual sensors, comprising a limited number of cameras and microphones. We first demonstrate computationally efficient algorithms that can identify who spoke and when, a problem in speech processing known as speaker diarization. We also extract visual activity features efficiently from MPEG4 video by taking advantage of the processing that was already done for video compression. Then, we present a method of associating the audio-visual data together so that the content of each participant can be managed individually. The methods presented in this article can be used as a principal component that enables many higher-level semantic analysis tasks needed in search, retrieval, and navigation.
Efficient GW calculations using eigenvalue-eigenvector decomposition of the dielectric matrix

NASA Astrophysics Data System (ADS)

Nguyen, Huy-Viet; Pham, T. Anh; Rocca, Dario; Galli, Giulia

2011-03-01

During the past 25 years, the GW method has been successfully used to compute electronic quasi-particle excitation spectra of a variety of materials. It is however a computationally intensive technique, as it involves summations over occupied and empty electronic states, to evaluate both the Green function (G) and the dielectric matrix (DM) entering the expression of the screened Coulomb interaction (W). Recent developments have shown that eigenpotentials of DMs can be efficiently calculated without any explicit evaluation of empty states. In this work, we will present a computationally efficient approach to the calculations of GW spectra by combining a representation of DMs in terms of its eigenpotentials and a recently developed iterative algorithm. As a demonstration of the efficiency of the method, we will present calculations of the vertical ionization potentials of several systems. Work was funnded by SciDAC-e DE-FC02-06ER25777.
Efficient ICCG on a shared memory multiprocessor

NASA Technical Reports Server (NTRS)

Hammond, Steven W.; Schreiber, Robert

1989-01-01

Different approaches are discussed for exploiting parallelism in the ICCG (Incomplete Cholesky Conjugate Gradient) method for solving large sparse symmetric positive definite systems of equations on a shared memory parallel computer. Techniques for efficiently solving triangular systems and computing sparse matrix-vector products are explored. Three methods for scheduling the tasks in solving triangular systems are implemented on the Sequent Balance 21000. Sample problems that are representative of a large class of problems solved using iterative methods are used. We show that a static analysis to determine data dependences in the triangular solve can greatly improve its parallel efficiency. We also show that ignoring symmetry and storing the whole matrix can reduce solution time substantially.
A Generalized Method for the Comparable and Rigorous Calculation of the Polytropic Efficiencies of Turbocompressors

NASA Astrophysics Data System (ADS)

Dimitrakopoulos, Panagiotis

2018-03-01

The calculation of polytropic efficiencies is a very important task, especially during the development of new compression units, like compressor impellers, stages and stage groups. Such calculations are also crucial for the determination of the performance of a whole compressor. As processors and computational capacities have substantially been improved in the last years, the need for a new, rigorous, robust, accurate and at the same time standardized method merged, regarding the computation of the polytropic efficiencies, especially based on thermodynamics of real gases. The proposed method is based on the rigorous definition of the polytropic efficiency. The input consists of pressure and temperature values at the end points of the compression path (suction and discharge), for a given working fluid. The average relative error for the studied cases was 0.536 %. Thus, this high-accuracy method is proposed for efficiency calculations related with turbocompressors and their compression units, especially when they are operating at high power levels, for example in jet engines and high-power plants.
An efficient coordinate transformation technique for unsteady, transonic aerodynamic analysis of low aspect-ratio wings

NASA Technical Reports Server (NTRS)

Guruswamy, G. P.; Goorjian, P. M.

1984-01-01

An efficient coordinate transformation technique is presented for constructing grids for unsteady, transonic aerodynamic computations for delta-type wings. The original shearing transformation yielded computations that were numerically unstable and this paper discusses the sources of those instabilities. The new shearing transformation yields computations that are stable, fast, and accurate. Comparisons of those two methods are shown for the flow over the F5 wing that demonstrate the new stability. Also, comparisons are made with experimental data that demonstrate the accuracy of the new method. The computations were made by using a time-accurate, finite-difference, alternating-direction-implicit (ADI) algorithm for the transonic small-disturbance potential equation.
Parallel computation using boundary elements in solid mechanics

NASA Technical Reports Server (NTRS)

Chien, L. S.; Sun, C. T.

1990-01-01

The inherent parallelism of the boundary element method is shown. The boundary element is formulated by assuming the linear variation of displacements and tractions within a line element. Moreover, MACSYMA symbolic program is employed to obtain the analytical results for influence coefficients. Three computational components are parallelized in this method to show the speedup and efficiency in computation. The global coefficient matrix is first formed concurrently. Then, the parallel Gaussian elimination solution scheme is applied to solve the resulting system of equations. Finally, and more importantly, the domain solutions of a given boundary value problem are calculated simultaneously. The linear speedups and high efficiencies are shown for solving a demonstrated problem on Sequent Symmetry S81 parallel computing system.
Seismic data restoration with a fast L1 norm trust region method

NASA Astrophysics Data System (ADS)

Cao, Jingjie; Wang, Yanfei

2014-08-01

Seismic data restoration is a major strategy to provide reliable wavefield when field data dissatisfy the Shannon sampling theorem. Recovery by sparsity-promoting inversion often get sparse solutions of seismic data in a transformed domains, however, most methods for sparsity-promoting inversion are line-searching methods which are efficient but are inclined to obtain local solutions. Using trust region method which can provide globally convergent solutions is a good choice to overcome this shortcoming. A trust region method for sparse inversion has been proposed, however, the efficiency should be improved to suitable for large-scale computation. In this paper, a new L1 norm trust region model is proposed for seismic data restoration and a robust gradient projection method for solving the sub-problem is utilized. Numerical results of synthetic and field data demonstrate that the proposed trust region method can get excellent computation speed and is a viable alternative for large-scale computation.
Exponential Methods for the Time Integration of Schroedinger Equation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cano, B.; Gonzalez-Pachon, A.

2010-09-30

We consider exponential methods of second order in time in order to integrate the cubic nonlinear Schroedinger equation. We are interested in taking profit of the special structure of this equation. Therefore, we look at symmetry, symplecticity and approximation of invariants of the proposed methods. That will allow to integrate till long times with reasonable accuracy. Computational efficiency is also our aim. Therefore, we make numerical computations in order to compare the methods considered and so as to conclude that explicit Lawson schemes projected on the norm of the solution are an efficient tool to integrate this equation.
Highly efficient and exact method for parallelization of grid-based algorithms and its implementation in DelPhi

PubMed Central

Li, Chuan; Li, Lin; Zhang, Jie; Alexov, Emil

2012-01-01

The Gauss-Seidel method is a standard iterative numerical method widely used to solve a system of equations and, in general, is more efficient comparing to other iterative methods, such as the Jacobi method. However, standard implementation of the Gauss-Seidel method restricts its utilization in parallel computing due to its requirement of using updated neighboring values (i.e., in current iteration) as soon as they are available. Here we report an efficient and exact (not requiring assumptions) method to parallelize iterations and to reduce the computational time as a linear/nearly linear function of the number of CPUs. In contrast to other existing solutions, our method does not require any assumptions and is equally applicable for solving linear and nonlinear equations. This approach is implemented in the DelPhi program, which is a finite difference Poisson-Boltzmann equation solver to model electrostatics in molecular biology. This development makes the iterative procedure on obtaining the electrostatic potential distribution in the parallelized DelPhi several folds faster than that in the serial code. Further we demonstrate the advantages of the new parallelized DelPhi by computing the electrostatic potential and the corresponding energies of large supramolecular structures. PMID:22674480
Testing alternative ground water models using cross-validation and other methods

USGS Publications Warehouse

Foglia, L.; Mehl, S.W.; Hill, M.C.; Perona, P.; Burlando, P.

2007-01-01

Many methods can be used to test alternative ground water models. Of concern in this work are methods able to (1) rank alternative models (also called model discrimination) and (2) identify observations important to parameter estimates and predictions (equivalent to the purpose served by some types of sensitivity analysis). Some of the measures investigated are computationally efficient; others are computationally demanding. The latter are generally needed to account for model nonlinearity. The efficient model discrimination methods investigated include the information criteria: the corrected Akaike information criterion, Bayesian information criterion, and generalized cross-validation. The efficient sensitivity analysis measures used are dimensionless scaled sensitivity (DSS), composite scaled sensitivity, and parameter correlation coefficient (PCC); the other statistics are DFBETAS, Cook's D, and observation-prediction statistic. Acronyms are explained in the introduction. Cross-validation (CV) is a computationally intensive nonlinear method that is used for both model discrimination and sensitivity analysis. The methods are tested using up to five alternative parsimoniously constructed models of the ground water system of the Maggia Valley in southern Switzerland. The alternative models differ in their representation of hydraulic conductivity. A new method for graphically representing CV and sensitivity analysis results for complex models is presented and used to evaluate the utility of the efficient statistics. The results indicate that for model selection, the information criteria produce similar results at much smaller computational cost than CV. For identifying important observations, the only obviously inferior linear measure is DSS; the poor performance was expected because DSS does not include the effects of parameter correlation and PCC reveals large parameter correlations. ?? 2007 National Ground Water Association.
An accurate and efficient reliability-based design optimization using the second order reliability method and improved stability transformation method

NASA Astrophysics Data System (ADS)

Meng, Zeng; Yang, Dixiong; Zhou, Huanlin; Yu, Bo

2018-05-01

The first order reliability method has been extensively adopted for reliability-based design optimization (RBDO), but it shows inaccuracy in calculating the failure probability with highly nonlinear performance functions. Thus, the second order reliability method is required to evaluate the reliability accurately. However, its application for RBDO is quite challenge owing to the expensive computational cost incurred by the repeated reliability evaluation and Hessian calculation of probabilistic constraints. In this article, a new improved stability transformation method is proposed to search the most probable point efficiently, and the Hessian matrix is calculated by the symmetric rank-one update. The computational capability of the proposed method is illustrated and compared to the existing RBDO approaches through three mathematical and two engineering examples. The comparison results indicate that the proposed method is very efficient and accurate, providing an alternative tool for RBDO of engineering structures.
Least squares QR-based decomposition provides an efficient way of computing optimal regularization parameter in photoacoustic tomography.

PubMed

Shaw, Calvin B; Prakash, Jaya; Pramanik, Manojit; Yalavarthy, Phaneendra K

2013-08-01

A computationally efficient approach that computes the optimal regularization parameter for the Tikhonov-minimization scheme is developed for photoacoustic imaging. This approach is based on the least squares-QR decomposition which is a well-known dimensionality reduction technique for a large system of equations. It is shown that the proposed framework is effective in terms of quantitative and qualitative reconstructions of initial pressure distribution enabled via finding an optimal regularization parameter. The computational efficiency and performance of the proposed method are shown using a test case of numerical blood vessel phantom, where the initial pressure is exactly known for quantitative comparison.
Secure Multiparty Quantum Computation for Summation and Multiplication.

PubMed

Shi, Run-hua; Mu, Yi; Zhong, Hong; Cui, Jie; Zhang, Shun

2016-01-21

As a fundamental primitive, Secure Multiparty Summation and Multiplication can be used to build complex secure protocols for other multiparty computations, specially, numerical computations. However, there is still lack of systematical and efficient quantum methods to compute Secure Multiparty Summation and Multiplication. In this paper, we present a novel and efficient quantum approach to securely compute the summation and multiplication of multiparty private inputs, respectively. Compared to classical solutions, our proposed approach can ensure the unconditional security and the perfect privacy protection based on the physical principle of quantum mechanics.

Secure Multiparty Quantum Computation for Summation and Multiplication

PubMed Central

Shi, Run-hua; Mu, Yi; Zhong, Hong; Cui, Jie; Zhang, Shun

2016-01-01

As a fundamental primitive, Secure Multiparty Summation and Multiplication can be used to build complex secure protocols for other multiparty computations, specially, numerical computations. However, there is still lack of systematical and efficient quantum methods to compute Secure Multiparty Summation and Multiplication. In this paper, we present a novel and efficient quantum approach to securely compute the summation and multiplication of multiparty private inputs, respectively. Compared to classical solutions, our proposed approach can ensure the unconditional security and the perfect privacy protection based on the physical principle of quantum mechanics. PMID:26792197
High-efficiency photorealistic computer-generated holograms based on the backward ray-tracing technique

NASA Astrophysics Data System (ADS)

Wang, Yuan; Chen, Zhidong; Sang, Xinzhu; Li, Hui; Zhao, Linmin

2018-03-01

Holographic displays can provide the complete optical wave field of a three-dimensional (3D) scene, including the depth perception. However, it often takes a long computation time to produce traditional computer-generated holograms (CGHs) without more complex and photorealistic rendering. The backward ray-tracing technique is able to render photorealistic high-quality images, which noticeably reduce the computation time achieved from the high-degree parallelism. Here, a high-efficiency photorealistic computer-generated hologram method is presented based on the ray-tracing technique. Rays are parallelly launched and traced under different illuminations and circumstances. Experimental results demonstrate the effectiveness of the proposed method. Compared with the traditional point cloud CGH, the computation time is decreased to 24 s to reconstruct a 3D object of 100 ×100 rays with continuous depth change.
A Secure and Verifiable Outsourced Access Control Scheme in Fog-Cloud Computing.

PubMed

Fan, Kai; Wang, Junxiong; Wang, Xin; Li, Hui; Yang, Yintang

2017-07-24

With the rapid development of big data and Internet of things (IOT), the number of networking devices and data volume are increasing dramatically. Fog computing, which extends cloud computing to the edge of the network can effectively solve the bottleneck problems of data transmission and data storage. However, security and privacy challenges are also arising in the fog-cloud computing environment. Ciphertext-policy attribute-based encryption (CP-ABE) can be adopted to realize data access control in fog-cloud computing systems. In this paper, we propose a verifiable outsourced multi-authority access control scheme, named VO-MAACS. In our construction, most encryption and decryption computations are outsourced to fog devices and the computation results can be verified by using our verification method. Meanwhile, to address the revocation issue, we design an efficient user and attribute revocation method for it. Finally, analysis and simulation results show that our scheme is both secure and highly efficient.
A flexible and accurate digital volume correlation method applicable to high-resolution volumetric images

NASA Astrophysics Data System (ADS)

Pan, Bing; Wang, Bo

2017-10-01

Digital volume correlation (DVC) is a powerful technique for quantifying interior deformation within solid opaque materials and biological tissues. In the last two decades, great efforts have been made to improve the accuracy and efficiency of the DVC algorithm. However, there is still a lack of a flexible, robust and accurate version that can be efficiently implemented in personal computers with limited RAM. This paper proposes an advanced DVC method that can realize accurate full-field internal deformation measurement applicable to high-resolution volume images with up to billions of voxels. Specifically, a novel layer-wise reliability-guided displacement tracking strategy combined with dynamic data management is presented to guide the DVC computation from slice to slice. The displacements at specified calculation points in each layer are computed using the advanced 3D inverse-compositional Gauss-Newton algorithm with the complete initial guess of the deformation vector accurately predicted from the computed calculation points. Since only limited slices of interest in the reference and deformed volume images rather than the whole volume images are required, the DVC calculation can thus be efficiently implemented on personal computers. The flexibility, accuracy and efficiency of the presented DVC approach are demonstrated by analyzing computer-simulated and experimentally obtained high-resolution volume images.
A multilevel finite element method for Fredholm integral eigenvalue problems

NASA Astrophysics Data System (ADS)

Xie, Hehu; Zhou, Tao

2015-12-01

In this work, we proposed a multigrid finite element (MFE) method for solving the Fredholm integral eigenvalue problems. The main motivation for such studies is to compute the Karhunen-Loève expansions of random fields, which play an important role in the applications of uncertainty quantification. In our MFE framework, solving the eigenvalue problem is converted to doing a series of integral iterations and eigenvalue solving in the coarsest mesh. Then, any existing efficient integration scheme can be used for the associated integration process. The error estimates are provided, and the computational complexity is analyzed. It is noticed that the total computational work of our method is comparable with a single integration step in the finest mesh. Several numerical experiments are presented to validate the efficiency of the proposed numerical method.
Solving the Coupled System Improves Computational Efficiency of the Bidomain Equations

PubMed Central

Southern, James A.; Plank, Gernot; Vigmond, Edward J.; Whiteley, Jonathan P.

2017-01-01

The bidomain equations are frequently used to model the propagation of cardiac action potentials across cardiac tissue. At the whole organ level the size of the computational mesh required makes their solution a significant computational challenge. As the accuracy of the numerical solution cannot be compromised, efficiency of the solution technique is important to ensure that the results of the simulation can be obtained in a reasonable time whilst still encapsulating the complexities of the system. In an attempt to increase efficiency of the solver, the bidomain equations are often decoupled into one parabolic equation that is computationally very cheap to solve and an elliptic equation that is much more expensive to solve. In this study the performance of this uncoupled solution method is compared with an alternative strategy in which the bidomain equations are solved as a coupled system. This seems counter-intuitive as the alternative method requires the solution of a much larger linear system at each time step. However, in tests on two 3-D rabbit ventricle benchmarks it is shown that the coupled method is up to 80% faster than the conventional uncoupled method — and that parallel performance is better for the larger coupled problem. PMID:19457741
Large-scale inverse model analyses employing fast randomized data reduction

NASA Astrophysics Data System (ADS)

Lin, Youzuo; Le, Ellen B.; O'Malley, Daniel; Vesselinov, Velimir V.; Bui-Thanh, Tan

2017-08-01

When the number of observations is large, it is computationally challenging to apply classical inverse modeling techniques. We have developed a new computationally efficient technique for solving inverse problems with a large number of observations (e.g., on the order of 107 or greater). Our method, which we call the randomized geostatistical approach (RGA), is built upon the principal component geostatistical approach (PCGA). We employ a data reduction technique combined with the PCGA to improve the computational efficiency and reduce the memory usage. Specifically, we employ a randomized numerical linear algebra technique based on a so-called "sketching" matrix to effectively reduce the dimension of the observations without losing the information content needed for the inverse analysis. In this way, the computational and memory costs for RGA scale with the information content rather than the size of the calibration data. Our algorithm is coded in Julia and implemented in the MADS open-source high-performance computational framework (http://mads.lanl.gov). We apply our new inverse modeling method to invert for a synthetic transmissivity field. Compared to a standard geostatistical approach (GA), our method is more efficient when the number of observations is large. Most importantly, our method is capable of solving larger inverse problems than the standard GA and PCGA approaches. Therefore, our new model inversion method is a powerful tool for solving large-scale inverse problems. The method can be applied in any field and is not limited to hydrogeological applications such as the characterization of aquifer heterogeneity.
Computation of Quasiperiodic Normally Hyperbolic Invariant Tori: Rigorous Results

NASA Astrophysics Data System (ADS)

Canadell, Marta; Haro, Àlex

2017-12-01

The development of efficient methods for detecting quasiperiodic oscillations and computing the corresponding invariant tori is a subject of great importance in dynamical systems and their applications in science and engineering. In this paper, we prove the convergence of a new Newton-like method for computing quasiperiodic normally hyperbolic invariant tori carrying quasiperiodic motion in smooth families of real-analytic dynamical systems. The main result is stated as an a posteriori KAM-like theorem that allows controlling the inner dynamics on the torus with appropriate detuning parameters, in order to obtain a prescribed quasiperiodic motion. The Newton-like method leads to several fast and efficient computational algorithms, which are discussed and tested in a companion paper (Canadell and Haro in J Nonlinear Sci, 2017. doi: 10.1007/s00332-017-9388-z), in which new mechanisms of breakdown are presented.
Efficient, graph-based white matter connectivity from orientation distribution functions via multi-directional graph propagation

NASA Astrophysics Data System (ADS)

Boucharin, Alexis; Oguz, Ipek; Vachet, Clement; Shi, Yundi; Sanchez, Mar; Styner, Martin

2011-03-01

The use of regional connectivity measurements derived from diffusion imaging datasets has become of considerable interest in the neuroimaging community in order to better understand cortical and subcortical white matter connectivity. Current connectivity assessment methods are based on streamline fiber tractography, usually applied in a Monte-Carlo fashion. In this work we present a novel, graph-based method that performs a fully deterministic, efficient and stable connectivity computation. The method handles crossing fibers and deals well with multiple seed regions. The computation is based on a multi-directional graph propagation method applied to sampled orientation distribution function (ODF), which can be computed directly from the original diffusion imaging data. We show early results of our method on synthetic and real datasets. The results illustrate the potential of our method towards subjectspecific connectivity measurements that are performed in an efficient, stable and reproducible manner. Such individual connectivity measurements would be well suited for application in population studies of neuropathology, such as Autism, Huntington's Disease, Multiple Sclerosis or leukodystrophies. The proposed method is generic and could easily be applied to non-diffusion data as long as local directional data can be derived.
Computationally efficient method for localizing the spiral rotor source using synthetic intracardiac electrograms during atrial fibrillation.

PubMed

Shariat, M H; Gazor, S; Redfearn, D

2015-08-01

Atrial fibrillation (AF), the most common sustained cardiac arrhythmia, is an extremely costly public health problem. Catheter-based ablation is a common minimally invasive procedure to treat AF. Contemporary mapping methods are highly dependent on the accuracy of anatomic localization of rotor sources within the atria. In this paper, using simulated atrial intracardiac electrograms (IEGMs) during AF, we propose a computationally efficient method for localizing the tip of the electrical rotor with an Archimedean/arithmetic spiral wavefront. The proposed method deploys the locations of electrodes of a catheter and their IEGMs activation times to estimate the unknown parameters of the spiral wavefront including its tip location. The proposed method is able to localize the spiral as soon as the wave hits three electrodes of the catheter. Our simulation results show that the method can efficiently localize the spiral wavefront that rotates either clockwise or counterclockwise.
A method to efficiently apply a biogeochemical model to a landscape.

Treesearch

Robert E. Kennedy; David P. Turner; Warren B. Cohen; Michael Guzy

2006-01-01

Biogeochemical models offer an important means of understanding carbon dynamics, but the computational complexity of many models means that modeling all grid cells on a large landscape is computationally burdensome. Because most biogeochemical models ignore adjacency effects between cells, however, a more efficient approach is possible. Recognizing that spatial...
Time and Learning Efficiency in Internet-Based Learning: A Systematic Review and Meta-Analysis

ERIC Educational Resources Information Center

Cook, David A.; Levinson, Anthony J.; Garside, Sarah

2010-01-01

Authors have claimed that Internet-based instruction promotes greater learning efficiency than non-computer methods. Objectives Determine, through a systematic synthesis of evidence in health professions education, how Internet-based instruction compares with non-computer instruction in time spent learning, and what features of Internet-based…
Efficient Stochastic Inversion Using Adjoint Models and Kernel-PCA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Thimmisetty, Charanraj A.; Zhao, Wenju; Chen, Xiao

2017-10-18

Performing stochastic inversion on a computationally expensive forward simulation model with a high-dimensional uncertain parameter space (e.g. a spatial random field) is computationally prohibitive even when gradient information can be computed efficiently. Moreover, the ‘nonlinear’ mapping from parameters to observables generally gives rise to non-Gaussian posteriors even with Gaussian priors, thus hampering the use of efficient inversion algorithms designed for models with Gaussian assumptions. In this paper, we propose a novel Bayesian stochastic inversion methodology, which is characterized by a tight coupling between the gradient-based Langevin Markov Chain Monte Carlo (LMCMC) method and a kernel principal component analysis (KPCA). Thismore » approach addresses the ‘curse-of-dimensionality’ via KPCA to identify a low-dimensional feature space within the high-dimensional and nonlinearly correlated parameter space. In addition, non-Gaussian posterior distributions are estimated via an efficient LMCMC method on the projected low-dimensional feature space. We will demonstrate this computational framework by integrating and adapting our recent data-driven statistics-on-manifolds constructions and reduction-through-projection techniques to a linear elasticity model.« less
An efficient hybrid technique in RCS predictions of complex targets at high frequencies

NASA Astrophysics Data System (ADS)

Algar, María-Jesús; Lozano, Lorena; Moreno, Javier; González, Iván; Cátedra, Felipe

2017-09-01

Most computer codes in Radar Cross Section (RCS) prediction use Physical Optics (PO) and Physical theory of Diffraction (PTD) combined with Geometrical Optics (GO) and Geometrical Theory of Diffraction (GTD). The latter approaches are computationally cheaper and much more accurate for curved surfaces, but not applicable for the computation of the RCS of all surfaces of a complex object due to the presence of caustic problems in the analysis of concave surfaces or flat surfaces in the far field. The main contribution of this paper is the development of a hybrid method based on a new combination of two asymptotic techniques: GTD and PO, considering the advantages and avoiding the disadvantages of each of them. A very efficient and accurate method to analyze the RCS of complex structures at high frequencies is obtained with the new combination. The proposed new method has been validated comparing RCS results obtained for some simple cases using the proposed approach and RCS using the rigorous technique of Method of Moments (MoM). Some complex cases have been examined at high frequencies contrasting the results with PO. This study shows the accuracy and the efficiency of the hybrid method and its suitability for the computation of the RCS at really large and complex targets at high frequencies.
Marc Henry de Frahan | NREL

Science.gov Websites

Computing Project, Marc develops high-fidelity turbulence models to enhance simulation accuracy and efficient numerical algorithms for future high performance computing hardware architectures. Research Interests High performance computing High order numerical methods for computational fluid dynamics Fluid
A CLASS OF RECONSTRUCTED DISCONTINUOUS GALERKIN METHODS IN COMPUTATIONAL FLUID DYNAMICS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hong Luo; Yidong Xia; Robert Nourgaliev

2011-05-01

A class of reconstructed discontinuous Galerkin (DG) methods is presented to solve compressible flow problems on arbitrary grids. The idea is to combine the efficiency of the reconstruction methods in finite volume methods and the accuracy of the DG methods to obtain a better numerical algorithm in computational fluid dynamics. The beauty of the resulting reconstructed discontinuous Galerkin (RDG) methods is that they provide a unified formulation for both finite volume and DG methods, and contain both classical finite volume and standard DG methods as two special cases of the RDG methods, and thus allow for a direct efficiency comparison.more » Both Green-Gauss and least-squares reconstruction methods and a least-squares recovery method are presented to obtain a quadratic polynomial representation of the underlying linear discontinuous Galerkin solution on each cell via a so-called in-cell reconstruction process. The devised in-cell reconstruction is aimed to augment the accuracy of the discontinuous Galerkin method by increasing the order of the underlying polynomial solution. These three reconstructed discontinuous Galerkin methods are used to compute a variety of compressible flow problems on arbitrary meshes to assess their accuracy. The numerical experiments demonstrate that all three reconstructed discontinuous Galerkin methods can significantly improve the accuracy of the underlying second-order DG method, although the least-squares reconstructed DG method provides the best performance in terms of both accuracy, efficiency, and robustness.« less
Efficient modeling of interconnects and capacitive discontinuities in high-speed digital circuits. Thesis

NASA Technical Reports Server (NTRS)

Oh, K. S.; Schutt-Aine, J.

1995-01-01

Modeling of interconnects and associated discontinuities with the recent advances high-speed digital circuits has gained a considerable interest over the last decade although the theoretical bases for analyzing these structures were well-established as early as the 1960s. Ongoing research at the present time is focused on devising methods which can be applied to more general geometries than the ones considered in earlier days and, at the same time, improving the computational efficiency and accuracy of these methods. In this thesis, numerically efficient methods to compute the transmission line parameters of a multiconductor system and the equivalent capacitances of various strip discontinuities are presented based on the quasi-static approximation. The presented techniques are applicable to conductors embedded in an arbitrary number of dielectric layers with two possible locations of ground planes at the top and bottom of the dielectric layers. The cross-sections of conductors can be arbitrary as long as they can be described with polygons. An integral equation approach in conjunction with the collocation method is used in the presented methods. A closed-form Green's function is derived based on weighted real images thus avoiding nested infinite summations in the exact Green's function; therefore, this closed-form Green's function is numerically more efficient than the exact Green's function. All elements associated with the moment matrix are computed using the closed-form formulas. Various numerical examples are considered to verify the presented methods, and a comparison of the computed results with other published results showed good agreement.
Efficient iterative method for solving the Dirac-Kohn-Sham density functional theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Lin; Shao, Sihong; E, Weinan

2012-11-06

We present for the first time an efficient iterative method to directly solve the four-component Dirac-Kohn-Sham (DKS) density functional theory. Due to the existence of the negative energy continuum in the DKS operator, the existing iterative techniques for solving the Kohn-Sham systems cannot be efficiently applied to solve the DKS systems. The key component of our method is a novel filtering step (F) which acts as a preconditioner in the framework of the locally optimal block preconditioned conjugate gradient (LOBPCG) method. The resulting method, dubbed the LOBPCG-F method, is able to compute the desired eigenvalues and eigenvectors in the positive energy band without computing any state in the negative energy band. The LOBPCG-F method introduces mild extra cost compared to the standard LOBPCG method and can be easily implemented. We demonstrate our method in the pseudopotential framework with a planewave basis set which naturally satisfies the kinetic balance prescription. Numerical results for Ptmore » $$_{2}$$, Au$$_{2}$$, TlF, and Bi$$_{2}$$Se$$_{3}$$ indicate that the LOBPCG-F method is a robust and efficient method for investigating the relativistic effect in systems containing heavy elements.« less
An efficient implementation of a high-order filter for a cubed-sphere spectral element model

NASA Astrophysics Data System (ADS)

Kang, Hyun-Gyu; Cheong, Hyeong-Bin

2017-03-01

A parallel-scalable, isotropic, scale-selective spatial filter was developed for the cubed-sphere spectral element model on the sphere. The filter equation is a high-order elliptic (Helmholtz) equation based on the spherical Laplacian operator, which is transformed into cubed-sphere local coordinates. The Laplacian operator is discretized on the computational domain, i.e., on each cell, by the spectral element method with Gauss-Lobatto Lagrange interpolating polynomials (GLLIPs) as the orthogonal basis functions. On the global domain, the discrete filter equation yielded a linear system represented by a highly sparse matrix. The density of this matrix increases quadratically (linearly) with the order of GLLIP (order of the filter), and the linear system is solved in only O (Ng) operations, where Ng is the total number of grid points. The solution, obtained by a row reduction method, demonstrated the typical accuracy and convergence rate of the cubed-sphere spectral element method. To achieve computational efficiency on parallel computers, the linear system was treated by an inverse matrix method (a sparse matrix-vector multiplication). The density of the inverse matrix was lowered to only a few times of the original sparse matrix without degrading the accuracy of the solution. For better computational efficiency, a local-domain high-order filter was introduced: The filter equation is applied to multiple cells, and then the central cell was only used to reconstruct the filtered field. The parallel efficiency of applying the inverse matrix method to the global- and local-domain filter was evaluated by the scalability on a distributed-memory parallel computer. The scale-selective performance of the filter was demonstrated on Earth topography. The usefulness of the filter as a hyper-viscosity for the vorticity equation was also demonstrated.
Efficient method for computing the electronic transport properties of a multiterminal system

NASA Astrophysics Data System (ADS)

Lima, Leandro R. F.; Dusko, Amintor; Lewenkopf, Caio

2018-04-01

We present a multiprobe recursive Green's function method to compute the transport properties of mesoscopic systems using the Landauer-Büttiker approach. By introducing an adaptive partition scheme, we map the multiprobe problem into the standard two-probe recursive Green's function method. We apply the method to compute the longitudinal and Hall resistances of a disordered graphene sample, a system of current interest. We show that the performance and accuracy of our method compares very well with other state-of-the-art schemes.

Efficient tiled calculation of over-10-gigapixel holograms using ray-wavefront conversion.

PubMed

Igarashi, Shunsuke; Nakamura, Tomoya; Matsushima, Kyoji; Yamaguchi, Masahiro

2018-04-16

In the calculation of large-scale computer-generated holograms, an approach called "tiling," which divides the hologram plane into small rectangles, is often employed due to limitations on computational memory. However, the total amount of computational complexity severely increases with the number of divisions. In this paper, we propose an efficient method for calculating tiled large-scale holograms using ray-wavefront conversion. In experiments, the effectiveness of the proposed method was verified by comparing its calculation cost with that using the previous method. Additionally, a hologram of 128K × 128K pixels was calculated and fabricated by a laser-lithography system, and a high-quality 105 mm × 105 mm 3D image including complicated reflection and translucency was optically reconstructed.
Research in Computational Astrobiology

NASA Technical Reports Server (NTRS)

Chaban, Galina; Colombano, Silvano; Scargle, Jeff; New, Michael H.; Pohorille, Andrew; Wilson, Michael A.

2003-01-01

We report on several projects in the field of computational astrobiology, which is devoted to advancing our understanding of the origin, evolution and distribution of life in the Universe using theoretical and computational tools. Research projects included modifying existing computer simulation codes to use efficient, multiple time step algorithms, statistical methods for analysis of astrophysical data via optimal partitioning methods, electronic structure calculations on water-nuclei acid complexes, incorporation of structural information into genomic sequence analysis methods and calculations of shock-induced formation of polycylic aromatic hydrocarbon compounds.
Multi-fidelity stochastic collocation method for computation of statistical moments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhu, Xueyu, E-mail: xueyu-zhu@uiowa.edu; Linebarger, Erin M., E-mail: aerinline@sci.utah.edu; Xiu, Dongbin, E-mail: xiu.16@osu.edu

We present an efficient numerical algorithm to approximate the statistical moments of stochastic problems, in the presence of models with different fidelities. The method extends the multi-fidelity approximation method developed in . By combining the efficiency of low-fidelity models and the accuracy of high-fidelity models, our method exhibits fast convergence with a limited number of high-fidelity simulations. We establish an error bound of the method and present several numerical examples to demonstrate the efficiency and applicability of the multi-fidelity algorithm.
Propulsive efficiency of frog swimming with different feet and swimming patterns

PubMed Central

Jizhuang, Fan; Wei, Zhang; Bowen, Yuan; Gangfeng, Liu

2017-01-01

ABSTRACT Aquatic and terrestrial animals have different swimming performances and mechanical efficiencies based on their different swimming methods. To explore propulsion in swimming frogs, this study calculated mechanical efficiencies based on data describing aquatic and terrestrial webbed-foot shapes and swimming patterns. First, a simplified frog model and dynamic equation were established, and hydrodynamic forces on the foot were computed according to computational fluid dynamic calculations. Then, a two-link mechanism was used to stand in for the diverse and complicated hind legs found in different frog species, in order to simplify the input work calculation. Joint torques were derived based on the virtual work principle to compute the efficiency of foot propulsion. Finally, two feet and swimming patterns were combined to compute propulsive efficiency. The aquatic frog demonstrated a propulsive efficiency (43.11%) between those of drag-based and lift-based propulsions, while the terrestrial frog efficiency (29.58%) fell within the range of drag-based propulsion. The results illustrate the main factor of swimming patterns for swimming performance and efficiency. PMID:28302669
Extracting Communities from Complex Networks by the k-Dense Method

NASA Astrophysics Data System (ADS)

Saito, Kazumi; Yamada, Takeshi; Kazama, Kazuhiro

To understand the structural and functional properties of large-scale complex networks, it is crucial to efficiently extract a set of cohesive subnetworks as communities. There have been proposed several such community extraction methods in the literature, including the classical k-core decomposition method and, more recently, the k-clique based community extraction method. The k-core method, although computationally efficient, is often not powerful enough for uncovering a detailed community structure and it produces only coarse-grained and loosely connected communities. The k-clique method, on the other hand, can extract fine-grained and tightly connected communities but requires a substantial amount of computational load for large-scale complex networks. In this paper, we present a new notion of a subnetwork called k-dense, and propose an efficient algorithm for extracting k-dense communities. We applied our method to the three different types of networks assembled from real data, namely, from blog trackbacks, word associations and Wikipedia references, and demonstrated that the k-dense method could extract communities almost as efficiently as the k-core method, while the qualities of the extracted communities are comparable to those obtained by the k-clique method.
Research related to improved computer aided design software package. [comparative efficiency of finite, boundary, and hybrid element methods in elastostatics

NASA Technical Reports Server (NTRS)

Walston, W. H., Jr.

1986-01-01

The comparative computational efficiencies of the finite element (FEM), boundary element (BEM), and hybrid boundary element-finite element (HVFEM) analysis techniques are evaluated for representative bounded domain interior and unbounded domain exterior problems in elastostatics. Computational efficiency is carefully defined in this study as the computer time required to attain a specified level of solution accuracy. The study found the FEM superior to the BEM for the interior problem, while the reverse was true for the exterior problem. The hybrid analysis technique was found to be comparable or superior to both the FEM and BEM for both the interior and exterior problems.
Efficient implementation of a 3-dimensional ADI method on the iPSC/860

DOE Office of Scientific and Technical Information (OSTI.GOV)

Van der Wijngaart, R.F.

1993-12-31

A comparison is made between several domain decomposition strategies for the solution of three-dimensional partial differential equations on a MIMD distributed memory parallel computer. The grids used are structured, and the numerical algorithm is ADI. Important implementation issues regarding load balancing, storage requirements, network latency, and overlap of computations and communications are discussed. Results of the solution of the three-dimensional heat equation on the Intel iPSC/860 are presented for the three most viable methods. It is found that the Bruno-Cappello decomposition delivers optimal computational speed through an almost complete elimination of processor idle time, while providing good memory efficiency.
Visualizing Matrix Multiplication

ERIC Educational Resources Information Center

Daugulis, Peteris; Sondore, Anita

2018-01-01

Efficient visualizations of computational algorithms are important tools for students, educators, and researchers. In this article, we point out an innovative visualization technique for matrix multiplication. This method differs from the standard, formal approach by using block matrices to make computations more visual. We find this method a…
Computer program modifications of Open-file report 82-1065; a comprehensive system for interpreting seismic-refraction and arrival-time data using interactive computer methods

USGS Publications Warehouse

Ackermann, Hans D.; Pankratz, Leroy W.; Dansereau, Danny A.

1983-01-01

The computer programs published in Open-File Report 82-1065, A comprehensive system for interpreting seismic-refraction arrival-time data using interactive computer methods (Ackermann, Pankratz, and Dansereau, 1982), have been modified to run on a mini-computer. The new version uses approximately 1/10 of the memory of the initial version, is more efficient and gives the same results.
Computing Interactions Of Free-Space Radiation With Matter

NASA Technical Reports Server (NTRS)

Wilson, J. W.; Cucinotta, F. A.; Shinn, J. L.; Townsend, L. W.; Badavi, F. F.; Tripathi, R. K.; Silberberg, R.; Tsao, C. H.; Badwar, G. D.

1995-01-01

High Charge and Energy Transport (HZETRN) computer program computationally efficient, user-friendly package of software adressing problem of transport of, and shielding against, radiation in free space. Designed as "black box" for design engineers not concerned with physics of underlying atomic and nuclear radiation processes in free-space environment, but rather primarily interested in obtaining fast and accurate dosimetric information for design and construction of modules and devices for use in free space. Computational efficiency achieved by unique algorithm based on deterministic approach to solution of Boltzmann equation rather than computationally intensive statistical Monte Carlo method. Written in FORTRAN.
Parallelization of the FLAPW method and comparison with the PPW method

NASA Astrophysics Data System (ADS)

Canning, Andrew; Mannstadt, Wolfgang; Freeman, Arthur

2000-03-01

The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining electronic and magnetic properties of crystals and surfaces. In the past the FLAPW method has been limited to systems of about a hundred atoms due to the lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell running on up to 512 processors on a Cray T3E parallel supercomputer. Some results will also be presented on a comparison of the plane-wave pseudopotential method and the FLAPW method on large systems.
A direct method for unfolding the resolution function from measurements of neutron induced reactions

NASA Astrophysics Data System (ADS)

Žugec, P.; Colonna, N.; Sabate-Gilarte, M.; Vlachoudis, V.; Massimi, C.; Lerendegui-Marco, J.; Stamatopoulos, A.; Bacak, M.; Warren, S. G.; n TOF Collaboration

2017-12-01

The paper explores the numerical stability and the computational efficiency of a direct method for unfolding the resolution function from the measurements of the neutron induced reactions. A detailed resolution function formalism is laid out, followed by an overview of challenges present in a practical implementation of the method. A special matrix storage scheme is developed in order to facilitate both the memory management of the resolution function matrix, and to increase the computational efficiency of the matrix multiplication and decomposition procedures. Due to its admirable computational properties, a Cholesky decomposition is at the heart of the unfolding procedure. With the smallest but necessary modification of the matrix to be decomposed, the method is successfully applied to system of 105 × 105. However, the amplification of the uncertainties during the direct inversion procedures limits the applicability of the method to high-precision measurements of neutron induced reactions.
Computationally Efficient 2D DOA Estimation with Uniform Rectangular Array in Low-Grazing Angle.

PubMed

Shi, Junpeng; Hu, Guoping; Zhang, Xiaofei; Sun, Fenggang; Xiao, Yu

2017-02-26

In this paper, we propose a computationally efficient spatial differencing matrix set (SDMS) method for two-dimensional direction of arrival (2D DOA) estimation with uniform rectangular arrays (URAs) in a low-grazing angle (LGA) condition. By rearranging the auto-correlation and cross-correlation matrices in turn among different subarrays, the SDMS method can estimate the two parameters independently with one-dimensional (1D) subspace-based estimation techniques, where we only perform difference for auto-correlation matrices and the cross-correlation matrices are kept completely. Then, the pair-matching of two parameters is achieved by extracting the diagonal elements of URA. Thus, the proposed method can decrease the computational complexity, suppress the effect of additive noise and also have little information loss. Simulation results show that, in LGA, compared to other methods, the proposed methods can achieve performance improvement in the white or colored noise conditions.
Computationally Efficient 2D DOA Estimation with Uniform Rectangular Array in Low-Grazing Angle

PubMed Central

Shi, Junpeng; Hu, Guoping; Zhang, Xiaofei; Sun, Fenggang; Xiao, Yu

2017-01-01

In this paper, we propose a computationally efficient spatial differencing matrix set (SDMS) method for two-dimensional direction of arrival (2D DOA) estimation with uniform rectangular arrays (URAs) in a low-grazing angle (LGA) condition. By rearranging the auto-correlation and cross-correlation matrices in turn among different subarrays, the SDMS method can estimate the two parameters independently with one-dimensional (1D) subspace-based estimation techniques, where we only perform difference for auto-correlation matrices and the cross-correlation matrices are kept completely. Then, the pair-matching of two parameters is achieved by extracting the diagonal elements of URA. Thus, the proposed method can decrease the computational complexity, suppress the effect of additive noise and also have little information loss. Simulation results show that, in LGA, compared to other methods, the proposed methods can achieve performance improvement in the white or colored noise conditions. PMID:28245634
Parallelization of the FLAPW method

NASA Astrophysics Data System (ADS)

Canning, A.; Mannstadt, W.; Freeman, A. J.

2000-08-01

The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining structural, electronic and magnetic properties of crystals and surfaces. Until the present work, the FLAPW method has been limited to systems of less than about a hundred atoms due to the lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work, we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell, running on up to 512 processors on a CRAY T3E parallel supercomputer.
Frozen Gaussian approximation based domain decomposition methods for the linear Schrödinger equation beyond the semi-classical regime

NASA Astrophysics Data System (ADS)

Lorin, E.; Yang, X.; Antoine, X.

2016-06-01

The paper is devoted to develop efficient domain decomposition methods for the linear Schrödinger equation beyond the semiclassical regime, which does not carry a small enough rescaled Planck constant for asymptotic methods (e.g. geometric optics) to produce a good accuracy, but which is too computationally expensive if direct methods (e.g. finite difference) are applied. This belongs to the category of computing middle-frequency wave propagation, where neither asymptotic nor direct methods can be directly used with both efficiency and accuracy. Motivated by recent works of the authors on absorbing boundary conditions (Antoine et al. (2014) [13] and Yang and Zhang (2014) [43]), we introduce Semiclassical Schwarz Waveform Relaxation methods (SSWR), which are seamless integrations of semiclassical approximation to Schwarz Waveform Relaxation methods. Two versions are proposed respectively based on Herman-Kluk propagation and geometric optics, and we prove the convergence and provide numerical evidence of efficiency and accuracy of these methods.
An efficient nonlinear finite-difference approach in the computational modeling of the dynamics of a nonlinear diffusion-reaction equation in microbial ecology.

PubMed

Macías-Díaz, J E; Macías, Siegfried; Medina-Ramírez, I E

2013-12-01

In this manuscript, we present a computational model to approximate the solutions of a partial differential equation which describes the growth dynamics of microbial films. The numerical technique reported in this work is an explicit, nonlinear finite-difference methodology which is computationally implemented using Newton's method. Our scheme is compared numerically against an implicit, linear finite-difference discretization of the same partial differential equation, whose computer coding requires an implementation of the stabilized bi-conjugate gradient method. Our numerical results evince that the nonlinear approach results in a more efficient approximation to the solutions of the biofilm model considered, and demands less computer memory. Moreover, the positivity of initial profiles is preserved in the practice by the nonlinear scheme proposed. Copyright © 2013 Elsevier Ltd. All rights reserved.
On modelling three-dimensional piezoelectric smart structures with boundary spectral element method

NASA Astrophysics Data System (ADS)

Zou, Fangxin; Aliabadi, M. H.

2017-05-01

The computational efficiency of the boundary element method in elastodynamic analysis can be significantly improved by employing high-order spectral elements for boundary discretisation. In this work, for the first time, the so-called boundary spectral element method is utilised to formulate the piezoelectric smart structures that are widely used in structural health monitoring (SHM) applications. The resultant boundary spectral element formulation has been validated by the finite element method (FEM) and physical experiments. The new formulation has demonstrated a lower demand on computational resources and a higher numerical stability than commercial FEM packages. Comparing to the conventional boundary element formulation, a significant reduction in computational expenses has been achieved. In summary, the boundary spectral element formulation presented in this paper provides a highly efficient and stable mathematical tool for the development of SHM applications.
Accelerating global optimization of aerodynamic shapes using a new surrogate-assisted parallel genetic algorithm

NASA Astrophysics Data System (ADS)

Ebrahimi, Mehdi; Jahangirian, Alireza

2017-12-01

An efficient strategy is presented for global shape optimization of wing sections with a parallel genetic algorithm. Several computational techniques are applied to increase the convergence rate and the efficiency of the method. A variable fidelity computational evaluation method is applied in which the expensive Navier-Stokes flow solver is complemented by an inexpensive multi-layer perceptron neural network for the objective function evaluations. A population dispersion method that consists of two phases, of exploration and refinement, is developed to improve the convergence rate and the robustness of the genetic algorithm. Owing to the nature of the optimization problem, a parallel framework based on the master/slave approach is used. The outcomes indicate that the method is able to find the global optimum with significantly lower computational time in comparison to the conventional genetic algorithm.
Efficient l1 -norm-based low-rank matrix approximations for large-scale problems using alternating rectified gradient method.

PubMed

Kim, Eunwoo; Lee, Minsik; Choi, Chong-Ho; Kwak, Nojun; Oh, Songhwai

2015-02-01

Low-rank matrix approximation plays an important role in the area of computer vision and image processing. Most of the conventional low-rank matrix approximation methods are based on the l2 -norm (Frobenius norm) with principal component analysis (PCA) being the most popular among them. However, this can give a poor approximation for data contaminated by outliers (including missing data), because the l2 -norm exaggerates the negative effect of outliers. Recently, to overcome this problem, various methods based on the l1 -norm, such as robust PCA methods, have been proposed for low-rank matrix approximation. Despite the robustness of the methods, they require heavy computational effort and substantial memory for high-dimensional data, which is impractical for real-world problems. In this paper, we propose two efficient low-rank factorization methods based on the l1 -norm that find proper projection and coefficient matrices using the alternating rectified gradient method. The proposed methods are applied to a number of low-rank matrix approximation problems to demonstrate their efficiency and robustness. The experimental results show that our proposals are efficient in both execution time and reconstruction performance unlike other state-of-the-art methods.

Improving the Efficiency and Effectiveness of Grading through the Use of Computer-Assisted Grading Rubrics

ERIC Educational Resources Information Center

Anglin, Linda; Anglin, Kenneth; Schumann, Paul L.; Kaliski, John A.

2008-01-01

This study tests the use of computer-assisted grading rubrics compared to other grading methods with respect to the efficiency and effectiveness of different grading processes for subjective assignments. The test was performed on a large Introduction to Business course. The students in this course were randomly assigned to four treatment groups…
Hadoop-MCC: Efficient Multiple Compound Comparison Algorithm Using Hadoop.

PubMed

Hua, Guan-Jie; Hung, Che-Lun; Tang, Chuan Yi

2018-01-01

In the past decade, the drug design technologies have been improved enormously. The computer-aided drug design (CADD) has played an important role in analysis and prediction in drug development, which makes the procedure more economical and efficient. However, computation with big data, such as ZINC containing more than 60 million compounds data and GDB-13 with more than 930 million small molecules, is a noticeable issue of time-consuming problem. Therefore, we propose a novel heterogeneous high performance computing method, named as Hadoop-MCC, integrating Hadoop and GPU, to copy with big chemical structure data efficiently. Hadoop-MCC gains the high availability and fault tolerance from Hadoop, as Hadoop is used to scatter input data to GPU devices and gather the results from GPU devices. Hadoop framework adopts mapper/reducer computation model. In the proposed method, mappers response for fetching SMILES data segments and perform LINGO method on GPU, then reducers collect all comparison results produced by mappers. Due to the high availability of Hadoop, all of LINGO computational jobs on mappers can be completed, even if some of the mappers encounter problems. A comparison of LINGO is performed on each the GPU device in parallel. According to the experimental results, the proposed method on multiple GPU devices can achieve better computational performance than the CUDA-MCC on a single GPU device. Hadoop-MCC is able to achieve scalability, high availability, and fault tolerance granted by Hadoop, and high performance as well by integrating computational power of both of Hadoop and GPU. It has been shown that using the heterogeneous architecture as Hadoop-MCC effectively can enhance better computational performance than on a single GPU device. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Automated Development of Accurate Algorithms and Efficient Codes for Computational Aeroacoustics

NASA Technical Reports Server (NTRS)

Goodrich, John W.; Dyson, Rodger W.

1999-01-01

The simulation of sound generation and propagation in three space dimensions with realistic aircraft components is a very large time dependent computation with fine details. Simulations in open domains with embedded objects require accurate and robust algorithms for propagation, for artificial inflow and outflow boundaries, and for the definition of geometrically complex objects. The development, implementation, and validation of methods for solving these demanding problems is being done to support the NASA pillar goals for reducing aircraft noise levels. Our goal is to provide algorithms which are sufficiently accurate and efficient to produce usable results rapidly enough to allow design engineers to study the effects on sound levels of design changes in propulsion systems, and in the integration of propulsion systems with airframes. There is a lack of design tools for these purposes at this time. Our technical approach to this problem combines the development of new, algorithms with the use of Mathematica and Unix utilities to automate the algorithm development, code implementation, and validation. We use explicit methods to ensure effective implementation by domain decomposition for SPMD parallel computing. There are several orders of magnitude difference in the computational efficiencies of the algorithms which we have considered. We currently have new artificial inflow and outflow boundary conditions that are stable, accurate, and unobtrusive, with implementations that match the accuracy and efficiency of the propagation methods. The artificial numerical boundary treatments have been proven to have solutions which converge to the full open domain problems, so that the error from the boundary treatments can be driven as low as is required. The purpose of this paper is to briefly present a method for developing highly accurate algorithms for computational aeroacoustics, the use of computer automation in this process, and a brief survey of the algorithms that have resulted from this work. A review of computational aeroacoustics has recently been given by Lele.
Unified commutation-pruning technique for efficient computation of composite DFTs

NASA Astrophysics Data System (ADS)

Castro-Palazuelos, David E.; Medina-Melendrez, Modesto Gpe.; Torres-Roman, Deni L.; Shkvarko, Yuriy V.

2015-12-01

An efficient computation of a composite length discrete Fourier transform (DFT), as well as a fast Fourier transform (FFT) of both time and space data sequences in uncertain (non-sparse or sparse) computational scenarios, requires specific processing algorithms. Traditional algorithms typically employ some pruning methods without any commutations, which prevents them from attaining the potential computational efficiency. In this paper, we propose an alternative unified approach with automatic commutations between three computational modalities aimed at efficient computations of the pruned DFTs adapted for variable composite lengths of the non-sparse input-output data. The first modality is an implementation of the direct computation of a composite length DFT, the second one employs the second-order recursive filtering method, and the third one performs the new pruned decomposed transform. The pruned decomposed transform algorithm performs the decimation in time or space (DIT) data acquisition domain and, then, decimation in frequency (DIF). The unified combination of these three algorithms is addressed as the DFTCOMM technique. Based on the treatment of the combinational-type hypotheses testing optimization problem of preferable allocations between all feasible commuting-pruning modalities, we have found the global optimal solution to the pruning problem that always requires a fewer or, at most, the same number of arithmetic operations than other feasible modalities. The DFTCOMM method outperforms the existing competing pruning techniques in the sense of attainable savings in the number of required arithmetic operations. It requires fewer or at most the same number of arithmetic operations for its execution than any other of the competing pruning methods reported in the literature. Finally, we provide the comparison of the DFTCOMM with the recently developed sparse fast Fourier transform (SFFT) algorithmic family. We feature that, in the sensing scenarios with sparse/non-sparse data Fourier spectrum, the DFTCOMM technique manifests robustness against such model uncertainties in the sense of insensitivity for sparsity/non-sparsity restrictions and the variability of the operating parameters.
A semi-Lagrangian transport method for kinetic problems with application to dense-to-dilute polydisperse reacting spray flows

NASA Astrophysics Data System (ADS)

Doisneau, François; Arienti, Marco; Oefelein, Joseph C.

2017-01-01

For sprays, as described by a kinetic disperse phase model strongly coupled to the Navier-Stokes equations, the resolution strategy is constrained by accuracy objectives, robustness needs, and the computing architecture. In order to leverage the good properties of the Eulerian formalism, we introduce a deterministic particle-based numerical method to solve transport in physical space, which is simple to adapt to the many types of closures and moment systems. The method is inspired by the semi-Lagrangian schemes, developed for Gas Dynamics. We show how semi-Lagrangian formulations are relevant for a disperse phase far from equilibrium and where the particle-particle coupling barely influences the transport; i.e., when particle pressure is negligible. The particle behavior is indeed close to free streaming. The new method uses the assumption of parcel transport and avoids to compute fluxes and their limiters, which makes it robust. It is a deterministic resolution method so that it does not require efforts on statistical convergence, noise control, or post-processing. All couplings are done among data under the form of Eulerian fields, which allows one to use efficient algorithms and to anticipate the computational load. This makes the method both accurate and efficient in the context of parallel computing. After a complete verification of the new transport method on various academic test cases, we demonstrate the overall strategy's ability to solve a strongly-coupled liquid jet with fine spatial resolution and we apply it to the case of high-fidelity Large Eddy Simulation of a dense spray flow. A fuel spray is simulated after atomization at Diesel engine combustion chamber conditions. The large, parallel, strongly coupled computation proves the efficiency of the method for dense, polydisperse, reacting spray flows.
A semi-Lagrangian transport method for kinetic problems with application to dense-to-dilute polydisperse reacting spray flows

DOE Office of Scientific and Technical Information (OSTI.GOV)

Doisneau, François, E-mail: fdoisne@sandia.gov; Arienti, Marco, E-mail: marient@sandia.gov; Oefelein, Joseph C., E-mail: oefelei@sandia.gov

For sprays, as described by a kinetic disperse phase model strongly coupled to the Navier–Stokes equations, the resolution strategy is constrained by accuracy objectives, robustness needs, and the computing architecture. In order to leverage the good properties of the Eulerian formalism, we introduce a deterministic particle-based numerical method to solve transport in physical space, which is simple to adapt to the many types of closures and moment systems. The method is inspired by the semi-Lagrangian schemes, developed for Gas Dynamics. We show how semi-Lagrangian formulations are relevant for a disperse phase far from equilibrium and where the particle–particle coupling barelymore » influences the transport; i.e., when particle pressure is negligible. The particle behavior is indeed close to free streaming. The new method uses the assumption of parcel transport and avoids to compute fluxes and their limiters, which makes it robust. It is a deterministic resolution method so that it does not require efforts on statistical convergence, noise control, or post-processing. All couplings are done among data under the form of Eulerian fields, which allows one to use efficient algorithms and to anticipate the computational load. This makes the method both accurate and efficient in the context of parallel computing. After a complete verification of the new transport method on various academic test cases, we demonstrate the overall strategy's ability to solve a strongly-coupled liquid jet with fine spatial resolution and we apply it to the case of high-fidelity Large Eddy Simulation of a dense spray flow. A fuel spray is simulated after atomization at Diesel engine combustion chamber conditions. The large, parallel, strongly coupled computation proves the efficiency of the method for dense, polydisperse, reacting spray flows.« less
Efficient biprediction decision scheme for fast high efficiency video coding encoding

NASA Astrophysics Data System (ADS)

Park, Sang-hyo; Lee, Seung-ho; Jang, Euee S.; Jun, Dongsan; Kang, Jung-Won

2016-11-01

An efficient biprediction decision scheme of high efficiency video coding (HEVC) is proposed for fast-encoding applications. For low-delay video applications, bidirectional prediction can be used to increase compression performance efficiently with previous reference frames. However, at the same time, the computational complexity of the HEVC encoder is significantly increased due to the additional biprediction search. Although a some research has attempted to reduce this complexity, whether the prediction is strongly related to both motion complexity and prediction modes in a coding unit has not yet been investigated. A method that avoids most compression-inefficient search points is proposed so that the computational complexity of the motion estimation process can be dramatically decreased. To determine if biprediction is critical, the proposed method exploits the stochastic correlation of the context of prediction units (PUs): the direction of a PU and the accuracy of a motion vector. Through experimental results, the proposed method showed that the time complexity of biprediction can be reduced to 30% on average, outperforming existing methods in view of encoding time, number of function calls, and memory access.
VOFTools - A software package of calculation tools for volume of fluid methods using general convex grids

NASA Astrophysics Data System (ADS)

López, J.; Hernández, J.; Gómez, P.; Faura, F.

2018-02-01

The VOFTools library includes efficient analytical and geometrical routines for (1) area/volume computation, (2) truncation operations that typically arise in VOF (volume of fluid) methods, (3) area/volume conservation enforcement (VCE) in PLIC (piecewise linear interface calculation) reconstruction and(4) computation of the distance from a given point to the reconstructed interface. The computation of a polyhedron volume uses an efficient formula based on a quadrilateral decomposition and a 2D projection of each polyhedron face. The analytical VCE method is based on coupling an interpolation procedure to bracket the solution with an improved final calculation step based on the above volume computation formula. Although the library was originally created to help develop highly accurate advection and reconstruction schemes in the context of VOF methods, it may have more general applications. To assess the performance of the supplied routines, different tests, which are provided in FORTRAN and C, were implemented for several 2D and 3D geometries.
Three-Dimensional Navier-Stokes Method with Two-Equation Turbulence Models for Efficient Numerical Simulation of Hypersonic Flows

NASA Technical Reports Server (NTRS)

Bardina, J. E.

1994-01-01

A new computational efficient 3-D compressible Reynolds-averaged implicit Navier-Stokes method with advanced two equation turbulence models for high speed flows is presented. All convective terms are modeled using an entropy satisfying higher-order Total Variation Diminishing (TVD) scheme based on implicit upwind flux-difference split approximations and arithmetic averaging procedure of primitive variables. This method combines the best features of data management and computational efficiency of space marching procedures with the generality and stability of time dependent Navier-Stokes procedures to solve flows with mixed supersonic and subsonic zones, including streamwise separated flows. Its robust stability derives from a combination of conservative implicit upwind flux-difference splitting with Roe's property U to provide accurate shock capturing capability that non-conservative schemes do not guarantee, alternating symmetric Gauss-Seidel 'method of planes' relaxation procedure coupled with a three-dimensional two-factor diagonal-dominant approximate factorization scheme, TVD flux limiters of higher-order flux differences satisfying realizability, and well-posed characteristic-based implicit boundary-point a'pproximations consistent with the local characteristics domain of dependence. The efficiency of the method is highly increased with Newton Raphson acceleration which allows convergence in essentially one forward sweep for supersonic flows. The method is verified by comparing with experiment and other Navier-Stokes methods. Here, results of adiabatic and cooled flat plate flows, compression corner flow, and 3-D hypersonic shock-wave/turbulent boundary layer interaction flows are presented. The robust 3-D method achieves a better computational efficiency of at least one order of magnitude over the CNS Navier-Stokes code. It provides cost-effective aerodynamic predictions in agreement with experiment, and the capability of predicting complex flow structures in complex geometries with good accuracy.
Feature extraction using first and second derivative extrema (FSDE) for real-time and hardware-efficient spike sorting.

PubMed

Paraskevopoulou, Sivylla E; Barsakcioglu, Deren Y; Saberi, Mohammed R; Eftekhar, Amir; Constandinou, Timothy G

2013-04-30

Next generation neural interfaces aspire to achieve real-time multi-channel systems by integrating spike sorting on chip to overcome limitations in communication channel capacity. The feasibility of this approach relies on developing highly efficient algorithms for feature extraction and clustering with the potential of low-power hardware implementation. We are proposing a feature extraction method, not requiring any calibration, based on first and second derivative features of the spike waveform. The accuracy and computational complexity of the proposed method are quantified and compared against commonly used feature extraction methods, through simulation across four datasets (with different single units) at multiple noise levels (ranging from 5 to 20% of the signal amplitude). The average classification error is shown to be below 7% with a computational complexity of 2N-3, where N is the number of sample points of each spike. Overall, this method presents a good trade-off between accuracy and computational complexity and is thus particularly well-suited for hardware-efficient implementation. Copyright © 2013 Elsevier B.V. All rights reserved.
Efficient convolutional sparse coding

DOEpatents

Wohlberg, Brendt

2017-06-20

Computationally efficient algorithms may be applied for fast dictionary learning solving the convolutional sparse coding problem in the Fourier domain. More specifically, efficient convolutional sparse coding may be derived within an alternating direction method of multipliers (ADMM) framework that utilizes fast Fourier transforms (FFT) to solve the main linear system in the frequency domain. Such algorithms may enable a significant reduction in computational cost over conventional approaches by implementing a linear solver for the most critical and computationally expensive component of the conventional iterative algorithm. The theoretical computational cost of the algorithm may be reduced from O(M.sup.3N) to O(MN log N), where N is the dimensionality of the data and M is the number of elements in the dictionary. This significant improvement in efficiency may greatly increase the range of problems that can practically be addressed via convolutional sparse representations.
Efficient analytical implementation of the DOT Riemann solver for the de Saint Venant-Exner morphodynamic model

NASA Astrophysics Data System (ADS)

Carraro, F.; Valiani, A.; Caleffi, V.

2018-03-01

Within the framework of the de Saint Venant equations coupled with the Exner equation for morphodynamic evolution, this work presents a new efficient implementation of the Dumbser-Osher-Toro (DOT) scheme for non-conservative problems. The DOT path-conservative scheme is a robust upwind method based on a complete Riemann solver, but it has the drawback of requiring expensive numerical computations. Indeed, to compute the non-linear time evolution in each time step, the DOT scheme requires numerical computation of the flux matrix eigenstructure (the totality of eigenvalues and eigenvectors) several times at each cell edge. In this work, an analytical and compact formulation of the eigenstructure for the de Saint Venant-Exner (dSVE) model is introduced and tested in terms of numerical efficiency and stability. Using the original DOT and PRICE-C (a very efficient FORCE-type method) as reference methods, we present a convergence analysis (error against CPU time) to study the performance of the DOT method with our new analytical implementation of eigenstructure calculations (A-DOT). In particular, the numerical performance of the three methods is tested in three test cases: a movable bed Riemann problem with analytical solution; a problem with smooth analytical solution; a test in which the water flow is characterised by subcritical and supercritical regions. For a given target error, the A-DOT method is always the most efficient choice. Finally, two experimental data sets and different transport formulae are considered to test the A-DOT model in more practical case studies.
Solving groundwater flow problems by conjugate-gradient methods and the strongly implicit procedure

USGS Publications Warehouse

Hill, Mary C.

1990-01-01

The performance of the preconditioned conjugate-gradient method with three preconditioners is compared with the strongly implicit procedure (SIP) using a scalar computer. The preconditioners considered are the incomplete Cholesky (ICCG) and the modified incomplete Cholesky (MICCG), which require the same computer storage as SIP as programmed for a problem with a symmetric matrix, and a polynomial preconditioner (POLCG), which requires less computer storage than SIP. Although POLCG is usually used on vector computers, it is included here because of its small storage requirements. In this paper, published comparisons of the solvers are evaluated, all four solvers are compared for the first time, and new test cases are presented to provide a more complete basis by which the solvers can be judged for typical groundwater flow problems. Based on nine test cases, the following conclusions are reached: (1) SIP is actually as efficient as ICCG for some of the published, linear, two-dimensional test cases that were reportedly solved much more efficiently by ICCG; (2) SIP is more efficient than other published comparisons would indicate when common convergence criteria are used; and (3) for problems that are three-dimensional, nonlinear, or both, and for which common convergence criteria are used, SIP is often more efficient than ICCG, and is sometimes more efficient than MICCG.
Optimized blind gamma-ray pulsar searches at fixed computing budget

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pletsch, Holger J.; Clark, Colin J., E-mail: holger.pletsch@aei.mpg.de

The sensitivity of blind gamma-ray pulsar searches in multiple years worth of photon data, as from the Fermi LAT, is primarily limited by the finite computational resources available. Addressing this 'needle in a haystack' problem, here we present methods for optimizing blind searches to achieve the highest sensitivity at fixed computing cost. For both coherent and semicoherent methods, we consider their statistical properties and study their search sensitivity under computational constraints. The results validate a multistage strategy, where the first stage scans the entire parameter space using an efficient semicoherent method and promising candidates are then refined through a fullymore » coherent analysis. We also find that for the first stage of a blind search incoherent harmonic summing of powers is not worthwhile at fixed computing cost for typical gamma-ray pulsars. Further enhancing sensitivity, we present efficiency-improved interpolation techniques for the semicoherent search stage. Via realistic simulations we demonstrate that overall these optimizations can significantly lower the minimum detectable pulsed fraction by almost 50% at the same computational expense.« less
Crowd Sensing-Enabling Security Service Recommendation for Social Fog Computing Systems

PubMed Central

Wu, Jun; Su, Zhou; Li, Jianhua

2017-01-01

Fog computing, shifting intelligence and resources from the remote cloud to edge networks, has the potential of providing low-latency for the communication from sensing data sources to users. For the objects from the Internet of Things (IoT) to the cloud, it is a new trend that the objects establish social-like relationships with each other, which efficiently brings the benefits of developed sociality to a complex environment. As fog service become more sophisticated, it will become more convenient for fog users to share their own services, resources, and data via social networks. Meanwhile, the efficient social organization can enable more flexible, secure, and collaborative networking. Aforementioned advantages make the social network a potential architecture for fog computing systems. In this paper, we design an architecture for social fog computing, in which the services of fog are provisioned based on “friend” relationships. To the best of our knowledge, this is the first attempt at an organized fog computing system-based social model. Meanwhile, social networking enhances the complexity and security risks of fog computing services, creating difficulties of security service recommendations in social fog computing. To address this, we propose a novel crowd sensing-enabling security service provisioning method to recommend security services accurately in social fog computing systems. Simulation results show the feasibilities and efficiency of the crowd sensing-enabling security service recommendation method for social fog computing systems. PMID:28758943
Crowd Sensing-Enabling Security Service Recommendation for Social Fog Computing Systems.

PubMed

Wu, Jun; Su, Zhou; Wang, Shen; Li, Jianhua

2017-07-30

Fog computing, shifting intelligence and resources from the remote cloud to edge networks, has the potential of providing low-latency for the communication from sensing data sources to users. For the objects from the Internet of Things (IoT) to the cloud, it is a new trend that the objects establish social-like relationships with each other, which efficiently brings the benefits of developed sociality to a complex environment. As fog service become more sophisticated, it will become more convenient for fog users to share their own services, resources, and data via social networks. Meanwhile, the efficient social organization can enable more flexible, secure, and collaborative networking. Aforementioned advantages make the social network a potential architecture for fog computing systems. In this paper, we design an architecture for social fog computing, in which the services of fog are provisioned based on "friend" relationships. To the best of our knowledge, this is the first attempt at an organized fog computing system-based social model. Meanwhile, social networking enhances the complexity and security risks of fog computing services, creating difficulties of security service recommendations in social fog computing. To address this, we propose a novel crowd sensing-enabling security service provisioning method to recommend security services accurately in social fog computing systems. Simulation results show the feasibilities and efficiency of the crowd sensing-enabling security service recommendation method for social fog computing systems.
A hybrid parallel architecture for electrostatic interactions in the simulation of dissipative particle dynamics

NASA Astrophysics Data System (ADS)

Yang, Sheng-Chun; Lu, Zhong-Yuan; Qian, Hu-Jun; Wang, Yong-Lei; Han, Jie-Ping

2017-11-01

In this work, we upgraded the electrostatic interaction method of CU-ENUF (Yang, et al., 2016) which first applied CUNFFT (nonequispaced Fourier transforms based on CUDA) to the reciprocal-space electrostatic computation and made the computation of electrostatic interaction done thoroughly in GPU. The upgraded edition of CU-ENUF runs concurrently in a hybrid parallel way that enables the computation parallelizing on multiple computer nodes firstly, then further on the installed GPU in each computer. By this parallel strategy, the size of simulation system will be never restricted to the throughput of a single CPU or GPU. The most critical technical problem is how to parallelize a CUNFFT in the parallel strategy, which is conquered effectively by deep-seated research of basic principles and some algorithm skills. Furthermore, the upgraded method is capable of computing electrostatic interactions for both the atomistic molecular dynamics (MD) and the dissipative particle dynamics (DPD). Finally, the benchmarks conducted for validation and performance indicate that the upgraded method is able to not only present a good precision when setting suitable parameters, but also give an efficient way to compute electrostatic interactions for huge simulation systems. Program Files doi:http://dx.doi.org/10.17632/zncf24fhpv.1 Licensing provisions: GNU General Public License 3 (GPL) Programming language: C, C++, and CUDA C Supplementary material: The program is designed for effective electrostatic interactions of large-scale simulation systems, which runs on particular computers equipped with NVIDIA GPUs. It has been tested on (a) single computer node with Intel(R) Core(TM) i7-3770@ 3.40 GHz (CPU) and GTX 980 Ti (GPU), and (b) MPI parallel computer nodes with the same configurations. Nature of problem: For molecular dynamics simulation, the electrostatic interaction is the most time-consuming computation because of its long-range feature and slow convergence in simulation space, which approximately take up most of the total simulation time. Although the parallel method CU-ENUF (Yang et al., 2016) based on GPU has achieved a qualitative leap compared with previous methods in electrostatic interactions computation, the computation capability is limited to the throughput capacity of a single GPU for super-scale simulation system. Therefore, we should look for an effective method to handle the calculation of electrostatic interactions efficiently for a simulation system with super-scale size. Solution method: We constructed a hybrid parallel architecture, in which CPU and GPU are combined to accelerate the electrostatic computation effectively. Firstly, the simulation system is divided into many subtasks via domain-decomposition method. Then MPI (Message Passing Interface) is used to implement the CPU-parallel computation with each computer node corresponding to a particular subtask, and furthermore each subtask in one computer node will be executed in GPU in parallel efficiently. In this hybrid parallel method, the most critical technical problem is how to parallelize a CUNFFT (nonequispaced fast Fourier transform based on CUDA) in the parallel strategy, which is conquered effectively by deep-seated research of basic principles and some algorithm skills. Restrictions: The HP-ENUF is mainly oriented to super-scale system simulations, in which the performance superiority is shown adequately. However, for a small simulation system containing less than 106 particles, the mode of multiple computer nodes has no apparent efficiency advantage or even lower efficiency due to the serious network delay among computer nodes, than the mode of single computer node. References: (1) S.-C. Yang, H.-J. Qian, Z.-Y. Lu, Appl. Comput. Harmon. Anal. 2016, http://dx.doi.org/10.1016/j.acha.2016.04.009. (2) S.-C. Yang, Y.-L. Wang, G.-S. Jiao, H.-J. Qian, Z.-Y. Lu, J. Comput. Chem. 37 (2016) 378. (3) S.-C. Yang, Y.-L. Zhu, H.-J. Qian, Z.-Y. Lu, Appl. Chem. Res. Chin. Univ., 2017, http://dx.doi.org/10.1007/s40242-016-6354-5. (4) Y.-L. Zhu, H. Liu, Z.-W. Li, H.-J. Qian, G. Milano, Z.-Y. Lu, J. Comput. Chem. 34 (2013) 2197.
Efficient Transition Probability Computation for Continuous-Time Branching Processes via Compressed Sensing.

PubMed

Xu, Jason; Minin, Vladimir N

2015-07-01

Branching processes are a class of continuous-time Markov chains (CTMCs) with ubiquitous applications. A general difficulty in statistical inference under partially observed CTMC models arises in computing transition probabilities when the discrete state space is large or uncountable. Classical methods such as matrix exponentiation are infeasible for large or countably infinite state spaces, and sampling-based alternatives are computationally intensive, requiring integration over all possible hidden events. Recent work has successfully applied generating function techniques to computing transition probabilities for linear multi-type branching processes. While these techniques often require significantly fewer computations than matrix exponentiation, they also become prohibitive in applications with large populations. We propose a compressed sensing framework that significantly accelerates the generating function method, decreasing computational cost up to a logarithmic factor by only assuming the probability mass of transitions is sparse. We demonstrate accurate and efficient transition probability computations in branching process models for blood cell formation and evolution of self-replicating transposable elements in bacterial genomes.
A Secure and Verifiable Outsourced Access Control Scheme in Fog-Cloud Computing

PubMed Central

Fan, Kai; Wang, Junxiong; Wang, Xin; Li, Hui; Yang, Yintang

2017-01-01

With the rapid development of big data and Internet of things (IOT), the number of networking devices and data volume are increasing dramatically. Fog computing, which extends cloud computing to the edge of the network can effectively solve the bottleneck problems of data transmission and data storage. However, security and privacy challenges are also arising in the fog-cloud computing environment. Ciphertext-policy attribute-based encryption (CP-ABE) can be adopted to realize data access control in fog-cloud computing systems. In this paper, we propose a verifiable outsourced multi-authority access control scheme, named VO-MAACS. In our construction, most encryption and decryption computations are outsourced to fog devices and the computation results can be verified by using our verification method. Meanwhile, to address the revocation issue, we design an efficient user and attribute revocation method for it. Finally, analysis and simulation results show that our scheme is both secure and highly efficient. PMID:28737733
Efficient Transition Probability Computation for Continuous-Time Branching Processes via Compressed Sensing

PubMed Central

Xu, Jason; Minin, Vladimir N.

2016-01-01

Branching processes are a class of continuous-time Markov chains (CTMCs) with ubiquitous applications. A general difficulty in statistical inference under partially observed CTMC models arises in computing transition probabilities when the discrete state space is large or uncountable. Classical methods such as matrix exponentiation are infeasible for large or countably infinite state spaces, and sampling-based alternatives are computationally intensive, requiring integration over all possible hidden events. Recent work has successfully applied generating function techniques to computing transition probabilities for linear multi-type branching processes. While these techniques often require significantly fewer computations than matrix exponentiation, they also become prohibitive in applications with large populations. We propose a compressed sensing framework that significantly accelerates the generating function method, decreasing computational cost up to a logarithmic factor by only assuming the probability mass of transitions is sparse. We demonstrate accurate and efficient transition probability computations in branching process models for blood cell formation and evolution of self-replicating transposable elements in bacterial genomes. PMID:26949377

Vectorization on the star computer of several numerical methods for a fluid flow problem

NASA Technical Reports Server (NTRS)

Lambiotte, J. J., Jr.; Howser, L. M.

1974-01-01

A reexamination of some numerical methods is considered in light of the new class of computers which use vector streaming to achieve high computation rates. A study has been made of the effect on the relative efficiency of several numerical methods applied to a particular fluid flow problem when they are implemented on a vector computer. The method of Brailovskaya, the alternating direction implicit method, a fully implicit method, and a new method called partial implicitization have been applied to the problem of determining the steady state solution of the two-dimensional flow of a viscous imcompressible fluid in a square cavity driven by a sliding wall. Results are obtained for three mesh sizes and a comparison is made of the methods for serial computation.
Efficiency of the neighbor-joining method in reconstructing deep and shallow evolutionary relationships in large phylogenies.

PubMed

Kumar, S; Gadagkar, S R

2000-12-01

The neighbor-joining (NJ) method is widely used in reconstructing large phylogenies because of its computational speed and the high accuracy in phylogenetic inference as revealed in computer simulation studies. However, most computer simulation studies have quantified the overall performance of the NJ method in terms of the percentage of branches inferred correctly or the percentage of replications in which the correct tree is recovered. We have examined other aspects of its performance, such as the relative efficiency in correctly reconstructing shallow (close to the external branches of the tree) and deep branches in large phylogenies; the contribution of zero-length branches to topological errors in the inferred trees; and the influence of increasing the tree size (number of sequences), evolutionary rate, and sequence length on the efficiency of the NJ method. Results show that the correct reconstruction of deep branches is no more difficult than that of shallower branches. The presence of zero-length branches in realized trees contributes significantly to the overall error observed in the NJ tree, especially in large phylogenies or slowly evolving genes. Furthermore, the tree size does not influence the efficiency of NJ in reconstructing shallow and deep branches in our simulation study, in which the evolutionary process is assumed to be homogeneous in all lineages.
A Hierarchical Algorithm for Fast Debye Summation with Applications to Small Angle Scattering

PubMed Central

Gumerov, Nail A.; Berlin, Konstantin; Fushman, David; Duraiswami, Ramani

2012-01-01

Debye summation, which involves the summation of sinc functions of distances between all pair of atoms in three dimensional space, arises in computations performed in crystallography, small/wide angle X-ray scattering (SAXS/WAXS) and small angle neutron scattering (SANS). Direct evaluation of Debye summation has quadratic complexity, which results in computational bottleneck when determining crystal properties, or running structure refinement protocols that involve SAXS or SANS, even for moderately sized molecules. We present a fast approximation algorithm that efficiently computes the summation to any prescribed accuracy ε in linear time. The algorithm is similar to the fast multipole method (FMM), and is based on a hierarchical spatial decomposition of the molecule coupled with local harmonic expansions and translation of these expansions. An even more efficient implementation is possible when the scattering profile is all that is required, as in small angle scattering reconstruction (SAS) of macromolecules. We examine the relationship of the proposed algorithm to existing approximate methods for profile computations, and show that these methods may result in inaccurate profile computations, unless an error bound derived in this paper is used. Our theoretical and computational results show orders of magnitude improvement in computation complexity over existing methods, while maintaining prescribed accuracy. PMID:22707386
Accurate optimization of amino acid form factors for computing small-angle X-ray scattering intensity of atomistic protein structures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tong, Dudu; Yang, Sichun; Lu, Lanyuan

2016-06-20

Structure modellingviasmall-angle X-ray scattering (SAXS) data generally requires intensive computations of scattering intensity from any given biomolecular structure, where the accurate evaluation of SAXS profiles using coarse-grained (CG) methods is vital to improve computational efficiency. To date, most CG SAXS computing methods have been based on a single-bead-per-residue approximation but have neglected structural correlations between amino acids. To improve the accuracy of scattering calculations, accurate CG form factors of amino acids are now derived using a rigorous optimization strategy, termed electron-density matching (EDM), to best fit electron-density distributions of protein structures. This EDM method is compared with and tested againstmore » other CG SAXS computing methods, and the resulting CG SAXS profiles from EDM agree better with all-atom theoretical SAXS data. By including the protein hydration shell represented by explicit CG water molecules and the correction of protein excluded volume, the developed CG form factors also reproduce the selected experimental SAXS profiles with very small deviations. Taken together, these EDM-derived CG form factors present an accurate and efficient computational approach for SAXS computing, especially when higher molecular details (represented by theqrange of the SAXS data) become necessary for effective structure modelling.« less
Change Detection of Mobile LIDAR Data Using Cloud Computing

NASA Astrophysics Data System (ADS)

Liu, Kun; Boehm, Jan; Alis, Christian

2016-06-01

Change detection has long been a challenging problem although a lot of research has been conducted in different fields such as remote sensing and photogrammetry, computer vision, and robotics. In this paper, we blend voxel grid and Apache Spark together to propose an efficient method to address the problem in the context of big data. Voxel grid is a regular geometry representation consisting of the voxels with the same size, which fairly suites parallel computation. Apache Spark is a popular distributed parallel computing platform which allows fault tolerance and memory cache. These features can significantly enhance the performance of Apache Spark and results in an efficient and robust implementation. In our experiments, both synthetic and real point cloud data are employed to demonstrate the quality of our method.
Solving large-scale PDE-constrained Bayesian inverse problems with Riemann manifold Hamiltonian Monte Carlo

NASA Astrophysics Data System (ADS)

Bui-Thanh, T.; Girolami, M.

2014-11-01

We consider the Riemann manifold Hamiltonian Monte Carlo (RMHMC) method for solving statistical inverse problems governed by partial differential equations (PDEs). The Bayesian framework is employed to cast the inverse problem into the task of statistical inference whose solution is the posterior distribution in infinite dimensional parameter space conditional upon observation data and Gaussian prior measure. We discretize both the likelihood and the prior using the H1-conforming finite element method together with a matrix transfer technique. The power of the RMHMC method is that it exploits the geometric structure induced by the PDE constraints of the underlying inverse problem. Consequently, each RMHMC posterior sample is almost uncorrelated/independent from the others providing statistically efficient Markov chain simulation. However this statistical efficiency comes at a computational cost. This motivates us to consider computationally more efficient strategies for RMHMC. At the heart of our construction is the fact that for Gaussian error structures the Fisher information matrix coincides with the Gauss-Newton Hessian. We exploit this fact in considering a computationally simplified RMHMC method combining state-of-the-art adjoint techniques and the superiority of the RMHMC method. Specifically, we first form the Gauss-Newton Hessian at the maximum a posteriori point and then use it as a fixed constant metric tensor throughout RMHMC simulation. This eliminates the need for the computationally costly differential geometric Christoffel symbols, which in turn greatly reduces computational effort at a corresponding loss of sampling efficiency. We further reduce the cost of forming the Fisher information matrix by using a low rank approximation via a randomized singular value decomposition technique. This is efficient since a small number of Hessian-vector products are required. The Hessian-vector product in turn requires only two extra PDE solves using the adjoint technique. Various numerical results up to 1025 parameters are presented to demonstrate the ability of the RMHMC method in exploring the geometric structure of the problem to propose (almost) uncorrelated/independent samples that are far away from each other, and yet the acceptance rate is almost unity. The results also suggest that for the PDE models considered the proposed fixed metric RMHMC can attain almost as high a quality performance as the original RMHMC, i.e. generating (almost) uncorrelated/independent samples, while being two orders of magnitude less computationally expensive.
An efficient and general numerical method to compute steady uniform vortices

NASA Astrophysics Data System (ADS)

Luzzatto-Fegiz, Paolo; Williamson, Charles H. K.

2011-07-01

Steady uniform vortices are widely used to represent high Reynolds number flows, yet their efficient computation still presents some challenges. Existing Newton iteration methods become inefficient as the vortices develop fine-scale features; in addition, these methods cannot, in general, find solutions with specified Casimir invariants. On the other hand, available relaxation approaches are computationally inexpensive, but can fail to converge to a solution. In this paper, we overcome these limitations by introducing a new discretization, based on an inverse-velocity map, which radically increases the efficiency of Newton iteration methods. In addition, we introduce a procedure to prescribe Casimirs and remove the degeneracies in the steady vorticity equation, thus ensuring convergence for general vortex configurations. We illustrate our methodology by considering several unbounded flows involving one or two vortices. Our method enables the computation, for the first time, of steady vortices that do not exhibit any geometric symmetry. In addition, we discover that, as the limiting vortex state for each flow is approached, each family of solutions traces a clockwise spiral in a bifurcation plot consisting of a velocity-impulse diagram. By the recently introduced "IVI diagram" stability approach [Phys. Rev. Lett. 104 (2010) 044504], each turn of this spiral is associated with a loss of stability for the steady flows. Such spiral structure is suggested to be a universal feature of steady, uniform-vorticity flows.
Deterministic absorbed dose estimation in computed tomography using a discrete ordinates method

DOE Office of Scientific and Technical Information (OSTI.GOV)

Norris, Edward T.; Liu, Xin, E-mail: xinliu@mst.edu; Hsieh, Jiang

Purpose: Organ dose estimation for a patient undergoing computed tomography (CT) scanning is very important. Although Monte Carlo methods are considered gold-standard in patient dose estimation, the computation time required is formidable for routine clinical calculations. Here, the authors instigate a deterministic method for estimating an absorbed dose more efficiently. Methods: Compared with current Monte Carlo methods, a more efficient approach to estimating the absorbed dose is to solve the linear Boltzmann equation numerically. In this study, an axial CT scan was modeled with a software package, Denovo, which solved the linear Boltzmann equation using the discrete ordinates method. Themore » CT scanning configuration included 16 x-ray source positions, beam collimators, flat filters, and bowtie filters. The phantom was the standard 32 cm CT dose index (CTDI) phantom. Four different Denovo simulations were performed with different simulation parameters, including the number of quadrature sets and the order of Legendre polynomial expansions. A Monte Carlo simulation was also performed for benchmarking the Denovo simulations. A quantitative comparison was made of the simulation results obtained by the Denovo and the Monte Carlo methods. Results: The difference in the simulation results of the discrete ordinates method and those of the Monte Carlo methods was found to be small, with a root-mean-square difference of around 2.4%. It was found that the discrete ordinates method, with a higher order of Legendre polynomial expansions, underestimated the absorbed dose near the center of the phantom (i.e., low dose region). Simulations of the quadrature set 8 and the first order of the Legendre polynomial expansions proved to be the most efficient computation method in the authors’ study. The single-thread computation time of the deterministic simulation of the quadrature set 8 and the first order of the Legendre polynomial expansions was 21 min on a personal computer. Conclusions: The simulation results showed that the deterministic method can be effectively used to estimate the absorbed dose in a CTDI phantom. The accuracy of the discrete ordinates method was close to that of a Monte Carlo simulation, and the primary benefit of the discrete ordinates method lies in its rapid computation speed. It is expected that further optimization of this method in routine clinical CT dose estimation will improve its accuracy and speed.« less
A transient FETI methodology for large-scale parallel implicit computations in structural mechanics

NASA Technical Reports Server (NTRS)

Farhat, Charbel; Crivelli, Luis; Roux, Francois-Xavier

1992-01-01

Explicit codes are often used to simulate the nonlinear dynamics of large-scale structural systems, even for low frequency response, because the storage and CPU requirements entailed by the repeated factorizations traditionally found in implicit codes rapidly overwhelm the available computing resources. With the advent of parallel processing, this trend is accelerating because explicit schemes are also easier to parallelize than implicit ones. However, the time step restriction imposed by the Courant stability condition on all explicit schemes cannot yet -- and perhaps will never -- be offset by the speed of parallel hardware. Therefore, it is essential to develop efficient and robust alternatives to direct methods that are also amenable to massively parallel processing because implicit codes using unconditionally stable time-integration algorithms are computationally more efficient when simulating low-frequency dynamics. Here we present a domain decomposition method for implicit schemes that requires significantly less storage than factorization algorithms, that is several times faster than other popular direct and iterative methods, that can be easily implemented on both shared and local memory parallel processors, and that is both computationally and communication-wise efficient. The proposed transient domain decomposition method is an extension of the method of Finite Element Tearing and Interconnecting (FETI) developed by Farhat and Roux for the solution of static problems. Serial and parallel performance results on the CRAY Y-MP/8 and the iPSC-860/128 systems are reported and analyzed for realistic structural dynamics problems. These results establish the superiority of the FETI method over both the serial/parallel conjugate gradient algorithm with diagonal scaling and the serial/parallel direct method, and contrast the computational power of the iPSC-860/128 parallel processor with that of the CRAY Y-MP/8 system.
Center for Efficient Exascale Discretizations Software Suite

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kolev, Tzanio; Dobrev, Veselin; Tomov, Vladimir

The CEED Software suite is a collection of generally applicable software tools focusing on the following computational motives: PDE discretizations on unstructured meshes, high-order finite element and spectral element methods and unstructured adaptive mesh refinement. All of this software is being developed as part of CEED, a co-design Center for Efficient Exascale Discretizations, within DOE's Exascale Computing Project (ECP) program.
Aerodynamic design optimization using sensitivity analysis and computational fluid dynamics

NASA Technical Reports Server (NTRS)

Baysal, Oktay; Eleshaky, Mohamed E.

1991-01-01

A new and efficient method is presented for aerodynamic design optimization, which is based on a computational fluid dynamics (CFD)-sensitivity analysis algorithm. The method is applied to design a scramjet-afterbody configuration for an optimized axial thrust. The Euler equations are solved for the inviscid analysis of the flow, which in turn provides the objective function and the constraints. The CFD analysis is then coupled with the optimization procedure that uses a constrained minimization method. The sensitivity coefficients, i.e. gradients of the objective function and the constraints, needed for the optimization are obtained using a quasi-analytical method rather than the traditional brute force method of finite difference approximations. During the one-dimensional search of the optimization procedure, an approximate flow analysis (predicted flow) based on a first-order Taylor series expansion is used to reduce the computational cost. Finally, the sensitivity of the optimum objective function to various design parameters, which are kept constant during the optimization, is computed to predict new optimum solutions. The flow analysis of the demonstrative example are compared with the experimental data. It is shown that the method is more efficient than the traditional methods.
Two tradeoffs between economy and reliability in loss of load probability constrained unit commitment

NASA Astrophysics Data System (ADS)

Liu, Yuan; Wang, Mingqiang; Ning, Xingyao

2018-02-01

Spinning reserve (SR) should be scheduled considering the balance between economy and reliability. To address the computational intractability cursed by the computation of loss of load probability (LOLP), many probabilistic methods use simplified formulations of LOLP to improve the computational efficiency. Two tradeoffs embedded in the SR optimization model are not explicitly analyzed in these methods. In this paper, two tradeoffs including primary tradeoff and secondary tradeoff between economy and reliability in the maximum LOLP constrained unit commitment (UC) model are explored and analyzed in a small system and in IEEE-RTS System. The analysis on the two tradeoffs can help in establishing new efficient simplified LOLP formulations and new SR optimization models.
AN EFFICIENT HIGHER-ORDER FAST MULTIPOLE BOUNDARY ELEMENT SOLUTION FOR POISSON-BOLTZMANN BASED MOLECULAR ELECTROSTATICS

PubMed Central

Bajaj, Chandrajit; Chen, Shun-Chuan; Rand, Alexander

2011-01-01

In order to compute polarization energy of biomolecules, we describe a boundary element approach to solving the linearized Poisson-Boltzmann equation. Our approach combines several important features including the derivative boundary formulation of the problem and a smooth approximation of the molecular surface based on the algebraic spline molecular surface. State of the art software for numerical linear algebra and the kernel independent fast multipole method is used for both simplicity and efficiency of our implementation. We perform a variety of computational experiments, testing our method on a number of actual proteins involved in molecular docking and demonstrating the effectiveness of our solver for computing molecular polarization energy. PMID:21660123
Accurate, efficient, and (iso)geometrically flexible collocation methods for phase-field models

NASA Astrophysics Data System (ADS)

Gomez, Hector; Reali, Alessandro; Sangalli, Giancarlo

2014-04-01

We propose new collocation methods for phase-field models. Our algorithms are based on isogeometric analysis, a new technology that makes use of functions from computational geometry, such as, for example, Non-Uniform Rational B-Splines (NURBS). NURBS exhibit excellent approximability and controllable global smoothness, and can represent exactly most geometries encapsulated in Computer Aided Design (CAD) models. These attributes permitted us to derive accurate, efficient, and geometrically flexible collocation methods for phase-field models. The performance of our method is demonstrated by several numerical examples of phase separation modeled by the Cahn-Hilliard equation. We feel that our method successfully combines the geometrical flexibility of finite elements with the accuracy and simplicity of pseudo-spectral collocation methods, and is a viable alternative to classical collocation methods.
PRECONDITIONED CONJUGATE-GRADIENT 2 (PCG2), a computer program for solving ground-water flow equations

USGS Publications Warehouse

Hill, Mary C.

1990-01-01

This report documents PCG2 : a numerical code to be used with the U.S. Geological Survey modular three-dimensional, finite-difference, ground-water flow model . PCG2 uses the preconditioned conjugate-gradient method to solve the equations produced by the model for hydraulic head. Linear or nonlinear flow conditions may be simulated. PCG2 includes two reconditioning options : modified incomplete Cholesky preconditioning, which is efficient on scalar computers; and polynomial preconditioning, which requires less computer storage and, with modifications that depend on the computer used, is most efficient on vector computers . Convergence of the solver is determined using both head-change and residual criteria. Nonlinear problems are solved using Picard iterations. This documentation provides a description of the preconditioned conjugate gradient method and the two preconditioners, detailed instructions for linking PCG2 to the modular model, sample data inputs, a brief description of PCG2, and a FORTRAN listing.
The multifacet graphically contracted function method. I. Formulation and implementation

NASA Astrophysics Data System (ADS)

Shepard, Ron; Gidofalvi, Gergely; Brozell, Scott R.

2014-08-01

The basic formulation for the multifacet generalization of the graphically contracted function (MFGCF) electronic structure method is presented. The analysis includes the discussion of linear dependency and redundancy of the arc factor parameters, the computation of reduced density matrices, Hamiltonian matrix construction, spin-density matrix construction, the computation of optimization gradients for single-state and state-averaged calculations, graphical wave function analysis, and the efficient computation of configuration state function and Slater determinant expansion coefficients. Timings are given for Hamiltonian matrix element and analytic optimization gradient computations for a range of model problems for full-CI Shavitt graphs, and it is observed that both the energy and the gradient computation scale as O(N2n4) for N electrons and n orbitals. The important arithmetic operations are within dense matrix-matrix product computational kernels, resulting in a computationally efficient procedure. An initial implementation of the method is used to present applications to several challenging chemical systems, including N2 dissociation, cubic H8 dissociation, the symmetric dissociation of H2O, and the insertion of Be into H2. The results are compared to the exact full-CI values and also to those of the previous single-facet GCF expansion form.
The multifacet graphically contracted function method. I. Formulation and implementation.

PubMed

Shepard, Ron; Gidofalvi, Gergely; Brozell, Scott R

2014-08-14

The basic formulation for the multifacet generalization of the graphically contracted function (MFGCF) electronic structure method is presented. The analysis includes the discussion of linear dependency and redundancy of the arc factor parameters, the computation of reduced density matrices, Hamiltonian matrix construction, spin-density matrix construction, the computation of optimization gradients for single-state and state-averaged calculations, graphical wave function analysis, and the efficient computation of configuration state function and Slater determinant expansion coefficients. Timings are given for Hamiltonian matrix element and analytic optimization gradient computations for a range of model problems for full-CI Shavitt graphs, and it is observed that both the energy and the gradient computation scale as O(N(2)n(4)) for N electrons and n orbitals. The important arithmetic operations are within dense matrix-matrix product computational kernels, resulting in a computationally efficient procedure. An initial implementation of the method is used to present applications to several challenging chemical systems, including N2 dissociation, cubic H8 dissociation, the symmetric dissociation of H2O, and the insertion of Be into H2. The results are compared to the exact full-CI values and also to those of the previous single-facet GCF expansion form.
Correlated histogram representation of Monte Carlo derived medical accelerator photon-output phase space

DOEpatents

Schach Von Wittenau, Alexis E.

2003-01-01

A method is provided to represent the calculated phase space of photons emanating from medical accelerators used in photon teletherapy. The method reproduces the energy distributions and trajectories of the photons originating in the bremsstrahlung target and of photons scattered by components within the accelerator head. The method reproduces the energy and directional information from sources up to several centimeters in radial extent, so it is expected to generalize well to accelerators made by different manufacturers. The method is computationally both fast and efficient overall sampling efficiency of 80% or higher for most field sizes. The computational cost is independent of the number of beams used in the treatment plan.
Solution of nonlinear time-dependent PDEs through componentwise approximation of matrix functions

NASA Astrophysics Data System (ADS)

Cibotarica, Alexandru; Lambers, James V.; Palchak, Elisabeth M.

2016-09-01

Exponential propagation iterative (EPI) methods provide an efficient approach to the solution of large stiff systems of ODEs, compared to standard integrators. However, the bulk of the computational effort in these methods is due to products of matrix functions and vectors, which can become very costly at high resolution due to an increase in the number of Krylov projection steps needed to maintain accuracy. In this paper, it is proposed to modify EPI methods by using Krylov subspace spectral (KSS) methods, instead of standard Krylov projection methods, to compute products of matrix functions and vectors. Numerical experiments demonstrate that this modification causes the number of Krylov projection steps to become bounded independently of the grid size, thus dramatically improving efficiency and scalability. As a result, for each test problem featured, as the total number of grid points increases, the growth in computation time is just below linear, while other methods achieved this only on selected test problems or not at all.
Overcoming Geometry-Induced Stiffness with IMplicit-Explicit (IMEX) Runge-Kutta Algorithms on Unstructured Grids with Applications to CEM, CFD, and CAA

NASA Technical Reports Server (NTRS)

Kanevsky, Alex

2004-01-01

My goal is to develop and implement efficient, accurate, and robust Implicit-Explicit Runge-Kutta (IMEX RK) methods [9] for overcoming geometry-induced stiffness with applications to computational electromagnetics (CEM), computational fluid dynamics (CFD) and computational aeroacoustics (CAA). IMEX algorithms solve the non-stiff portions of the domain using explicit methods, and isolate and solve the more expensive stiff portions using implicit methods. Current algorithms in CEM can only simulate purely harmonic (up to lOGHz plane wave) EM scattering by fighter aircraft, which are assumed to be pure metallic shells, and cannot handle the inclusion of coatings, penetration into and radiation out of the aircraft. Efficient MEX RK methods could potentially increase current CEM capabilities by 1-2 orders of magnitude, allowing scientists and engineers to attack more challenging and realistic problems.

Reproducing Quantum Probability Distributions at the Speed of Classical Dynamics: A New Approach for Developing Force-Field Functors.

PubMed

Sundar, Vikram; Gelbwaser-Klimovsky, David; Aspuru-Guzik, Alán

2018-04-05

Modeling nuclear quantum effects is required for accurate molecular dynamics (MD) simulations of molecules. The community has paid special attention to water and other biomolecules that show hydrogen bonding. Standard methods of modeling nuclear quantum effects like Ring Polymer Molecular Dynamics (RPMD) are computationally costlier than running classical trajectories. A force-field functor (FFF) is an alternative method that computes an effective force field that replicates quantum properties of the original force field. In this work, we propose an efficient method of computing FFF using the Wigner-Kirkwood expansion. As a test case, we calculate a range of thermodynamic properties of Neon, obtaining the same level of accuracy as RPMD, but with the shorter runtime of classical simulations. By modifying existing MD programs, the proposed method could be used in the future to increase the efficiency and accuracy of MD simulations involving water and proteins.
Cut set-based risk and reliability analysis for arbitrarily interconnected networks

DOEpatents

Wyss, Gregory D.

2000-01-01

Method for computing all-terminal reliability for arbitrarily interconnected networks such as the United States public switched telephone network. The method includes an efficient search algorithm to generate minimal cut sets for nonhierarchical networks directly from the network connectivity diagram. Efficiency of the search algorithm stems in part from its basis on only link failures. The method also includes a novel quantification scheme that likewise reduces computational effort associated with assessing network reliability based on traditional risk importance measures. Vast reductions in computational effort are realized since combinatorial expansion and subsequent Boolean reduction steps are eliminated through analysis of network segmentations using a technique of assuming node failures to occur on only one side of a break in the network, and repeating the technique for all minimal cut sets generated with the search algorithm. The method functions equally well for planar and non-planar networks.
Massively parallel multicanonical simulations

NASA Astrophysics Data System (ADS)

Gross, Jonathan; Zierenberg, Johannes; Weigel, Martin; Janke, Wolfhard

2018-03-01

Generalized-ensemble Monte Carlo simulations such as the multicanonical method and similar techniques are among the most efficient approaches for simulations of systems undergoing discontinuous phase transitions or with rugged free-energy landscapes. As Markov chain methods, they are inherently serial computationally. It was demonstrated recently, however, that a combination of independent simulations that communicate weight updates at variable intervals allows for the efficient utilization of parallel computational resources for multicanonical simulations. Implementing this approach for the many-thread architecture provided by current generations of graphics processing units (GPUs), we show how it can be efficiently employed with of the order of 104 parallel walkers and beyond, thus constituting a versatile tool for Monte Carlo simulations in the era of massively parallel computing. We provide the fully documented source code for the approach applied to the paradigmatic example of the two-dimensional Ising model as starting point and reference for practitioners in the field.
Efficiency analysis of numerical integrations for finite element substructure in real-time hybrid simulation

NASA Astrophysics Data System (ADS)

Wang, Jinting; Lu, Liqiao; Zhu, Fei

2018-01-01

Finite element (FE) is a powerful tool and has been applied by investigators to real-time hybrid simulations (RTHSs). This study focuses on the computational efficiency, including the computational time and accuracy, of numerical integrations in solving FE numerical substructure in RTHSs. First, sparse matrix storage schemes are adopted to decrease the computational time of FE numerical substructure. In this way, the task execution time (TET) decreases such that the scale of the numerical substructure model increases. Subsequently, several commonly used explicit numerical integration algorithms, including the central difference method (CDM), the Newmark explicit method, the Chang method and the Gui-λ method, are comprehensively compared to evaluate their computational time in solving FE numerical substructure. CDM is better than the other explicit integration algorithms when the damping matrix is diagonal, while the Gui-λ (λ = 4) method is advantageous when the damping matrix is non-diagonal. Finally, the effect of time delay on the computational accuracy of RTHSs is investigated by simulating structure-foundation systems. Simulation results show that the influences of time delay on the displacement response become obvious with the mass ratio increasing, and delay compensation methods may reduce the relative error of the displacement peak value to less than 5% even under the large time-step and large time delay.
Review of Computational Stirling Analysis Methods

NASA Technical Reports Server (NTRS)

Dyson, Rodger W.; Wilson, Scott D.; Tew, Roy C.

2004-01-01

Nuclear thermal to electric power conversion carries the promise of longer duration missions and higher scientific data transmission rates back to Earth for both Mars rovers and deep space missions. A free-piston Stirling convertor is a candidate technology that is considered an efficient and reliable power conversion device for such purposes. While already very efficient, it is believed that better Stirling engines can be developed if the losses inherent its current designs could be better understood. However, they are difficult to instrument and so efforts are underway to simulate a complete Stirling engine numerically. This has only recently been attempted and a review of the methods leading up to and including such computational analysis is presented. And finally it is proposed that the quality and depth of Stirling loss understanding may be improved by utilizing the higher fidelity and efficiency of recently developed numerical methods. One such method, the Ultra HI-Fl technique is presented in detail.
A COMPUTATIONALLY EFFICIENT HYBRID APPROACH FOR DYNAMIC GAS/AEROSOL TRANSFER IN AIR QUALITY MODELS. (R826371C005)

EPA Science Inventory

Dynamic mass transfer methods have been developed to better describe the interaction of the aerosol population with semi-volatile species such as nitrate, ammonia, and chloride. Unfortunately, these dynamic methods are computationally expensive. Assumptions are often made to r...
The Location of Sources of Human Computer Processed Cerebral Potentials for the Automated Assessment of Visual Field Impairment

PubMed Central

Leisman, Gerald; Ashkenazi, Maureen

1979-01-01

Objective psychophysical techniques for investigating visual fields are described. The paper concerns methods for the collection and analysis of evoked potentials using a small laboratory computer and provides efficient methods for obtaining information about the conduction pathways of the visual system.
Procedure and computer program to calculate machine contribution to sawmill recovery

Treesearch

Philip H. Steele; Hiram Hallock; Stanford Lunstrum

1981-01-01

The importance of considering individual machine contribution to total mill efficiency is discussed. A method for accurately calculating machine contribution is introduced, and an example is given using this method. A FORTRAN computer program to make the necessary complex calculations automatically is also presented with user instructions.
Comparing Server Energy Use and Efficiency Using Small Sample Sizes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Coles, Henry C.; Qin, Yong; Price, Phillip N.

This report documents a demonstration that compared the energy consumption and efficiency of a limited sample size of server-type IT equipment from different manufacturers by measuring power at the server power supply power cords. The results are specific to the equipment and methods used. However, it is hoped that those responsible for IT equipment selection can used the methods described to choose models that optimize energy use efficiency. The demonstration was conducted in a data center at Lawrence Berkeley National Laboratory in Berkeley, California. It was performed with five servers of similar mechanical and electronic specifications; three from Intel andmore » one each from Dell and Supermicro. Server IT equipment is constructed using commodity components, server manufacturer-designed assemblies, and control systems. Server compute efficiency is constrained by the commodity component specifications and integration requirements. The design freedom, outside of the commodity component constraints, provides room for the manufacturer to offer a product with competitive efficiency that meets market needs at a compelling price. A goal of the demonstration was to compare and quantify the server efficiency for three different brands. The efficiency is defined as the average compute rate (computations per unit of time) divided by the average energy consumption rate. The research team used an industry standard benchmark software package to provide a repeatable software load to obtain the compute rate and provide a variety of power consumption levels. Energy use when the servers were in an idle state (not providing computing work) were also measured. At high server compute loads, all brands, using the same key components (processors and memory), had similar results; therefore, from these results, it could not be concluded that one brand is more efficient than the other brands. The test results show that the power consumption variability caused by the key components as a group is similar to all other components as a group. However, some differences were observed. The Supermicro server used 27 percent more power at idle compared to the other brands. The Intel server had a power supply control feature called cold redundancy, and the data suggest that cold redundancy can provide energy savings at low power levels. Test and evaluation methods that might be used by others having limited resources for IT equipment evaluation are explained in the report.« less
Radiation shielding evaluation of the BNCT treatment room at THOR: a TORT-coupled MCNP Monte Carlo simulation study.

PubMed

Chen, A Y; Liu, Y-W H; Sheu, R J

2008-01-01

This study investigates the radiation shielding design of the treatment room for boron neutron capture therapy at Tsing Hua Open-pool Reactor using "TORT-coupled MCNP" method. With this method, the computational efficiency is improved significantly by two to three orders of magnitude compared to the analog Monte Carlo MCNP calculation. This makes the calculation feasible using a single CPU in less than 1 day. Further optimization of the photon weight windows leads to additional 50-75% improvement in the overall computational efficiency.
Turbofan Duct Propagation Model

NASA Technical Reports Server (NTRS)

Lan, Justin H.; Posey, Joe W. (Technical Monitor)

2001-01-01

The CDUCT code utilizes a parabolic approximation to the convected Helmholtz equation in order to efficiently model acoustic propagation in acoustically treated, complex shaped ducts. The parabolic approximation solves one-way wave propagation with a marching method which neglects backwards reflected waves. The derivation of the parabolic approximation is presented. Several code validation cases are given. An acoustic lining design process for an example aft fan duct is discussed. It is noted that the method can efficiently model realistic three-dimension effects, acoustic lining, and flow within the computational capabilities of a typical computer workstation.
A deconvolution extraction method for 2D multi-object fibre spectroscopy based on the regularized least-squares QR-factorization algorithm

NASA Astrophysics Data System (ADS)

Yu, Jian; Yin, Qian; Guo, Ping; Luo, A.-li

2014-09-01

This paper presents an efficient method for the extraction of astronomical spectra from two-dimensional (2D) multifibre spectrographs based on the regularized least-squares QR-factorization (LSQR) algorithm. We address two issues: we propose a modified Gaussian point spread function (PSF) for modelling the 2D PSF from multi-emission-line gas-discharge lamp images (arc images), and we develop an efficient deconvolution method to extract spectra in real circumstances. The proposed modified 2D Gaussian PSF model can fit various types of 2D PSFs, including different radial distortion angles and ellipticities. We adopt the regularized LSQR algorithm to solve the sparse linear equations constructed from the sparse convolution matrix, which we designate the deconvolution spectrum extraction method. Furthermore, we implement a parallelized LSQR algorithm based on graphics processing unit programming in the Compute Unified Device Architecture to accelerate the computational processing. Experimental results illustrate that the proposed extraction method can greatly reduce the computational cost and memory use of the deconvolution method and, consequently, increase its efficiency and practicability. In addition, the proposed extraction method has a stronger noise tolerance than other methods, such as the boxcar (aperture) extraction and profile extraction methods. Finally, we present an analysis of the sensitivity of the extraction results to the radius and full width at half-maximum of the 2D PSF.
10 CFR 431.197 - Manufacturer's determination of efficiency for distribution transformers.

Code of Federal Regulations, 2010 CFR

2010-01-01

... methods used; the mathematical model, the engineering or statistical analysis, computer simulation or... (b)(3) of this section, or by application of an alternative efficiency determination method (AEDM... section only if: (i) The AEDM has been derived from a mathematical model that represents the electrical...
Distributed collaborative response surface method for mechanical dynamic assembly reliability design

NASA Astrophysics Data System (ADS)

Bai, Guangchen; Fei, Chengwei

2013-11-01

Because of the randomness of many impact factors influencing the dynamic assembly relationship of complex machinery, the reliability analysis of dynamic assembly relationship needs to be accomplished considering the randomness from a probabilistic perspective. To improve the accuracy and efficiency of dynamic assembly relationship reliability analysis, the mechanical dynamic assembly reliability(MDAR) theory and a distributed collaborative response surface method(DCRSM) are proposed. The mathematic model of DCRSM is established based on the quadratic response surface function, and verified by the assembly relationship reliability analysis of aeroengine high pressure turbine(HPT) blade-tip radial running clearance(BTRRC). Through the comparison of the DCRSM, traditional response surface method(RSM) and Monte Carlo Method(MCM), the results show that the DCRSM is not able to accomplish the computational task which is impossible for the other methods when the number of simulation is more than 100 000 times, but also the computational precision for the DCRSM is basically consistent with the MCM and improved by 0.40˜4.63% to the RSM, furthermore, the computational efficiency of DCRSM is up to about 188 times of the MCM and 55 times of the RSM under 10000 times simulations. The DCRSM is demonstrated to be a feasible and effective approach for markedly improving the computational efficiency and accuracy of MDAR analysis. Thus, the proposed research provides the promising theory and method for the MDAR design and optimization, and opens a novel research direction of probabilistic analysis for developing the high-performance and high-reliability of aeroengine.
Interaction sorting method for molecular dynamics on multi-core SIMD CPU architecture.

PubMed

Matvienko, Sergey; Alemasov, Nikolay; Fomin, Eduard

2015-02-01

Molecular dynamics (MD) is widely used in computational biology for studying binding mechanisms of molecules, molecular transport, conformational transitions, protein folding, etc. The method is computationally expensive; thus, the demand for the development of novel, much more efficient algorithms is still high. Therefore, the new algorithm designed in 2007 and called interaction sorting (IS) clearly attracted interest, as it outperformed the most efficient MD algorithms. In this work, a new IS modification is proposed which allows the algorithm to utilize SIMD processor instructions. This paper shows that the improvement provides an additional gain in performance, 9% to 45% in comparison to the original IS method.
A class of parallel algorithms for computation of the manipulator inertia matrix

NASA Technical Reports Server (NTRS)

Fijany, Amir; Bejczy, Antal K.

1989-01-01

Parallel and parallel/pipeline algorithms for computation of the manipulator inertia matrix are presented. An algorithm based on composite rigid-body spatial inertia method, which provides better features for parallelization, is used for the computation of the inertia matrix. Two parallel algorithms are developed which achieve the time lower bound in computation. Also described is the mapping of these algorithms with topological variation on a two-dimensional processor array, with nearest-neighbor connection, and with cardinality variation on a linear processor array. An efficient parallel/pipeline algorithm for the linear array was also developed, but at significantly higher efficiency.
Increasing the computational efficient of digital cross correlation by a vectorization method

NASA Astrophysics Data System (ADS)

Chang, Ching-Yuan; Ma, Chien-Ching

2017-08-01

This study presents a vectorization method for use in MATLAB programming aimed at increasing the computational efficiency of digital cross correlation in sound and images, resulting in a speedup of 6.387 and 36.044 times compared with performance values obtained from looped expression. This work bridges the gap between matrix operations and loop iteration, preserving flexibility and efficiency in program testing. This paper uses numerical simulation to verify the speedup of the proposed vectorization method as well as experiments to measure the quantitative transient displacement response subjected to dynamic impact loading. The experiment involved the use of a high speed camera as well as a fiber optic system to measure the transient displacement in a cantilever beam under impact from a steel ball. Experimental measurement data obtained from the two methods are in excellent agreement in both the time and frequency domain, with discrepancies of only 0.68%. Numerical and experiment results demonstrate the efficacy of the proposed vectorization method with regard to computational speed in signal processing and high precision in the correlation algorithm. We also present the source code with which to build MATLAB-executable functions on Windows as well as Linux platforms, and provide a series of examples to demonstrate the application of the proposed vectorization method.
Virtualization and cloud computing in dentistry.

PubMed

Chow, Frank; Muftu, Ali; Shorter, Richard

2014-01-01

The use of virtualization and cloud computing has changed the way we use computers. Virtualization is a method of placing software called a hypervisor on the hardware of a computer or a host operating system. It allows a guest operating system to run on top of the physical computer with a virtual machine (i.e., virtual computer). Virtualization allows multiple virtual computers to run on top of one physical computer and to share its hardware resources, such as printers, scanners, and modems. This increases the efficient use of the computer by decreasing costs (e.g., hardware, electricity administration, and management) since only one physical computer is needed and running. This virtualization platform is the basis for cloud computing. It has expanded into areas of server and storage virtualization. One of the commonly used dental storage systems is cloud storage. Patient information is encrypted as required by the Health Insurance Portability and Accountability Act (HIPAA) and stored on off-site private cloud services for a monthly service fee. As computer costs continue to increase, so too will the need for more storage and processing power. Virtual and cloud computing will be a method for dentists to minimize costs and maximize computer efficiency in the near future. This article will provide some useful information on current uses of cloud computing.
Thermal radiation view factor: Methods, accuracy and computer-aided procedures

NASA Technical Reports Server (NTRS)

Kadaba, P. V.

1982-01-01

The computer aided thermal analysis programs which predicts the result of predetermined acceptable temperature range prior to stationing of these orbiting equipment in various attitudes with respect to the Sun and the Earth was examined. Complexity of the surface geometries suggests the use of numerical schemes for the determination of these viewfactors. Basic definitions and standard methods which form the basis for various digital computer methods and various numerical methods are presented. The physical model and the mathematical methods on which a number of available programs are built are summarized. The strength and the weaknesses of the methods employed, the accuracy of the calculations and the time required for computations are evaluated. The situations where accuracies are important for energy calculations are identified and methods to save computational times are proposed. Guide to best use of the available programs at several centers and the future choices for efficient use of digital computers are included in the recommendations.
Fast Bayesian experimental design: Laplace-based importance sampling for the expected information gain

NASA Astrophysics Data System (ADS)

Beck, Joakim; Dia, Ben Mansour; Espath, Luis F. R.; Long, Quan; Tempone, Raúl

2018-06-01

In calculating expected information gain in optimal Bayesian experimental design, the computation of the inner loop in the classical double-loop Monte Carlo requires a large number of samples and suffers from underflow if the number of samples is small. These drawbacks can be avoided by using an importance sampling approach. We present a computationally efficient method for optimal Bayesian experimental design that introduces importance sampling based on the Laplace method to the inner loop. We derive the optimal values for the method parameters in which the average computational cost is minimized according to the desired error tolerance. We use three numerical examples to demonstrate the computational efficiency of our method compared with the classical double-loop Monte Carlo, and a more recent single-loop Monte Carlo method that uses the Laplace method as an approximation of the return value of the inner loop. The first example is a scalar problem that is linear in the uncertain parameter. The second example is a nonlinear scalar problem. The third example deals with the optimal sensor placement for an electrical impedance tomography experiment to recover the fiber orientation in laminate composites.

A SCILAB Program for Computing General-Relativistic Models of Rotating Neutron Stars by Implementing Hartle's Perturbation Method

NASA Astrophysics Data System (ADS)

Papasotiriou, P. J.; Geroyannis, V. S.

We implement Hartle's perturbation method to the computation of relativistic rigidly rotating neutron star models. The program has been written in SCILAB (© INRIA ENPC), a matrix-oriented high-level programming language. The numerical method is described in very detail and is applied to many models in slow or fast rotation. We show that, although the method is perturbative, it gives accurate results for all practical purposes and it should prove an efficient tool for computing rapidly rotating pulsars.
A systematic and efficient method to compute multi-loop master integrals

NASA Astrophysics Data System (ADS)

Liu, Xiao; Ma, Yan-Qing; Wang, Chen-Yu

2018-04-01

We propose a novel method to compute multi-loop master integrals by constructing and numerically solving a system of ordinary differential equations, with almost trivial boundary conditions. Thus it can be systematically applied to problems with arbitrary kinematic configurations. Numerical tests show that our method can not only achieve results with high precision, but also be much faster than the only existing systematic method sector decomposition. As a by product, we find a new strategy to compute scalar one-loop integrals without reducing them to master integrals.
DNS of Flow in a Low-Pressure Turbine Cascade Using a Discontinuous-Galerkin Spectral-Element Method

NASA Technical Reports Server (NTRS)

Garai, Anirban; Diosady, Laslo Tibor; Murman, Scott; Madavan, Nateri

2015-01-01

A new computational capability under development for accurate and efficient high-fidelity direct numerical simulation (DNS) and large eddy simulation (LES) of turbomachinery is described. This capability is based on an entropy-stable Discontinuous-Galerkin spectral-element approach that extends to arbitrarily high orders of spatial and temporal accuracy and is implemented in a computationally efficient manner on a modern high performance computer architecture. A validation study using this method to perform DNS of flow in a low-pressure turbine airfoil cascade are presented. Preliminary results indicate that the method captures the main features of the flow. Discrepancies between the predicted results and the experiments are likely due to the effects of freestream turbulence not being included in the simulation and will be addressed in the final paper.
An efficient numerical technique for calculating thermal spreading resistance

NASA Technical Reports Server (NTRS)

Gale, E. H., Jr.

1977-01-01

An efficient numerical technique for solving the equations resulting from finite difference analyses of fields governed by Poisson's equation is presented. The method is direct (noniterative)and the computer work required varies with the square of the order of the coefficient matrix. The computational work required varies with the cube of this order for standard inversion techniques, e.g., Gaussian elimination, Jordan, Doolittle, etc.
A-VCI: A flexible method to efficiently compute vibrational spectra

NASA Astrophysics Data System (ADS)

Odunlami, Marc; Le Bris, Vincent; Bégué, Didier; Baraille, Isabelle; Coulaud, Olivier

2017-06-01

The adaptive vibrational configuration interaction algorithm has been introduced as a new method to efficiently reduce the dimension of the set of basis functions used in a vibrational configuration interaction process. It is based on the construction of nested bases for the discretization of the Hamiltonian operator according to a theoretical criterion that ensures the convergence of the method. In the present work, the Hamiltonian is written as a sum of products of operators. The purpose of this paper is to study the properties and outline the performance details of the main steps of the algorithm. New parameters have been incorporated to increase flexibility, and their influence has been thoroughly investigated. The robustness and reliability of the method are demonstrated for the computation of the vibrational spectrum up to 3000 cm-1 of a widely studied 6-atom molecule (acetonitrile). Our results are compared to the most accurate up to date computation; we also give a new reference calculation for future work on this system. The algorithm has also been applied to a more challenging 7-atom molecule (ethylene oxide). The computed spectrum up to 3200 cm-1 is the most accurate computation that exists today on such systems.
A-VCI: A flexible method to efficiently compute vibrational spectra.

PubMed

Odunlami, Marc; Le Bris, Vincent; Bégué, Didier; Baraille, Isabelle; Coulaud, Olivier

2017-06-07

The adaptive vibrational configuration interaction algorithm has been introduced as a new method to efficiently reduce the dimension of the set of basis functions used in a vibrational configuration interaction process. It is based on the construction of nested bases for the discretization of the Hamiltonian operator according to a theoretical criterion that ensures the convergence of the method. In the present work, the Hamiltonian is written as a sum of products of operators. The purpose of this paper is to study the properties and outline the performance details of the main steps of the algorithm. New parameters have been incorporated to increase flexibility, and their influence has been thoroughly investigated. The robustness and reliability of the method are demonstrated for the computation of the vibrational spectrum up to 3000 cm -1 of a widely studied 6-atom molecule (acetonitrile). Our results are compared to the most accurate up to date computation; we also give a new reference calculation for future work on this system. The algorithm has also been applied to a more challenging 7-atom molecule (ethylene oxide). The computed spectrum up to 3200 cm -1 is the most accurate computation that exists today on such systems.
A Very High Order, Adaptable MESA Implementation for Aeroacoustic Computations

NASA Technical Reports Server (NTRS)

Dydson, Roger W.; Goodrich, John W.

2000-01-01

Since computational efficiency and wave resolution scale with accuracy, the ideal would be infinitely high accuracy for problems with widely varying wavelength scales. Currently, many of the computational aeroacoustics methods are limited to 4th order accurate Runge-Kutta methods in time which limits their resolution and efficiency. However, a new procedure for implementing the Modified Expansion Solution Approximation (MESA) schemes, based upon Hermitian divided differences, is presented which extends the effective accuracy of the MESA schemes to 57th order in space and time when using 128 bit floating point precision. This new approach has the advantages of reducing round-off error, being easy to program. and is more computationally efficient when compared to previous approaches. Its accuracy is limited only by the floating point hardware. The advantages of this new approach are demonstrated by solving the linearized Euler equations in an open bi-periodic domain. A 500th order MESA scheme can now be created in seconds, making these schemes ideally suited for the next generation of high performance 256-bit (double quadruple) or higher precision computers. This ease of creation makes it possible to adapt the algorithm to the mesh in time instead of its converse: this is ideal for resolving varying wavelength scales which occur in noise generation simulations. And finally, the sources of round-off error which effect the very high order methods are examined and remedies provided that effectively increase the accuracy of the MESA schemes while using current computer technology.
Efficient computation of the joint sample frequency spectra for multiple populations.

PubMed

Kamm, John A; Terhorst, Jonathan; Song, Yun S

2017-01-01

A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of large-scale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. SFS-based inference methods require accurate computation of the expected SFS under a given demographic model. Although much methodological progress has been made, existing methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable accurate, efficient computation of the expected joint SFS for thousands of individuals sampled from hundreds of populations related by a complex demographic model with arbitrary population size histories (including piecewise-exponential growth). Our results are implemented in a new software package called momi (MOran Models for Inference). Through an empirical study we demonstrate our improvements to numerical stability and computational complexity.
Efficient computation of the joint sample frequency spectra for multiple populations

PubMed Central

Kamm, John A.; Terhorst, Jonathan; Song, Yun S.

2016-01-01

A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of large-scale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. SFS-based inference methods require accurate computation of the expected SFS under a given demographic model. Although much methodological progress has been made, existing methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable accurate, efficient computation of the expected joint SFS for thousands of individuals sampled from hundreds of populations related by a complex demographic model with arbitrary population size histories (including piecewise-exponential growth). Our results are implemented in a new software package called momi (MOran Models for Inference). Through an empirical study we demonstrate our improvements to numerical stability and computational complexity. PMID:28239248
Adaptive [theta]-methods for pricing American options

NASA Astrophysics Data System (ADS)

Khaliq, Abdul Q. M.; Voss, David A.; Kazmi, Kamran

2008-12-01

We develop adaptive [theta]-methods for solving the Black-Scholes PDE for American options. By adding a small, continuous term, the Black-Scholes PDE becomes an advection-diffusion-reaction equation on a fixed spatial domain. Standard implementation of [theta]-methods would require a Newton-type iterative procedure at each time step thereby increasing the computational complexity of the methods. Our linearly implicit approach avoids such complications. We establish a general framework under which [theta]-methods satisfy a discrete version of the positivity constraint characteristic of American options, and numerically demonstrate the sensitivity of the constraint. The positivity results are established for the single-asset and independent two-asset models. In addition, we have incorporated and analyzed an adaptive time-step control strategy to increase the computational efficiency. Numerical experiments are presented for one- and two-asset American options, using adaptive exponential splitting for two-asset problems. The approach is compared with an iterative solution of the two-asset problem in terms of computational efficiency.
Space Radiation Transport Methods Development

NASA Technical Reports Server (NTRS)

Wilson, J. W.; Tripathi, R. K.; Qualls, G. D.; Cucinotta, F. A.; Prael, R. E.; Norbury, J. W.; Heinbockel, J. H.; Tweed, J.

2002-01-01

Improved spacecraft shield design requires early entry of radiation constraints into the design process to maximize performance and minimize costs. As a result, we have been investigating high-speed computational procedures to allow shield analysis from the preliminary design concepts to the final design. In particular, we will discuss the progress towards a full three-dimensional and computationally efficient deterministic code for which the current HZETRN evaluates the lowest order asymptotic term. HZETRN is the first deterministic solution to the Boltzmann equation allowing field mapping within the International Space Station (ISS) in tens of minutes using standard Finite Element Method (FEM) geometry common to engineering design practice enabling development of integrated multidisciplinary design optimization methods. A single ray trace in ISS FEM geometry requires 14 milliseconds and severely limits application of Monte Carlo methods to such engineering models. A potential means of improving the Monte Carlo efficiency in coupling to spacecraft geometry is given in terms of reconfigurable computing and could be utilized in the final design as verification of the deterministic method optimized design.
SGFSC: speeding the gene functional similarity calculation based on hash tables.

PubMed

Tian, Zhen; Wang, Chunyu; Guo, Maozu; Liu, Xiaoyan; Teng, Zhixia

2016-11-04

In recent years, many measures of gene functional similarity have been proposed and widely used in all kinds of essential research. These methods are mainly divided into two categories: pairwise approaches and group-wise approaches. However, a common problem with these methods is their time consumption, especially when measuring the gene functional similarities of a large number of gene pairs. The problem of computational efficiency for pairwise approaches is even more prominent because they are dependent on the combination of semantic similarity. Therefore, the efficient measurement of gene functional similarity remains a challenging problem. To speed current gene functional similarity calculation methods, a novel two-step computing strategy is proposed: (1) establish a hash table for each method to store essential information obtained from the Gene Ontology (GO) graph and (2) measure gene functional similarity based on the corresponding hash table. There is no need to traverse the GO graph repeatedly for each method with the help of the hash table. The analysis of time complexity shows that the computational efficiency of these methods is significantly improved. We also implement a novel Speeding Gene Functional Similarity Calculation tool, namely SGFSC, which is bundled with seven typical measures using our proposed strategy. Further experiments show the great advantage of SGFSC in measuring gene functional similarity on the whole genomic scale. The proposed strategy is successful in speeding current gene functional similarity calculation methods. SGFSC is an efficient tool that is freely available at http://nclab.hit.edu.cn/SGFSC . The source code of SGFSC can be downloaded from http://pan.baidu.com/s/1dFFmvpZ .
Efficient least angle regression for identification of linear-in-the-parameters models

PubMed Central

Beach, Thomas H.; Rezgui, Yacine

2017-01-01

Least angle regression, as a promising model selection method, differentiates itself from conventional stepwise and stagewise methods, in that it is neither too greedy nor too slow. It is closely related to L1 norm optimization, which has the advantage of low prediction variance through sacrificing part of model bias property in order to enhance model generalization capability. In this paper, we propose an efficient least angle regression algorithm for model selection for a large class of linear-in-the-parameters models with the purpose of accelerating the model selection process. The entire algorithm works completely in a recursive manner, where the correlations between model terms and residuals, the evolving directions and other pertinent variables are derived explicitly and updated successively at every subset selection step. The model coefficients are only computed when the algorithm finishes. The direct involvement of matrix inversions is thereby relieved. A detailed computational complexity analysis indicates that the proposed algorithm possesses significant computational efficiency, compared with the original approach where the well-known efficient Cholesky decomposition is involved in solving least angle regression. Three artificial and real-world examples are employed to demonstrate the effectiveness, efficiency and numerical stability of the proposed algorithm. PMID:28293140
The demodulated band transform

PubMed Central

Kovach, Christopher K.; Gander, Phillip E.

2016-01-01

Background Windowed Fourier decompositions (WFD) are widely used in measuring stationary and non-stationary spectral phenomena and in describing pairwise relationships among multiple signals. Although a variety of WFDs see frequent application in electrophysiological research, including the short-time Fourier transform, continuous wavelets, band-pass filtering and multitaper-based approaches, each carries certain drawbacks related to computational efficiency and spectral leakage. This work surveys the advantages of a WFD not previously applied in electrophysiological settings. New Methods A computationally efficient form of complex demodulation, the demodulated band transform (DBT), is described. Results DBT is shown to provide an efficient approach to spectral estimation with minimal susceptibility to spectral leakage. In addition, it lends itself well to adaptive filtering of non-stationary narrowband noise. Comparison with existing methods A detailed comparison with alternative WFDs is offered, with an emphasis on the relationship between DBT and Thomson's multitaper. DBT is shown to perform favorably in combining computational efficiency with minimal introduction of spectral leakage. Conclusion DBT is ideally suited to efficient estimation of both stationary and non-stationary spectral and cross-spectral statistics with minimal susceptibility to spectral leakage. These qualities are broadly desirable in many settings. PMID:26711370
Multi-GPU hybrid programming accelerated three-dimensional phase-field model in binary alloy

NASA Astrophysics Data System (ADS)

Zhu, Changsheng; Liu, Jieqiong; Zhu, Mingfang; Feng, Li

2018-03-01

In the process of dendritic growth simulation, the computational efficiency and the problem scales have extremely important influence on simulation efficiency of three-dimensional phase-field model. Thus, seeking for high performance calculation method to improve the computational efficiency and to expand the problem scales has a great significance to the research of microstructure of the material. A high performance calculation method based on MPI+CUDA hybrid programming model is introduced. Multi-GPU is used to implement quantitative numerical simulations of three-dimensional phase-field model in binary alloy under the condition of multi-physical processes coupling. The acceleration effect of different GPU nodes on different calculation scales is explored. On the foundation of multi-GPU calculation model that has been introduced, two optimization schemes, Non-blocking communication optimization and overlap of MPI and GPU computing optimization, are proposed. The results of two optimization schemes and basic multi-GPU model are compared. The calculation results show that the use of multi-GPU calculation model can improve the computational efficiency of three-dimensional phase-field obviously, which is 13 times to single GPU, and the problem scales have been expanded to 8193. The feasibility of two optimization schemes is shown, and the overlap of MPI and GPU computing optimization has better performance, which is 1.7 times to basic multi-GPU model, when 21 GPUs are used.
FastMag: Fast micromagnetic simulator for complex magnetic structures (invited)

NASA Astrophysics Data System (ADS)

Chang, R.; Li, S.; Lubarda, M. V.; Livshitz, B.; Lomakin, V.

2011-04-01

A fast micromagnetic simulator (FastMag) for general problems is presented. FastMag solves the Landau-Lifshitz-Gilbert equation and can handle multiscale problems with a high computational efficiency. The simulator derives its high performance from efficient methods for evaluating the effective field and from implementations on massively parallel graphics processing unit (GPU) architectures. FastMag discretizes the computational domain into tetrahedral elements and therefore is highly flexible for general problems. The magnetostatic field is computed via the superposition principle for both volume and surface parts of the computational domain. This is accomplished by implementing efficient quadrature rules and analytical integration for overlapping elements in which the integral kernel is singular. Thus, discretized superposition integrals are computed using a nonuniform grid interpolation method, which evaluates the field from N sources at N collocated observers in O(N) operations. This approach allows handling objects of arbitrary shape, allows easily calculating of the field outside the magnetized domains, does not require solving a linear system of equations, and requires little memory. FastMag is implemented on GPUs with ?> GPU-central processing unit speed-ups of 2 orders of magnitude. Simulations are shown of a large array of magnetic dots and a recording head fully discretized down to the exchange length, with over a hundred million tetrahedral elements on an inexpensive desktop computer.
Depth compensating calculation method of computer-generated holograms using symmetry and similarity of zone plates

NASA Astrophysics Data System (ADS)

Wei, Hui; Gong, Guanghong; Li, Ni

2017-10-01

Computer-generated hologram (CGH) is a promising 3D display technology while it is challenged by heavy computation load and vast memory requirement. To solve these problems, a depth compensating CGH calculation method based on symmetry and similarity of zone plates is proposed and implemented on graphics processing unit (GPU). An improved LUT method is put forward to compute the distances between object points and hologram pixels in the XY direction. The concept of depth compensating factor is defined and used for calculating the holograms of points with different depth positions instead of layer-based methods. The proposed method is suitable for arbitrary sampling objects with lower memory usage and higher computational efficiency compared to other CGH methods. The effectiveness of the proposed method is validated by numerical and optical experiments.
Meshfree and efficient modeling of swimming cells

NASA Astrophysics Data System (ADS)

Gallagher, Meurig T.; Smith, David J.

2018-05-01

Locomotion in Stokes flow is an intensively studied problem because it describes important biological phenomena such as the motility of many species' sperm, bacteria, algae, and protozoa. Numerical computations can be challenging, particularly in three dimensions, due to the presence of moving boundaries and complex geometries; methods which combine ease of implementation and computational efficiency are therefore needed. A recently proposed method to discretize the regularized Stokeslet boundary integral equation without the need for a connected mesh is applied to the inertialess locomotion problem in Stokes flow. The mathematical formulation and key aspects of the computational implementation in matlab® or GNU Octave are described, followed by numerical experiments with biflagellate algae and multiple uniflagellate sperm swimming between no-slip surfaces, for which both swimming trajectories and flow fields are calculated. These computational experiments required minutes of time on modest hardware; an extensible implementation is provided in a GitHub repository. The nearest-neighbor discretization dramatically improves convergence and robustness, a key challenge in extending the regularized Stokeslet method to complicated three-dimensional biological fluid problems.
Efficient Strategies for Estimating the Spatial Coherence of Backscatter

PubMed Central

Hyun, Dongwoon; Crowley, Anna Lisa C.; Dahl, Jeremy J.

2017-01-01

The spatial coherence of ultrasound backscatter has been proposed to reduce clutter in medical imaging, to measure the anisotropy of the scattering source, and to improve the detection of blood flow. These techniques rely on correlation estimates that are obtained using computationally expensive strategies. In this study, we assess existing spatial coherence estimation methods and propose three computationally efficient modifications: a reduced kernel, a downsampled receive aperture, and the use of an ensemble correlation coefficient. The proposed methods are implemented in simulation and in vivo studies. Reducing the kernel to a single sample improved computational throughput and improved axial resolution. Downsampling the receive aperture was found to have negligible effect on estimator variance, and improved computational throughput by an order of magnitude for a downsample factor of 4. The ensemble correlation estimator demonstrated lower variance than the currently used average correlation. Combining the three methods, the throughput was improved 105-fold in simulation with a downsample factor of 4 and 20-fold in vivo with a downsample factor of 2. PMID:27913342
Fast Semantic Segmentation of 3d Point Clouds with Strongly Varying Density

NASA Astrophysics Data System (ADS)

Hackel, Timo; Wegner, Jan D.; Schindler, Konrad

2016-06-01

We describe an effective and efficient method for point-wise semantic classification of 3D point clouds. The method can handle unstructured and inhomogeneous point clouds such as those derived from static terrestrial LiDAR or photogammetric reconstruction; and it is computationally efficient, making it possible to process point clouds with many millions of points in a matter of minutes. The key issue, both to cope with strong variations in point density and to bring down computation time, turns out to be careful handling of neighborhood relations. By choosing appropriate definitions of a point's (multi-scale) neighborhood, we obtain a feature set that is both expressive and fast to compute. We evaluate our classification method both on benchmark data from a mobile mapping platform and on a variety of large, terrestrial laser scans with greatly varying point density. The proposed feature set outperforms the state of the art with respect to per-point classification accuracy, while at the same time being much faster to compute.

Efficient Conformational Sampling in Explicit Solvent Using a Hybrid Replica Exchange Molecular Dynamics Method

DTIC Science & Technology

2011-12-01

REMD while reproducing the energy landscape of explicit solvent simulations . ’ INTRODUCTION Molecular dynamics (MD) simulations of proteins can pro...Mongan, J.; McCammon, J. A. Accelerated molecular dynamics : a promising and efficient simulation method for biomolecules. J. Chem. Phys. 2004, 120 (24...Chemical Theory and Computation ARTICLE (8) Abraham,M. J.; Gready, J. E. Ensuringmixing efficiency of replica- exchange molecular dynamics simulations . J
Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures

PubMed Central

Manolakos, Elias S.

2015-01-01

Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub. PMID:26605332
Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures.

PubMed

Sharma, Anuj; Manolakos, Elias S

2015-01-01

Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub.
High performance computation of radiative transfer equation using the finite element method

NASA Astrophysics Data System (ADS)

Badri, M. A.; Jolivet, P.; Rousseau, B.; Favennec, Y.

2018-05-01

This article deals with an efficient strategy for numerically simulating radiative transfer phenomena using distributed computing. The finite element method alongside the discrete ordinate method is used for spatio-angular discretization of the monochromatic steady-state radiative transfer equation in an anisotropically scattering media. Two very different methods of parallelization, angular and spatial decomposition methods, are presented. To do so, the finite element method is used in a vectorial way. A detailed comparison of scalability, performance, and efficiency on thousands of processors is established for two- and three-dimensional heterogeneous test cases. Timings show that both algorithms scale well when using proper preconditioners. It is also observed that our angular decomposition scheme outperforms our domain decomposition method. Overall, we perform numerical simulations at scales that were previously unattainable by standard radiative transfer equation solvers.
Solving the hypersingular boundary integral equation in three-dimensional acoustics using a regularization relationship.

PubMed

Yan, Zai You; Hung, Kin Chew; Zheng, Hui

2003-05-01

Regularization of the hypersingular integral in the normal derivative of the conventional Helmholtz integral equation through a double surface integral method or regularization relationship has been studied. By introducing the new concept of discretized operator matrix, evaluation of the double surface integrals is reduced to calculate the product of two discretized operator matrices. Such a treatment greatly improves the computational efficiency. As the number of frequencies to be computed increases, the computational cost of solving the composite Helmholtz integral equation is comparable to that of solving the conventional Helmholtz integral equation. In this paper, the detailed formulation of the proposed regularization method is presented. The computational efficiency and accuracy of the regularization method are demonstrated for a general class of acoustic radiation and scattering problems. The radiation of a pulsating sphere, an oscillating sphere, and a rigid sphere insonified by a plane acoustic wave are solved using the new method with curvilinear quadrilateral isoparametric elements. It is found that the numerical results rapidly converge to the corresponding analytical solutions as finer meshes are applied.
Methods for operating parallel computing systems employing sequenced communications

DOEpatents

Benner, R.E.; Gustafson, J.L.; Montry, G.R.

1999-08-10

A parallel computing system and method are disclosed having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system. 15 figs.
Methods for operating parallel computing systems employing sequenced communications

DOEpatents

Benner, Robert E.; Gustafson, John L.; Montry, Gary R.

1999-01-01

A parallel computing system and method having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system.
A computational efficient modelling of laminar separation bubbles

NASA Technical Reports Server (NTRS)

Dini, Paolo; Maughmer, Mark D.

1990-01-01

In predicting the aerodynamic characteristics of airfoils operating at low Reynolds numbers, it is often important to account for the effects of laminar (transitional) separation bubbles. Previous approaches to the modelling of this viscous phenomenon range from fast but sometimes unreliable empirical correlations for the length of the bubble and the associated increase in momentum thickness, to more accurate but significantly slower displacement-thickness iteration methods employing inverse boundary-layer formulations in the separated regions. Since the penalty in computational time associated with the more general methods is unacceptable for airfoil design applications, use of an accurate yet computationally efficient model is highly desirable. To this end, a semi-empirical bubble model was developed and incorporated into the Eppler and Somers airfoil design and analysis program. The generality and the efficiency was achieved by successfully approximating the local viscous/inviscid interaction, the transition location, and the turbulent reattachment process within the framework of an integral boundary-layer method. Comparisons of the predicted aerodynamic characteristics with experimental measurements for several airfoils show excellent and consistent agreement for Reynolds numbers from 2,000,000 down to 100,000.
Magnetic potential, vector and gradient tensor fields of a tesseroid in a geocentric spherical coordinate system

NASA Astrophysics Data System (ADS)

Du, Jinsong; Chen, Chao; Lesur, Vincent; Lane, Richard; Wang, Huilin

2015-06-01

We examined the mathematical and computational aspects of the magnetic potential, vector and gradient tensor fields of a tesseroid in a geocentric spherical coordinate system (SCS). This work is relevant for 3-D modelling that is performed with lithospheric vertical scales and global, continent or large regional horizontal scales. The curvature of the Earth is significant at these scales and hence, a SCS is more appropriate than the usual Cartesian coordinate system (CCS). The 3-D arrays of spherical prisms (SP; `tesseroids') can be used to model the response of volumes with variable magnetic properties. Analytical solutions do not exist for these model elements and numerical or mixed numerical and analytical solutions must be employed. We compared various methods for calculating the response in terms of accuracy and computational efficiency. The methods were (1) the spherical coordinate magnetic dipole method (MD), (2) variants of the 3-D Gauss-Legendre quadrature integration method (3-D GLQI) with (i) different numbers of nodes in each of the three directions, and (ii) models where we subdivided each SP into a number of smaller tesseroid volume elements, (3) a procedure that we term revised Gauss-Legendre quadrature integration (3-D RGLQI) where the magnetization direction which is constant in a SCS is assumed to be constant in a CCS and equal to the direction at the geometric centre of each tesseroid, (4) the Taylor's series expansion method (TSE) and (5) the rectangular prism method (RP). In any realistic application, both the accuracy and the computational efficiency factors must be considered to determine the optimum approach to employ. In all instances, accuracy improves with increasing distance from the source. It is higher in the percentage terms for potential than the vector or tensor response. The tensor errors are the largest, but they decrease more quickly with distance from the source. In our comparisons of relative computational efficiency, we found that the magnetic potential takes less time to compute than the vector response, which in turn takes less time to compute than the tensor gradient response. The MD method takes less time to compute than either the TSE or RP methods. The efficiency of the (GLQI and) RGLQI methods depends on the polynomial order, but the response typically takes longer to compute than it does for the other methods. The optimum method is a complex function of the desired accuracy, the size of the volume elements, the element latitude and the distance between the source and the observation. For a model of global extent with typical model element size (e.g. 1 degree horizontally and 10 km radially) and observations at altitudes of 10s to 100s of km, a mixture of methods based on the horizontal separation of the source and observation separation would be the optimum approach. To demonstrate the RGLQI method described within this paper, we applied it to the computation of the response for a global magnetization model for observations at 300 and 30 km altitude.
An imperialist competitive algorithm for virtual machine placement in cloud computing

NASA Astrophysics Data System (ADS)

Jamali, Shahram; Malektaji, Sepideh; Analoui, Morteza

2017-05-01

Cloud computing, the recently emerged revolution in IT industry, is empowered by virtualisation technology. In this paradigm, the user's applications run over some virtual machines (VMs). The process of selecting proper physical machines to host these virtual machines is called virtual machine placement. It plays an important role on resource utilisation and power efficiency of cloud computing environment. In this paper, we propose an imperialist competitive-based algorithm for the virtual machine placement problem called ICA-VMPLC. The base optimisation algorithm is chosen to be ICA because of its ease in neighbourhood movement, good convergence rate and suitable terminology. The proposed algorithm investigates search space in a unique manner to efficiently obtain optimal placement solution that simultaneously minimises power consumption and total resource wastage. Its final solution performance is compared with several existing methods such as grouping genetic and ant colony-based algorithms as well as bin packing heuristic. The simulation results show that the proposed method is superior to other tested algorithms in terms of power consumption, resource wastage, CPU usage efficiency and memory usage efficiency.
Proportional Topology Optimization: A New Non-Sensitivity Method for Solving Stress Constrained and Minimum Compliance Problems and Its Implementation in MATLAB

PubMed Central

Biyikli, Emre; To, Albert C.

2015-01-01

A new topology optimization method called the Proportional Topology Optimization (PTO) is presented. As a non-sensitivity method, PTO is simple to understand, easy to implement, and is also efficient and accurate at the same time. It is implemented into two MATLAB programs to solve the stress constrained and minimum compliance problems. Descriptions of the algorithm and computer programs are provided in detail. The method is applied to solve three numerical examples for both types of problems. The method shows comparable efficiency and accuracy with an existing optimality criteria method which computes sensitivities. Also, the PTO stress constrained algorithm and minimum compliance algorithm are compared by feeding output from one algorithm to the other in an alternative manner, where the former yields lower maximum stress and volume fraction but higher compliance compared to the latter. Advantages and disadvantages of the proposed method and future works are discussed. The computer programs are self-contained and publicly shared in the website www.ptomethod.org. PMID:26678849
Computationally efficient approach for solving time dependent diffusion equation with discrete temporal convolution applied to granular particles of battery electrodes

NASA Astrophysics Data System (ADS)

Senegačnik, Jure; Tavčar, Gregor; Katrašnik, Tomaž

2015-03-01

The paper presents a computationally efficient method for solving the time dependent diffusion equation in a granule of the Li-ion battery's granular solid electrode. The method, called Discrete Temporal Convolution method (DTC), is based on a discrete temporal convolution of the analytical solution of the step function boundary value problem. This approach enables modelling concentration distribution in the granular particles for arbitrary time dependent exchange fluxes that do not need to be known a priori. It is demonstrated in the paper that the proposed method features faster computational times than finite volume/difference methods and Padé approximation at the same accuracy of the results. It is also demonstrated that all three addressed methods feature higher accuracy compared to the quasi-steady polynomial approaches when applied to simulate the current densities variations typical for mobile/automotive applications. The proposed approach can thus be considered as one of the key innovative methods enabling real-time capability of the multi particle electrochemical battery models featuring spatial and temporal resolved particle concentration profiles.
Efficient matrix approach to optical wave propagation and Linear Canonical Transforms.

PubMed

Shakir, Sami A; Fried, David L; Pease, Edwin A; Brennan, Terry J; Dolash, Thomas M

2015-10-05

The Fresnel diffraction integral form of optical wave propagation and the more general Linear Canonical Transforms (LCT) are cast into a matrix transformation form. Taking advantage of recent efficient matrix multiply algorithms, this approach promises an efficient computational and analytical tool that is competitive with FFT based methods but offers better behavior in terms of aliasing, transparent boundary condition, and flexibility in number of sampling points and computational window sizes of the input and output planes being independent. This flexibility makes the method significantly faster than FFT based propagators when only a single point, as in Strehl metrics, or a limited number of points, as in power-in-the-bucket metrics, are needed in the output observation plane.
Discontinuous Galerkin Methods and High-Speed Turbulent Flows

NASA Astrophysics Data System (ADS)

Atak, Muhammed; Larsson, Johan; Munz, Claus-Dieter

2014-11-01

Discontinuous Galerkin methods gain increasing importance within the CFD community as they combine arbitrary high order of accuracy in complex geometries with parallel efficiency. Particularly the discontinuous Galerkin spectral element method (DGSEM) is a promising candidate for both the direct numerical simulation (DNS) and large eddy simulation (LES) of turbulent flows due to its excellent scaling attributes. In this talk, we present a DNS of a compressible turbulent boundary layer along a flat plate at a free-stream Mach number of M = 2.67 and assess the computational efficiency of the DGSEM at performing high-fidelity simulations of both transitional and turbulent boundary layers. We compare the accuracy of the results as well as the computational performance to results using a high order finite difference method.
Investigation to realize a computationally efficient implementation of the high-order instantaneous-moments-based fringe analysis method

NASA Astrophysics Data System (ADS)

Gorthi, Sai Siva; Rajshekhar, Gannavarpu; Rastogi, Pramod

2010-06-01

Recently, a high-order instantaneous moments (HIM)-operator-based method was proposed for accurate phase estimation in digital holographic interferometry. The method relies on piece-wise polynomial approximation of phase and subsequent evaluation of the polynomial coefficients from the HIM operator using single-tone frequency estimation. The work presents a comparative analysis of the performance of different single-tone frequency estimation techniques, like Fourier transform followed by optimization, estimation of signal parameters by rotational invariance technique (ESPRIT), multiple signal classification (MUSIC), and iterative frequency estimation by interpolation on Fourier coefficients (IFEIF) in HIM-operator-based methods for phase estimation. Simulation and experimental results demonstrate the potential of the IFEIF technique with respect to computational efficiency and estimation accuracy.
Efficient storage, computation, and exposure of computer-generated holograms by electron-beam lithography.

PubMed

Newman, D M; Hawley, R W; Goeckel, D L; Crawford, R D; Abraham, S; Gallagher, N C

1993-05-10

An efficient storage format was developed for computer-generated holograms for use in electron-beam lithography. This method employs run-length encoding and Lempel-Ziv-Welch compression and succeeds in exposing holograms that were previously infeasible owing to the hologram's tremendous pattern-data file size. These holograms also require significant computation; thus the algorithm was implemented on a parallel computer, which improved performance by 2 orders of magnitude. The decompression algorithm was integrated into the Cambridge electron-beam machine's front-end processor.Although this provides much-needed ability, some hardware enhancements will be required in the future to overcome inadequacies in the current front-end processor that result in a lengthy exposure time.
Control mechanism of double-rotator-structure ternary optical computer

NASA Astrophysics Data System (ADS)

Kai, SONG; Liping, YAN

2017-03-01

Double-rotator-structure ternary optical processor (DRSTOP) has two characteristics, namely, giant data-bits parallel computing and reconfigurable processor, which can handle thousands of data bits in parallel, and can run much faster than computers and other optical computer systems so far. In order to put DRSTOP into practical application, this paper established a series of methods, namely, task classification method, data-bits allocation method, control information generation method, control information formatting and sending method, and decoded results obtaining method and so on. These methods form the control mechanism of DRSTOP. This control mechanism makes DRSTOP become an automated computing platform. Compared with the traditional calculation tools, DRSTOP computing platform can ease the contradiction between high energy consumption and big data computing due to greatly reducing the cost of communications and I/O. Finally, the paper designed a set of experiments for DRSTOP control mechanism to verify its feasibility and correctness. Experimental results showed that the control mechanism is correct, feasible and efficient.
Efficient and Robust Optimization for Building Energy Simulation

PubMed Central

Pourarian, Shokouh; Kearsley, Anthony; Wen, Jin; Pertzborn, Amanda

2016-01-01

Efficiently, robustly and accurately solving large sets of structured, non-linear algebraic and differential equations is one of the most computationally expensive steps in the dynamic simulation of building energy systems. Here, the efficiency, robustness and accuracy of two commonly employed solution methods are compared. The comparison is conducted using the HVACSIM+ software package, a component based building system simulation tool. The HVACSIM+ software presently employs Powell’s Hybrid method to solve systems of nonlinear algebraic equations that model the dynamics of energy states and interactions within buildings. It is shown here that the Powell’s method does not always converge to a solution. Since a myriad of other numerical methods are available, the question arises as to which method is most appropriate for building energy simulation. This paper finds considerable computational benefits result from replacing the Powell’s Hybrid method solver in HVACSIM+ with a solver more appropriate for the challenges particular to numerical simulations of buildings. Evidence is provided that a variant of the Levenberg-Marquardt solver has superior accuracy and robustness compared to the Powell’s Hybrid method presently used in HVACSIM+. PMID:27325907
Efficient and Robust Optimization for Building Energy Simulation.

PubMed

Pourarian, Shokouh; Kearsley, Anthony; Wen, Jin; Pertzborn, Amanda

2016-06-15

Efficiently, robustly and accurately solving large sets of structured, non-linear algebraic and differential equations is one of the most computationally expensive steps in the dynamic simulation of building energy systems. Here, the efficiency, robustness and accuracy of two commonly employed solution methods are compared. The comparison is conducted using the HVACSIM+ software package, a component based building system simulation tool. The HVACSIM+ software presently employs Powell's Hybrid method to solve systems of nonlinear algebraic equations that model the dynamics of energy states and interactions within buildings. It is shown here that the Powell's method does not always converge to a solution. Since a myriad of other numerical methods are available, the question arises as to which method is most appropriate for building energy simulation. This paper finds considerable computational benefits result from replacing the Powell's Hybrid method solver in HVACSIM+ with a solver more appropriate for the challenges particular to numerical simulations of buildings. Evidence is provided that a variant of the Levenberg-Marquardt solver has superior accuracy and robustness compared to the Powell's Hybrid method presently used in HVACSIM+.
Parallel 3D Mortar Element Method for Adaptive Nonconforming Meshes

NASA Technical Reports Server (NTRS)

Feng, Huiyu; Mavriplis, Catherine; VanderWijngaart, Rob; Biswas, Rupak

2004-01-01

High order methods are frequently used in computational simulation for their high accuracy. An efficient way to avoid unnecessary computation in smooth regions of the solution is to use adaptive meshes which employ fine grids only in areas where they are needed. Nonconforming spectral elements allow the grid to be flexibly adjusted to satisfy the computational accuracy requirements. The method is suitable for computational simulations of unsteady problems with very disparate length scales or unsteady moving features, such as heat transfer, fluid dynamics or flame combustion. In this work, we select the Mark Element Method (MEM) to handle the non-conforming interfaces between elements. A new technique is introduced to efficiently implement MEM in 3-D nonconforming meshes. By introducing an "intermediate mortar", the proposed method decomposes the projection between 3-D elements and mortars into two steps. In each step, projection matrices derived in 2-D are used. The two-step method avoids explicitly forming/deriving large projection matrices for 3-D meshes, and also helps to simplify the implementation. This new technique can be used for both h- and p-type adaptation. This method is applied to an unsteady 3-D moving heat source problem. With our new MEM implementation, mesh adaptation is able to efficiently refine the grid near the heat source and coarsen the grid once the heat source passes. The savings in computational work resulting from the dynamic mesh adaptation is demonstrated by the reduction of the the number of elements used and CPU time spent. MEM and mesh adaptation, respectively, bring irregularity and dynamics to the computer memory access pattern. Hence, they provide a good way to gauge the performance of computer systems when running scientific applications whose memory access patterns are irregular and unpredictable. We select a 3-D moving heat source problem as the Unstructured Adaptive (UA) grid benchmark, a new component of the NAS Parallel Benchmarks (NPB). In this paper, we present some interesting performance results of ow OpenMP parallel implementation on different architectures such as the SGI Origin2000, SGI Altix, and Cray MTA-2.

Storage and computationally efficient permutations of factorized covariance and square-root information matrices

NASA Technical Reports Server (NTRS)

Muellerschoen, R. J.

1988-01-01

A unified method to permute vector-stored upper-triangular diagonal factorized covariance (UD) and vector stored upper-triangular square-root information filter (SRIF) arrays is presented. The method involves cyclical permutation of the rows and columns of the arrays and retriangularization with appropriate square-root-free fast Givens rotations or elementary slow Givens reflections. A minimal amount of computation is performed and only one scratch vector of size N is required, where N is the column dimension of the arrays. To make the method efficient for large SRIF arrays on a virtual memory machine, three additional scratch vectors each of size N are used to avoid expensive paging faults. The method discussed is compared with the methods and routines of Bierman's Estimation Subroutine Library (ESL).
Aspects of numerical and representational methods related to the finite-difference simulation of advective and dispersive transport of freshwater in a thin brackish aquifer

USGS Publications Warehouse

Merritt, M.L.

1993-01-01

The simulation of the transport of injected freshwater in a thin brackish aquifer, overlain and underlain by confining layers containing more saline water, is shown to be influenced by the choice of the finite-difference approximation method, the algorithm for representing vertical advective and dispersive fluxes, and the values assigned to parametric coefficients that specify the degree of vertical dispersion and molecular diffusion that occurs. Computed potable water recovery efficiencies will differ depending upon the choice of algorithm and approximation method, as will dispersion coefficients estimated based on the calibration of simulations to match measured data. A comparison of centered and backward finite-difference approximation methods shows that substantially different transition zones between injected and native waters are depicted by the different methods, and computed recovery efficiencies vary greatly. Standard and experimental algorithms and a variety of values for molecular diffusivity, transverse dispersivity, and vertical scaling factor were compared in simulations of freshwater storage in a thin brackish aquifer. Computed recovery efficiencies vary considerably, and appreciable differences are observed in the distribution of injected freshwater in the various cases tested. The results demonstrate both a qualitatively different description of transport using the experimental algorithms and the interrelated influences of molecular diffusion and transverse dispersion on simulated recovery efficiency. When simulating natural aquifer flow in cross-section, flushing of the aquifer occurred for all tested coefficient choices using both standard and experimental algorithms. ?? 1993.
The Diffusion of Computer-Based Technology in K-12 Schools: Teachers' Perspectives

ERIC Educational Resources Information Center

Colandrea, John Louis

2012-01-01

Because computer technology represents a major financial outlay for school districts and is an efficient method of preparing and delivering lessons, studying the process of teacher adoption of computer use is beneficial and adds to the current body of knowledge. Because the teacher is the ultimate user of computer technology for lesson preparation…
Fast Particle Methods for Multiscale Phenomena Simulations

NASA Technical Reports Server (NTRS)

Koumoutsakos, P.; Wray, A.; Shariff, K.; Pohorille, Andrew

2000-01-01

We are developing particle methods oriented at improving computational modeling capabilities of multiscale physical phenomena in : (i) high Reynolds number unsteady vortical flows, (ii) particle laden and interfacial flows, (iii)molecular dynamics studies of nanoscale droplets and studies of the structure, functions, and evolution of the earliest living cell. The unifying computational approach involves particle methods implemented in parallel computer architectures. The inherent adaptivity, robustness and efficiency of particle methods makes them a multidisciplinary computational tool capable of bridging the gap of micro-scale and continuum flow simulations. Using efficient tree data structures, multipole expansion algorithms, and improved particle-grid interpolation, particle methods allow for simulations using millions of computational elements, making possible the resolution of a wide range of length and time scales of these important physical phenomena.The current challenges in these simulations are in : [i] the proper formulation of particle methods in the molecular and continuous level for the discretization of the governing equations [ii] the resolution of the wide range of time and length scales governing the phenomena under investigation. [iii] the minimization of numerical artifacts that may interfere with the physics of the systems under consideration. [iv] the parallelization of processes such as tree traversal and grid-particle interpolations We are conducting simulations using vortex methods, molecular dynamics and smooth particle hydrodynamics, exploiting their unifying concepts such as : the solution of the N-body problem in parallel computers, highly accurate particle-particle and grid-particle interpolations, parallel FFT's and the formulation of processes such as diffusion in the context of particle methods. This approach enables us to transcend among seemingly unrelated areas of research.
Using quantum chemistry muscle to flex massive systems: How to respond to something perturbing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bertoni, Colleen

Computational chemistry uses the theoretical advances of quantum mechanics and the algorithmic and hardware advances of computer science to give insight into chemical problems. It is currently possible to do highly accurate quantum chemistry calculations, but the most accurate methods are very computationally expensive. Thus it is only feasible to do highly accurate calculations on small molecules, since typically more computationally efficient methods are also less accurate. The overall goal of my dissertation work has been to try to decrease the computational expense of calculations without decreasing the accuracy. In particular, my dissertation work focuses on fragmentation methods, intermolecular interactionsmore » methods, analytic gradients, and taking advantage of new hardware.« less
Methodical and technological aspects of creation of interactive computer learning systems

NASA Astrophysics Data System (ADS)

Vishtak, N. M.; Frolov, D. A.

2017-01-01

The article presents a methodology for the development of an interactive computer training system for training power plant. The methods used in the work are a generalization of the content of scientific and methodological sources on the use of computer-based training systems in vocational education, methods of system analysis, methods of structural and object-oriented modeling of information systems. The relevance of the development of the interactive computer training systems in the preparation of the personnel in the conditions of the educational and training centers is proved. Development stages of the computer training systems are allocated, factors of efficient use of the interactive computer training system are analysed. The algorithm of work performance at each development stage of the interactive computer training system that enables one to optimize time, financial and labor expenditure on the creation of the interactive computer training system is offered.
Two pass method and radiation interchange processing when applied to thermal-structural analysis of large space truss structures

NASA Technical Reports Server (NTRS)

Warren, Andrew H.; Arelt, Joseph E.; Lalicata, Anthony L.; Rogers, Karen M.

1993-01-01

A method of efficient and automated thermal-structural processing of very large space structures is presented. The method interfaces the finite element and finite difference techniques. It also results in a pronounced reduction of the quantity of computations, computer resources and manpower required for the task, while assuring the desired accuracy of the results.
Efficient grid-based techniques for density functional theory

NASA Astrophysics Data System (ADS)

Rodriguez-Hernandez, Juan Ignacio

Understanding the chemical and physical properties of molecules and materials at a fundamental level often requires quantum-mechanical models for these substance's electronic structure. This type of many body quantum mechanics calculation is computationally demanding, hindering its application to substances with more than a few hundreds atoms. The supreme goal of many researches in quantum chemistry---and the topic of this dissertation---is to develop more efficient computational algorithms for electronic structure calculations. In particular, this dissertation develops two new numerical integration techniques for computing molecular and atomic properties within conventional Kohn-Sham-Density Functional Theory (KS-DFT) of molecular electronic structure. The first of these grid-based techniques is based on the transformed sparse grid construction. In this construction, a sparse grid is generated in the unit cube and then mapped to real space according to the pro-molecular density using the conditional distribution transformation. The transformed sparse grid was implemented in program deMon2k, where it is used as the numerical integrator for the exchange-correlation energy and potential in the KS-DFT procedure. We tested our grid by computing ground state energies, equilibrium geometries, and atomization energies. The accuracy on these test calculations shows that our grid is more efficient than some previous integration methods: our grids use fewer points to obtain the same accuracy. The transformed sparse grids were also tested for integrating, interpolating and differentiating in different dimensions (n = 1,2,3,6). The second technique is a grid-based method for computing atomic properties within QTAIM. It was also implemented in deMon2k. The performance of the method was tested by computing QTAIM atomic energies, charges, dipole moments, and quadrupole moments. For medium accuracy, our method is the fastest one we know of.
Conjugate Gradient Algorithms For Manipulator Simulation

NASA Technical Reports Server (NTRS)

Fijany, Amir; Scheid, Robert E.

1991-01-01

Report discusses applicability of conjugate-gradient algorithms to computation of forward dynamics of robotic manipulators. Rapid computation of forward dynamics essential to teleoperation and other advanced robotic applications. Part of continuing effort to find algorithms meeting requirements for increased computational efficiency and speed. Method used for iterative solution of systems of linear equations.
Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing

NASA Astrophysics Data System (ADS)

Nishiura, Daisuke; Furuichi, Mikito; Sakaguchi, Hide

2015-09-01

The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.
Application of microarray analysis on computer cluster and cloud platforms.

PubMed

Bernau, C; Boulesteix, A-L; Knaus, J

2013-01-01

Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.
Sensitivity analysis and approximation methods for general eigenvalue problems

NASA Technical Reports Server (NTRS)

Murthy, D. V.; Haftka, R. T.

1986-01-01

Optimization of dynamic systems involving complex non-hermitian matrices is often computationally expensive. Major contributors to the computational expense are the sensitivity analysis and reanalysis of a modified design. The present work seeks to alleviate this computational burden by identifying efficient sensitivity analysis and approximate reanalysis methods. For the algebraic eigenvalue problem involving non-hermitian matrices, algorithms for sensitivity analysis and approximate reanalysis are classified, compared and evaluated for efficiency and accuracy. Proper eigenvector normalization is discussed. An improved method for calculating derivatives of eigenvectors is proposed based on a more rational normalization condition and taking advantage of matrix sparsity. Important numerical aspects of this method are also discussed. To alleviate the problem of reanalysis, various approximation methods for eigenvalues are proposed and evaluated. Linear and quadratic approximations are based directly on the Taylor series. Several approximation methods are developed based on the generalized Rayleigh quotient for the eigenvalue problem. Approximation methods based on trace theorem give high accuracy without needing any derivatives. Operation counts for the computation of the approximations are given. General recommendations are made for the selection of appropriate approximation technique as a function of the matrix size, number of design variables, number of eigenvalues of interest and the number of design points at which approximation is sought.
Assessing the predictive capability of randomized tree-based ensembles in streamflow modelling

NASA Astrophysics Data System (ADS)

Galelli, S.; Castelletti, A.

2013-02-01

Combining randomization methods with ensemble prediction is emerging as an effective option to balance accuracy and computational efficiency in data-driven modeling. In this paper we investigate the prediction capability of extremely randomized trees (Extra-Trees), in terms of accuracy, explanation ability and computational efficiency, in a streamflow modeling exercise. Extra-Trees are a totally randomized tree-based ensemble method that (i) alleviates the poor generalization property and tendency to overfitting of traditional standalone decision trees (e.g. CART); (ii) is computationally very efficient; and, (iii) allows to infer the relative importance of the input variables, which might help in the ex-post physical interpretation of the model. The Extra-Trees potential is analyzed on two real-world case studies (Marina catchment (Singapore) and Canning River (Western Australia)) representing two different morphoclimatic contexts comparatively with other tree-based methods (CART and M5) and parametric data-driven approaches (ANNs and multiple linear regression). Results show that Extra-Trees perform comparatively well to the best of the benchmarks (i.e. M5) in both the watersheds, while outperforming the other approaches in terms of computational requirement when adopted on large datasets. In addition, the ranking of the input variable provided can be given a physically meaningful interpretation.
Assessing the predictive capability of randomized tree-based ensembles in streamflow modelling

NASA Astrophysics Data System (ADS)

Galelli, S.; Castelletti, A.

2013-07-01

Combining randomization methods with ensemble prediction is emerging as an effective option to balance accuracy and computational efficiency in data-driven modelling. In this paper, we investigate the prediction capability of extremely randomized trees (Extra-Trees), in terms of accuracy, explanation ability and computational efficiency, in a streamflow modelling exercise. Extra-Trees are a totally randomized tree-based ensemble method that (i) alleviates the poor generalisation property and tendency to overfitting of traditional standalone decision trees (e.g. CART); (ii) is computationally efficient; and, (iii) allows to infer the relative importance of the input variables, which might help in the ex-post physical interpretation of the model. The Extra-Trees potential is analysed on two real-world case studies - Marina catchment (Singapore) and Canning River (Western Australia) - representing two different morphoclimatic contexts. The evaluation is performed against other tree-based methods (CART and M5) and parametric data-driven approaches (ANNs and multiple linear regression). Results show that Extra-Trees perform comparatively well to the best of the benchmarks (i.e. M5) in both the watersheds, while outperforming the other approaches in terms of computational requirement when adopted on large datasets. In addition, the ranking of the input variable provided can be given a physically meaningful interpretation.
Complexity control algorithm based on adaptive mode selection for interframe coding in high efficiency video coding

NASA Astrophysics Data System (ADS)

Chen, Gang; Yang, Bing; Zhang, Xiaoyun; Gao, Zhiyong

2017-07-01

The latest high efficiency video coding (HEVC) standard significantly increases the encoding complexity for improving its coding efficiency. Due to the limited computational capability of handheld devices, complexity constrained video coding has drawn great attention in recent years. A complexity control algorithm based on adaptive mode selection is proposed for interframe coding in HEVC. Considering the direct proportionality between encoding time and computational complexity, the computational complexity is measured in terms of encoding time. First, complexity is mapped to a target in terms of prediction modes. Then, an adaptive mode selection algorithm is proposed for the mode decision process. Specifically, the optimal mode combination scheme that is chosen through offline statistics is developed at low complexity. If the complexity budget has not been used up, an adaptive mode sorting method is employed to further improve coding efficiency. The experimental results show that the proposed algorithm achieves a very large complexity control range (as low as 10%) for the HEVC encoder while maintaining good rate-distortion performance. For the lowdelayP condition, compared with the direct resource allocation method and the state-of-the-art method, an average gain of 0.63 and 0.17 dB in BDPSNR is observed for 18 sequences when the target complexity is around 40%.
The multifacet graphically contracted function method. I. Formulation and implementation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shepard, Ron; Brozell, Scott R.; Gidofalvi, Gergely

2014-08-14

The basic formulation for the multifacet generalization of the graphically contracted function (MFGCF) electronic structure method is presented. The analysis includes the discussion of linear dependency and redundancy of the arc factor parameters, the computation of reduced density matrices, Hamiltonian matrix construction, spin-density matrix construction, the computation of optimization gradients for single-state and state-averaged calculations, graphical wave function analysis, and the efficient computation of configuration state function and Slater determinant expansion coefficients. Timings are given for Hamiltonian matrix element and analytic optimization gradient computations for a range of model problems for full-CI Shavitt graphs, and it is observed that bothmore » the energy and the gradient computation scale as O(N{sup 2}n{sup 4}) for N electrons and n orbitals. The important arithmetic operations are within dense matrix-matrix product computational kernels, resulting in a computationally efficient procedure. An initial implementation of the method is used to present applications to several challenging chemical systems, including N{sub 2} dissociation, cubic H{sub 8} dissociation, the symmetric dissociation of H{sub 2}O, and the insertion of Be into H{sub 2}. The results are compared to the exact full-CI values and also to those of the previous single-facet GCF expansion form.« less
Probabilistic power flow using improved Monte Carlo simulation method with correlated wind sources

NASA Astrophysics Data System (ADS)

Bie, Pei; Zhang, Buhan; Li, Hang; Deng, Weisi; Wu, Jiasi

2017-01-01

Probabilistic Power Flow (PPF) is a very useful tool for power system steady-state analysis. However, the correlation among different random injection power (like wind power) brings great difficulties to calculate PPF. Monte Carlo simulation (MCS) and analytical methods are two commonly used methods to solve PPF. MCS has high accuracy but is very time consuming. Analytical method like cumulants method (CM) has high computing efficiency but the cumulants calculating is not convenient when wind power output does not obey any typical distribution, especially when correlated wind sources are considered. In this paper, an Improved Monte Carlo simulation method (IMCS) is proposed. The joint empirical distribution is applied to model different wind power output. This method combines the advantages of both MCS and analytical method. It not only has high computing efficiency, but also can provide solutions with enough accuracy, which is very suitable for on-line analysis.
Accelerating Spaceborne SAR Imaging Using Multiple CPU/GPU Deep Collaborative Computing

PubMed Central

Zhang, Fan; Li, Guojun; Li, Wei; Hu, Wei; Hu, Yuxin

2016-01-01

With the development of synthetic aperture radar (SAR) technologies in recent years, the huge amount of remote sensing data brings challenges for real-time imaging processing. Therefore, high performance computing (HPC) methods have been presented to accelerate SAR imaging, especially the GPU based methods. In the classical GPU based imaging algorithm, GPU is employed to accelerate image processing by massive parallel computing, and CPU is only used to perform the auxiliary work such as data input/output (IO). However, the computing capability of CPU is ignored and underestimated. In this work, a new deep collaborative SAR imaging method based on multiple CPU/GPU is proposed to achieve real-time SAR imaging. Through the proposed tasks partitioning and scheduling strategy, the whole image can be generated with deep collaborative multiple CPU/GPU computing. In the part of CPU parallel imaging, the advanced vector extension (AVX) method is firstly introduced into the multi-core CPU parallel method for higher efficiency. As for the GPU parallel imaging, not only the bottlenecks of memory limitation and frequent data transferring are broken, but also kinds of optimized strategies are applied, such as streaming, parallel pipeline and so on. Experimental results demonstrate that the deep CPU/GPU collaborative imaging method enhances the efficiency of SAR imaging on single-core CPU by 270 times and realizes the real-time imaging in that the imaging rate outperforms the raw data generation rate. PMID:27070606
Accelerating Spaceborne SAR Imaging Using Multiple CPU/GPU Deep Collaborative Computing.

PubMed

Zhang, Fan; Li, Guojun; Li, Wei; Hu, Wei; Hu, Yuxin

2016-04-07

With the development of synthetic aperture radar (SAR) technologies in recent years, the huge amount of remote sensing data brings challenges for real-time imaging processing. Therefore, high performance computing (HPC) methods have been presented to accelerate SAR imaging, especially the GPU based methods. In the classical GPU based imaging algorithm, GPU is employed to accelerate image processing by massive parallel computing, and CPU is only used to perform the auxiliary work such as data input/output (IO). However, the computing capability of CPU is ignored and underestimated. In this work, a new deep collaborative SAR imaging method based on multiple CPU/GPU is proposed to achieve real-time SAR imaging. Through the proposed tasks partitioning and scheduling strategy, the whole image can be generated with deep collaborative multiple CPU/GPU computing. In the part of CPU parallel imaging, the advanced vector extension (AVX) method is firstly introduced into the multi-core CPU parallel method for higher efficiency. As for the GPU parallel imaging, not only the bottlenecks of memory limitation and frequent data transferring are broken, but also kinds of optimized strategies are applied, such as streaming, parallel pipeline and so on. Experimental results demonstrate that the deep CPU/GPU collaborative imaging method enhances the efficiency of SAR imaging on single-core CPU by 270 times and realizes the real-time imaging in that the imaging rate outperforms the raw data generation rate.
A UNIFIED FRAMEWORK FOR VARIANCE COMPONENT ESTIMATION WITH SUMMARY STATISTICS IN GENOME-WIDE ASSOCIATION STUDIES.

PubMed

Zhou, Xiang

2017-12-01

Linear mixed models (LMMs) are among the most commonly used tools for genetic association studies. However, the standard method for estimating variance components in LMMs-the restricted maximum likelihood estimation method (REML)-suffers from several important drawbacks: REML requires individual-level genotypes and phenotypes from all samples in the study, is computationally slow, and produces downward-biased estimates in case control studies. To remedy these drawbacks, we present an alternative framework for variance component estimation, which we refer to as MQS. MQS is based on the method of moments (MoM) and the minimal norm quadratic unbiased estimation (MINQUE) criterion, and brings two seemingly unrelated methods-the renowned Haseman-Elston (HE) regression and the recent LD score regression (LDSC)-into the same unified statistical framework. With this new framework, we provide an alternative but mathematically equivalent form of HE that allows for the use of summary statistics. We provide an exact estimation form of LDSC to yield unbiased and statistically more efficient estimates. A key feature of our method is its ability to pair marginal z -scores computed using all samples with SNP correlation information computed using a small random subset of individuals (or individuals from a proper reference panel), while capable of producing estimates that can be almost as accurate as if both quantities are computed using the full data. As a result, our method produces unbiased and statistically efficient estimates, and makes use of summary statistics, while it is computationally efficient for large data sets. Using simulations and applications to 37 phenotypes from 8 real data sets, we illustrate the benefits of our method for estimating and partitioning SNP heritability in population studies as well as for heritability estimation in family studies. Our method is implemented in the GEMMA software package, freely available at www.xzlab.org/software.html.

Constrained Total Generalized p-Variation Minimization for Few-View X-Ray Computed Tomography Image Reconstruction.

PubMed

Zhang, Hanming; Wang, Linyuan; Yan, Bin; Li, Lei; Cai, Ailong; Hu, Guoen

2016-01-01

Total generalized variation (TGV)-based computed tomography (CT) image reconstruction, which utilizes high-order image derivatives, is superior to total variation-based methods in terms of the preservation of edge information and the suppression of unfavorable staircase effects. However, conventional TGV regularization employs l1-based form, which is not the most direct method for maximizing sparsity prior. In this study, we propose a total generalized p-variation (TGpV) regularization model to improve the sparsity exploitation of TGV and offer efficient solutions to few-view CT image reconstruction problems. To solve the nonconvex optimization problem of the TGpV minimization model, we then present an efficient iterative algorithm based on the alternating minimization of augmented Lagrangian function. All of the resulting subproblems decoupled by variable splitting admit explicit solutions by applying alternating minimization method and generalized p-shrinkage mapping. In addition, approximate solutions that can be easily performed and quickly calculated through fast Fourier transform are derived using the proximal point method to reduce the cost of inner subproblems. The accuracy and efficiency of the simulated and real data are qualitatively and quantitatively evaluated to validate the efficiency and feasibility of the proposed method. Overall, the proposed method exhibits reasonable performance and outperforms the original TGV-based method when applied to few-view problems.
Efficient marginalization to compute protein posterior probabilities from shotgun mass spectrometry data

PubMed Central

Serang, Oliver; MacCoss, Michael J.; Noble, William Stafford

2010-01-01

The problem of identifying proteins from a shotgun proteomics experiment has not been definitively solved. Identifying the proteins in a sample requires ranking them, ideally with interpretable scores. In particular, “degenerate” peptides, which map to multiple proteins, have made such a ranking difficult to compute. The problem of computing posterior probabilities for the proteins, which can be interpreted as confidence in a protein’s presence, has been especially daunting. Previous approaches have either ignored the peptide degeneracy problem completely, addressed it by computing a heuristic set of proteins or heuristic posterior probabilities, or by estimating the posterior probabilities with sampling methods. We present a probabilistic model for protein identification in tandem mass spectrometry that recognizes peptide degeneracy. We then introduce graph-transforming algorithms that facilitate efficient computation of protein probabilities, even for large data sets. We evaluate our identification procedure on five different well-characterized data sets and demonstrate our ability to efficiently compute high-quality protein posteriors. PMID:20712337
Adjoint-Based Aerodynamic Design of Complex Aerospace Configurations

NASA Technical Reports Server (NTRS)

Nielsen, Eric J.

2016-01-01

An overview of twenty years of adjoint-based aerodynamic design research at NASA Langley Research Center is presented. Adjoint-based algorithms provide a powerful tool for efficient sensitivity analysis of complex large-scale computational fluid dynamics (CFD) simulations. Unlike alternative approaches for which computational expense generally scales with the number of design parameters, adjoint techniques yield sensitivity derivatives of a simulation output with respect to all input parameters at the cost of a single additional simulation. With modern large-scale CFD applications often requiring millions of compute hours for a single analysis, the efficiency afforded by adjoint methods is critical in realizing a computationally tractable design optimization capability for such applications.
Perspective: Ring-polymer instanton theory

NASA Astrophysics Data System (ADS)

Richardson, Jeremy O.

2018-05-01

Since the earliest explorations of quantum mechanics, it has been a topic of great interest that quantum tunneling allows particles to penetrate classically insurmountable barriers. Instanton theory provides a simple description of these processes in terms of dominant tunneling pathways. Using a ring-polymer discretization, an efficient computational method is obtained for applying this theory to compute reaction rates and tunneling splittings in molecular systems. Unlike other quantum-dynamics approaches, the method scales well with the number of degrees of freedom, and for many polyatomic systems, the method may provide the most accurate predictions which can be practically computed. Instanton theory thus has the capability to produce useful data for many fields of low-temperature chemistry including spectroscopy, atmospheric and astrochemistry, as well as surface science. There is however still room for improvement in the efficiency of the numerical algorithms, and new theories are under development for describing tunneling in nonadiabatic transitions.
Integrating structure-based and ligand-based approaches for computational drug design.

PubMed

Wilson, Gregory L; Lill, Markus A

2011-04-01

Methods utilized in computer-aided drug design can be classified into two major categories: structure based and ligand based, using information on the structure of the protein or on the biological and physicochemical properties of bound ligands, respectively. In recent years there has been a trend towards integrating these two methods in order to enhance the reliability and efficiency of computer-aided drug-design approaches by combining information from both the ligand and the protein. This trend resulted in a variety of methods that include: pseudoreceptor methods, pharmacophore methods, fingerprint methods and approaches integrating docking with similarity-based methods. In this article, we will describe the concepts behind each method and selected applications.
MINIVER: Miniature version of real/ideal gas aero-heating and ablation computer program

NASA Technical Reports Server (NTRS)

Hendler, D. R.

1976-01-01

Computer code is used to determine heat transfer multiplication factors, special flow field simulation techniques, different heat transfer methods, different transition criteria, crossflow simulation, and more efficient thin skin thickness optimization procedure.
Krylov subspace methods on supercomputers

NASA Technical Reports Server (NTRS)

Saad, Youcef

1988-01-01

A short survey of recent research on Krylov subspace methods with emphasis on implementation on vector and parallel computers is presented. Conjugate gradient methods have proven very useful on traditional scalar computers, and their popularity is likely to increase as three-dimensional models gain importance. A conservative approach to derive effective iterative techniques for supercomputers has been to find efficient parallel/vector implementations of the standard algorithms. The main source of difficulty in the incomplete factorization preconditionings is in the solution of the triangular systems at each step. A few approaches consisting of implementing efficient forward and backward triangular solutions are described in detail. Polynomial preconditioning as an alternative to standard incomplete factorization techniques is also discussed. Another efficient approach is to reorder the equations so as to improve the structure of the matrix to achieve better parallelism or vectorization. An overview of these and other ideas and their effectiveness or potential for different types of architectures is given.
All-atom calculation of protein free-energy profiles

NASA Astrophysics Data System (ADS)

Orioli, S.; Ianeselli, A.; Spagnolli, G.; Faccioli, P.

2017-10-01

The Bias Functional (BF) approach is a variational method which enables one to efficiently generate ensembles of reactive trajectories for complex biomolecular transitions, using ordinary computer clusters. For example, this scheme was applied to simulate in atomistic detail the folding of proteins consisting of several hundreds of amino acids and with experimental folding time of several minutes. A drawback of the BF approach is that it produces trajectories which do not satisfy microscopic reversibility. Consequently, this method cannot be used to directly compute equilibrium observables, such as free energy landscapes or equilibrium constants. In this work, we develop a statistical analysis which permits us to compute the potential of mean-force (PMF) along an arbitrary collective coordinate, by exploiting the information contained in the reactive trajectories calculated with the BF approach. We assess the accuracy and computational efficiency of this scheme by comparing its results with the PMF obtained for a small protein by means of plain molecular dynamics.
Aerodynamic optimization studies on advanced architecture computers

NASA Technical Reports Server (NTRS)

Chawla, Kalpana

1995-01-01

The approach to carrying out multi-discipline aerospace design studies in the future, especially in massively parallel computing environments, comprises of choosing (1) suitable solvers to compute solutions to equations characterizing a discipline, and (2) efficient optimization methods. In addition, for aerodynamic optimization problems, (3) smart methodologies must be selected to modify the surface shape. In this research effort, a 'direct' optimization method is implemented on the Cray C-90 to improve aerodynamic design. It is coupled with an existing implicit Navier-Stokes solver, OVERFLOW, to compute flow solutions. The optimization method is chosen such that it can accomodate multi-discipline optimization in future computations. In the work , however, only single discipline aerodynamic optimization will be included.
Computational Methods for Configurational Entropy Using Internal and Cartesian Coordinates.

PubMed

Hikiri, Simon; Yoshidome, Takashi; Ikeguchi, Mitsunori

2016-12-13

The configurational entropy of solute molecules is a crucially important quantity to study various biophysical processes. Consequently, it is necessary to establish an efficient quantitative computational method to calculate configurational entropy as accurately as possible. In the present paper, we investigate the quantitative performance of the quasi-harmonic and related computational methods, including widely used methods implemented in popular molecular dynamics (MD) software packages, compared with the Clausius method, which is capable of accurately computing the change of the configurational entropy upon temperature change. Notably, we focused on the choice of the coordinate systems (i.e., internal or Cartesian coordinates). The Boltzmann-quasi-harmonic (BQH) method using internal coordinates outperformed all the six methods examined here. The introduction of improper torsions in the BQH method improves its performance, and anharmonicity of proper torsions in proteins is identified to be the origin of the superior performance of the BQH method. In contrast, widely used methods implemented in MD packages show rather poor performance. In addition, the enhanced sampling of replica-exchange MD simulations was found to be efficient for the convergent behavior of entropy calculations. Also in folding/unfolding transitions of a small protein, Chignolin, the BQH method was reasonably accurate. However, the independent term without the correlation term in the BQH method was most accurate for the folding entropy among the methods considered in this study, because the QH approximation of the correlation term in the BQH method was no longer valid for the divergent unfolded structures.
A Study on the Methods of Assessment and Strategy of Knowledge Sharing in Computer Course

ERIC Educational Resources Information Center

Chan, Pat P. W.

2014-01-01

With the advancement of information and communication technology, collaboration and knowledge sharing through technology is facilitated which enhances the learning process and improves the learning efficiency. The purpose of this paper is to review the methods of assessment and strategy of collaboration and knowledge sharing in a computer course,…
Efficient parallelization of analytic bond-order potentials for large-scale atomistic simulations

NASA Astrophysics Data System (ADS)

Teijeiro, C.; Hammerschmidt, T.; Drautz, R.; Sutmann, G.

2016-07-01

Analytic bond-order potentials (BOPs) provide a way to compute atomistic properties with controllable accuracy. For large-scale computations of heterogeneous compounds at the atomistic level, both the computational efficiency and memory demand of BOP implementations have to be optimized. Since the evaluation of BOPs is a local operation within a finite environment, the parallelization concepts known from short-range interacting particle simulations can be applied to improve the performance of these simulations. In this work, several efficient parallelization methods for BOPs that use three-dimensional domain decomposition schemes are described. The schemes are implemented into the bond-order potential code BOPfox, and their performance is measured in a series of benchmarks. Systems of up to several millions of atoms are simulated on a high performance computing system, and parallel scaling is demonstrated for up to thousands of processors.
Parallel Implicit Runge-Kutta Methods Applied to Coupled Orbit/Attitude Propagation

NASA Astrophysics Data System (ADS)

Hatten, Noble; Russell, Ryan P.

2017-12-01

A variable-step Gauss-Legendre implicit Runge-Kutta (GLIRK) propagator is applied to coupled orbit/attitude propagation. Concepts previously shown to improve efficiency in 3DOF propagation are modified and extended to the 6DOF problem, including the use of variable-fidelity dynamics models. The impact of computing the stage dynamics of a single step in parallel is examined using up to 23 threads and 22 associated GLIRK stages; one thread is reserved for an extra dynamics function evaluation used in the estimation of the local truncation error. Efficiency is found to peak for typical examples when using approximately 8 to 12 stages for both serial and parallel implementations. Accuracy and efficiency compare favorably to explicit Runge-Kutta and linear-multistep solvers for representative scenarios. However, linear-multistep methods are found to be more efficient for some applications, particularly in a serial computing environment, or when parallelism can be applied across multiple trajectories.
Very Large Scale Optimization

NASA Technical Reports Server (NTRS)

Vanderplaats, Garrett; Townsend, James C. (Technical Monitor)

2002-01-01

The purpose of this research under the NASA Small Business Innovative Research program was to develop algorithms and associated software to solve very large nonlinear, constrained optimization tasks. Key issues included efficiency, reliability, memory, and gradient calculation requirements. This report describes the general optimization problem, ten candidate methods, and detailed evaluations of four candidates. The algorithm chosen for final development is a modern recreation of a 1960s external penalty function method that uses very limited computer memory and computational time. Although of lower efficiency, the new method can solve problems orders of magnitude larger than current methods. The resulting BIGDOT software has been demonstrated on problems with 50,000 variables and about 50,000 active constraints. For unconstrained optimization, it has solved a problem in excess of 135,000 variables. The method includes a technique for solving discrete variable problems that finds a "good" design, although a theoretical optimum cannot be guaranteed. It is very scalable in that the number of function and gradient evaluations does not change significantly with increased problem size. Test cases are provided to demonstrate the efficiency and reliability of the methods and software.
A fast mass spring model solver for high-resolution elastic objects

NASA Astrophysics Data System (ADS)

Zheng, Mianlun; Yuan, Zhiyong; Zhu, Weixu; Zhang, Guian

2017-03-01

Real-time simulation of elastic objects is of great importance for computer graphics and virtual reality applications. The fast mass spring model solver can achieve visually realistic simulation in an efficient way. Unfortunately, this method suffers from resolution limitations and lack of mechanical realism for a surface geometry model, which greatly restricts its application. To tackle these problems, in this paper we propose a fast mass spring model solver for high-resolution elastic objects. First, we project the complex surface geometry model into a set of uniform grid cells as cages through *cages mean value coordinate method to reflect its internal structure and mechanics properties. Then, we replace the original Cholesky decomposition method in the fast mass spring model solver with a conjugate gradient method, which can make the fast mass spring model solver more efficient for detailed surface geometry models. Finally, we propose a graphics processing unit accelerated parallel algorithm for the conjugate gradient method. Experimental results show that our method can realize efficient deformation simulation of 3D elastic objects with visual reality and physical fidelity, which has a great potential for applications in computer animation.
An evaluation method of computer usability based on human-to-computer information transmission model.

PubMed

Ogawa, K

1992-01-01

This paper proposes a new evaluation and prediction method for computer usability. This method is based on our two previously proposed information transmission measures created from a human-to-computer information transmission model. The model has three information transmission levels: the device, software, and task content levels. Two measures, called the device independent information measure (DI) and the computer independent information measure (CI), defined on the software and task content levels respectively, are given as the amount of information transmitted. Two information transmission rates are defined as DI/T and CI/T, where T is the task completion time: the device independent information transmission rate (RDI), and the computer independent information transmission rate (RCI). The method utilizes the RDI and RCI rates to evaluate relatively the usability of software and device operations on different computer systems. Experiments using three different systems, in this case a graphical information input task, confirm that the method offers an efficient way of determining computer usability.
Stability and error estimation for Component Adaptive Grid methods

NASA Technical Reports Server (NTRS)

Oliger, Joseph; Zhu, Xiaolei

1994-01-01

Component adaptive grid (CAG) methods for solving hyperbolic partial differential equations (PDE's) are discussed in this paper. Applying recent stability results for a class of numerical methods on uniform grids. The convergence of these methods for linear problems on component adaptive grids is established here. Furthermore, the computational error can be estimated on CAG's using the stability results. Using these estimates, the error can be controlled on CAG's. Thus, the solution can be computed efficiently on CAG's within a given error tolerance. Computational results for time dependent linear problems in one and two space dimensions are presented.
Secure and Efficient Regression Analysis Using a Hybrid Cryptographic Framework: Development and Evaluation

PubMed Central

Jiang, Xiaoqian; Aziz, Md Momin Al; Wang, Shuang; Mohammed, Noman

2018-01-01

Background Machine learning is an effective data-driven tool that is being widely used to extract valuable patterns and insights from data. Specifically, predictive machine learning models are very important in health care for clinical data analysis. The machine learning algorithms that generate predictive models often require pooling data from different sources to discover statistical patterns or correlations among different attributes of the input data. The primary challenge is to fulfill one major objective: preserving the privacy of individuals while discovering knowledge from data. Objective Our objective was to develop a hybrid cryptographic framework for performing regression analysis over distributed data in a secure and efficient way. Methods Existing secure computation schemes are not suitable for processing the large-scale data that are used in cutting-edge machine learning applications. We designed, developed, and evaluated a hybrid cryptographic framework, which can securely perform regression analysis, a fundamental machine learning algorithm using somewhat homomorphic encryption and a newly introduced secure hardware component of Intel Software Guard Extensions (Intel SGX) to ensure both privacy and efficiency at the same time. Results Experimental results demonstrate that our proposed method provides a better trade-off in terms of security and efficiency than solely secure hardware-based methods. Besides, there is no approximation error. Computed model parameters are exactly similar to plaintext results. Conclusions To the best of our knowledge, this kind of secure computation model using a hybrid cryptographic framework, which leverages both somewhat homomorphic encryption and Intel SGX, is not proposed or evaluated to this date. Our proposed framework ensures data security and computational efficiency at the same time. PMID:29506966
Status and future prospects of using numerical methods to study complex flows at High Reynolds numbers

NASA Technical Reports Server (NTRS)

Maccormack, R. W.

1978-01-01

The calculation of flow fields past aircraft configuration at flight Reynolds numbers is considered. Progress in devising accurate and efficient numerical methods, in understanding and modeling the physics of turbulence, and in developing reliable and powerful computer hardware is discussed. Emphasis is placed on efficient solutions to the Navier-Stokes equations.
Use of the parameterised finite element method to robustly and efficiently evolve the edge of a moving cell.

PubMed

Neilson, Matthew P; Mackenzie, John A; Webb, Steven D; Insall, Robert H

2010-11-01

In this paper we present a computational tool that enables the simulation of mathematical models of cell migration and chemotaxis on an evolving cell membrane. Recent models require the numerical solution of systems of reaction-diffusion equations on the evolving cell membrane and then the solution state is used to drive the evolution of the cell edge. Previous work involved moving the cell edge using a level set method (LSM). However, the LSM is computationally very expensive, which severely limits the practical usefulness of the algorithm. To address this issue, we have employed the parameterised finite element method (PFEM) as an alternative method for evolving a cell boundary. We show that the PFEM is far more efficient and robust than the LSM. We therefore suggest that the PFEM potentially has an essential role to play in computational modelling efforts towards the understanding of many of the complex issues related to chemotaxis.

Distributed collaborative probabilistic design for turbine blade-tip radial running clearance using support vector machine of regression

NASA Astrophysics Data System (ADS)

Fei, Cheng-Wei; Bai, Guang-Chen

2014-12-01

To improve the computational precision and efficiency of probabilistic design for mechanical dynamic assembly like the blade-tip radial running clearance (BTRRC) of gas turbine, a distribution collaborative probabilistic design method-based support vector machine of regression (SR)(called as DCSRM) is proposed by integrating distribution collaborative response surface method and support vector machine regression model. The mathematical model of DCSRM is established and the probabilistic design idea of DCSRM is introduced. The dynamic assembly probabilistic design of aeroengine high-pressure turbine (HPT) BTRRC is accomplished to verify the proposed DCSRM. The analysis results reveal that the optimal static blade-tip clearance of HPT is gained for designing BTRRC, and improving the performance and reliability of aeroengine. The comparison of methods shows that the DCSRM has high computational accuracy and high computational efficiency in BTRRC probabilistic analysis. The present research offers an effective way for the reliability design of mechanical dynamic assembly and enriches mechanical reliability theory and method.
A diagonal algorithm for the method of pseudocompressibility. [for steady-state solution to incompressible Navier-Stokes equation

NASA Technical Reports Server (NTRS)

Rogers, S. E.; Kwak, D.; Chang, J. L. C.

1986-01-01

The method of pseudocompressibility has been shown to be an efficient method for obtaining a steady-state solution to the incompressible Navier-Stokes equations. Recent improvements to this method include the use of a diagonal scheme for the inversion of the equations at each iteration. The necessary transformations have been derived for the pseudocompressibility equations in generalized coordinates. The diagonal algorithm reduces the computing time necessary to obtain a steady-state solution by a factor of nearly three. Implicit viscous terms are maintained in the equations, and it has become possible to use fourth-order implicit dissipation. The steady-state solution is unchanged by the approximations resulting from the diagonalization of the equations. Computed results for flow over a two-dimensional backward-facing step and a three-dimensional cylinder mounted normal to a flat plate are presented for both the old and new algorithms. The accuracy and computing efficiency of these algorithms are compared.
A multigrid nonoscillatory method for computing high speed flows

NASA Technical Reports Server (NTRS)

Li, C. P.; Shieh, T. H.

1993-01-01

A multigrid method using different smoothers has been developed to solve the Euler equations discretized by a nonoscillatory scheme up to fourth order accuracy. The best smoothing property is provided by a five-stage Runge-Kutta technique with optimized coefficients, yet the most efficient smoother is a backward Euler technique in factored and diagonalized form. The singlegrid solution for a hypersonic, viscous conic flow is in excellent agreement with the solution obtained by the third order MUSCL and Roe's method. Mach 8 inviscid flow computations for a complete entry probe have shown that the accuracy is at least as good as the symmetric TVD scheme of Yee and Harten. The implicit multigrid method is four times more efficient than the explicit multigrid technique and 3.5 times faster than the single-grid implicit technique. For a Mach 8.7 inviscid flow over a blunt delta wing at 30 deg incidence, the CPU reduction factor from the three-level multigrid computation is 2.2 on a grid of 37 x 41 x 73 nodes.
Efficient reliability analysis of structures with the rotational quasi-symmetric point- and the maximum entropy methods

NASA Astrophysics Data System (ADS)

Xu, Jun; Dang, Chao; Kong, Fan

2017-10-01

This paper presents a new method for efficient structural reliability analysis. In this method, a rotational quasi-symmetric point method (RQ-SPM) is proposed for evaluating the fractional moments of the performance function. Then, the derivation of the performance function's probability density function (PDF) is carried out based on the maximum entropy method in which constraints are specified in terms of fractional moments. In this regard, the probability of failure can be obtained by a simple integral over the performance function's PDF. Six examples, including a finite element-based reliability analysis and a dynamic system with strong nonlinearity, are used to illustrate the efficacy of the proposed method. All the computed results are compared with those by Monte Carlo simulation (MCS). It is found that the proposed method can provide very accurate results with low computational effort.
Contour integral method for obtaining the self-energy matrices of electrodes in electron transport calculations

NASA Astrophysics Data System (ADS)

Iwase, Shigeru; Futamura, Yasunori; Imakura, Akira; Sakurai, Tetsuya; Tsukamoto, Shigeru; Ono, Tomoya

2018-05-01

We propose an efficient computational method for evaluating the self-energy matrices of electrodes to study ballistic electron transport properties in nanoscale systems. To reduce the high computational cost incurred in large systems, a contour integral eigensolver based on the Sakurai-Sugiura method combined with the shifted biconjugate gradient method is developed to solve an exponential-type eigenvalue problem for complex wave vectors. A remarkable feature of the proposed algorithm is that the numerical procedure is very similar to that of conventional band structure calculations. We implement the developed method in the framework of the real-space higher-order finite-difference scheme with nonlocal pseudopotentials. Numerical tests for a wide variety of materials validate the robustness, accuracy, and efficiency of the proposed method. As an illustration of the method, we present the electron transport property of the freestanding silicene with the line defect originating from the reversed buckled phases.
Accurate first-principles structures and energies of diversely bonded systems from an efficient density functional

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sun, Jianwei; Remsing, Richard C.; Zhang, Yubo

2016-06-13

One atom or molecule binds to another through various types of bond, the strengths of which range from several meV to several eV. Although some computational methods can provide accurate descriptions of all bond types, those methods are not efficient enough for many studies (for example, large systems, ab initio molecular dynamics and high-throughput searches for functional materials). Here, we show that the recently developed non-empirical strongly constrained and appropriately normed (SCAN) meta-generalized gradient approximation (meta-GGA) within the density functional theory framework predicts accurate geometries and energies of diversely bonded molecules and materials (including covalent, metallic, ionic, hydrogen and vanmore » der Waals bonds). This represents a significant improvement at comparable efficiency over its predecessors, the GGAs that currently dominate materials computation. Often, SCAN matches or improves on the accuracy of a computationally expensive hybrid functional, at almost-GGA cost. SCAN is therefore expected to have a broad impact on chemistry and materials science.« less
Increasing the sampling efficiency of protein conformational transition using velocity-scaling optimized hybrid explicit/implicit solvent REMD simulation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yu, Yuqi; Wang, Jinan; Shao, Qiang, E-mail: qshao@mail.shcnc.ac.cn, E-mail: Jiye.Shi@ucb.com, E-mail: wlzhu@mail.shcnc.ac.cn

2015-03-28

The application of temperature replica exchange molecular dynamics (REMD) simulation on protein motion is limited by its huge requirement of computational resource, particularly when explicit solvent model is implemented. In the previous study, we developed a velocity-scaling optimized hybrid explicit/implicit solvent REMD method with the hope to reduce the temperature (replica) number on the premise of maintaining high sampling efficiency. In this study, we utilized this method to characterize and energetically identify the conformational transition pathway of a protein model, the N-terminal domain of calmodulin. In comparison to the standard explicit solvent REMD simulation, the hybrid REMD is much lessmore » computationally expensive but, meanwhile, gives accurate evaluation of the structural and thermodynamic properties of the conformational transition which are in well agreement with the standard REMD simulation. Therefore, the hybrid REMD could highly increase the computational efficiency and thus expand the application of REMD simulation to larger-size protein systems.« less
Accurate first-principles structures and energies of diversely bonded systems from an efficient density functional.

PubMed

Sun, Jianwei; Remsing, Richard C; Zhang, Yubo; Sun, Zhaoru; Ruzsinszky, Adrienn; Peng, Haowei; Yang, Zenghui; Paul, Arpita; Waghmare, Umesh; Wu, Xifan; Klein, Michael L; Perdew, John P

2016-09-01

One atom or molecule binds to another through various types of bond, the strengths of which range from several meV to several eV. Although some computational methods can provide accurate descriptions of all bond types, those methods are not efficient enough for many studies (for example, large systems, ab initio molecular dynamics and high-throughput searches for functional materials). Here, we show that the recently developed non-empirical strongly constrained and appropriately normed (SCAN) meta-generalized gradient approximation (meta-GGA) within the density functional theory framework predicts accurate geometries and energies of diversely bonded molecules and materials (including covalent, metallic, ionic, hydrogen and van der Waals bonds). This represents a significant improvement at comparable efficiency over its predecessors, the GGAs that currently dominate materials computation. Often, SCAN matches or improves on the accuracy of a computationally expensive hybrid functional, at almost-GGA cost. SCAN is therefore expected to have a broad impact on chemistry and materials science.
QM/QM approach to model energy disorder in amorphous organic semiconductors.

PubMed

Friederich, Pascal; Meded, Velimir; Symalla, Franz; Elstner, Marcus; Wenzel, Wolfgang

2015-02-10

It is an outstanding challenge to model the electronic properties of organic amorphous materials utilized in organic electronics. Computation of the charge carrier mobility is a challenging problem as it requires integration of morphological and electronic degrees of freedom in a coherent methodology and depends strongly on the distribution of polaron energies in the system. Here we represent a QM/QM model to compute the polaron energies combining density functional methods for molecules in the vicinity of the polaron with computationally efficient density functional based tight binding methods in the rest of the environment. For seven widely used amorphous organic semiconductor materials, we show that the calculations are accelerated up to 1 order of magnitude without any loss in accuracy. Considering that the quantum chemical step is the efficiency bottleneck of a workflow to model the carrier mobility, these results are an important step toward accurate and efficient disordered organic semiconductors simulations, a prerequisite for accelerated materials screening and consequent component optimization in the organic electronics industry.
Zone-boundary optimization for direct laser writing of continuous-relief diffractive optical elements.

PubMed

Korolkov, Victor P; Nasyrov, Ruslan K; Shimansky, Ruslan V

2006-01-01

Enhancing the diffraction efficiency of continuous-relief diffractive optical elements fabricated by direct laser writing is discussed. A new method of zone-boundary optimization is proposed to correct exposure data only in narrow areas along the boundaries of diffractive zones. The optimization decreases the loss of diffraction efficiency related to convolution of a desired phase profile with a writing-beam intensity distribution. A simplified stepped transition function that describes optimized exposure data near zone boundaries can be made universal for a wide range of zone periods. The approach permits a similar increase in the diffraction efficiency as an individual-pixel optimization but with fewer computation efforts. Computer simulations demonstrated that the zone-boundary optimization for a 6 microm period grating increases the efficiency by 7% and 14.5% for 0.6 microm and 1.65 microm writing-spot diameters, respectively. The diffraction efficiency of as much as 65%-90% for 4-10 microm zone periods was obtained experimentally with this method.
Full Quantum Dynamics Simulation of a Realistic Molecular System Using the Adaptive Time-Dependent Density Matrix Renormalization Group Method.

PubMed

Yao, Yao; Sun, Ke-Wei; Luo, Zhen; Ma, Haibo

2018-01-18

The accurate theoretical interpretation of ultrafast time-resolved spectroscopy experiments relies on full quantum dynamics simulations for the investigated system, which is nevertheless computationally prohibitive for realistic molecular systems with a large number of electronic and/or vibrational degrees of freedom. In this work, we propose a unitary transformation approach for realistic vibronic Hamiltonians, which can be coped with using the adaptive time-dependent density matrix renormalization group (t-DMRG) method to efficiently evolve the nonadiabatic dynamics of a large molecular system. We demonstrate the accuracy and efficiency of this approach with an example of simulating the exciton dissociation process within an oligothiophene/fullerene heterojunction, indicating that t-DMRG can be a promising method for full quantum dynamics simulation in large chemical systems. Moreover, it is also shown that the proper vibronic features in the ultrafast electronic process can be obtained by simulating the two-dimensional (2D) electronic spectrum by virtue of the high computational efficiency of the t-DMRG method.
An efficient method for computing unsteady transonic aerodynamics of swept wings with control surfaces

NASA Technical Reports Server (NTRS)

Liu, D. D.; Kao, Y. F.; Fung, K. Y.

1989-01-01

A transonic equivalent strip (TES) method was further developed for unsteady flow computations of arbitrary wing planforms. The TES method consists of two consecutive correction steps to a given nonlinear code such as LTRAN2; namely, the chordwise mean flow correction and the spanwise phase correction. The computation procedure requires direct pressure input from other computed or measured data. Otherwise, it does not require airfoil shape or grid generation for given planforms. To validate the computed results, four swept wings of various aspect ratios, including those with control surfaces, are selected as computational examples. Overall trends in unsteady pressures are established with those obtained by XTRAN3S codes, Isogai's full potential code and measured data by NLR and RAE. In comparison with these methods, the TES has achieved considerable saving in computer time and reasonable accuracy which suggests immediate industrial applications.
Unstructured mesh methods for CFD

NASA Technical Reports Server (NTRS)

Peraire, J.; Morgan, K.; Peiro, J.

1990-01-01

Mesh generation methods for Computational Fluid Dynamics (CFD) are outlined. Geometric modeling is discussed. An advancing front method is described. Flow past a two engine Falcon aeroplane is studied. An algorithm and associated data structure called the alternating digital tree, which efficiently solves the geometric searching problem is described. The computation of an initial approximation to the steady state solution of a given poblem is described. Mesh generation for transient flows is described.
Efficient Monte Carlo Estimation of the Expected Value of Sample Information Using Moment Matching.

PubMed

Heath, Anna; Manolopoulou, Ioanna; Baio, Gianluca

2018-02-01

The Expected Value of Sample Information (EVSI) is used to calculate the economic value of a new research strategy. Although this value would be important to both researchers and funders, there are very few practical applications of the EVSI. This is due to computational difficulties associated with calculating the EVSI in practical health economic models using nested simulations. We present an approximation method for the EVSI that is framed in a Bayesian setting and is based on estimating the distribution of the posterior mean of the incremental net benefit across all possible future samples, known as the distribution of the preposterior mean. Specifically, this distribution is estimated using moment matching coupled with simulations that are available for probabilistic sensitivity analysis, which is typically mandatory in health economic evaluations. This novel approximation method is applied to a health economic model that has previously been used to assess the performance of other EVSI estimators and accurately estimates the EVSI. The computational time for this method is competitive with other methods. We have developed a new calculation method for the EVSI which is computationally efficient and accurate. This novel method relies on some additional simulation so can be expensive in models with a large computational cost.
Efficient computation of turbulent flow in ribbed passages using a non-overlapping near-wall domain decomposition method

NASA Astrophysics Data System (ADS)

Jones, Adam; Utyuzhnikov, Sergey

2017-08-01

Turbulent flow in a ribbed channel is studied using an efficient near-wall domain decomposition (NDD) method. The NDD approach is formulated by splitting the computational domain into an inner and outer region, with an interface boundary between the two. The computational mesh covers the outer region, and the flow in this region is solved using the open-source CFD code Code_Saturne with special boundary conditions on the interface boundary, called interface boundary conditions (IBCs). The IBCs are of Robin type and incorporate the effect of the inner region on the flow in the outer region. IBCs are formulated in terms of the distance from the interface boundary to the wall in the inner region. It is demonstrated that up to 90% of the region between the ribs in the ribbed passage can be removed from the computational mesh with an error on the friction factor within 2.5%. In addition, computations with NDD are faster than computations based on low Reynolds number (LRN) models by a factor of five. Different rib heights can be studied with the same mesh in the outer region without affecting the accuracy of the friction factor. This is tested with six different rib heights in an example of a design optimisation study. It is found that the friction factors computed with NDD are almost identical to the fully-resolved results. When used for inverse problems, NDD is considerably more efficient than LRN computations because only one computation needs to be performed and only one mesh needs to be generated.
Modified Method of Adaptive Artificial Viscosity for Solution of Gas Dynamics Problems on Parallel Computer Systems

NASA Astrophysics Data System (ADS)

Popov, Igor; Sukov, Sergey

2018-02-01

A modification of the adaptive artificial viscosity (AAV) method is considered. This modification is based on one stage time approximation and is adopted to calculation of gasdynamics problems on unstructured grids with an arbitrary type of grid elements. The proposed numerical method has simplified logic, better performance and parallel efficiency compared to the implementation of the original AAV method. Computer experiments evidence the robustness and convergence of the method to difference solution.
GPU-accelerated Modeling and Element-free Reverse-time Migration with Gauss Points Partition

NASA Astrophysics Data System (ADS)

Zhen, Z.; Jia, X.

2014-12-01

Element-free method (EFM) has been applied to seismic modeling and migration. Compared with finite element method (FEM) and finite difference method (FDM), it is much cheaper and more flexible because only the information of the nodes and the boundary of the study area are required in computation. In the EFM, the number of Gauss points should be consistent with the number of model nodes; otherwise the accuracy of the intermediate coefficient matrices would be harmed. Thus when we increase the nodes of velocity model in order to obtain higher resolution, we find that the size of the computer's memory will be a bottleneck. The original EFM can deal with at most 81×81 nodes in the case of 2G memory, as tested by Jia and Hu (2006). In order to solve the problem of storage and computation efficiency, we propose a concept of Gauss points partition (GPP), and utilize the GPUs to improve the computation efficiency. Considering the characteristics of the Gaussian points, the GPP method doesn't influence the propagation of seismic wave in the velocity model. To overcome the time-consuming computation of the stiffness matrix (K) and the mass matrix (M), we also use the GPUs in our computation program. We employ the compressed sparse row (CSR) format to compress the intermediate sparse matrices and try to simplify the operations by solving the linear equations with the CULA Sparse's Conjugate Gradient (CG) solver instead of the linear sparse solver 'PARDISO'. It is observed that our strategy can significantly reduce the computational time of K and Mcompared with the algorithm based on CPU. The model tested is Marmousi model. The length of the model is 7425m and the depth is 2990m. We discretize the model with 595x298 nodes, 300x300 Gauss cells and 3x3 Gauss points in each cell. In contrast to the computational time of the conventional EFM, the GPUs-GPP approach can substantially improve the efficiency. The speedup ratio of time consumption of computing K, M is 120 and the speedup ratio time consumption of RTM is 11.5. At the same time, the accuracy of imaging is not harmed. Another advantage of the GPUs-GPP method is its easy applications in other numerical methods such as the FEM. Finally, in the GPUs-GPP method, the arrays require quite limited memory storage, which makes the method promising in dealing with large-scale 3D problems.
A demonstration of adjoint methods for multi-dimensional remote sensing of the atmosphere and surface

NASA Astrophysics Data System (ADS)

Martin, William G. K.; Hasekamp, Otto P.

2018-01-01

In previous work, we derived the adjoint method as a computationally efficient path to three-dimensional (3D) retrievals of clouds and aerosols. In this paper we will demonstrate the use of adjoint methods for retrieving two-dimensional (2D) fields of cloud extinction. The demonstration uses a new 2D radiative transfer solver (FSDOM). This radiation code was augmented with adjoint methods to allow efficient derivative calculations needed to retrieve cloud and surface properties from multi-angle reflectance measurements. The code was then used in three synthetic retrieval studies. Our retrieval algorithm adjusts the cloud extinction field and surface albedo to minimize the measurement misfit function with a gradient-based, quasi-Newton approach. At each step we compute the value of the misfit function and its gradient with two calls to the solver FSDOM. First we solve the forward radiative transfer equation to compute the residual misfit with measurements, and second we solve the adjoint radiative transfer equation to compute the gradient of the misfit function with respect to all unknowns. The synthetic retrieval studies verify that adjoint methods are scalable to retrieval problems with many measurements and unknowns. We can retrieve the vertically-integrated optical depth of moderately thick clouds as a function of the horizontal coordinate. It is also possible to retrieve the vertical profile of clouds that are separated by clear regions. The vertical profile retrievals improve for smaller cloud fractions. This leads to the conclusion that cloud edges actually increase the amount of information that is available for retrieving the vertical profile of clouds. However, to exploit this information one must retrieve the horizontally heterogeneous cloud properties with a 2D (or 3D) model. This prototype shows that adjoint methods can efficiently compute the gradient of the misfit function. This work paves the way for the application of similar methods to 3D remote sensing problems.
A new impedance accounting for short- and long-range effects in mixed substructured formulations of nonlinear problems

NASA Astrophysics Data System (ADS)

Negrello, Camille; Gosselet, Pierre; Rey, Christian

2018-05-01

An efficient method for solving large nonlinear problems combines Newton solvers and Domain Decomposition Methods (DDM). In the DDM framework, the boundary conditions can be chosen to be primal, dual or mixed. The mixed approach presents the advantage to be eligible for the research of an optimal interface parameter (often called impedance) which can increase the convergence rate. The optimal value for this parameter is often too expensive to be computed exactly in practice: an approximate version has to be sought for, along with a compromise between efficiency and computational cost. In the context of parallel algorithms for solving nonlinear structural mechanical problems, we propose a new heuristic for the impedance which combines short and long range effects at a low computational cost.
SAGE: The Self-Adaptive Grid Code. 3

NASA Technical Reports Server (NTRS)

Davies, Carol B.; Venkatapathy, Ethiraj

1999-01-01

The multi-dimensional self-adaptive grid code, SAGE, is an important tool in the field of computational fluid dynamics (CFD). It provides an efficient method to improve the accuracy of flow solutions while simultaneously reducing computer processing time. Briefly, SAGE enhances an initial computational grid by redistributing the mesh points into more appropriate locations. The movement of these points is driven by an equal-error-distribution algorithm that utilizes the relationship between high flow gradients and excessive solution errors. The method also provides a balance between clustering points in the high gradient regions and maintaining the smoothness and continuity of the adapted grid, The latest version, Version 3, includes the ability to change the boundaries of a given grid to more efficiently enclose flow structures and provides alternative redistribution algorithms.

Computationally efficient method for optical simulation of solar cells and their applications

NASA Astrophysics Data System (ADS)

Semenikhin, I.; Zanuccoli, M.; Fiegna, C.; Vyurkov, V.; Sangiorgi, E.

2013-01-01

This paper presents two novel implementations of the Differential method to solve the Maxwell equations in nanostructured optoelectronic solid state devices. The first proposed implementation is based on an improved and computationally efficient T-matrix formulation that adopts multiple-precision arithmetic to tackle the numerical instability problem which arises due to evanescent modes. The second implementation adopts the iterative approach that allows to achieve low computational complexity O(N logN) or better. The proposed algorithms may work with structures with arbitrary spatial variation of the permittivity. The developed two-dimensional numerical simulator is applied to analyze the dependence of the absorption characteristics of a thin silicon slab on the morphology of the front interface and on the angle of incidence of the radiation with respect to the device surface.
Symmetry-conserving purification of quantum states within the density matrix renormalization group

DOE PAGES

Nocera, Alberto; Alvarez, Gonzalo

2016-01-28

The density matrix renormalization group (DMRG) algorithm was originally designed to efficiently compute the zero-temperature or ground-state properties of one-dimensional strongly correlated quantum systems. The development of the algorithm at finite temperature has been a topic of much interest, because of the usefulness of thermodynamics quantities in understanding the physics of condensed matter systems, and because of the increased complexity associated with efficiently computing temperature-dependent properties. The ancilla method is a DMRG technique that enables the computation of these thermodynamic quantities. In this paper, we review the ancilla method, and improve its performance by working on reduced Hilbert spaces andmore » using canonical approaches. Furthermore we explore its applicability beyond spins systems to t-J and Hubbard models.« less
Reuse of imputed data in microarray analysis increases imputation efficiency

PubMed Central

Kim, Ki-Yeol; Kim, Byoung-Jin; Yi, Gwan-Su

2004-01-01

Background The imputation of missing values is necessary for the efficient use of DNA microarray data, because many clustering algorithms and some statistical analysis require a complete data set. A few imputation methods for DNA microarray data have been introduced, but the efficiency of the methods was low and the validity of imputed values in these methods had not been fully checked. Results We developed a new cluster-based imputation method called sequential K-nearest neighbor (SKNN) method. This imputes the missing values sequentially from the gene having least missing values, and uses the imputed values for the later imputation. Although it uses the imputed values, the efficiency of this new method is greatly improved in its accuracy and computational complexity over the conventional KNN-based method and other methods based on maximum likelihood estimation. The performance of SKNN was in particular higher than other imputation methods for the data with high missing rates and large number of experiments. Application of Expectation Maximization (EM) to the SKNN method improved the accuracy, but increased computational time proportional to the number of iterations. The Multiple Imputation (MI) method, which is well known but not applied previously to microarray data, showed a similarly high accuracy as the SKNN method, with slightly higher dependency on the types of data sets. Conclusions Sequential reuse of imputed data in KNN-based imputation greatly increases the efficiency of imputation. The SKNN method should be practically useful to save the data of some microarray experiments which have high amounts of missing entries. The SKNN method generates reliable imputed values which can be used for further cluster-based analysis of microarray data. PMID:15504240
The FLAME-slab method for electromagnetic wave scattering in aperiodic slabs

NASA Astrophysics Data System (ADS)

Mansha, Shampy; Tsukerman, Igor; Chong, Y. D.

2017-12-01

The proposed numerical method, "FLAME-slab," solves electromagnetic wave scattering problems for aperiodic slab structures by exploiting short-range regularities in these structures. The computational procedure involves special difference schemes with high accuracy even on coarse grids. These schemes are based on Trefftz approximations, utilizing functions that locally satisfy the governing differential equations, as is done in the Flexible Local Approximation Method (FLAME). Radiation boundary conditions are implemented via Fourier expansions in the air surrounding the slab. When applied to ensembles of slab structures with identical short-range features, such as amorphous or quasicrystalline lattices, the method is significantly more efficient, both in runtime and in memory consumption, than traditional approaches. This efficiency is due to the fact that the Trefftz functions need to be computed only once for the whole ensemble.
Simple formalism for efficient derivatives and multi-determinant expansions in quantum Monte Carlo

DOE Office of Scientific and Technical Information (OSTI.GOV)

Filippi, Claudia, E-mail: c.filippi@utwente.nl; Assaraf, Roland, E-mail: assaraf@lct.jussieu.fr; Moroni, Saverio, E-mail: moroni@democritos.it

2016-05-21

We present a simple and general formalism to compute efficiently the derivatives of a multi-determinant Jastrow-Slater wave function, the local energy, the interatomic forces, and similar quantities needed in quantum Monte Carlo. Through a straightforward manipulation of matrices evaluated on the occupied and virtual orbitals, we obtain an efficiency equivalent to algorithmic differentiation in the computation of the interatomic forces and the optimization of the orbital parameters. Furthermore, for a large multi-determinant expansion, the significant computational gain afforded by a recently introduced table method is here extended to the local value of any one-body operator and to its derivatives, inmore » both all-electron and pseudopotential calculations.« less
Information processing using a single dynamical node as complex system

PubMed Central

Appeltant, L.; Soriano, M.C.; Van der Sande, G.; Danckaert, J.; Massar, S.; Dambre, J.; Schrauwen, B.; Mirasso, C.R.; Fischer, I.

2011-01-01

Novel methods for information processing are highly desired in our information-driven society. Inspired by the brain's ability to process information, the recently introduced paradigm known as 'reservoir computing' shows that complex networks can efficiently perform computation. Here we introduce a novel architecture that reduces the usually required large number of elements to a single nonlinear node with delayed feedback. Through an electronic implementation, we experimentally and numerically demonstrate excellent performance in a speech recognition benchmark. Complementary numerical studies also show excellent performance for a time series prediction benchmark. These results prove that delay-dynamical systems, even in their simplest manifestation, can perform efficient information processing. This finding paves the way to feasible and resource-efficient technological implementations of reservoir computing. PMID:21915110
Human motion planning based on recursive dynamics and optimal control techniques

NASA Technical Reports Server (NTRS)

Lo, Janzen; Huang, Gang; Metaxas, Dimitris

2002-01-01

This paper presents an efficient optimal control and recursive dynamics-based computer animation system for simulating and controlling the motion of articulated figures. A quasi-Newton nonlinear programming technique (super-linear convergence) is implemented to solve minimum torque-based human motion-planning problems. The explicit analytical gradients needed in the dynamics are derived using a matrix exponential formulation and Lie algebra. Cubic spline functions are used to make the search space for an optimal solution finite. Based on our formulations, our method is well conditioned and robust, in addition to being computationally efficient. To better illustrate the efficiency of our method, we present results of natural looking and physically correct human motions for a variety of human motion tasks involving open and closed loop kinematic chains.
Computing the Baker-Campbell-Hausdorff series and the Zassenhaus product

NASA Astrophysics Data System (ADS)

Weyrauch, Michael; Scholz, Daniel

2009-09-01

The Baker-Campbell-Hausdorff (BCH) series and the Zassenhaus product are of fundamental importance for the theory of Lie groups and their applications in physics and physical chemistry. Standard methods for the explicit construction of the BCH and Zassenhaus terms yield polynomial representations, which must be translated into the usually required commutator representation. We prove that a new translation proposed recently yields a correct representation of the BCH and Zassenhaus terms. This representation entails fewer terms than the well-known Dynkin-Specht-Wever representation, which is of relevance for practical applications. Furthermore, various methods for the computation of the BCH and Zassenhaus terms are compared, and a new efficient approach for the calculation of the Zassenhaus terms is proposed. Mathematica implementations for the most efficient algorithms are provided together with comparisons of efficiency.
Algorithms for Efficient Computation of Transfer Functions for Large Order Flexible Systems

NASA Technical Reports Server (NTRS)

Maghami, Peiman G.; Giesy, Daniel P.

1998-01-01

An efficient and robust computational scheme is given for the calculation of the frequency response function of a large order, flexible system implemented with a linear, time invariant control system. Advantage is taken of the highly structured sparsity of the system matrix of the plant based on a model of the structure using normal mode coordinates. The computational time per frequency point of the new computational scheme is a linear function of system size, a significant improvement over traditional, still-matrix techniques whose computational times per frequency point range from quadratic to cubic functions of system size. This permits the practical frequency domain analysis of systems of much larger order than by traditional, full-matrix techniques. Formulations are given for both open- and closed-loop systems. Numerical examples are presented showing the advantages of the present formulation over traditional approaches, both in speed and in accuracy. Using a model with 703 structural modes, the present method was up to two orders of magnitude faster than a traditional method. The present method generally showed good to excellent accuracy throughout the range of test frequencies, while traditional methods gave adequate accuracy for lower frequencies, but generally deteriorated in performance at higher frequencies with worst case errors being many orders of magnitude times the correct values.
A new approach to integrate GPU-based Monte Carlo simulation into inverse treatment plan optimization for proton therapy.

PubMed

Li, Yongbao; Tian, Zhen; Song, Ting; Wu, Zhaoxia; Liu, Yaqiang; Jiang, Steve; Jia, Xun

2017-01-07

Monte Carlo (MC)-based spot dose calculation is highly desired for inverse treatment planning in proton therapy because of its accuracy. Recent studies on biological optimization have also indicated the use of MC methods to compute relevant quantities of interest, e.g. linear energy transfer. Although GPU-based MC engines have been developed to address inverse optimization problems, their efficiency still needs to be improved. Also, the use of a large number of GPUs in MC calculation is not favorable for clinical applications. The previously proposed adaptive particle sampling (APS) method can improve the efficiency of MC-based inverse optimization by using the computationally expensive MC simulation more effectively. This method is more efficient than the conventional approach that performs spot dose calculation and optimization in two sequential steps. In this paper, we propose a computational library to perform MC-based spot dose calculation on GPU with the APS scheme. The implemented APS method performs a non-uniform sampling of the particles from pencil beam spots during the optimization process, favoring those from the high intensity spots. The library also conducts two computationally intensive matrix-vector operations frequently used when solving an optimization problem. This library design allows a streamlined integration of the MC-based spot dose calculation into an existing proton therapy inverse planning process. We tested the developed library in a typical inverse optimization system with four patient cases. The library achieved the targeted functions by supporting inverse planning in various proton therapy schemes, e.g. single field uniform dose, 3D intensity modulated proton therapy, and distal edge tracking. The efficiency was 41.6 ± 15.3% higher than the use of a GPU-based MC package in a conventional calculation scheme. The total computation time ranged between 2 and 50 min on a single GPU card depending on the problem size.
A new approach to integrate GPU-based Monte Carlo simulation into inverse treatment plan optimization for proton therapy

NASA Astrophysics Data System (ADS)

Li, Yongbao; Tian, Zhen; Song, Ting; Wu, Zhaoxia; Liu, Yaqiang; Jiang, Steve; Jia, Xun

2017-01-01

Monte Carlo (MC)-based spot dose calculation is highly desired for inverse treatment planning in proton therapy because of its accuracy. Recent studies on biological optimization have also indicated the use of MC methods to compute relevant quantities of interest, e.g. linear energy transfer. Although GPU-based MC engines have been developed to address inverse optimization problems, their efficiency still needs to be improved. Also, the use of a large number of GPUs in MC calculation is not favorable for clinical applications. The previously proposed adaptive particle sampling (APS) method can improve the efficiency of MC-based inverse optimization by using the computationally expensive MC simulation more effectively. This method is more efficient than the conventional approach that performs spot dose calculation and optimization in two sequential steps. In this paper, we propose a computational library to perform MC-based spot dose calculation on GPU with the APS scheme. The implemented APS method performs a non-uniform sampling of the particles from pencil beam spots during the optimization process, favoring those from the high intensity spots. The library also conducts two computationally intensive matrix-vector operations frequently used when solving an optimization problem. This library design allows a streamlined integration of the MC-based spot dose calculation into an existing proton therapy inverse planning process. We tested the developed library in a typical inverse optimization system with four patient cases. The library achieved the targeted functions by supporting inverse planning in various proton therapy schemes, e.g. single field uniform dose, 3D intensity modulated proton therapy, and distal edge tracking. The efficiency was 41.6 ± 15.3% higher than the use of a GPU-based MC package in a conventional calculation scheme. The total computation time ranged between 2 and 50 min on a single GPU card depending on the problem size.
A New Approach to Integrate GPU-based Monte Carlo Simulation into Inverse Treatment Plan Optimization for Proton Therapy

PubMed Central

Li, Yongbao; Tian, Zhen; Song, Ting; Wu, Zhaoxia; Liu, Yaqiang; Jiang, Steve; Jia, Xun

2016-01-01

Monte Carlo (MC)-based spot dose calculation is highly desired for inverse treatment planning in proton therapy because of its accuracy. Recent studies on biological optimization have also indicated the use of MC methods to compute relevant quantities of interest, e.g. linear energy transfer. Although GPU-based MC engines have been developed to address inverse optimization problems, their efficiency still needs to be improved. Also, the use of a large number of GPUs in MC calculation is not favorable for clinical applications. The previously proposed adaptive particle sampling (APS) method can improve the efficiency of MC-based inverse optimization by using the computationally expensive MC simulation more effectively. This method is more efficient than the conventional approach that performs spot dose calculation and optimization in two sequential steps. In this paper, we propose a computational library to perform MC-based spot dose calculation on GPU with the APS scheme. The implemented APS method performs a non-uniform sampling of the particles from pencil beam spots during the optimization process, favoring those from the high intensity spots. The library also conducts two computationally intensive matrix-vector operations frequently used when solving an optimization problem. This library design allows a streamlined integration of the MC-based spot dose calculation into an existing proton therapy inverse planning process. We tested the developed library in a typical inverse optimization system with four patient cases. The library achieved the targeted functions by supporting inverse planning in various proton therapy schemes, e.g. single field uniform dose, 3D intensity modulated proton therapy, and distal edge tracking. The efficiency was 41.6±15.3% higher than the use of a GPU-based MC package in a conventional calculation scheme. The total computation time ranged between 2 and 50 min on a single GPU card depending on the problem size. PMID:27991456
Parallel high-precision orbit propagation using the modified Picard-Chebyshev method

NASA Astrophysics Data System (ADS)

Koblick, Darin C.

2012-03-01

The modified Picard-Chebyshev method, when run in parallel, is thought to be more accurate and faster than the most efficient sequential numerical integration techniques when applied to orbit propagation problems. Previous experiments have shown that the modified Picard-Chebyshev method can have up to a one order magnitude speedup over the 12th order Runge-Kutta-Nystrom method. For this study, the evaluation of the accuracy and computational time of the modified Picard-Chebyshev method, using the Java Astrodynamics Toolkit high-precision force model, is conducted to assess its runtime performance. Simulation results of the modified Picard-Chebyshev method, implemented in MATLAB and the MATLAB Parallel Computing Toolbox, are compared against the most efficient first and second order Ordinary Differential Equation (ODE) solvers. A total of six processors were used to assess the runtime performance of the modified Picard-Chebyshev method. It was found that for all orbit propagation test cases, where the gravity model was simulated to be of higher degree and order (above 225 to increase computational overhead), the modified Picard-Chebyshev method was faster, by as much as a factor of two, than the other ODE solvers which were tested.
A network-based multi-target computational estimation scheme for anticoagulant activities of compounds.

PubMed

Li, Qian; Li, Xudong; Li, Canghai; Chen, Lirong; Song, Jun; Tang, Yalin; Xu, Xiaojie

2011-03-22

Traditional virtual screening method pays more attention on predicted binding affinity between drug molecule and target related to a certain disease instead of phenotypic data of drug molecule against disease system, as is often less effective on discovery of the drug which is used to treat many types of complex diseases. Virtual screening against a complex disease by general network estimation has become feasible with the development of network biology and system biology. More effective methods of computational estimation for the whole efficacy of a compound in a complex disease system are needed, given the distinct weightiness of the different target in a biological process and the standpoint that partial inhibition of several targets can be more efficient than the complete inhibition of a single target. We developed a novel approach by integrating the affinity predictions from multi-target docking studies with biological network efficiency analysis to estimate the anticoagulant activities of compounds. From results of network efficiency calculation for human clotting cascade, factor Xa and thrombin were identified as the two most fragile enzymes, while the catalytic reaction mediated by complex IXa:VIIIa and the formation of the complex VIIIa:IXa were recognized as the two most fragile biological matter in the human clotting cascade system. Furthermore, the method which combined network efficiency with molecular docking scores was applied to estimate the anticoagulant activities of a serial of argatroban intermediates and eight natural products respectively. The better correlation (r = 0.671) between the experimental data and the decrease of the network deficiency suggests that the approach could be a promising computational systems biology tool to aid identification of anticoagulant activities of compounds in drug discovery. This article proposes a network-based multi-target computational estimation method for anticoagulant activities of compounds by combining network efficiency analysis with scoring function from molecular docking.
A Network-Based Multi-Target Computational Estimation Scheme for Anticoagulant Activities of Compounds

PubMed Central

Li, Canghai; Chen, Lirong; Song, Jun; Tang, Yalin; Xu, Xiaojie

2011-01-01

Background Traditional virtual screening method pays more attention on predicted binding affinity between drug molecule and target related to a certain disease instead of phenotypic data of drug molecule against disease system, as is often less effective on discovery of the drug which is used to treat many types of complex diseases. Virtual screening against a complex disease by general network estimation has become feasible with the development of network biology and system biology. More effective methods of computational estimation for the whole efficacy of a compound in a complex disease system are needed, given the distinct weightiness of the different target in a biological process and the standpoint that partial inhibition of several targets can be more efficient than the complete inhibition of a single target. Methodology We developed a novel approach by integrating the affinity predictions from multi-target docking studies with biological network efficiency analysis to estimate the anticoagulant activities of compounds. From results of network efficiency calculation for human clotting cascade, factor Xa and thrombin were identified as the two most fragile enzymes, while the catalytic reaction mediated by complex IXa:VIIIa and the formation of the complex VIIIa:IXa were recognized as the two most fragile biological matter in the human clotting cascade system. Furthermore, the method which combined network efficiency with molecular docking scores was applied to estimate the anticoagulant activities of a serial of argatroban intermediates and eight natural products respectively. The better correlation (r = 0.671) between the experimental data and the decrease of the network deficiency suggests that the approach could be a promising computational systems biology tool to aid identification of anticoagulant activities of compounds in drug discovery. Conclusions This article proposes a network-based multi-target computational estimation method for anticoagulant activities of compounds by combining network efficiency analysis with scoring function from molecular docking. PMID:21445339
Efficient scatter model for simulation of ultrasound images from computed tomography data

NASA Astrophysics Data System (ADS)

D'Amato, J. P.; Lo Vercio, L.; Rubi, P.; Fernandez Vera, E.; Barbuzza, R.; Del Fresno, M.; Larrabide, I.

2015-12-01

Background and motivation: Real-time ultrasound simulation refers to the process of computationally creating fully synthetic ultrasound images instantly. Due to the high value of specialized low cost training for healthcare professionals, there is a growing interest in the use of this technology and the development of high fidelity systems that simulate the acquisitions of echographic images. The objective is to create an efficient and reproducible simulator that can run either on notebooks or desktops using low cost devices. Materials and methods: We present an interactive ultrasound simulator based on CT data. This simulator is based on ray-casting and provides real-time interaction capabilities. The simulation of scattering that is coherent with the transducer position in real time is also introduced. Such noise is produced using a simplified model of multiplicative noise and convolution with point spread functions (PSF) tailored for this purpose. Results: The computational efficiency of scattering maps generation was revised with an improved performance. This allowed a more efficient simulation of coherent scattering in the synthetic echographic images while providing highly realistic result. We describe some quality and performance metrics to validate these results, where a performance of up to 55fps was achieved. Conclusion: The proposed technique for real-time scattering modeling provides realistic yet computationally efficient scatter distributions. The error between the original image and the simulated scattering image was compared for the proposed method and the state-of-the-art, showing negligible differences in its distribution.
A stochastic method for computing hadronic matrix elements

DOE PAGES

Alexandrou, Constantia; Constantinou, Martha; Dinter, Simon; ...

2014-01-24

In this study, we present a stochastic method for the calculation of baryon 3-point functions which is an alternative to the typically used sequential method offering more versatility. We analyze the scaling of the error of the stochastically evaluated 3-point function with the lattice volume and find a favorable signal to noise ratio suggesting that the stochastic method can be extended to large volumes providing an efficient approach to compute hadronic matrix elements and form factors.
Numerical simulation using vorticity-vector potential formulation

NASA Technical Reports Server (NTRS)

Tokunaga, Hiroshi

1993-01-01

An accurate and efficient computational method is needed for three-dimensional incompressible viscous flows in engineering applications. On solving the turbulent shear flows directly or using the subgrid scale model, it is indispensable to resolve the small scale fluid motions as well as the large scale motions. From this point of view, the pseudo-spectral method is used so far as the computational method. However, the finite difference or the finite element methods are widely applied for computing the flow with practical importance since these methods are easily applied to the flows with complex geometric configurations. However, there exist several problems in applying the finite difference method to direct and large eddy simulations. Accuracy is one of most important problems. This point was already addressed by the present author on the direct simulations on the instability of the plane Poiseuille flow and also on the transition to turbulence. In order to obtain high efficiency, the multi-grid Poisson solver is combined with the higher-order, accurate finite difference method. The formulation method is also one of the most important problems in applying the finite difference method to the incompressible turbulent flows. The three-dimensional Navier-Stokes equations have been solved so far in the primitive variables formulation. One of the major difficulties of this method is the rigorous satisfaction of the equation of continuity. In general, the staggered grid is used for the satisfaction of the solenoidal condition for the velocity field at the wall boundary. However, the velocity field satisfies the equation of continuity automatically in the vorticity-vector potential formulation. From this point of view, the vorticity-vector potential method was extended to the generalized coordinate system. In the present article, we adopt the vorticity-vector potential formulation, the generalized coordinate system, and the 4th-order accurate difference method as the computational method. We present the computational method and apply the present method to computations of flows in a square cavity at large Reynolds number in order to investigate its effectiveness.
Improved Collision-Detection Method for Robotic Manipulator

NASA Technical Reports Server (NTRS)

Leger, Chris

2003-01-01

An improved method has been devised for the computational prediction of a collision between (1) a robotic manipulator and (2) another part of the robot or an external object in the vicinity of the robot. The method is intended to be used to test commanded manipulator trajectories in advance so that execution of the commands can be stopped before damage is done. The method involves utilization of both (1) mathematical models of the robot and its environment constructed manually prior to operation and (2) similar models constructed automatically from sensory data acquired during operation. The representation of objects in this method is simpler and more efficient (with respect to both computation time and computer memory), relative to the representations used in most prior methods. The present method was developed especially for use on a robotic land vehicle (rover) equipped with a manipulator arm and a vision system that includes stereoscopic electronic cameras. In this method, objects are represented and collisions detected by use of a previously developed technique known in the art as the method of oriented bounding boxes (OBBs). As the name of this technique indicates, an object is represented approximately, for computational purposes, by a box that encloses its outer boundary. Because many parts of a robotic manipulator are cylindrical, the OBB method has been extended in this method to enable the approximate representation of cylindrical parts by use of octagonal or other multiple-OBB assemblies denoted oriented bounding prisms (OBPs), as in the example of Figure 1. Unlike prior methods, the OBB/OBP method does not require any divisions or transcendental functions; this feature leads to greater robustness and numerical accuracy. The OBB/OBP method was selected for incorporation into the present method because it offers the best compromise between accuracy on the one hand and computational efficiency (and thus computational speed) on the other hand.
FORTRAN program for calculating total efficiency - specific speed characteristics of centrifugal compressors

NASA Technical Reports Server (NTRS)

Galvas, M. R.

1972-01-01

A computer program for predicting design point specific speed - efficiency characteristics of centrifugal compressors is presented with instructions for its use. The method permits rapid selection of compressor geometry that yields maximum total efficiency for a particular application. A numerical example is included to demonstrate the selection procedure.

A Novel Method for Estimating Shortwave Direct Radiative Effect of Above-cloud Aerosols over Ocean Using CALIOP and MODIS Data

NASA Technical Reports Server (NTRS)

Zhang, Z.; Meyer, K.; Platnick, S.; Oreopoulos, L.; Lee, D.; Yu, H.

2013-01-01

This paper describes an efficient and unique method for computing the shortwave direct radiative effect (DRE) of aerosol residing above low-level liquid-phase clouds using CALIOP and MODIS data. It accounts for the overlapping of aerosol and cloud rigorously by utilizing the joint histogram of cloud optical depth and cloud top pressure. Effects of sub-grid scale cloud and aerosol variations on DRE are accounted for. It is computationally efficient through using grid-level cloud and aerosol statistics, instead of pixel-level products, and a pre-computed look-up table in radiative transfer calculations. We verified that for smoke over the southeast Atlantic Ocean the method yields a seasonal mean instantaneous shortwave DRE that generally agrees with more rigorous pixel-level computation within 4%. We have also computed the annual mean instantaneous shortwave DRE of light-absorbing aerosols (i.e., smoke and polluted dust) over global ocean based on 4 yr of CALIOP and MODIS data. We found that the variability of the annual mean shortwave DRE of above-cloud light-absorbing aerosol is mainly driven by the optical depth of the underlying clouds.
A Novel Method for Estimating Shortwave Direct Radiative Effect of Above-Cloud Aerosols Using CALIOP and MODIS Data

NASA Technical Reports Server (NTRS)

Zhang, Z.; Meyer, K.; Platnick, S.; Oreopoulos, L.; Lee, D.; Yu, H.

2014-01-01

This paper describes an efficient and unique method for computing the shortwave direct radiative effect (DRE) of aerosol residing above low-level liquid-phase clouds using CALIOP and MODIS data. It accounts for the overlapping of aerosol and cloud rigorously by utilizing the joint histogram of cloud optical depth and cloud top pressure. Effects of sub-grid scale cloud and aerosol variations on DRE are accounted for. It is computationally efficient through using grid-level cloud and aerosol statistics, instead of pixel-level products, and a pre-computed look-up table in radiative transfer calculations. We verified that for smoke over the southeast Atlantic Ocean the method yields a seasonal mean instantaneous shortwave DRE that generally agrees with more rigorous pixel-level computation within 4. We have also computed the annual mean instantaneous shortwave DRE of light-absorbing aerosols (i.e., smoke and polluted dust) over global ocean based on 4 yr of CALIOP and MODIS data. We found that the variability of the annual mean shortwave DRE of above-cloud light-absorbing aerosol is mainly driven by the optical depth of the underlying clouds.
Memory-optimized shift operator alternating direction implicit finite difference time domain method for plasma

NASA Astrophysics Data System (ADS)

Song, Wanjun; Zhang, Hou

2017-11-01

Through introducing the alternating direction implicit (ADI) technique and the memory-optimized algorithm to the shift operator (SO) finite difference time domain (FDTD) method, the memory-optimized SO-ADI FDTD for nonmagnetized collisional plasma is proposed and the corresponding formulae of the proposed method for programming are deduced. In order to further the computational efficiency, the iteration method rather than Gauss elimination method is employed to solve the equation set in the derivation of the formulae. Complicated transformations and convolutions are avoided in the proposed method compared with the Z transforms (ZT) ADI FDTD method and the piecewise linear JE recursive convolution (PLJERC) ADI FDTD method. The numerical dispersion of the SO-ADI FDTD method with different plasma frequencies and electron collision frequencies is analyzed and the appropriate ratio of grid size to the minimum wavelength is given. The accuracy of the proposed method is validated by the reflection coefficient test on a nonmagnetized collisional plasma sheet. The testing results show that the proposed method is advantageous for improving computational efficiency and saving computer memory. The reflection coefficient of a perfect electric conductor (PEC) sheet covered by multilayer plasma and the RCS of the objects coated by plasma are calculated by the proposed method and the simulation results are analyzed.
Eigenvalue sensitivity analysis of planar frames with variable joint and support locations

NASA Technical Reports Server (NTRS)

Chuang, Ching H.; Hou, Gene J. W.

1991-01-01

Two sensitivity equations are derived in this study based upon the continuum approach for eigenvalue sensitivity analysis of planar frame structures with variable joint and support locations. A variational form of an eigenvalue equation is first derived in which all of the quantities are expressed in the local coordinate system attached to each member. Material derivative of this variational equation is then sought to account for changes in member's length and orientation resulting form the perturbation of joint and support locations. Finally, eigenvalue sensitivity equations are formulated in either domain quantities (by the domain method) or boundary quantities (by the boundary method). It is concluded that the sensitivity equation derived by the boundary method is more efficient in computation but less accurate than that of the domain method. Nevertheless, both of them in terms of computational efficiency are superior to the conventional direct differentiation method and the finite difference method.
Generalized Gilat-Raubenheimer method for density-of-states calculation in photonic crystals

NASA Astrophysics Data System (ADS)

Liu, Boyuan; Johnson, Steven G.; Joannopoulos, John D.; Lu, Ling

2018-04-01

An efficient numerical algorithm is the key for accurate evaluation of density of states (DOS) in band theory. The Gilat-Raubenheimer (GR) method proposed in 1966 is an efficient linear extrapolation method which was limited in specific lattices. Here, using an affine transformation, we provide a new generalization of the original GR method to any Bravais lattices and show that it is superior to the tetrahedron method and the adaptive Gaussian broadening method. Finally, we apply our generalized GR method to compute DOS of various gyroid photonic crystals of topological degeneracies.
Efficient Wideband Numerical Simulations for Nanostructures Employing a Drude-Critical Points (DCP) Dispersive Model.

PubMed

Ren, Qiang; Nagar, Jogender; Kang, Lei; Bian, Yusheng; Werner, Ping; Werner, Douglas H

2017-05-18

A highly efficient numerical approach for simulating the wideband optical response of nano-architectures comprised of Drude-Critical Points (DCP) media (e.g., gold and silver) is proposed and validated through comparing with commercial computational software. The kernel of this algorithm is the subdomain level discontinuous Galerkin time domain (DGTD) method, which can be viewed as a hybrid of the spectral-element time-domain method (SETD) and the finite-element time-domain (FETD) method. An hp-refinement technique is applied to decrease the Degrees-of-Freedom (DoFs) and computational requirements. The collocated E-J scheme facilitates solving the auxiliary equations by converting the inversions of matrices to simpler vector manipulations. A new hybrid time stepping approach, which couples the Runge-Kutta and Newmark methods, is proposed to solve the temporal auxiliary differential equations (ADEs) with a high degree of efficiency. The advantages of this new approach, in terms of computational resource overhead and accuracy, are validated through comparison with well-known commercial software for three diverse cases, which cover both near-field and far-field properties with plane wave and lumped port sources. The presented work provides the missing link between DCP dispersive models and FETD and/or SETD based algorithms. It is a competitive candidate for numerically studying the wideband plasmonic properties of DCP media.
A parallel finite element procedure for contact-impact problems using edge-based smooth triangular element and GPU

NASA Astrophysics Data System (ADS)

Cai, Yong; Cui, Xiangyang; Li, Guangyao; Liu, Wenyang

2018-04-01

The edge-smooth finite element method (ES-FEM) can improve the computational accuracy of triangular shell elements and the mesh partition efficiency of complex models. In this paper, an approach is developed to perform explicit finite element simulations of contact-impact problems with a graphical processing unit (GPU) using a special edge-smooth triangular shell element based on ES-FEM. Of critical importance for this problem is achieving finer-grained parallelism to enable efficient data loading and to minimize communication between the device and host. Four kinds of parallel strategies are then developed to efficiently solve these ES-FEM based shell element formulas, and various optimization methods are adopted to ensure aligned memory access. Special focus is dedicated to developing an approach for the parallel construction of edge systems. A parallel hierarchy-territory contact-searching algorithm (HITA) and a parallel penalty function calculation method are embedded in this parallel explicit algorithm. Finally, the program flow is well designed, and a GPU-based simulation system is developed, using Nvidia's CUDA. Several numerical examples are presented to illustrate the high quality of the results obtained with the proposed methods. In addition, the GPU-based parallel computation is shown to significantly reduce the computing time.
Comparative analysis of autofocus functions in digital in-line phase-shifting holography.

PubMed

Fonseca, Elsa S R; Fiadeiro, Paulo T; Pereira, Manuela; Pinheiro, António

2016-09-20

Numerical reconstruction of digital holograms relies on a precise knowledge of the original object position. However, there are a number of relevant applications where this parameter is not known in advance and an efficient autofocusing method is required. This paper addresses the problem of finding optimal focusing methods for use in reconstruction of digital holograms of macroscopic amplitude and phase objects, using digital in-line phase-shifting holography in transmission mode. Fifteen autofocus measures, including spatial-, spectral-, and sparsity-based methods, were evaluated for both synthetic and experimental holograms. The Fresnel transform and the angular spectrum reconstruction methods were compared. Evaluation criteria included unimodality, accuracy, resolution, and computational cost. Autofocusing under angular spectrum propagation tends to perform better with respect to accuracy and unimodality criteria. Phase objects are, generally, more difficult to focus than amplitude objects. The normalized variance, the standard correlation, and the Tenenbaum gradient are the most reliable spatial-based metrics, combining computational efficiency with good accuracy and resolution. A good trade-off between focus performance and computational cost was found for the Fresnelet sparsity method.
A space radiation transport method development

NASA Technical Reports Server (NTRS)

Wilson, J. W.; Tripathi, R. K.; Qualls, G. D.; Cucinotta, F. A.; Prael, R. E.; Norbury, J. W.; Heinbockel, J. H.; Tweed, J.

2004-01-01

Improved spacecraft shield design requires early entry of radiation constraints into the design process to maximize performance and minimize costs. As a result, we have been investigating high-speed computational procedures to allow shield analysis from the preliminary design concepts to the final design. In particular, we will discuss the progress towards a full three-dimensional and computationally efficient deterministic code for which the current HZETRN evaluates the lowest-order asymptotic term. HZETRN is the first deterministic solution to the Boltzmann equation allowing field mapping within the International Space Station (ISS) in tens of minutes using standard finite element method (FEM) geometry common to engineering design practice enabling development of integrated multidisciplinary design optimization methods. A single ray trace in ISS FEM geometry requires 14 ms and severely limits application of Monte Carlo methods to such engineering models. A potential means of improving the Monte Carlo efficiency in coupling to spacecraft geometry is given in terms of re-configurable computing and could be utilized in the final design as verification of the deterministic method optimized design. Published by Elsevier Ltd on behalf of COSPAR.
Decomposition method for fast computation of gigapixel-sized Fresnel holograms on a graphics processing unit cluster.

PubMed

Jackin, Boaz Jessie; Watanabe, Shinpei; Ootsu, Kanemitsu; Ohkawa, Takeshi; Yokota, Takashi; Hayasaki, Yoshio; Yatagai, Toyohiko; Baba, Takanobu

2018-04-20

A parallel computation method for large-size Fresnel computer-generated hologram (CGH) is reported. The method was introduced by us in an earlier report as a technique for calculating Fourier CGH from 2D object data. In this paper we extend the method to compute Fresnel CGH from 3D object data. The scale of the computation problem is also expanded to 2 gigapixels, making it closer to real application requirements. The significant feature of the reported method is its ability to avoid communication overhead and thereby fully utilize the computing power of parallel devices. The method exhibits three layers of parallelism that favor small to large scale parallel computing machines. Simulation and optical experiments were conducted to demonstrate the workability and to evaluate the efficiency of the proposed technique. A two-times improvement in computation speed has been achieved compared to the conventional method, on a 16-node cluster (one GPU per node) utilizing only one layer of parallelism. A 20-times improvement in computation speed has been estimated utilizing two layers of parallelism on a very large-scale parallel machine with 16 nodes, where each node has 16 GPUs.
Efficient computation of PDF-based characteristics from diffusion MR signal.

PubMed

Assemlal, Haz-Edine; Tschumperlé, David; Brun, Luc

2008-01-01

We present a general method for the computation of PDF-based characteristics of the tissue micro-architecture in MR imaging. The approach relies on the approximation of the MR signal by a series expansion based on Spherical Harmonics and Laguerre-Gaussian functions, followed by a simple projection step that is efficiently done in a finite dimensional space. The resulting algorithm is generic, flexible and is able to compute a large set of useful characteristics of the local tissues structure. We illustrate the effectiveness of this approach by showing results on synthetic and real MR datasets acquired in a clinical time-frame.
Efficiently modeling neural networks on massively parallel computers

NASA Technical Reports Server (NTRS)

Farber, Robert M.

1993-01-01

Neural networks are a very useful tool for analyzing and modeling complex real world systems. Applying neural network simulations to real world problems generally involves large amounts of data and massive amounts of computation. To efficiently handle the computational requirements of large problems, we have implemented at Los Alamos a highly efficient neural network compiler for serial computers, vector computers, vector parallel computers, and fine grain SIMD computers such as the CM-2 connection machine. This paper describes the mapping used by the compiler to implement feed-forward backpropagation neural networks for a SIMD (Single Instruction Multiple Data) architecture parallel computer. Thinking Machines Corporation has benchmarked our code at 1.3 billion interconnects per second (approximately 3 gigaflops) on a 64,000 processor CM-2 connection machine (Singer 1990). This mapping is applicable to other SIMD computers and can be implemented on MIMD computers such as the CM-5 connection machine. Our mapping has virtually no communications overhead with the exception of the communications required for a global summation across the processors (which has a sub-linear runtime growth on the order of O(log(number of processors)). We can efficiently model very large neural networks which have many neurons and interconnects and our mapping can extend to arbitrarily large networks (within memory limitations) by merging the memory space of separate processors with fast adjacent processor interprocessor communications. This paper will consider the simulation of only feed forward neural network although this method is extendable to recurrent networks.
Improving multivariate Horner schemes with Monte Carlo tree search

NASA Astrophysics Data System (ADS)

Kuipers, J.; Plaat, A.; Vermaseren, J. A. M.; van den Herik, H. J.

2013-11-01

Optimizing the cost of evaluating a polynomial is a classic problem in computer science. For polynomials in one variable, Horner's method provides a scheme for producing a computationally efficient form. For multivariate polynomials it is possible to generalize Horner's method, but this leaves freedom in the order of the variables. Traditionally, greedy schemes like most-occurring variable first are used. This simple textbook algorithm has given remarkably efficient results. Finding better algorithms has proved difficult. In trying to improve upon the greedy scheme we have implemented Monte Carlo tree search, a recent search method from the field of artificial intelligence. This results in better Horner schemes and reduces the cost of evaluating polynomials, sometimes by factors up to two.
Do Computers Improve the Drawing of a Geometrical Figure for 10 Year-Old Children?

ERIC Educational Resources Information Center

Martin, Perrine; Velay, Jean-Luc

2012-01-01

Nowadays, computer aided design (CAD) is widely used by designers. Would children learn to draw more easily and more efficiently if they were taught with computerised tools? To answer this question, we made an experiment designed to compare two methods for children to do the same drawing: the classical "pen and paper" method and a CAD…
Board-foot and Cubic-foot Volume Computing Equations for Southeastern Tree Species

Treesearch

Mackay B. Bryan; Joe P. McClure

1962-01-01

Wide acceptance of Bitterlich's (2) method of sampling, popularized in this country by Grosenbaugh (3), with adaptations such as the variable plot used by Forest Survey in the Southeast, has opened a new era in forest surveying. The efficiency of these sampling methods, accompanied by the timely availability of electronic computing machines, has made it feasible...
Efficient and accurate two-scale FE-FFT-based prediction of the effective material behavior of elasto-viscoplastic polycrystals

NASA Astrophysics Data System (ADS)

Kochmann, Julian; Wulfinghoff, Stephan; Ehle, Lisa; Mayer, Joachim; Svendsen, Bob; Reese, Stefanie

2018-06-01

Recently, two-scale FE-FFT-based methods (e.g., Spahn et al. in Comput Methods Appl Mech Eng 268:871-883, 2014; Kochmann et al. in Comput Methods Appl Mech Eng 305:89-110, 2016) have been proposed to predict the microscopic and overall mechanical behavior of heterogeneous materials. The purpose of this work is the extension to elasto-viscoplastic polycrystals, efficient and robust Fourier solvers and the prediction of micromechanical fields during macroscopic deformation processes. Assuming scale separation, the macroscopic problem is solved using the finite element method. The solution of the microscopic problem, which is embedded as a periodic unit cell (UC) in each macroscopic integration point, is found by employing fast Fourier transforms, fixed-point and Newton-Krylov methods. The overall material behavior is defined by the mean UC response. In order to ensure spatially converged micromechanical fields as well as feasible overall CPU times, an efficient but simple solution strategy for two-scale simulations is proposed. As an example, the constitutive behavior of 42CrMo4 steel is predicted during macroscopic three-point bending tests.
Efficient and accurate two-scale FE-FFT-based prediction of the effective material behavior of elasto-viscoplastic polycrystals

NASA Astrophysics Data System (ADS)

Kochmann, Julian; Wulfinghoff, Stephan; Ehle, Lisa; Mayer, Joachim; Svendsen, Bob; Reese, Stefanie

2017-09-01

Recently, two-scale FE-FFT-based methods (e.g., Spahn et al. in Comput Methods Appl Mech Eng 268:871-883, 2014; Kochmann et al. in Comput Methods Appl Mech Eng 305:89-110, 2016) have been proposed to predict the microscopic and overall mechanical behavior of heterogeneous materials. The purpose of this work is the extension to elasto-viscoplastic polycrystals, efficient and robust Fourier solvers and the prediction of micromechanical fields during macroscopic deformation processes. Assuming scale separation, the macroscopic problem is solved using the finite element method. The solution of the microscopic problem, which is embedded as a periodic unit cell (UC) in each macroscopic integration point, is found by employing fast Fourier transforms, fixed-point and Newton-Krylov methods. The overall material behavior is defined by the mean UC response. In order to ensure spatially converged micromechanical fields as well as feasible overall CPU times, an efficient but simple solution strategy for two-scale simulations is proposed. As an example, the constitutive behavior of 42CrMo4 steel is predicted during macroscopic three-point bending tests.
Item Difficulty in the Evaluation of Computer-Based Instruction: An Example from Neuroanatomy

PubMed Central

Chariker, Julia H.; Naaz, Farah; Pani, John R.

2012-01-01

This article reports large item effects in a study of computer-based learning of neuroanatomy. Outcome measures of the efficiency of learning, transfer of learning, and generalization of knowledge diverged by a wide margin across test items, with certain sets of items emerging as particularly difficult to master. In addition, the outcomes of comparisons between instructional methods changed with the difficulty of the items to be learned. More challenging items better differentiated between instructional methods. This set of results is important for two reasons. First, it suggests that instruction may be more efficient if sets of consistently difficult items are the targets of instructional methods particularly suited to them. Second, there is wide variation in the published literature regarding the outcomes of empirical evaluations of computer-based instruction. As a consequence, many questions arise as to the factors that may affect such evaluations. The present paper demonstrates that the level of challenge in the material that is presented to learners is an important factor to consider in the evaluation of a computer-based instructional system. PMID:22231801
Direct differentiation of the quasi-incompressible fluid formulation of fluid-structure interaction using the PFEM

NASA Astrophysics Data System (ADS)

Zhu, Minjie; Scott, Michael H.

2017-07-01

Accurate and efficient response sensitivities for fluid-structure interaction (FSI) simulations are important for assessing the uncertain response of coastal and off-shore structures to hydrodynamic loading. To compute gradients efficiently via the direct differentiation method (DDM) for the fully incompressible fluid formulation, approximations of the sensitivity equations are necessary, leading to inaccuracies of the computed gradients when the geometry of the fluid mesh changes rapidly between successive time steps or the fluid viscosity is nonzero. To maintain accuracy of the sensitivity computations, a quasi-incompressible fluid is assumed for the response analysis of FSI using the particle finite element method and DDM is applied to this formulation, resulting in linearized equations for the response sensitivity that are consistent with those used to compute the response. Both the response and the response sensitivity can be solved using the same unified fractional step method. FSI simulations show that although the response using the quasi-incompressible and incompressible fluid formulations is similar, only the quasi-incompressible approach gives accurate response sensitivity for viscous, turbulent flows regardless of time step size.
Item difficulty in the evaluation of computer-based instruction: an example from neuroanatomy.

PubMed

Chariker, Julia H; Naaz, Farah; Pani, John R

2012-01-01

This article reports large item effects in a study of computer-based learning of neuroanatomy. Outcome measures of the efficiency of learning, transfer of learning, and generalization of knowledge diverged by a wide margin across test items, with certain sets of items emerging as particularly difficult to master. In addition, the outcomes of comparisons between instructional methods changed with the difficulty of the items to be learned. More challenging items better differentiated between instructional methods. This set of results is important for two reasons. First, it suggests that instruction may be more efficient if sets of consistently difficult items are the targets of instructional methods particularly suited to them. Second, there is wide variation in the published literature regarding the outcomes of empirical evaluations of computer-based instruction. As a consequence, many questions arise as to the factors that may affect such evaluations. The present article demonstrates that the level of challenge in the material that is presented to learners is an important factor to consider in the evaluation of a computer-based instructional system. Copyright © 2011 American Association of Anatomists.

Optimization Algorithm for Kalman Filter Exploiting the Numerical Characteristics of SINS/GPS Integrated Navigation Systems.

PubMed

Hu, Shaoxing; Xu, Shike; Wang, Duhu; Zhang, Aiwu

2015-11-11

Aiming at addressing the problem of high computational cost of the traditional Kalman filter in SINS/GPS, a practical optimization algorithm with offline-derivation and parallel processing methods based on the numerical characteristics of the system is presented in this paper. The algorithm exploits the sparseness and/or symmetry of matrices to simplify the computational procedure. Thus plenty of invalid operations can be avoided by offline derivation using a block matrix technique. For enhanced efficiency, a new parallel computational mechanism is established by subdividing and restructuring calculation processes after analyzing the extracted "useful" data. As a result, the algorithm saves about 90% of the CPU processing time and 66% of the memory usage needed in a classical Kalman filter. Meanwhile, the method as a numerical approach needs no precise-loss transformation/approximation of system modules and the accuracy suffers little in comparison with the filter before computational optimization. Furthermore, since no complicated matrix theories are needed, the algorithm can be easily transplanted into other modified filters as a secondary optimization method to achieve further efficiency.
Computationally efficient method for Fourier transform of highly chirped pulses for laser and parametric amplifier modeling.

PubMed

Andrianov, Alexey; Szabo, Aron; Sergeev, Alexander; Kim, Arkady; Chvykov, Vladimir; Kalashnikov, Mikhail

2016-11-14

We developed an improved approach to calculate the Fourier transform of signals with arbitrary large quadratic phase which can be efficiently implemented in numerical simulations utilizing Fast Fourier transform. The proposed algorithm significantly reduces the computational cost of Fourier transform of a highly chirped and stretched pulse by splitting it into two separate transforms of almost transform limited pulses, thereby reducing the required grid size roughly by a factor of the pulse stretching. The application of our improved Fourier transform algorithm in the split-step method for numerical modeling of CPA and OPCPA shows excellent agreement with standard algorithms.
Extension of the TDCR model to compute counting efficiencies for radionuclides with complex decay schemes.

PubMed

Kossert, K; Cassette, Ph; Carles, A Grau; Jörg, G; Gostomski, Christroph Lierse V; Nähle, O; Wolf, Ch

2014-05-01

The triple-to-double coincidence ratio (TDCR) method is frequently used to measure the activity of radionuclides decaying by pure β emission or electron capture (EC). Some radionuclides with more complex decays have also been studied, but accurate calculations of decay branches which are accompanied by many coincident γ transitions have not yet been investigated. This paper describes recent extensions of the model to make efficiency computations for more complex decay schemes possible. In particular, the MICELLE2 program that applies a stochastic approach of the free parameter model was extended. With an improved code, efficiencies for β(-), β(+) and EC branches with up to seven coincident γ transitions can be calculated. Moreover, a new parametrization for the computation of electron stopping powers has been implemented to compute the ionization quenching function of 10 commercial scintillation cocktails. In order to demonstrate the capabilities of the TDCR method, the following radionuclides are discussed: (166m)Ho (complex β(-)/γ), (59)Fe (complex β(-)/γ), (64)Cu (β(-), β(+), EC and EC/γ) and (229)Th in equilibrium with its progenies (decay chain with many α, β and complex β(-)/γ transitions). © 2013 Published by Elsevier Ltd.
Efficient volumetric estimation from plenoptic data

NASA Astrophysics Data System (ADS)

Anglin, Paul; Reeves, Stanley J.; Thurow, Brian S.

2013-03-01

The commercial release of the Lytro camera, and greater availability of plenoptic imaging systems in general, have given the image processing community cost-effective tools for light-field imaging. While this data is most commonly used to generate planar images at arbitrary focal depths, reconstruction of volumetric fields is also possible. Similarly, deconvolution is a technique that is conventionally used in planar image reconstruction, or deblurring, algorithms. However, when leveraged with the ability of a light-field camera to quickly reproduce multiple focal planes within an imaged volume, deconvolution offers a computationally efficient method of volumetric reconstruction. Related research has shown than light-field imaging systems in conjunction with tomographic reconstruction techniques are also capable of estimating the imaged volume and have been successfully applied to particle image velocimetry (PIV). However, while tomographic volumetric estimation through algorithms such as multiplicative algebraic reconstruction techniques (MART) have proven to be highly accurate, they are computationally intensive. In this paper, the reconstruction problem is shown to be solvable by deconvolution. Deconvolution offers significant improvement in computational efficiency through the use of fast Fourier transforms (FFTs) when compared to other tomographic methods. This work describes a deconvolution algorithm designed to reconstruct a 3-D particle field from simulated plenoptic data. A 3-D extension of existing 2-D FFT-based refocusing techniques is presented to further improve efficiency when computing object focal stacks and system point spread functions (PSF). Reconstruction artifacts are identified; their underlying source and methods of mitigation are explored where possible, and reconstructions of simulated particle fields are provided.
Vectorized Jiles-Atherton hysteresis model

NASA Astrophysics Data System (ADS)

Szymański, Grzegorz; Waszak, Michał

2004-01-01

This paper deals with vector hysteresis modeling. A vector model consisting of individual Jiles-Atherton components placed along principal axes is proposed. The cross-axis coupling ensures general vector model properties. Minor loops are obtained using scaling method. The model is intended for efficient finite element method computations defined in terms of magnetic vector potential. Numerical efficiency is ensured by differential susceptibility approach.
An efficient numerical model for multicomponent compressible flow in fractured porous media

NASA Astrophysics Data System (ADS)

Zidane, Ali; Firoozabadi, Abbas

2014-12-01

An efficient and accurate numerical model for multicomponent compressible single-phase flow in fractured media is presented. The discrete-fracture approach is used to model the fractures where the fracture entities are described explicitly in the computational domain. We use the concept of cross flow equilibrium in the fractures. This will allow large matrix elements in the neighborhood of the fractures and considerable speed up of the algorithm. We use an implicit finite volume (FV) scheme to solve the species mass balance equation in the fractures. This step avoids the use of Courant-Freidricks-Levy (CFL) condition and contributes to significant speed up of the code. The hybrid mixed finite element method (MFE) is used to solve for the velocity in both the matrix and the fractures coupled with the discontinuous Galerkin (DG) method to solve the species transport equations in the matrix. Four numerical examples are presented to demonstrate the robustness and efficiency of the proposed model. We show that the combination of the fracture cross-flow equilibrium and the implicit composition calculation in the fractures increase the computational speed 20-130 times in 2D. In 3D, one may expect even a higher computational efficiency.
Image restoration for three-dimensional fluorescence microscopy using an orthonormal basis for efficient representation of depth-variant point-spread functions

PubMed Central

Patwary, Nurmohammed; Preza, Chrysanthe

2015-01-01

A depth-variant (DV) image restoration algorithm for wide field fluorescence microscopy, using an orthonormal basis decomposition of DV point-spread functions (PSFs), is investigated in this study. The efficient PSF representation is based on a previously developed principal component analysis (PCA), which is computationally intensive. We present an approach developed to reduce the number of DV PSFs required for the PCA computation, thereby making the PCA-based approach computationally tractable for thick samples. Restoration results from both synthetic and experimental images show consistency and that the proposed algorithm addresses efficiently depth-induced aberration using a small number of principal components. Comparison of the PCA-based algorithm with a previously-developed strata-based DV restoration algorithm demonstrates that the proposed method improves performance by 50% in terms of accuracy and simultaneously reduces the processing time by 64% using comparable computational resources. PMID:26504634
Computational methods for structural load and resistance modeling

NASA Technical Reports Server (NTRS)

Thacker, B. H.; Millwater, H. R.; Harren, S. V.

1991-01-01

An automated capability for computing structural reliability considering uncertainties in both load and resistance variables is presented. The computations are carried out using an automated Advanced Mean Value iteration algorithm (AMV +) with performance functions involving load and resistance variables obtained by both explicit and implicit methods. A complete description of the procedures used is given as well as several illustrative examples, verified by Monte Carlo Analysis. In particular, the computational methods described in the paper are shown to be quite accurate and efficient for a material nonlinear structure considering material damage as a function of several primitive random variables. The results show clearly the effectiveness of the algorithms for computing the reliability of large-scale structural systems with a maximum number of resolutions.
Cost-of-illness studies based on massive data: a prevalence-based, top-down regression approach.

PubMed

Stollenwerk, Björn; Welchowski, Thomas; Vogl, Matthias; Stock, Stephanie

2016-04-01

Despite the increasing availability of routine data, no analysis method has yet been presented for cost-of-illness (COI) studies based on massive data. We aim, first, to present such a method and, second, to assess the relevance of the associated gain in numerical efficiency. We propose a prevalence-based, top-down regression approach consisting of five steps: aggregating the data; fitting a generalized additive model (GAM); predicting costs via the fitted GAM; comparing predicted costs between prevalent and non-prevalent subjects; and quantifying the stochastic uncertainty via error propagation. To demonstrate the method, it was applied to aggregated data in the context of chronic lung disease to German sickness funds data (from 1999), covering over 7.3 million insured. To assess the gain in numerical efficiency, the computational time of the innovative approach has been compared with corresponding GAMs applied to simulated individual-level data. Furthermore, the probability of model failure was modeled via logistic regression. Applying the innovative method was reasonably fast (19 min). In contrast, regarding patient-level data, computational time increased disproportionately by sample size. Furthermore, using patient-level data was accompanied by a substantial risk of model failure (about 80 % for 6 million subjects). The gain in computational efficiency of the innovative COI method seems to be of practical relevance. Furthermore, it may yield more precise cost estimates.
A Survey of Methods for Analyzing and Improving GPU Energy Efficiency

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mittal, Sparsh; Vetter, Jeffrey S

2014-01-01

Recent years have witnessed a phenomenal growth in the computational capabilities and applications of GPUs. However, this trend has also led to dramatic increase in their power consumption. This paper surveys research works on analyzing and improving energy efficiency of GPUs. It also provides a classification of these techniques on the basis of their main research idea. Further, it attempts to synthesize research works which compare energy efficiency of GPUs with other computing systems, e.g. FPGAs and CPUs. The aim of this survey is to provide researchers with knowledge of state-of-the-art in GPU power management and motivate them to architectmore » highly energy-efficient GPUs of tomorrow.« less
Need for speed: An optimized gridding approach for spatially explicit disease simulations.

PubMed

Sellman, Stefan; Tsao, Kimberly; Tildesley, Michael J; Brommesson, Peter; Webb, Colleen T; Wennergren, Uno; Keeling, Matt J; Lindström, Tom

2018-04-01

Numerical models for simulating outbreaks of infectious diseases are powerful tools for informing surveillance and control strategy decisions. However, large-scale spatially explicit models can be limited by the amount of computational resources they require, which poses a problem when multiple scenarios need to be explored to provide policy recommendations. We introduce an easily implemented method that can reduce computation time in a standard Susceptible-Exposed-Infectious-Removed (SEIR) model without introducing any further approximations or truncations. It is based on a hierarchical infection process that operates on entire groups of spatially related nodes (cells in a grid) in order to efficiently filter out large volumes of susceptible nodes that would otherwise have required expensive calculations. After the filtering of the cells, only a subset of the nodes that were originally at risk are then evaluated for actual infection. The increase in efficiency is sensitive to the exact configuration of the grid, and we describe a simple method to find an estimate of the optimal configuration of a given landscape as well as a method to partition the landscape into a grid configuration. To investigate its efficiency, we compare the introduced methods to other algorithms and evaluate computation time, focusing on simulated outbreaks of foot-and-mouth disease (FMD) on the farm population of the USA, the UK and Sweden, as well as on three randomly generated populations with varying degree of clustering. The introduced method provided up to 500 times faster calculations than pairwise computation, and consistently performed as well or better than other available methods. This enables large scale, spatially explicit simulations such as for the entire continental USA without sacrificing realism or predictive power.
Need for speed: An optimized gridding approach for spatially explicit disease simulations

PubMed Central

Tildesley, Michael J.; Brommesson, Peter; Webb, Colleen T.; Wennergren, Uno; Lindström, Tom

2018-01-01

Numerical models for simulating outbreaks of infectious diseases are powerful tools for informing surveillance and control strategy decisions. However, large-scale spatially explicit models can be limited by the amount of computational resources they require, which poses a problem when multiple scenarios need to be explored to provide policy recommendations. We introduce an easily implemented method that can reduce computation time in a standard Susceptible-Exposed-Infectious-Removed (SEIR) model without introducing any further approximations or truncations. It is based on a hierarchical infection process that operates on entire groups of spatially related nodes (cells in a grid) in order to efficiently filter out large volumes of susceptible nodes that would otherwise have required expensive calculations. After the filtering of the cells, only a subset of the nodes that were originally at risk are then evaluated for actual infection. The increase in efficiency is sensitive to the exact configuration of the grid, and we describe a simple method to find an estimate of the optimal configuration of a given landscape as well as a method to partition the landscape into a grid configuration. To investigate its efficiency, we compare the introduced methods to other algorithms and evaluate computation time, focusing on simulated outbreaks of foot-and-mouth disease (FMD) on the farm population of the USA, the UK and Sweden, as well as on three randomly generated populations with varying degree of clustering. The introduced method provided up to 500 times faster calculations than pairwise computation, and consistently performed as well or better than other available methods. This enables large scale, spatially explicit simulations such as for the entire continental USA without sacrificing realism or predictive power. PMID:29624574
A high-order strong stability preserving Runge-Kutta method for three-dimensional full waveform modeling and inversion of anelastic models

NASA Astrophysics Data System (ADS)

Wang, N.; Shen, Y.; Yang, D.; Bao, X.; Li, J.; Zhang, W.

2017-12-01

Accurate and efficient forward modeling methods are important for high resolution full waveform inversion. Compared with the elastic case, solving anelastic wave equation requires more computational time, because of the need to compute additional material-independent anelastic functions. A numerical scheme with a large Courant-Friedrichs-Lewy (CFL) condition number enables us to use a large time step to simulate wave propagation, which improves computational efficiency. In this work, we apply the fourth-order strong stability preserving Runge-Kutta method with an optimal CFL coeffiecient to solve the anelastic wave equation. We use a fourth order DRP/opt MacCormack scheme for the spatial discretization, and we approximate the rheological behaviors of the Earth by using the generalized Maxwell body model. With a larger CFL condition number, we find that the computational efficient is significantly improved compared with the traditional fourth-order Runge-Kutta method. Then, we apply the scattering-integral method for calculating travel time and amplitude sensitivity kernels with respect to velocity and attenuation structures. For each source, we carry out one forward simulation and save the time-dependent strain tensor. For each station, we carry out three `backward' simulations for the three components and save the corresponding strain tensors. The sensitivity kernels at each point in the medium are the convolution of the two sets of the strain tensors. Finally, we show several synthetic tests to verify the effectiveness of the strong stability preserving Runge-Kutta method in generating accurate synthetics in full waveform modeling, and in generating accurate strain tensors for calculating sensitivity kernels at regional and global scales.
Challenges in Species Tree Estimation Under the Multispecies Coalescent Model

PubMed Central

Xu, Bo; Yang, Ziheng

2016-01-01

The multispecies coalescent (MSC) model has emerged as a powerful framework for inferring species phylogenies while accounting for ancestral polymorphism and gene tree-species tree conflict. A number of methods have been developed in the past few years to estimate the species tree under the MSC. The full likelihood methods (including maximum likelihood and Bayesian inference) average over the unknown gene trees and accommodate their uncertainties properly but involve intensive computation. The approximate or summary coalescent methods are computationally fast and are applicable to genomic datasets with thousands of loci, but do not make an efficient use of information in the multilocus data. Most of them take the two-step approach of reconstructing the gene trees for multiple loci by phylogenetic methods and then treating the estimated gene trees as observed data, without accounting for their uncertainties appropriately. In this article we review the statistical nature of the species tree estimation problem under the MSC, and explore the conceptual issues and challenges of species tree estimation by focusing mainly on simple cases of three or four closely related species. We use mathematical analysis and computer simulation to demonstrate that large differences in statistical performance may exist between the two classes of methods. We illustrate that several counterintuitive behaviors may occur with the summary methods but they are due to inefficient use of information in the data by summary methods and vanish when the data are analyzed using full-likelihood methods. These include (i) unidentifiability of parameters in the model, (ii) inconsistency in the so-called anomaly zone, (iii) singularity on the likelihood surface, and (iv) deterioration of performance upon addition of more data. We discuss the challenges and strategies of species tree inference for distantly related species when the molecular clock is violated, and highlight the need for improving the computational efficiency and model realism of the likelihood methods as well as the statistical efficiency of the summary methods. PMID:27927902
Computation of Nonlinear Hydrodynamic Loads on Floating Wind Turbines Using Fluid-Impulse Theory: Preprint

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kok Yan Chan, G.; Sclavounos, P. D.; Jonkman, J.

2015-04-02

A hydrodynamics computer module was developed for the evaluation of the linear and nonlinear loads on floating wind turbines using a new fluid-impulse formulation for coupling with the FAST program. The recently developed formulation allows the computation of linear and nonlinear loads on floating bodies in the time domain and avoids the computationally intensive evaluation of temporal and nonlinear free-surface problems and efficient methods are derived for its computation. The body instantaneous wetted surface is approximated by a panel mesh and the discretization of the free surface is circumvented by using the Green function. The evaluation of the nonlinear loadsmore » is based on explicit expressions derived by the fluid-impulse theory, which can be computed efficiently. Computations are presented of the linear and nonlinear loads on the MIT/NREL tension-leg platform. Comparisons were carried out with frequency-domain linear and second-order methods. Emphasis was placed on modeling accuracy of the magnitude of nonlinear low- and high-frequency wave loads in a sea state. Although fluid-impulse theory is applied to floating wind turbines in this paper, the theory is applicable to other offshore platforms as well.« less
Computationally Efficient Adaptive Beamformer for Ultrasound Imaging Based on QR Decomposition.

PubMed

Park, Jongin; Wi, Seok-Min; Lee, Jin S

2016-02-01

Adaptive beamforming methods for ultrasound imaging have been studied to improve image resolution and contrast. The most common approach is the minimum variance (MV) beamformer which minimizes the power of the beamformed output while maintaining the response from the direction of interest constant. The method achieves higher resolution and better contrast than the delay-and-sum (DAS) beamformer, but it suffers from high computational cost. This cost is mainly due to the computation of the spatial covariance matrix and its inverse, which requires O(L(3)) computations, where L denotes the subarray size. In this study, we propose a computationally efficient MV beamformer based on QR decomposition. The idea behind our approach is to transform the spatial covariance matrix to be a scalar matrix σI and we subsequently obtain the apodization weights and the beamformed output without computing the matrix inverse. To do that, QR decomposition algorithm is used and also can be executed at low cost, and therefore, the computational complexity is reduced to O(L(2)). In addition, our approach is mathematically equivalent to the conventional MV beamformer, thereby showing the equivalent performances. The simulation and experimental results support the validity of our approach.
Bootstrap Methods: A Very Leisurely Look.

ERIC Educational Resources Information Center

Hinkle, Dennis E.; Winstead, Wayland H.

The Bootstrap method, a computer-intensive statistical method of estimation, is illustrated using a simple and efficient Statistical Analysis System (SAS) routine. The utility of the method for generating unknown parameters, including standard errors for simple statistics, regression coefficients, discriminant function coefficients, and factor…
Computational Efficiency of the Simplex Embedding Method in Convex Nondifferentiable Optimization

NASA Astrophysics Data System (ADS)

Kolosnitsyn, A. V.

2018-02-01

The simplex embedding method for solving convex nondifferentiable optimization problems is considered. A description of modifications of this method based on a shift of the cutting plane intended for cutting off the maximum number of simplex vertices is given. These modification speed up the problem solution. A numerical comparison of the efficiency of the proposed modifications based on the numerical solution of benchmark convex nondifferentiable optimization problems is presented.
Constrained Total Generalized p-Variation Minimization for Few-View X-Ray Computed Tomography Image Reconstruction

PubMed Central

Zhang, Hanming; Wang, Linyuan; Yan, Bin; Li, Lei; Cai, Ailong; Hu, Guoen

2016-01-01

Total generalized variation (TGV)-based computed tomography (CT) image reconstruction, which utilizes high-order image derivatives, is superior to total variation-based methods in terms of the preservation of edge information and the suppression of unfavorable staircase effects. However, conventional TGV regularization employs l1-based form, which is not the most direct method for maximizing sparsity prior. In this study, we propose a total generalized p-variation (TGpV) regularization model to improve the sparsity exploitation of TGV and offer efficient solutions to few-view CT image reconstruction problems. To solve the nonconvex optimization problem of the TGpV minimization model, we then present an efficient iterative algorithm based on the alternating minimization of augmented Lagrangian function. All of the resulting subproblems decoupled by variable splitting admit explicit solutions by applying alternating minimization method and generalized p-shrinkage mapping. In addition, approximate solutions that can be easily performed and quickly calculated through fast Fourier transform are derived using the proximal point method to reduce the cost of inner subproblems. The accuracy and efficiency of the simulated and real data are qualitatively and quantitatively evaluated to validate the efficiency and feasibility of the proposed method. Overall, the proposed method exhibits reasonable performance and outperforms the original TGV-based method when applied to few-view problems. PMID:26901410
Automatic domain updating technique for improving computational efficiency of 2-D flood-inundation simulation

NASA Astrophysics Data System (ADS)

Tanaka, T.; Tachikawa, Y.; Ichikawa, Y.; Yorozu, K.

2017-12-01

Flood is one of the most hazardous disasters and causes serious damage to people and property around the world. To prevent/mitigate flood damage through early warning system and/or river management planning, numerical modelling of flood-inundation processes is essential. In a literature, flood-inundation models have been extensively developed and improved to achieve flood flow simulation with complex topography at high resolution. With increasing demands on flood-inundation modelling, its computational burden is now one of the key issues. Improvements of computational efficiency of full shallow water equations are made from various perspectives such as approximations of the momentum equations, parallelization technique, and coarsening approaches. To support these techniques and more improve the computational efficiency of flood-inundation simulations, this study proposes an Automatic Domain Updating (ADU) method of 2-D flood-inundation simulation. The ADU method traces the wet and dry interface and automatically updates the simulation domain in response to the progress and recession of flood propagation. The updating algorithm is as follow: first, to register the simulation cells potentially flooded at initial stage (such as floodplains nearby river channels), and then if a registered cell is flooded, to register its surrounding cells. The time for this additional process is saved by checking only cells at wet and dry interface. The computation time is reduced by skipping the processing time of non-flooded area. This algorithm is easily applied to any types of 2-D flood inundation models. The proposed ADU method is implemented to 2-D local inertial equations for the Yodo River basin, Japan. Case studies for two flood events show that the simulation is finished within two to 10 times smaller time showing the same result as that without the ADU method.

Fast dose kernel interpolation using Fourier transform with application to permanent prostate brachytherapy dosimetry.

PubMed

Liu, Derek; Sloboda, Ron S

2014-05-01

Boyer and Mok proposed a fast calculation method employing the Fourier transform (FT), for which calculation time is independent of the number of seeds but seed placement is restricted to calculation grid points. Here an interpolation method is described enabling unrestricted seed placement while preserving the computational efficiency of the original method. The Iodine-125 seed dose kernel was sampled and selected values were modified to optimize interpolation accuracy for clinically relevant doses. For each seed, the kernel was shifted to the nearest grid point via convolution with a unit impulse, implemented in the Fourier domain. The remaining fractional shift was performed using a piecewise third-order Lagrange filter. Implementation of the interpolation method greatly improved FT-based dose calculation accuracy. The dose distribution was accurate to within 2% beyond 3 mm from each seed. Isodose contours were indistinguishable from explicit TG-43 calculation. Dose-volume metric errors were negligible. Computation time for the FT interpolation method was essentially the same as Boyer's method. A FT interpolation method for permanent prostate brachytherapy TG-43 dose calculation was developed which expands upon Boyer's original method and enables unrestricted seed placement. The proposed method substantially improves the clinically relevant dose accuracy with negligible additional computation cost, preserving the efficiency of the original method.
Local coding based matching kernel method for image classification.

PubMed

Song, Yan; McLoughlin, Ian Vince; Dai, Li-Rong

2014-01-01

This paper mainly focuses on how to effectively and efficiently measure visual similarity for local feature based representation. Among existing methods, metrics based on Bag of Visual Word (BoV) techniques are efficient and conceptually simple, at the expense of effectiveness. By contrast, kernel based metrics are more effective, but at the cost of greater computational complexity and increased storage requirements. We show that a unified visual matching framework can be developed to encompass both BoV and kernel based metrics, in which local kernel plays an important role between feature pairs or between features and their reconstruction. Generally, local kernels are defined using Euclidean distance or its derivatives, based either explicitly or implicitly on an assumption of Gaussian noise. However, local features such as SIFT and HoG often follow a heavy-tailed distribution which tends to undermine the motivation behind Euclidean metrics. Motivated by recent advances in feature coding techniques, a novel efficient local coding based matching kernel (LCMK) method is proposed. This exploits the manifold structures in Hilbert space derived from local kernels. The proposed method combines advantages of both BoV and kernel based metrics, and achieves a linear computational complexity. This enables efficient and scalable visual matching to be performed on large scale image sets. To evaluate the effectiveness of the proposed LCMK method, we conduct extensive experiments with widely used benchmark datasets, including 15-Scenes, Caltech101/256, PASCAL VOC 2007 and 2011 datasets. Experimental results confirm the effectiveness of the relatively efficient LCMK method.
A parallel overset-curvilinear-immersed boundary framework for simulating complex 3D incompressible flows

PubMed Central

Borazjani, Iman; Ge, Liang; Le, Trung; Sotiropoulos, Fotis

2013-01-01

We develop an overset-curvilinear immersed boundary (overset-CURVIB) method in a general non-inertial frame of reference to simulate a wide range of challenging biological flow problems. The method incorporates overset-curvilinear grids to efficiently handle multi-connected geometries and increase the resolution locally near immersed boundaries. Complex bodies undergoing arbitrarily large deformations may be embedded within the overset-curvilinear background grid and treated as sharp interfaces using the curvilinear immersed boundary (CURVIB) method (Ge and Sotiropoulos, Journal of Computational Physics, 2007). The incompressible flow equations are formulated in a general non-inertial frame of reference to enhance the overall versatility and efficiency of the numerical approach. Efficient search algorithms to identify areas requiring blanking, donor cells, and interpolation coefficients for constructing the boundary conditions at grid interfaces of the overset grid are developed and implemented using efficient parallel computing communication strategies to transfer information among sub-domains. The governing equations are discretized using a second-order accurate finite-volume approach and integrated in time via an efficient fractional-step method. Various strategies for ensuring globally conservative interpolation at grid interfaces suitable for incompressible flow fractional step methods are implemented and evaluated. The method is verified and validated against experimental data, and its capabilities are demonstrated by simulating the flow past multiple aquatic swimmers and the systolic flow in an anatomic left ventricle with a mechanical heart valve implanted in the aortic position. PMID:23833331
An efficient method for the computation of Legendre moments.

PubMed

Yap, Pew-Thian; Paramesran, Raveendran

2005-12-01

Legendre moments are continuous moments, hence, when applied to discrete-space images, numerical approximation is involved and error occurs. This paper proposes a method to compute the exact values of the moments by mathematically integrating the Legendre polynomials over the corresponding intervals of the image pixels. Experimental results show that the values obtained match those calculated theoretically, and the image reconstructed from these moments have lower error than that of the conventional methods for the same order. Although the same set of exact Legendre moments can be obtained indirectly from the set of geometric moments, the computation time taken is much longer than the proposed method.
Finite-Element Methods for Real-Time Simulation of Surgery

NASA Technical Reports Server (NTRS)

Basdogan, Cagatay

2003-01-01

Two finite-element methods have been developed for mathematical modeling of the time-dependent behaviors of deformable objects and, more specifically, the mechanical responses of soft tissues and organs in contact with surgical tools. These methods may afford the computational efficiency needed to satisfy the requirement to obtain computational results in real time for simulating surgical procedures as described in Simulation System for Training in Laparoscopic Surgery (NPO-21192) on page 31 in this issue of NASA Tech Briefs. Simulation of the behavior of soft tissue in real time is a challenging problem because of the complexity of soft-tissue mechanics. The responses of soft tissues are characterized by nonlinearities and by spatial inhomogeneities and rate and time dependences of material properties. Finite-element methods seem promising for integrating these characteristics of tissues into computational models of organs, but they demand much central-processing-unit (CPU) time and memory, and the demand increases with the number of nodes and degrees of freedom in a given finite-element model. Hence, as finite-element models become more realistic, it becomes more difficult to compute solutions in real time. In both of the present methods, one uses approximate mathematical models trading some accuracy for computational efficiency and thereby increasing the feasibility of attaining real-time up36 NASA Tech Briefs, October 2003 date rates. The first of these methods is based on modal analysis. In this method, one reduces the number of differential equations by selecting only the most significant vibration modes of an object (typically, a suitable number of the lowest-frequency modes) for computing deformations of the object in response to applied forces.
SparseMaps—A systematic infrastructure for reduced-scaling electronic structure methods. III. Linear-scaling multireference domain-based pair natural orbital N-electron valence perturbation theory

NASA Astrophysics Data System (ADS)

Guo, Yang; Sivalingam, Kantharuban; Valeev, Edward F.; Neese, Frank

2016-03-01

Multi-reference (MR) electronic structure methods, such as MR configuration interaction or MR perturbation theory, can provide reliable energies and properties for many molecular phenomena like bond breaking, excited states, transition states or magnetic properties of transition metal complexes and clusters. However, owing to their inherent complexity, most MR methods are still too computationally expensive for large systems. Therefore the development of more computationally attractive MR approaches is necessary to enable routine application for large-scale chemical systems. Among the state-of-the-art MR methods, second-order N-electron valence state perturbation theory (NEVPT2) is an efficient, size-consistent, and intruder-state-free method. However, there are still two important bottlenecks in practical applications of NEVPT2 to large systems: (a) the high computational cost of NEVPT2 for large molecules, even with moderate active spaces and (b) the prohibitive cost for treating large active spaces. In this work, we address problem (a) by developing a linear scaling "partially contracted" NEVPT2 method. This development uses the idea of domain-based local pair natural orbitals (DLPNOs) to form a highly efficient algorithm. As shown previously in the framework of single-reference methods, the DLPNO concept leads to an enormous reduction in computational effort while at the same time providing high accuracy (approaching 99.9% of the correlation energy), robustness, and black-box character. In the DLPNO approach, the virtual space is spanned by pair natural orbitals that are expanded in terms of projected atomic orbitals in large orbital domains, while the inactive space is spanned by localized orbitals. The active orbitals are left untouched. Our implementation features a highly efficient "electron pair prescreening" that skips the negligible inactive pairs. The surviving pairs are treated using the partially contracted NEVPT2 formalism. A detailed comparison between the partial and strong contraction schemes is made, with conclusions that discourage the strong contraction scheme as a basis for local correlation methods due to its non-invariance with respect to rotations in the inactive and external subspaces. A minimal set of conservatively chosen truncation thresholds controls the accuracy of the method. With the default thresholds, about 99.9% of the canonical partially contracted NEVPT2 correlation energy is recovered while the crossover of the computational cost with the already very efficient canonical method occurs reasonably early; in linear chain type compounds at a chain length of around 80 atoms. Calculations are reported for systems with more than 300 atoms and 5400 basis functions.
Polynomial-time quantum algorithm for the simulation of chemical dynamics

PubMed Central

Kassal, Ivan; Jordan, Stephen P.; Love, Peter J.; Mohseni, Masoud; Aspuru-Guzik, Alán

2008-01-01

The computational cost of exact methods for quantum simulation using classical computers grows exponentially with system size. As a consequence, these techniques can be applied only to small systems. By contrast, we demonstrate that quantum computers could exactly simulate chemical reactions in polynomial time. Our algorithm uses the split-operator approach and explicitly simulates all electron-nuclear and interelectronic interactions in quadratic time. Surprisingly, this treatment is not only more accurate than the Born–Oppenheimer approximation but faster and more efficient as well, for all reactions with more than about four atoms. This is the case even though the entire electronic wave function is propagated on a grid with appropriately short time steps. Although the preparation and measurement of arbitrary states on a quantum computer is inefficient, here we demonstrate how to prepare states of chemical interest efficiently. We also show how to efficiently obtain chemically relevant observables, such as state-to-state transition probabilities and thermal reaction rates. Quantum computers using these techniques could outperform current classical computers with 100 qubits. PMID:19033207
Computer Graphics-aided systems analysis: application to well completion design

DOE Office of Scientific and Technical Information (OSTI.GOV)

Detamore, J.E.; Sarma, M.P.

1985-03-01

The development of an engineering tool (in the form of a computer model) for solving design and analysis problems related with oil and gas well production operations is discussed. The development of the method is based on integrating the concepts of ''Systems Analysis'' with the techniques of ''Computer Graphics''. The concepts behind the method are very general in nature. This paper, however, illustrates the application of the method in solving gas well completion design problems. The use of the method will save time and improve the efficiency of such design and analysis problems. The method can be extended to othermore » design and analysis aspects of oil and gas wells.« less
Method for simultaneous overlapped communications between neighboring processors in a multiple

DOEpatents

Benner, Robert E.; Gustafson, John L.; Montry, Gary R.

1991-01-01

A parallel computing system and method having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system.
Aeroelasticity of wing and wing-body configurations on parallel computers

NASA Technical Reports Server (NTRS)

Byun, Chansup

1995-01-01

The objective of this research is to develop computationally efficient methods for solving aeroelasticity problems on parallel computers. Both uncoupled and coupled methods are studied in this research. For the uncoupled approach, the conventional U-g method is used to determine the flutter boundary. The generalized aerodynamic forces required are obtained by the pulse transfer-function analysis method. For the coupled approach, the fluid-structure interaction is obtained by directly coupling finite difference Euler/Navier-Stokes equations for fluids and finite element dynamics equations for structures. This capability will significantly impact many aerospace projects of national importance such as Advanced Subsonic Civil Transport (ASCT), where the structural stability margin becomes very critical at the transonic region. This research effort will have direct impact on the High Performance Computing and Communication (HPCC) Program of NASA in the area of parallel computing.
Algebraic model checking for Boolean gene regulatory networks.

PubMed

Tran, Quoc-Nam

2011-01-01

We present a computational method in which modular and Groebner bases (GB) computation in Boolean rings are used for solving problems in Boolean gene regulatory networks (BN). In contrast to other known algebraic approaches, the degree of intermediate polynomials during the calculation of Groebner bases using our method will never grow resulting in a significant improvement in running time and memory space consumption. We also show how calculation in temporal logic for model checking can be done by means of our direct and efficient Groebner basis computation in Boolean rings. We present our experimental results in finding attractors and control strategies of Boolean networks to illustrate our theoretical arguments. The results are promising. Our algebraic approach is more efficient than the state-of-the-art model checker NuSMV on BNs. More importantly, our approach finds all solutions for the BN problems.
On finite element implementation and computational techniques for constitutive modeling of high temperature composites

NASA Technical Reports Server (NTRS)

Saleeb, A. F.; Chang, T. Y. P.; Wilt, T.; Iskovitz, I.

1989-01-01

The research work performed during the past year on finite element implementation and computational techniques pertaining to high temperature composites is outlined. In the present research, two main issues are addressed: efficient geometric modeling of composite structures and expedient numerical integration techniques dealing with constitutive rate equations. In the first issue, mixed finite elements for modeling laminated plates and shells were examined in terms of numerical accuracy, locking property and computational efficiency. Element applications include (currently available) linearly elastic analysis and future extension to material nonlinearity for damage predictions and large deformations. On the material level, various integration methods to integrate nonlinear constitutive rate equations for finite element implementation were studied. These include explicit, implicit and automatic subincrementing schemes. In all cases, examples are included to illustrate the numerical characteristics of various methods that were considered.
A Proposal of a Fast Computation Method for Thermal Capacity and Voltage ATC by Means of Homotopy Functions

NASA Astrophysics Data System (ADS)

Zoka, Yoshifumi; Yorino, Naoto; Kawano, Koki; Suenari, Hiroyasu

This paper proposes a fast computation method for Available Transfer Capability (ATC) with respect to thermal and voltage magnitude limits. In the paper, ATC is formulated as an optimization problem. In order to obtain the efficiency for the N-1 outage contingency calculations, linear sensitivity methods are applied for screening and ranking all contingency selections with respect to the thermal and voltage magnitude limits margin to identify the severest case. In addition, homotopy functions are used for the generator QV constrains to reduce the maximum error of the linear estimation. Then, the Primal-Dual Interior Point Method (PDIPM) is used to solve the optimization problem for the severest case only, in which the solutions of ATC can be obtained efficiently. The effectiveness of the proposed method is demonstrated through IEEE 30, 57, 118-bus systems.
An efficient unstructured WENO method for supersonic reactive flows

NASA Astrophysics Data System (ADS)

Zhao, Wen-Geng; Zheng, Hong-Wei; Liu, Feng-Jun; Shi, Xiao-Tian; Gao, Jun; Hu, Ning; Lv, Meng; Chen, Si-Cong; Zhao, Hong-Da

2018-03-01

An efficient high-order numerical method for supersonic reactive flows is proposed in this article. The reactive source term and convection term are solved separately by splitting scheme. In the reaction step, an adaptive time-step method is presented, which can improve the efficiency greatly. In the convection step, a third-order accurate weighted essentially non-oscillatory (WENO) method is adopted to reconstruct the solution in the unstructured grids. Numerical results show that our new method can capture the correct propagation speed of the detonation wave exactly even in coarse grids, while high order accuracy can be achieved in the smooth region. In addition, the proposed adaptive splitting method can reduce the computational cost greatly compared with the traditional splitting method.
Optimization of the molecular dynamics method for simulations of DNA and ion transport through biological nanopores.

PubMed

Wells, David B; Bhattacharya, Swati; Carr, Rogan; Maffeo, Christopher; Ho, Anthony; Comer, Jeffrey; Aksimentiev, Aleksei

2012-01-01

Molecular dynamics (MD) simulations have become a standard method for the rational design and interpretation of experimental studies of DNA translocation through nanopores. The MD method, however, offers a multitude of algorithms, parameters, and other protocol choices that can affect the accuracy of the resulting data as well as computational efficiency. In this chapter, we examine the most popular choices offered by the MD method, seeking an optimal set of parameters that enable the most computationally efficient and accurate simulations of DNA and ion transport through biological nanopores. In particular, we examine the influence of short-range cutoff, integration timestep and force field parameters on the temperature and concentration dependence of bulk ion conductivity, ion pairing, ion solvation energy, DNA structure, DNA-ion interactions, and the ionic current through a nanopore.
Tempest - Efficient Computation of Atmospheric Flows Using High-Order Local Discretization Methods

NASA Astrophysics Data System (ADS)

Ullrich, P. A.; Guerra, J. E.

2014-12-01

The Tempest Framework composes several compact numerical methods to easily facilitate intercomparison of atmospheric flow calculations on the sphere and in rectangular domains. This framework includes the implementations of Spectral Elements, Discontinuous Galerkin, Flux Reconstruction, and Hybrid Finite Element methods with the goal of achieving optimal accuracy in the solution of atmospheric problems. Several advantages of this approach are discussed such as: improved pressure gradient calculation, numerical stability by vertical/horizontal splitting, arbitrary order of accuracy, etc. The local numerical discretization allows for high performance parallel computation and efficient inclusion of parameterizations. These techniques are used in conjunction with a non-conformal, locally refined, cubed-sphere grid for global simulations and standard Cartesian grids for simulations at the mesoscale. A complete implementation of the methods described is demonstrated in a non-hydrostatic setting.
A robust and efficient stepwise regression method for building sparse polynomial chaos expansions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Abraham, Simon, E-mail: Simon.Abraham@ulb.ac.be; Raisee, Mehrdad; Ghorbaniasl, Ghader

2017-03-01

Polynomial Chaos (PC) expansions are widely used in various engineering fields for quantifying uncertainties arising from uncertain parameters. The computational cost of classical PC solution schemes is unaffordable as the number of deterministic simulations to be calculated grows dramatically with the number of stochastic dimension. This considerably restricts the practical use of PC at the industrial level. A common approach to address such problems is to make use of sparse PC expansions. This paper presents a non-intrusive regression-based method for building sparse PC expansions. The most important PC contributions are detected sequentially through an automatic search procedure. The variable selectionmore » criterion is based on efficient tools relevant to probabilistic method. Two benchmark analytical functions are used to validate the proposed algorithm. The computational efficiency of the method is then illustrated by a more realistic CFD application, consisting of the non-deterministic flow around a transonic airfoil subject to geometrical uncertainties. To assess the performance of the developed methodology, a detailed comparison is made with the well established LAR-based selection technique. The results show that the developed sparse regression technique is able to identify the most significant PC contributions describing the problem. Moreover, the most important stochastic features are captured at a reduced computational cost compared to the LAR method. The results also demonstrate the superior robustness of the method by repeating the analyses using random experimental designs.« less
Effective Energy Simulation and Optimal Design of Side-lit Buildings with Venetian Blinds

NASA Astrophysics Data System (ADS)

Cheng, Tian

Venetian blinds are popularly used in buildings to control the amount of incoming daylight for improving visual comfort and reducing heat gains in air-conditioning systems. Studies have shown that the proper design and operation of window systems could result in significant energy savings in both lighting and cooling. However, there is no convenient computer tool that allows effective and efficient optimization of the envelope of side-lit buildings with blinds now. Three computer tools, Adeline, DOE2 and EnergyPlus widely used for the above-mentioned purpose have been experimentally examined in this study. Results indicate that the two former tools give unacceptable accuracy due to unrealistic assumptions adopted while the last one may generate large errors in certain conditions. Moreover, current computer tools have to conduct hourly energy simulations, which are not necessary for life-cycle energy analysis and optimal design, to provide annual cooling loads. This is not computationally efficient, particularly not suitable for optimal designing a building at initial stage because the impacts of many design variations and optional features have to be evaluated. A methodology is therefore developed for efficient and effective thermal and daylighting simulations and optimal design of buildings with blinds. Based on geometric optics and radiosity method, a mathematical model is developed to reasonably simulate the daylighting behaviors of venetian blinds. Indoor illuminance at any reference point can be directly and efficiently computed. They have been validated with both experiments and simulations with Radiance. Validation results show that indoor illuminances computed by the new models agree well with the measured data, and the accuracy provided by them is equivalent to that of Radiance. The computational efficiency of the new models is much higher than that of Radiance as well as EnergyPlus. Two new methods are developed for the thermal simulation of buildings. A fast Fourier transform (FFT) method is presented to avoid the root-searching process in the inverse Laplace transform of multilayered walls. Generalized explicit FFT formulae for calculating the discrete Fourier transform (DFT) are developed for the first time. They can largely facilitate the implementation of FFT. The new method also provides a basis for generating the symbolic response factors. Validation simulations show that it can generate the response factors as accurate as the analytical solutions. The second method is for direct estimation of annual or seasonal cooling loads without the need for tedious hourly energy simulations. It is validated by hourly simulation results with DOE2. Then symbolic long-term cooling load can be created by combining the two methods with thermal network analysis. The symbolic long-term cooling load can keep the design parameters of interest as symbols, which is particularly useful for the optimal design and sensitivity analysis. The methodology is applied to an office building in Hong Kong for the optimal design of building envelope. Design variables such as window-to-wall ratio, building orientation, and glazing optical and thermal properties are included in the study. Results show that the selected design values could significantly impact the energy performance of windows, and the optimal design of side-lit buildings could greatly enhance energy savings. The application example also demonstrates that the developed methodology significantly facilitates the optimal building design and sensitivity analysis, and leads to high computational efficiency.
Joint the Center for Applied Scientific Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gamblin, Todd; Bremer, Timo; Van Essen, Brian

The Center for Applied Scientific Computing serves as Livermore Lab’s window to the broader computer science, computational physics, applied mathematics, and data science research communities. In collaboration with academic, industrial, and other government laboratory partners, we conduct world-class scientific research and development on problems critical to national security. CASC applies the power of high-performance computing and the efficiency of modern computational methods to the realms of stockpile stewardship, cyber and energy security, and knowledge discovery for intelligence applications.
Efficient implementation of multidimensional fast fourier transform on a distributed-memory parallel multi-node computer

DOEpatents

Bhanot, Gyan V [Princeton, NJ; Chen, Dong [Croton-On-Hudson, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY

2012-01-10

The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.

Efficient implementation of a multidimensional fast fourier transform on a distributed-memory parallel multi-node computer

DOEpatents

Bhanot, Gyan V [Princeton, NJ; Chen, Dong [Croton-On-Hudson, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY

2008-01-01

The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.
Efficient Geometric Sound Propagation Using Visibility Culling

NASA Astrophysics Data System (ADS)

Chandak, Anish

2011-07-01

Simulating propagation of sound can improve the sense of realism in interactive applications such as video games and can lead to better designs in engineering applications such as architectural acoustics. In this thesis, we present geometric sound propagation techniques which are faster than prior methods and map well to upcoming parallel multi-core CPUs. We model specular reflections by using the image-source method and model finite-edge diffraction by using the well-known Biot-Tolstoy-Medwin (BTM) model. We accelerate the computation of specular reflections by applying novel visibility algorithms, FastV and AD-Frustum, which compute visibility from a point. We accelerate finite-edge diffraction modeling by applying a novel visibility algorithm which computes visibility from a region. Our visibility algorithms are based on frustum tracing and exploit recent advances in fast ray-hierarchy intersections, data-parallel computations, and scalable, multi-core algorithms. The AD-Frustum algorithm adapts its computation to the scene complexity and allows small errors in computing specular reflection paths for higher computational efficiency. FastV and our visibility algorithm from a region are general, object-space, conservative visibility algorithms that together significantly reduce the number of image sources compared to other techniques while preserving the same accuracy. Our geometric propagation algorithms are an order of magnitude faster than prior approaches for modeling specular reflections and two to ten times faster for modeling finite-edge diffraction. Our algorithms are interactive, scale almost linearly on multi-core CPUs, and can handle large, complex, and dynamic scenes. We also compare the accuracy of our sound propagation algorithms with other methods. Once sound propagation is performed, it is desirable to listen to the propagated sound in interactive and engineering applications. We can generate smooth, artifact-free output audio signals by applying efficient audio-processing algorithms. We also present the first efficient audio-processing algorithm for scenarios with simultaneously moving source and moving receiver (MS-MR) which incurs less than 25% overhead compared to static source and moving receiver (SS-MR) or moving source and static receiver (MS-SR) scenario.
Distributed collaborative probabilistic design of multi-failure structure with fluid-structure interaction using fuzzy neural network of regression

NASA Astrophysics Data System (ADS)

Song, Lu-Kai; Wen, Jie; Fei, Cheng-Wei; Bai, Guang-Chen

2018-05-01

To improve the computing efficiency and precision of probabilistic design for multi-failure structure, a distributed collaborative probabilistic design method-based fuzzy neural network of regression (FR) (called as DCFRM) is proposed with the integration of distributed collaborative response surface method and fuzzy neural network regression model. The mathematical model of DCFRM is established and the probabilistic design idea with DCFRM is introduced. The probabilistic analysis of turbine blisk involving multi-failure modes (deformation failure, stress failure and strain failure) was investigated by considering fluid-structure interaction with the proposed method. The distribution characteristics, reliability degree, and sensitivity degree of each failure mode and overall failure mode on turbine blisk are obtained, which provides a useful reference for improving the performance and reliability of aeroengine. Through the comparison of methods shows that the DCFRM reshapes the probability of probabilistic analysis for multi-failure structure and improves the computing efficiency while keeping acceptable computational precision. Moreover, the proposed method offers a useful insight for reliability-based design optimization of multi-failure structure and thereby also enriches the theory and method of mechanical reliability design.
Robust stability of linear systems: Some computational considerations

NASA Technical Reports Server (NTRS)

Laub, A. J.

1979-01-01

The cases of both additive and multiplicative perturbations were discussed and a number of relationships between the two cases were given. A number of computational aspects of the theory were also discussed, including a proposed new method for evaluating general transfer or frequency response matrices. The new method is numerically stable and efficient, requiring only operations to update for new values of the frequency parameter.
A CUMULATIVE MIGRATION METHOD FOR COMPUTING RIGOROUS TRANSPORT CROSS SECTIONS AND DIFFUSION COEFFICIENTS FOR LWR LATTICES WITH MONTE CARLO

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhaoyuan Liu; Kord Smith; Benoit Forget

2016-05-01

A new method for computing homogenized assembly neutron transport cross sections and dif- fusion coefficients that is both rigorous and computationally efficient is proposed in this paper. In the limit of a homogeneous hydrogen slab, the new method is equivalent to the long-used, and only-recently-published CASMO transport method. The rigorous method is used to demonstrate the sources of inaccuracy in the commonly applied “out-scatter” transport correction. It is also demonstrated that the newly developed method is directly applicable to lattice calculations per- formed by Monte Carlo and is capable of computing rigorous homogenized transport cross sections for arbitrarily heterogeneous lattices.more » Comparisons of several common transport cross section ap- proximations are presented for a simple problem of infinite medium hydrogen. The new method has also been applied in computing 2-group diffusion data for an actual PWR lattice from BEAVRS benchmark.« less
SPIP: A computer program implementing the Interaction Picture method for simulation of light-wave propagation in optical fibre

NASA Astrophysics Data System (ADS)

Balac, Stéphane; Fernandez, Arnaud

2016-02-01

The computer program SPIP is aimed at solving the Generalized Non-Linear Schrödinger equation (GNLSE), involved in optics e.g. in the modelling of light-wave propagation in an optical fibre, by the Interaction Picture method, a new efficient alternative method to the Symmetric Split-Step method. In the SPIP program a dedicated costless adaptive step-size control based on the use of a 4th order embedded Runge-Kutta method is implemented in order to speed up the resolution.
STIC: Photonic Quantum Computation through Cavity Assisted Interaction

DTIC Science & Technology

2007-12-28

PRA ; available as quant-ph/06060791. Report for the grant “Photonic Quantum Computation through Cavity Assisted Interaction” from DTO Luming Duan...cavity •B. Wang, L.-M. Duan, PRA 72 (in press, 2005) Single-photon source Photonic Quantum Computation through Cavity-Assisted Interaction H. Jeff Kimble...interaction [Duan, Wang, Kimble, PRA 05] • “Investigate more efficient methods for combating noise in photonic quantum computation ” • Partial progress
An adaptive proper orthogonal decomposition method for model order reduction of multi-disc rotor system

NASA Astrophysics Data System (ADS)

Jin, Yulin; Lu, Kuan; Hou, Lei; Chen, Yushu

2017-12-01

The proper orthogonal decomposition (POD) method is a main and efficient tool for order reduction of high-dimensional complex systems in many research fields. However, the robustness problem of this method is always unsolved, although there are some modified POD methods which were proposed to solve this problem. In this paper, a new adaptive POD method called the interpolation Grassmann manifold (IGM) method is proposed to address the weakness of local property of the interpolation tangent-space of Grassmann manifold (ITGM) method in a wider parametric region. This method is demonstrated here by a nonlinear rotor system of 33-degrees of freedom (DOFs) with a pair of liquid-film bearings and a pedestal looseness fault. The motion region of the rotor system is divided into two parts: simple motion region and complex motion region. The adaptive POD method is compared with the ITGM method for the large and small spans of parameter in the two parametric regions to present the advantage of this method and disadvantage of the ITGM method. The comparisons of the responses are applied to verify the accuracy and robustness of the adaptive POD method, as well as the computational efficiency is also analyzed. As a result, the new adaptive POD method has a strong robustness and high computational efficiency and accuracy in a wide scope of parameter.
Sub-domain methods for collaborative electromagnetic computations

NASA Astrophysics Data System (ADS)

Soudais, Paul; Barka, André

2006-06-01

In this article, we describe a sub-domain method for electromagnetic computations based on boundary element method. The benefits of the sub-domain method are that the computation can be split between several companies for collaborative studies; also the computation time can be reduced by one or more orders of magnitude especially in the context of parametric studies. The accuracy and efficiency of this technique is assessed by RCS computations on an aircraft air intake with duct and rotating engine mock-up called CHANNEL. Collaborative results, obtained by combining two sets of sub-domains computed by two companies, are compared with measurements on the CHANNEL mock-up. The comparisons are made for several angular positions of the engine to show the benefits of the method for parametric studies. We also discuss the accuracy of two formulations of the sub-domain connecting scheme using edge based or modal field expansion. To cite this article: P. Soudais, A. Barka, C. R. Physique 7 (2006).
Computing Galois Groups of Eisenstein Polynomials Over P-adic Fields

NASA Astrophysics Data System (ADS)

Milstead, Jonathan

The most efficient algorithms for computing Galois groups of polynomials over global fields are based on Stauduhar's relative resolvent method. These methods are not directly generalizable to the local field case, since they require a field that contains the global field in which all roots of the polynomial can be approximated. We present splitting field-independent methods for computing the Galois group of an Eisenstein polynomial over a p-adic field. Our approach is to combine information from different disciplines. We primarily, make use of the ramification polygon of the polynomial, which is the Newton polygon of a related polynomial. This allows us to quickly calculate several invariants that serve to reduce the number of possible Galois groups. Algorithms by Greve and Pauli very efficiently return the Galois group of polynomials where the ramification polygon consists of one segment as well as information about the subfields of the stem field. Second, we look at the factorization of linear absolute resolvents to further narrow the pool of possible groups.
Combined multi-spectrum and orthogonal Laplacianfaces for fast CB-XLCT imaging with single-view data

NASA Astrophysics Data System (ADS)

Zhang, Haibo; Geng, Guohua; Chen, Yanrong; Qu, Xuan; Zhao, Fengjun; Hou, Yuqing; Yi, Huangjian; He, Xiaowei

2017-12-01

Cone-beam X-ray luminescence computed tomography (CB-XLCT) is an attractive hybrid imaging modality, which has the potential of monitoring the metabolic processes of nanophosphors-based drugs in vivo. Single-view data reconstruction as a key issue of CB-XLCT imaging promotes the effective study of dynamic XLCT imaging. However, it suffers from serious ill-posedness in the inverse problem. In this paper, a multi-spectrum strategy is adopted to relieve the ill-posedness of reconstruction. The strategy is based on the third-order simplified spherical harmonic approximation model. Then, an orthogonal Laplacianfaces-based method is proposed to reduce the large computational burden without degrading the imaging quality. Both simulated data and in vivo experimental data were used to evaluate the efficiency and robustness of the proposed method. The results are satisfactory in terms of both location and quantitative recovering with computational efficiency, indicating that the proposed method is practical and promising for single-view CB-XLCT imaging.
Visual saliency-based fast intracoding algorithm for high efficiency video coding

NASA Astrophysics Data System (ADS)

Zhou, Xin; Shi, Guangming; Zhou, Wei; Duan, Zhemin

2017-01-01

Intraprediction has been significantly improved in high efficiency video coding over H.264/AVC with quad-tree-based coding unit (CU) structure from size 64×64 to 8×8 and more prediction modes. However, these techniques cause a dramatic increase in computational complexity. An intracoding algorithm is proposed that consists of perceptual fast CU size decision algorithm and fast intraprediction mode decision algorithm. First, based on the visual saliency detection, an adaptive and fast CU size decision method is proposed to alleviate intraencoding complexity. Furthermore, a fast intraprediction mode decision algorithm with step halving rough mode decision method and early modes pruning algorithm is presented to selectively check the potential modes and effectively reduce the complexity of computation. Experimental results show that our proposed fast method reduces the computational complexity of the current HM to about 57% in encoding time with only 0.37% increases in BD rate. Meanwhile, the proposed fast algorithm has reasonable peak signal-to-noise ratio losses and nearly the same subjective perceptual quality.
A spectral approach for discrete dislocation dynamics simulations of nanoindentation

NASA Astrophysics Data System (ADS)

Bertin, Nicolas; Glavas, Vedran; Datta, Dibakar; Cai, Wei

2018-07-01

We present a spectral approach to perform nanoindentation simulations using three-dimensional nodal discrete dislocation dynamics. The method relies on a two step approach. First, the contact problem between an indenter of arbitrary shape and an isotropic elastic half-space is solved using a spectral iterative algorithm, and the contact pressure is fully determined on the half-space surface. The contact pressure is then used as a boundary condition of the spectral solver to determine the resulting stress field produced in the simulation volume. In both stages, the mechanical fields are decomposed into Fourier modes and are efficiently computed using fast Fourier transforms. To further improve the computational efficiency, the method is coupled with a subcycling integrator and a special approach is devised to approximate the displacement field associated with surface steps. As a benchmark, the method is used to compute the response of an elastic half-space using different types of indenter. An example of a dislocation dynamics nanoindentation simulation with complex initial microstructure is presented.
On computing the global time-optimal motions of robotic manipulators in the presence of obstacles

NASA Technical Reports Server (NTRS)

Shiller, Zvi; Dubowsky, Steven

1991-01-01

A method for computing the time-optimal motions of robotic manipulators is presented that considers the nonlinear manipulator dynamics, actuator constraints, joint limits, and obstacles. The optimization problem is reduced to a search for the time-optimal path in the n-dimensional position space. A small set of near-optimal paths is first efficiently selected from a grid, using a branch and bound search and a series of lower bound estimates on the traveling time along a given path. These paths are further optimized with a local path optimization to yield the global optimal solution. Obstacles are considered by eliminating the collision points from the tessellated space and by adding a penalty function to the motion time in the local optimization. The computational efficiency of the method stems from the reduced dimensionality of the searched spaced and from combining the grid search with a local optimization. The method is demonstrated in several examples for two- and six-degree-of-freedom manipulators with obstacles.
An algorithm for continuum modeling of rocks with multiple embedded nonlinearly-compliant joints [Continuum modeling of elasto-plastic media with multiple embedded nonlinearly-compliant joints

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hurley, R. C.; Vorobiev, O. Y.; Ezzedine, S. M.

Here, we present a numerical method for modeling the mechanical effects of nonlinearly-compliant joints in elasto-plastic media. The method uses a series of strain-rate and stress update algorithms to determine joint closure, slip, and solid stress within computational cells containing multiple “embedded” joints. This work facilitates efficient modeling of nonlinear wave propagation in large spatial domains containing a large number of joints that affect bulk mechanical properties. We implement the method within the massively parallel Lagrangian code GEODYN-L and provide verification and examples. We highlight the ability of our algorithms to capture joint interactions and multiple weakness planes within individualmore » computational cells, as well as its computational efficiency. We also discuss the motivation for developing the proposed technique: to simulate large-scale wave propagation during the Source Physics Experiments (SPE), a series of underground explosions conducted at the Nevada National Security Site (NNSS).« less
An algorithm for continuum modeling of rocks with multiple embedded nonlinearly-compliant joints [Continuum modeling of elasto-plastic media with multiple embedded nonlinearly-compliant joints

DOE PAGES

Hurley, R. C.; Vorobiev, O. Y.; Ezzedine, S. M.

2017-04-06

Here, we present a numerical method for modeling the mechanical effects of nonlinearly-compliant joints in elasto-plastic media. The method uses a series of strain-rate and stress update algorithms to determine joint closure, slip, and solid stress within computational cells containing multiple “embedded” joints. This work facilitates efficient modeling of nonlinear wave propagation in large spatial domains containing a large number of joints that affect bulk mechanical properties. We implement the method within the massively parallel Lagrangian code GEODYN-L and provide verification and examples. We highlight the ability of our algorithms to capture joint interactions and multiple weakness planes within individualmore » computational cells, as well as its computational efficiency. We also discuss the motivation for developing the proposed technique: to simulate large-scale wave propagation during the Source Physics Experiments (SPE), a series of underground explosions conducted at the Nevada National Security Site (NNSS).« less
Fast modal extraction in NASTRAN via the FEER computer program. [based on automatic matrix reduction method for lower modes of structures with many degrees of freedom

NASA Technical Reports Server (NTRS)

Newman, M. B.; Pipano, A.

1973-01-01

A new eigensolution routine, FEER (Fast Eigensolution Extraction Routine), used in conjunction with NASTRAN at Israel Aircraft Industries is described. The FEER program is based on an automatic matrix reduction scheme whereby the lower modes of structures with many degrees of freedom can be accurately extracted from a tridiagonal eigenvalue problem whose size is of the same order of magnitude as the number of required modes. The process is effected without arbitrary lumping of masses at selected node points or selection of nodes to be retained in the analysis set. The results of computational efficiency studies are presented, showing major arithmetic operation counts and actual computer run times of FEER as compared to other methods of eigenvalue extraction, including those available in the NASTRAN READ module. It is concluded that the tridiagonal reduction method used in FEER would serve as a valuable addition to NASTRAN for highly increased efficiency in obtaining structural vibration modes.
A fast Fourier transform on multipoles (FFTM) algorithm for solving Helmholtz equation in acoustics analysis.

PubMed

Ong, Eng Teo; Lee, Heow Pueh; Lim, Kian Meng

2004-09-01

This article presents a fast algorithm for the efficient solution of the Helmholtz equation. The method is based on the translation theory of the multipole expansions. Here, the speedup comes from the convolution nature of the translation operators, which can be evaluated rapidly using fast Fourier transform algorithms. Also, the computations of the translation operators are accelerated by using the recursive formulas developed recently by Gumerov and Duraiswami [SIAM J. Sci. Comput. 25, 1344-1381(2003)]. It is demonstrated that the algorithm can produce good accuracy with a relatively low order of expansion. Efficiency analyses of the algorithm reveal that it has computational complexities of O(Na), where a ranges from 1.05 to 1.24. However, this method requires substantially more memory to store the translation operators as compared to the fast multipole method. Hence, despite its simplicity in implementation, this memory requirement issue may limit the application of this algorithm to solving very large-scale problems.
Computationally efficient characterization of potential energy surfaces based on fingerprint distances

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schaefer, Bastian; Goedecker, Stefan, E-mail: stefan.goedecker@unibas.ch

2016-07-21

An analysis of the network defined by the potential energy minima of multi-atomic systems and their connectivity via reaction pathways that go through transition states allows us to understand important characteristics like thermodynamic, dynamic, and structural properties. Unfortunately computing the transition states and reaction pathways in addition to the significant energetically low-lying local minima is a computationally demanding task. We here introduce a computationally efficient method that is based on a combination of the minima hopping global optimization method and the insight that uphill barriers tend to increase with increasing structural distances of the educt and product states. This methodmore » allows us to replace the exact connectivity information and transition state energies with alternative and approximate concepts. Without adding any significant additional cost to the minima hopping global optimization approach, this method allows us to generate an approximate network of the minima, their connectivity, and a rough measure for the energy needed for their interconversion. This can be used to obtain a first qualitative idea on important physical and chemical properties by means of a disconnectivity graph analysis. Besides the physical insight obtained by such an analysis, the gained knowledge can be used to make a decision if it is worthwhile or not to invest computational resources for an exact computation of the transition states and the reaction pathways. Furthermore it is demonstrated that the here presented method can be used for finding physically reasonable interconversion pathways that are promising input pathways for methods like transition path sampling or discrete path sampling.« less
Sensitivity Analysis for Coupled Aero-structural Systems

NASA Technical Reports Server (NTRS)

Giunta, Anthony A.

1999-01-01

A novel method has been developed for calculating gradients of aerodynamic force and moment coefficients for an aeroelastic aircraft model. This method uses the Global Sensitivity Equations (GSE) to account for the aero-structural coupling, and a reduced-order modal analysis approach to condense the coupling bandwidth between the aerodynamic and structural models. Parallel computing is applied to reduce the computational expense of the numerous high fidelity aerodynamic analyses needed for the coupled aero-structural system. Good agreement is obtained between aerodynamic force and moment gradients computed with the GSE/modal analysis approach and the same quantities computed using brute-force, computationally expensive, finite difference approximations. A comparison between the computational expense of the GSE/modal analysis method and a pure finite difference approach is presented. These results show that the GSE/modal analysis approach is the more computationally efficient technique if sensitivity analysis is to be performed for two or more aircraft design parameters.

A DAG Scheduling Scheme on Heterogeneous Computing Systems Using Tuple-Based Chemical Reaction Optimization

PubMed Central

Jiang, Yuyi; Shao, Zhiqing; Guo, Yi

2014-01-01

A complex computing problem can be solved efficiently on a system with multiple computing nodes by dividing its implementation code into several parallel processing modules or tasks that can be formulated as directed acyclic graph (DAG) problems. The DAG jobs may be mapped to and scheduled on the computing nodes to minimize the total execution time. Searching an optimal DAG scheduling solution is considered to be NP-complete. This paper proposed a tuple molecular structure-based chemical reaction optimization (TMSCRO) method for DAG scheduling on heterogeneous computing systems, based on a very recently proposed metaheuristic method, chemical reaction optimization (CRO). Comparing with other CRO-based algorithms for DAG scheduling, the design of tuple reaction molecular structure and four elementary reaction operators of TMSCRO is more reasonable. TMSCRO also applies the concept of constrained critical paths (CCPs), constrained-critical-path directed acyclic graph (CCPDAG) and super molecule for accelerating convergence. In this paper, we have also conducted simulation experiments to verify the effectiveness and efficiency of TMSCRO upon a large set of randomly generated graphs and the graphs for real world problems. PMID:25143977
A DAG scheduling scheme on heterogeneous computing systems using tuple-based chemical reaction optimization.

PubMed

Jiang, Yuyi; Shao, Zhiqing; Guo, Yi

2014-01-01

A complex computing problem can be solved efficiently on a system with multiple computing nodes by dividing its implementation code into several parallel processing modules or tasks that can be formulated as directed acyclic graph (DAG) problems. The DAG jobs may be mapped to and scheduled on the computing nodes to minimize the total execution time. Searching an optimal DAG scheduling solution is considered to be NP-complete. This paper proposed a tuple molecular structure-based chemical reaction optimization (TMSCRO) method for DAG scheduling on heterogeneous computing systems, based on a very recently proposed metaheuristic method, chemical reaction optimization (CRO). Comparing with other CRO-based algorithms for DAG scheduling, the design of tuple reaction molecular structure and four elementary reaction operators of TMSCRO is more reasonable. TMSCRO also applies the concept of constrained critical paths (CCPs), constrained-critical-path directed acyclic graph (CCPDAG) and super molecule for accelerating convergence. In this paper, we have also conducted simulation experiments to verify the effectiveness and efficiency of TMSCRO upon a large set of randomly generated graphs and the graphs for real world problems.
Predicting Flows of Rarefied Gases

NASA Technical Reports Server (NTRS)

LeBeau, Gerald J.; Wilmoth, Richard G.

2005-01-01

DSMC Analysis Code (DAC) is a flexible, highly automated, easy-to-use computer program for predicting flows of rarefied gases -- especially flows of upper-atmospheric, propulsion, and vented gases impinging on spacecraft surfaces. DAC implements the direct simulation Monte Carlo (DSMC) method, which is widely recognized as standard for simulating flows at densities so low that the continuum-based equations of computational fluid dynamics are invalid. DAC enables users to model complex surface shapes and boundary conditions quickly and easily. The discretization of a flow field into computational grids is automated, thereby relieving the user of a traditionally time-consuming task while ensuring (1) appropriate refinement of grids throughout the computational domain, (2) determination of optimal settings for temporal discretization and other simulation parameters, and (3) satisfaction of the fundamental constraints of the method. In so doing, DAC ensures an accurate and efficient simulation. In addition, DAC can utilize parallel processing to reduce computation time. The domain decomposition needed for parallel processing is completely automated, and the software employs a dynamic load-balancing mechanism to ensure optimal parallel efficiency throughout the simulation.
An initial investigation into methods of computing transonic aerodynamic sensitivity coefficients

NASA Technical Reports Server (NTRS)

Carlson, Leland A.

1988-01-01

The initial effort was concentrated on developing the quasi-analytical approach for two-dimensional transonic flow. To keep the problem computationally efficient and straightforward, only the two-dimensional flow was considered and the problem was modeled using the transonic small perturbation equation.
Computational efficient segmentation of cell nuclei in 2D and 3D fluorescent micrographs

NASA Astrophysics Data System (ADS)

De Vylder, Jonas; Philips, Wilfried

2011-02-01

This paper proposes a new segmentation technique developed for the segmentation of cell nuclei in both 2D and 3D fluorescent micrographs. The proposed method can deal with both blurred edges as with touching nuclei. Using a dual scan line algorithm its both memory as computational efficient, making it interesting for the analysis of images coming from high throughput systems or the analysis of 3D microscopic images. Experiments show good results, i.e. recall of over 0.98.
High-Level Data-Abstraction System

NASA Technical Reports Server (NTRS)

Fishwick, P. A.

1986-01-01

Communication with data-base processor flexible and efficient. High Level Data Abstraction (HILDA) system is three-layer system supporting data-abstraction features of Intel data-base processor (DBP). Purpose of HILDA establishment of flexible method of efficiently communicating with DBP. Power of HILDA lies in its extensibility with regard to syntax and semantic changes. HILDA's high-level query language readily modified. Offers powerful potential to computer sites where DBP attached to DEC VAX-series computer. HILDA system written in Pascal and FORTRAN 77 for interactive execution.
ACCESS 1: Approximation Concepts Code for Efficient Structural Synthesis program documentation and user's guide

NASA Technical Reports Server (NTRS)

Miura, H.; Schmit, L. A., Jr.

1976-01-01

The program documentation and user's guide for the ACCESS-1 computer program is presented. ACCESS-1 is a research oriented program which implements a collection of approximation concepts to achieve excellent efficiency in structural synthesis. The finite element method is used for structural analysis and general mathematical programming algorithms are applied in the design optimization procedure. Implementation of the computer program, preparation of input data and basic program structure are described, and three illustrative examples are given.
A New Method for Computing Three-Dimensional Capture Fraction in Heterogeneous Regional Systems using the MODFLOW Adjoint Code

NASA Astrophysics Data System (ADS)

Clemo, T. M.; Ramarao, B.; Kelly, V. A.; Lavenue, M.

2011-12-01

Capture is a measure of the impact of groundwater pumping upon groundwater and surface water systems. The computation of capture through analytical or numerical methods has been the subject of articles in the literature for several decades (Bredehoeft et al., 1982). Most recently Leake et al. (2010) described a systematic way to produce capture maps in three-dimensional systems using a numerical perturbation approach in which capture from streams was computed using unit rate pumping at many locations within a MODFLOW model. The Leake et al. (2010) method advances the current state of computing capture. A limitation stems from the computational demand required by the perturbation approach wherein days or weeks of computational time might be required to obtain a robust measure of capture. In this paper, we present an efficient method to compute capture in three-dimensional systems based upon adjoint states. The efficiency of the adjoint method will enable uncertainty analysis to be conducted on capture calculations. The USGS and INTERA have collaborated to extend the MODFLOW Adjoint code (Clemo, 2007) to include stream-aquifer interaction and have applied it to one of the examples used in Leake et al. (2010), the San Pedro Basin MODFLOW model. With five layers and 140,800 grid blocks per layer, the San Pedro Basin model, provided an ideal example data set to compare the capture computed from the perturbation and the adjoint methods. The capture fraction map produced from the perturbation method for the San Pedro Basin model required significant computational time to compute and therefore the locations for the pumping wells were limited to 1530 locations in layer 4. The 1530 direct simulations of capture require approximately 76 CPU hours. Had capture been simulated in each grid block in each layer, as is done in the adjoint method, the CPU time would have been on the order of 4 years. The MODFLOW-Adjoint produced the capture fraction map of the San Pedro Basin model at 704,000 grid blocks (140,800 grid blocks x 5 layers) in just 6 minutes. The capture fraction maps from the perturbation and adjoint methods agree closely. The results of this study indicate that the adjoint capture method and its associated computational efficiency will enable scientists and engineers facing water resource management decisions to evaluate the sensitivity and uncertainty of impacts to regional water resource systems as part of groundwater supply strategies. Bredehoeft, J.D., S.S. Papadopulos, and H.H. Cooper Jr, Groundwater: The water budget myth. In Scientific Basis of Water-Resources Management, ed. National Research Council (U.S.), Geophysical Study Committee, 51-57. Washington D.C.: National Academy Press, 1982. Clemo, Tom, MODFLOW-2005 Ground-Water Model-Users Guide to Adjoint State based Sensitivity Process (ADJ), BSU CGISS 07-01, Center for the Geophysical Investigation of the Shallow Subsurface, Boise State University, 2007. Leake, S.A., H.W. Reeves, and J.E. Dickinson, A New Capture Fraction Method to Map How Pumpage Affects Surface Water Flow, Ground Water, 48(5), 670-700, 2010.
Improved Ant Colony Clustering Algorithm and Its Performance Study

PubMed Central

Gao, Wei

2016-01-01

Clustering analysis is used in many disciplines and applications; it is an important tool that descriptively identifies homogeneous groups of objects based on attribute values. The ant colony clustering algorithm is a swarm-intelligent method used for clustering problems that is inspired by the behavior of ant colonies that cluster their corpses and sort their larvae. A new abstraction ant colony clustering algorithm using a data combination mechanism is proposed to improve the computational efficiency and accuracy of the ant colony clustering algorithm. The abstraction ant colony clustering algorithm is used to cluster benchmark problems, and its performance is compared with the ant colony clustering algorithm and other methods used in existing literature. Based on similar computational difficulties and complexities, the results show that the abstraction ant colony clustering algorithm produces results that are not only more accurate but also more efficiently determined than the ant colony clustering algorithm and the other methods. Thus, the abstraction ant colony clustering algorithm can be used for efficient multivariate data clustering. PMID:26839533
An efficient algorithm using matrix methods to solve wind tunnel force-balance equations

NASA Technical Reports Server (NTRS)

Smith, D. L.

1972-01-01

An iterative procedure applying matrix methods to accomplish an efficient algorithm for automatic computer reduction of wind-tunnel force-balance data has been developed. Balance equations are expressed in a matrix form that is convenient for storing balance sensitivities and interaction coefficient values for online or offline batch data reduction. The convergence of the iterative values to a unique solution of this system of equations is investigated, and it is shown that for balances which satisfy the criteria discussed, this type of solution does occur. Methods for making sensitivity adjustments and initial load effect considerations in wind-tunnel applications are also discussed, and the logic for determining the convergence accuracy limits for the iterative solution is given. This more efficient data reduction program is compared with the technique presently in use at the NASA Langley Research Center, and computational times on the order of one-third or less are demonstrated by use of this new program.
Parallelized Stochastic Cutoff Method for Long-Range Interacting Systems

NASA Astrophysics Data System (ADS)

Endo, Eishin; Toga, Yuta; Sasaki, Munetaka

2015-07-01

We present a method of parallelizing the stochastic cutoff (SCO) method, which is a Monte-Carlo method for long-range interacting systems. After interactions are eliminated by the SCO method, we subdivide a lattice into noninteracting interpenetrating sublattices. This subdivision enables us to parallelize the Monte-Carlo calculation in the SCO method. Such subdivision is found by numerically solving the vertex coloring of a graph created by the SCO method. We use an algorithm proposed by Kuhn and Wattenhofer to solve the vertex coloring by parallel computation. This method was applied to a two-dimensional magnetic dipolar system on an L × L square lattice to examine its parallelization efficiency. The result showed that, in the case of L = 2304, the speed of computation increased about 102 times by parallel computation with 288 processors.
An efficient and guaranteed stable numerical method for continuous modeling of infiltration and redistribution with a shallow dynamic water table

NASA Astrophysics Data System (ADS)

Lai, Wencong; Ogden, Fred L.; Steinke, Robert C.; Talbot, Cary A.

2015-03-01

We have developed a one-dimensional numerical method to simulate infiltration and redistribution in the presence of a shallow dynamic water table. This method builds upon the Green-Ampt infiltration with Redistribution (GAR) model and incorporates features from the Talbot-Ogden (T-O) infiltration and redistribution method in a discretized moisture content domain. The redistribution scheme is more physically meaningful than the capillary weighted redistribution scheme in the T-O method. Groundwater dynamics are considered in this new method instead of hydrostatic groundwater front. It is also computationally more efficient than the T-O method. Motion of water in the vadose zone due to infiltration, redistribution, and interactions with capillary groundwater are described by ordinary differential equations. Numerical solutions to these equations are computationally less expensive than solutions of the highly nonlinear Richards' (1931) partial differential equation. We present results from numerical tests on 11 soil types using multiple rain pulses with different boundary conditions, with and without a shallow water table and compare against the numerical solution of Richards' equation (RE). Results from the new method are in satisfactory agreement with RE solutions in term of ponding time, deponding time, infiltration rate, and cumulative infiltrated depth. The new method, which we call "GARTO" can be used as an alternative to the RE for 1-D coupled surface and groundwater models in general situations with homogeneous soils with dynamic water table. The GARTO method represents a significant advance in simulating groundwater surface water interactions because it very closely matches the RE solution while being computationally efficient, with guaranteed mass conservation, and no stability limitations that can affect RE solvers in the case of a near-surface water table.
Recurrent neural network based virtual detection line

NASA Astrophysics Data System (ADS)

Kadikis, Roberts

2018-04-01

The paper proposes an efficient method for detection of moving objects in the video. The objects are detected when they cross a virtual detection line. Only the pixels of the detection line are processed, which makes the method computationally efficient. A Recurrent Neural Network processes these pixels. The machine learning approach allows one to train a model that works in different and changing outdoor conditions. Also, the same network can be trained for various detection tasks, which is demonstrated by the tests on vehicle and people counting. In addition, the paper proposes a method for semi-automatic acquisition of labeled training data. The labeling method is used to create training and testing datasets, which in turn are used to train and evaluate the accuracy and efficiency of the detection method. The method shows similar accuracy as the alternative efficient methods but provides greater adaptability and usability for different tasks.
A strategy for improved computational efficiency of the method of anchored distributions

NASA Astrophysics Data System (ADS)

Over, Matthew William; Yang, Yarong; Chen, Xingyuan; Rubin, Yoram

2013-06-01

This paper proposes a strategy for improving the computational efficiency of model inversion using the method of anchored distributions (MAD) by "bundling" similar model parametrizations in the likelihood function. Inferring the likelihood function typically requires a large number of forward model (FM) simulations for each possible model parametrization; as a result, the process is quite expensive. To ease this prohibitive cost, we present an approximation for the likelihood function called bundling that relaxes the requirement for high quantities of FM simulations. This approximation redefines the conditional statement of the likelihood function as the probability of a set of similar model parametrizations "bundle" replicating field measurements, which we show is neither a model reduction nor a sampling approach to improving the computational efficiency of model inversion. To evaluate the effectiveness of these modifications, we compare the quality of predictions and computational cost of bundling relative to a baseline MAD inversion of 3-D flow and transport model parameters. Additionally, to aid understanding of the implementation we provide a tutorial for bundling in the form of a sample data set and script for the R statistical computing language. For our synthetic experiment, bundling achieved a 35% reduction in overall computational cost and had a limited negative impact on predicted probability distributions of the model parameters. Strategies for minimizing error in the bundling approximation, for enforcing similarity among the sets of model parametrizations, and for identifying convergence of the likelihood function are also presented.
Modeling bioluminescent photon transport in tissue based on Radiosity-diffusion model

NASA Astrophysics Data System (ADS)

Sun, Li; Wang, Pu; Tian, Jie; Zhang, Bo; Han, Dong; Yang, Xin

2010-03-01

Bioluminescence tomography (BLT) is one of the most important non-invasive optical molecular imaging modalities. The model for the bioluminescent photon propagation plays a significant role in the bioluminescence tomography study. Due to the high computational efficiency, diffusion approximation (DA) is generally applied in the bioluminescence tomography. But the diffusion equation is valid only in highly scattering and weakly absorbing regions and fails in non-scattering or low-scattering tissues, such as a cyst in the breast, the cerebrospinal fluid (CSF) layer of the brain and synovial fluid layer in the joints. A hybrid Radiosity-diffusion model is proposed for dealing with the non-scattering regions within diffusing domains in this paper. This hybrid method incorporates a priori information of the geometry of non-scattering regions, which can be acquired by magnetic resonance imaging (MRI) or x-ray computed tomography (CT). Then the model is implemented using a finite element method (FEM) to ensure the high computational efficiency. Finally, we demonstrate that the method is comparable with Mont Carlo (MC) method which is regarded as a 'gold standard' for photon transportation simulation.
Phase unwrapping with graph cuts optimization and dual decomposition acceleration for 3D high-resolution MRI data.

PubMed

Dong, Jianwu; Chen, Feng; Zhou, Dong; Liu, Tian; Yu, Zhaofei; Wang, Yi

2017-03-01

Existence of low SNR regions and rapid-phase variations pose challenges to spatial phase unwrapping algorithms. Global optimization-based phase unwrapping methods are widely used, but are significantly slower than greedy methods. In this paper, dual decomposition acceleration is introduced to speed up a three-dimensional graph cut-based phase unwrapping algorithm. The phase unwrapping problem is formulated as a global discrete energy minimization problem, whereas the technique of dual decomposition is used to increase the computational efficiency by splitting the full problem into overlapping subproblems and enforcing the congruence of overlapping variables. Using three dimensional (3D) multiecho gradient echo images from an agarose phantom and five brain hemorrhage patients, we compared this proposed method with an unaccelerated graph cut-based method. Experimental results show up to 18-fold acceleration in computation time. Dual decomposition significantly improves the computational efficiency of 3D graph cut-based phase unwrapping algorithms. Magn Reson Med 77:1353-1358, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.
Bypassing the malfunction junction in warm dense matter simulations

NASA Astrophysics Data System (ADS)

Cangi, Attila; Pribram-Jones, Aurora

2015-03-01

Simulation of warm dense matter requires computational methods that capture both quantum and classical behavior efficiently under high-temperature and high-density conditions. The state-of-the-art approach to model electrons and ions under those conditions is density functional theory molecular dynamics, but this method's computational cost skyrockets as temperatures and densities increase. We propose finite-temperature potential functional theory as an in-principle-exact alternative that suffers no such drawback. In analogy to the zero-temperature theory developed previously, we derive an orbital-free free energy approximation through a coupling-constant formalism. Our density approximation and its associated free energy approximation demonstrate the method's accuracy and efficiency. A.C. has been partially supported by NSF Grant CHE-1112442. A.P.J. is supported by DOE Grant DE-FG02-97ER25308.
Unsteady Fast Random Particle Mesh method for efficient prediction of tonal and broadband noises of a centrifugal fan unit

NASA Astrophysics Data System (ADS)

Heo, Seung; Cheong, Cheolung; Kim, Taehoon

2015-09-01

In this study, efficient numerical method is proposed for predicting tonal and broadband noises of a centrifugal fan unit. The proposed method is based on Hybrid Computational Aero-Acoustic (H-CAA) techniques combined with Unsteady Fast Random Particle Mesh (U-FRPM) method. The U-FRPM method is developed by extending the FRPM method proposed by Ewert et al. and is utilized to synthesize turbulence flow field from unsteady RANS solutions. The H-CAA technique combined with U-FRPM method is applied to predict broadband as well as tonal noises of a centrifugal fan unit in a household refrigerator. Firstly, unsteady flow field driven by a rotating fan is computed by solving the RANS equations with Computational Fluid Dynamic (CFD) techniques. Main source regions around the rotating fan are identified by examining the computed flow fields. Then, turbulence flow fields in the main source regions are synthesized by applying the U-FRPM method. The acoustic analogy is applied to model acoustic sources in the main source regions. Finally, the centrifugal fan noise is predicted by feeding the modeled acoustic sources into an acoustic solver based on the Boundary Element Method (BEM). The sound spectral levels predicted using the current numerical method show good agreements with the measured spectra at the Blade Pass Frequencies (BPFs) as well as in the high frequency range. On the more, the present method enables quantitative assessment of relative contributions of identified source regions to the sound field by comparing predicted sound pressure spectrum due to modeled sources.
Study of flutter related computational procedures for minimum weight structural sizing of advanced aircraft, supplemental data

NASA Technical Reports Server (NTRS)

Oconnell, R. F.; Hassig, H. J.; Radovcich, N. A.

1975-01-01

Computational aspects of (1) flutter optimization (minimization of structural mass subject to specified flutter requirements), (2) methods for solving the flutter equation, and (3) efficient methods for computing generalized aerodynamic force coefficients in the repetitive analysis environment of computer-aided structural design are discussed. Specific areas included: a two-dimensional Regula Falsi approach to solving the generalized flutter equation; method of incremented flutter analysis and its applications; the use of velocity potential influence coefficients in a five-matrix product formulation of the generalized aerodynamic force coefficients; options for computational operations required to generate generalized aerodynamic force coefficients; theoretical considerations related to optimization with one or more flutter constraints; and expressions for derivatives of flutter-related quantities with respect to design variables.
Efficient visibility encoding for dynamic illumination in direct volume rendering.

PubMed

Kronander, Joel; Jönsson, Daniel; Löw, Joakim; Ljung, Patric; Ynnerman, Anders; Unger, Jonas

2012-03-01

We present an algorithm that enables real-time dynamic shading in direct volume rendering using general lighting, including directional lights, point lights, and environment maps. Real-time performance is achieved by encoding local and global volumetric visibility using spherical harmonic (SH) basis functions stored in an efficient multiresolution grid over the extent of the volume. Our method enables high-frequency shadows in the spatial domain, but is limited to a low-frequency approximation of visibility and illumination in the angular domain. In a first pass, level of detail (LOD) selection in the grid is based on the current transfer function setting. This enables rapid online computation and SH projection of the local spherical distribution of visibility information. Using a piecewise integration of the SH coefficients over the local regions, the global visibility within the volume is then computed. By representing the light sources using their SH projections, the integral over lighting, visibility, and isotropic phase functions can be efficiently computed during rendering. The utility of our method is demonstrated in several examples showing the generality and interactive performance of the approach.

Strategies for efficient numerical implementation of hybrid multi-scale agent-based models to describe biological systems

PubMed Central

Cilfone, Nicholas A.; Kirschner, Denise E.; Linderman, Jennifer J.

2015-01-01

Biologically related processes operate across multiple spatiotemporal scales. For computational modeling methodologies to mimic this biological complexity, individual scale models must be linked in ways that allow for dynamic exchange of information across scales. A powerful methodology is to combine a discrete modeling approach, agent-based models (ABMs), with continuum models to form hybrid models. Hybrid multi-scale ABMs have been used to simulate emergent responses of biological systems. Here, we review two aspects of hybrid multi-scale ABMs: linking individual scale models and efficiently solving the resulting model. We discuss the computational choices associated with aspects of linking individual scale models while simultaneously maintaining model tractability. We demonstrate implementations of existing numerical methods in the context of hybrid multi-scale ABMs. Using an example model describing Mycobacterium tuberculosis infection, we show relative computational speeds of various combinations of numerical methods. Efficient linking and solution of hybrid multi-scale ABMs is key to model portability, modularity, and their use in understanding biological phenomena at a systems level. PMID:26366228
High-order computational fluid dynamics tools for aircraft design

PubMed Central

Wang, Z. J.

2014-01-01

Most forecasts predict an annual airline traffic growth rate between 4.5 and 5% in the foreseeable future. To sustain that growth, the environmental impact of aircraft cannot be ignored. Future aircraft must have much better fuel economy, dramatically less greenhouse gas emissions and noise, in addition to better performance. Many technical breakthroughs must take place to achieve the aggressive environmental goals set up by governments in North America and Europe. One of these breakthroughs will be physics-based, highly accurate and efficient computational fluid dynamics and aeroacoustics tools capable of predicting complex flows over the entire flight envelope and through an aircraft engine, and computing aircraft noise. Some of these flows are dominated by unsteady vortices of disparate scales, often highly turbulent, and they call for higher-order methods. As these tools will be integral components of a multi-disciplinary optimization environment, they must be efficient to impact design. Ultimately, the accuracy, efficiency, robustness, scalability and geometric flexibility will determine which methods will be adopted in the design process. This article explores these aspects and identifies pacing items. PMID:25024419
Redundancy management for efficient fault recovery in NASA's distributed computing system

NASA Technical Reports Server (NTRS)

Malek, Miroslaw; Pandya, Mihir; Yau, Kitty

1991-01-01

The management of redundancy in computer systems was studied and guidelines were provided for the development of NASA's fault-tolerant distributed systems. Fault recovery and reconfiguration mechanisms were examined. A theoretical foundation was laid for redundancy management by efficient reconfiguration methods and algorithmic diversity. Algorithms were developed to optimize the resources for embedding of computational graphs of tasks in the system architecture and reconfiguration of these tasks after a failure has occurred. The computational structure represented by a path and the complete binary tree was considered and the mesh and hypercube architectures were targeted for their embeddings. The innovative concept of Hybrid Algorithm Technique was introduced. This new technique provides a mechanism for obtaining fault tolerance while exhibiting improved performance.
Novel patch modelling method for efficient simulation and prediction uncertainty analysis of multi-scale groundwater flow and transport processes

NASA Astrophysics Data System (ADS)

Sreekanth, J.; Moore, Catherine

2018-04-01

The application of global sensitivity and uncertainty analysis techniques to groundwater models of deep sedimentary basins are typically challenged by large computational burdens combined with associated numerical stability issues. The highly parameterized approaches required for exploring the predictive uncertainty associated with the heterogeneous hydraulic characteristics of multiple aquifers and aquitards in these sedimentary basins exacerbate these issues. A novel Patch Modelling Methodology is proposed for improving the computational feasibility of stochastic modelling analysis of large-scale and complex groundwater models. The method incorporates a nested groundwater modelling framework that enables efficient simulation of groundwater flow and transport across multiple spatial and temporal scales. The method also allows different processes to be simulated within different model scales. Existing nested model methodologies are extended by employing 'joining predictions' for extrapolating prediction-salient information from one model scale to the next. This establishes a feedback mechanism supporting the transfer of information from child models to parent models as well as parent models to child models in a computationally efficient manner. This feedback mechanism is simple and flexible and ensures that while the salient small scale features influencing larger scale prediction are transferred back to the larger scale, this does not require the live coupling of models. This method allows the modelling of multiple groundwater flow and transport processes using separate groundwater models that are built for the appropriate spatial and temporal scales, within a stochastic framework, while also removing the computational burden associated with live model coupling. The utility of the method is demonstrated by application to an actual large scale aquifer injection scheme in Australia.
XPRESS: eXascale PRogramming Environment and System Software

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brightwell, Ron; Sterling, Thomas; Koniges, Alice

The XPRESS Project is one of four major projects of the DOE Office of Science Advanced Scientific Computing Research X-stack Program initiated in September, 2012. The purpose of XPRESS is to devise an innovative system software stack to enable practical and useful exascale computing around the end of the decade with near-term contributions to efficient and scalable operation of trans-Petaflops performance systems in the next two to three years; both for DOE mission-critical applications. To this end, XPRESS directly addresses critical challenges in computing of efficiency, scalability, and programmability through introspective methods of dynamic adaptive resource management and task scheduling.
Three-dimensional photoacoustic tomography based on graphics-processing-unit-accelerated finite element method.

PubMed

Peng, Kuan; He, Ling; Zhu, Ziqiang; Tang, Jingtian; Xiao, Jiaying

2013-12-01

Compared with commonly used analytical reconstruction methods, the frequency-domain finite element method (FEM) based approach has proven to be an accurate and flexible algorithm for photoacoustic tomography. However, the FEM-based algorithm is computationally demanding, especially for three-dimensional cases. To enhance the algorithm's efficiency, in this work a parallel computational strategy is implemented in the framework of the FEM-based reconstruction algorithm using a graphic-processing-unit parallel frame named the "compute unified device architecture." A series of simulation experiments is carried out to test the accuracy and accelerating effect of the improved method. The results obtained indicate that the parallel calculation does not change the accuracy of the reconstruction algorithm, while its computational cost is significantly reduced by a factor of 38.9 with a GTX 580 graphics card using the improved method.
A scalable parallel black oil simulator on distributed memory parallel computers

NASA Astrophysics Data System (ADS)

Wang, Kun; Liu, Hui; Chen, Zhangxin

2015-11-01

This paper presents our work on developing a parallel black oil simulator for distributed memory computers based on our in-house parallel platform. The parallel simulator is designed to overcome the performance issues of common simulators that are implemented for personal computers and workstations. The finite difference method is applied to discretize the black oil model. In addition, some advanced techniques are employed to strengthen the robustness and parallel scalability of the simulator, including an inexact Newton method, matrix decoupling methods, and algebraic multigrid methods. A new multi-stage preconditioner is proposed to accelerate the solution of linear systems from the Newton methods. Numerical experiments show that our simulator is scalable and efficient, and is capable of simulating extremely large-scale black oil problems with tens of millions of grid blocks using thousands of MPI processes on parallel computers.
A Lightweight Protocol for Secure Video Streaming

PubMed Central

Morkevicius, Nerijus; Bagdonas, Kazimieras

2018-01-01

The Internet of Things (IoT) introduces many new challenges which cannot be solved using traditional cloud and host computing models. A new architecture known as fog computing is emerging to address these technological and security gaps. Traditional security paradigms focused on providing perimeter-based protections and client/server point to point protocols (e.g., Transport Layer Security (TLS)) are no longer the best choices for addressing new security challenges in fog computing end devices, where energy and computational resources are limited. In this paper, we present a lightweight secure streaming protocol for the fog computing “Fog Node-End Device” layer. This protocol is lightweight, connectionless, supports broadcast and multicast operations, and is able to provide data source authentication, data integrity, and confidentiality. The protocol is based on simple and energy efficient cryptographic methods, such as Hash Message Authentication Codes (HMAC) and symmetrical ciphers, and uses modified User Datagram Protocol (UDP) packets to embed authentication data into streaming data. Data redundancy could be added to improve reliability in lossy networks. The experimental results summarized in this paper confirm that the proposed method efficiently uses energy and computational resources and at the same time provides security properties on par with the Datagram TLS (DTLS) standard. PMID:29757988
A Lightweight Protocol for Secure Video Streaming.

PubMed

Venčkauskas, Algimantas; Morkevicius, Nerijus; Bagdonas, Kazimieras; Damaševičius, Robertas; Maskeliūnas, Rytis

2018-05-14

The Internet of Things (IoT) introduces many new challenges which cannot be solved using traditional cloud and host computing models. A new architecture known as fog computing is emerging to address these technological and security gaps. Traditional security paradigms focused on providing perimeter-based protections and client/server point to point protocols (e.g., Transport Layer Security (TLS)) are no longer the best choices for addressing new security challenges in fog computing end devices, where energy and computational resources are limited. In this paper, we present a lightweight secure streaming protocol for the fog computing "Fog Node-End Device" layer. This protocol is lightweight, connectionless, supports broadcast and multicast operations, and is able to provide data source authentication, data integrity, and confidentiality. The protocol is based on simple and energy efficient cryptographic methods, such as Hash Message Authentication Codes (HMAC) and symmetrical ciphers, and uses modified User Datagram Protocol (UDP) packets to embed authentication data into streaming data. Data redundancy could be added to improve reliability in lossy networks. The experimental results summarized in this paper confirm that the proposed method efficiently uses energy and computational resources and at the same time provides security properties on par with the Datagram TLS (DTLS) standard.
A time-efficient algorithm for implementing the Catmull-Clark subdivision method

NASA Astrophysics Data System (ADS)

Ioannou, G.; Savva, A.; Stylianou, V.

2015-10-01

Splines are the most popular methods in Figure Modeling and CAGD (Computer Aided Geometric Design) in generating smooth surfaces from a number of control points. The control points define the shape of a figure and splines calculate the required number of points which when displayed on a computer screen the result is a smooth surface. However, spline methods are based on a rectangular topological structure of points, i.e., a two-dimensional table of vertices, and thus cannot generate complex figures, such as the human and animal bodies that their complex structure does not allow them to be defined by a regular rectangular grid. On the other hand surface subdivision methods, which are derived by splines, generate surfaces which are defined by an arbitrary topology of control points. This is the reason that during the last fifteen years subdivision methods have taken the lead over regular spline methods in all areas of modeling in both industry and research. The cost of executing computer software developed to read control points and calculate the surface is run-time, due to the fact that the surface-structure required for handling arbitrary topological grids is very complicate. There are many software programs that have been developed related to the implementation of subdivision surfaces however, not many algorithms are documented in the literature, to support developers for writing efficient code. This paper aims to assist programmers by presenting a time-efficient algorithm for implementing subdivision splines. The Catmull-Clark which is the most popular of the subdivision methods has been employed to illustrate the algorithm.
Efficient Unsteady Flow Visualization with High-Order Access Dependencies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Jiang; Guo, Hanqi; Yuan, Xiaoru

We present a novel high-order access dependencies based model for efficient pathline computation in unsteady flow visualization. By taking longer access sequences into account to model more sophisticated data access patterns in particle tracing, our method greatly improves the accuracy and reliability in data access prediction. In our work, high-order access dependencies are calculated by tracing uniformly-seeded pathlines in both forward and backward directions in a preprocessing stage. The effectiveness of our proposed approach is demonstrated through a parallel particle tracing framework with high-order data prefetching. Results show that our method achieves higher data locality and hence improves the efficiencymore » of pathline computation.« less
Robust large-scale parallel nonlinear solvers for simulations.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bader, Brett William; Pawlowski, Roger Patrick; Kolda, Tamara Gibson

2005-11-01

This report documents research to develop robust and efficient solution techniques for solving large-scale systems of nonlinear equations. The most widely used method for solving systems of nonlinear equations is Newton's method. While much research has been devoted to augmenting Newton-based solvers (usually with globalization techniques), little has been devoted to exploring the application of different models. Our research has been directed at evaluating techniques using different models than Newton's method: a lower order model, Broyden's method, and a higher order model, the tensor method. We have developed large-scale versions of each of these models and have demonstrated their usemore » in important applications at Sandia. Broyden's method replaces the Jacobian with an approximation, allowing codes that cannot evaluate a Jacobian or have an inaccurate Jacobian to converge to a solution. Limited-memory methods, which have been successful in optimization, allow us to extend this approach to large-scale problems. We compare the robustness and efficiency of Newton's method, modified Newton's method, Jacobian-free Newton-Krylov method, and our limited-memory Broyden method. Comparisons are carried out for large-scale applications of fluid flow simulations and electronic circuit simulations. Results show that, in cases where the Jacobian was inaccurate or could not be computed, Broyden's method converged in some cases where Newton's method failed to converge. We identify conditions where Broyden's method can be more efficient than Newton's method. We also present modifications to a large-scale tensor method, originally proposed by Bouaricha, for greater efficiency, better robustness, and wider applicability. Tensor methods are an alternative to Newton-based methods and are based on computing a step based on a local quadratic model rather than a linear model. The advantage of Bouaricha's method is that it can use any existing linear solver, which makes it simple to write and easily portable. However, the method usually takes twice as long to solve as Newton-GMRES on general problems because it solves two linear systems at each iteration. In this paper, we discuss modifications to Bouaricha's method for a practical implementation, including a special globalization technique and other modifications for greater efficiency. We present numerical results showing computational advantages over Newton-GMRES on some realistic problems. We further discuss a new approach for dealing with singular (or ill-conditioned) matrices. In particular, we modify an algorithm for identifying a turning point so that an increasingly ill-conditioned Jacobian does not prevent convergence.« less
A comparison of several methods of solving nonlinear regression groundwater flow problems

USGS Publications Warehouse

Cooley, Richard L.

1985-01-01

Computational efficiency and computer memory requirements for four methods of minimizing functions were compared for four test nonlinear-regression steady state groundwater flow problems. The fastest methods were the Marquardt and quasi-linearization methods, which required almost identical computer times and numbers of iterations; the next fastest was the quasi-Newton method, and last was the Fletcher-Reeves method, which did not converge in 100 iterations for two of the problems. The fastest method per iteration was the Fletcher-Reeves method, and this was followed closely by the quasi-Newton method. The Marquardt and quasi-linearization methods were slower. For all four methods the speed per iteration was directly related to the number of parameters in the model. However, this effect was much more pronounced for the Marquardt and quasi-linearization methods than for the other two. Hence the quasi-Newton (and perhaps Fletcher-Reeves) method might be more efficient than either the Marquardt or quasi-linearization methods if the number of parameters in a particular model were large, although this remains to be proven. The Marquardt method required somewhat less central memory than the quasi-linearization metilod for three of the four problems. For all four problems the quasi-Newton method required roughly two thirds to three quarters of the memory required by the Marquardt method, and the Fletcher-Reeves method required slightly less memory than the quasi-Newton method. Memory requirements were not excessive for any of the four methods.
Efficient parallel implicit methods for rotary-wing aerodynamics calculations

NASA Astrophysics Data System (ADS)

Wissink, Andrew M.

Euler/Navier-Stokes Computational Fluid Dynamics (CFD) methods are commonly used for prediction of the aerodynamics and aeroacoustics of modern rotary-wing aircraft. However, their widespread application to large complex problems is limited lack of adequate computing power. Parallel processing offers the potential for dramatic increases in computing power, but most conventional implicit solution methods are inefficient in parallel and new techniques must be adopted to realize its potential. This work proposes alternative implicit schemes for Euler/Navier-Stokes rotary-wing calculations which are robust and efficient in parallel. The first part of this work proposes an efficient parallelizable modification of the Lower Upper-Symmetric Gauss Seidel (LU-SGS) implicit operator used in the well-known Transonic Unsteady Rotor Navier Stokes (TURNS) code. The new hybrid LU-SGS scheme couples a point-relaxation approach of the Data Parallel-Lower Upper Relaxation (DP-LUR) algorithm for inter-processor communication with the Symmetric Gauss Seidel algorithm of LU-SGS for on-processor computations. With the modified operator, TURNS is implemented in parallel using Message Passing Interface (MPI) for communication. Numerical performance and parallel efficiency are evaluated on the IBM SP2 and Thinking Machines CM-5 multi-processors for a variety of steady-state and unsteady test cases. The hybrid LU-SGS scheme maintains the numerical performance of the original LU-SGS algorithm in all cases and shows a good degree of parallel efficiency. It experiences a higher degree of robustness than DP-LUR for third-order upwind solutions. The second part of this work examines use of Krylov subspace iterative solvers for the nonlinear CFD solutions. The hybrid LU-SGS scheme is used as a parallelizable preconditioner. Two iterative methods are tested, Generalized Minimum Residual (GMRES) and Orthogonal s-Step Generalized Conjugate Residual (OSGCR). The Newton method demonstrates good parallel performance on the IBM SP2, with OS-GCR giving slightly better performance than GMRES on large numbers of processors. For steady and quasi-steady calculations, the convergence rate is accelerated but the overall solution time remains about the same as the standard hybrid LU-SGS scheme. For unsteady calculations, however, the Newton method maintains a higher degree of time-accuracy which allows tbe use of larger timesteps and results in CPU savings of 20-35%.
Computational Issues in Damping Identification for Large Scale Problems

NASA Technical Reports Server (NTRS)

Pilkey, Deborah L.; Roe, Kevin P.; Inman, Daniel J.

1997-01-01

Two damping identification methods are tested for efficiency in large-scale applications. One is an iterative routine, and the other a least squares method. Numerical simulations have been performed on multiple degree-of-freedom models to test the effectiveness of the algorithm and the usefulness of parallel computation for the problems. High Performance Fortran is used to parallelize the algorithm. Tests were performed using the IBM-SP2 at NASA Ames Research Center. The least squares method tested incurs high communication costs, which reduces the benefit of high performance computing. This method's memory requirement grows at a very rapid rate meaning that larger problems can quickly exceed available computer memory. The iterative method's memory requirement grows at a much slower pace and is able to handle problems with 500+ degrees of freedom on a single processor. This method benefits from parallelization, and significant speedup can he seen for problems of 100+ degrees-of-freedom.
Self-consistent clustering analysis: an efficient multiscale scheme for inelastic heterogeneous materials

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, Z.; Bessa, M. A.; Liu, W.K.

A predictive computational theory is shown for modeling complex, hierarchical materials ranging from metal alloys to polymer nanocomposites. The theory can capture complex mechanisms such as plasticity and failure that span across multiple length scales. This general multiscale material modeling theory relies on sound principles of mathematics and mechanics, and a cutting-edge reduced order modeling method named self-consistent clustering analysis (SCA) [Zeliang Liu, M.A. Bessa, Wing Kam Liu, “Self-consistent clustering analysis: An efficient multi-scale scheme for inelastic heterogeneous materials,” Comput. Methods Appl. Mech. Engrg. 306 (2016) 319–341]. SCA reduces by several orders of magnitude the computational cost of micromechanical andmore » concurrent multiscale simulations, while retaining the microstructure information. This remarkable increase in efficiency is achieved with a data-driven clustering method. Computationally expensive operations are performed in the so-called offline stage, where degrees of freedom (DOFs) are agglomerated into clusters. The interaction tensor of these clusters is computed. In the online or predictive stage, the Lippmann-Schwinger integral equation is solved cluster-wise using a self-consistent scheme to ensure solution accuracy and avoid path dependence. To construct a concurrent multiscale model, this scheme is applied at each material point in a macroscale structure, replacing a conventional constitutive model with the average response computed from the microscale model using just the SCA online stage. A regularized damage theory is incorporated in the microscale that avoids the mesh and RVE size dependence that commonly plagues microscale damage calculations. The SCA method is illustrated with two cases: a carbon fiber reinforced polymer (CFRP) structure with the concurrent multiscale model and an application to fatigue prediction for additively manufactured metals. For the CFRP problem, a speed up estimated to be about 43,000 is achieved by using the SCA method, as opposed to FE2, enabling the solution of an otherwise computationally intractable problem. The second example uses a crystal plasticity constitutive law and computes the fatigue potency of extrinsic microscale features such as voids. This shows that local stress and strain are capture sufficiently well by SCA. This model has been incorporated in a process-structure-properties prediction framework for process design in additive manufacturing.« less
Visualization assisted by parallel processing

NASA Astrophysics Data System (ADS)

Lange, B.; Rey, H.; Vasques, X.; Puech, W.; Rodriguez, N.

2011-01-01

This paper discusses the experimental results of our visualization model for data extracted from sensors. The objective of this paper is to find a computationally efficient method to produce a real time rendering visualization for a large amount of data. We develop visualization method to monitor temperature variance of a data center. Sensors are placed on three layers and do not cover all the room. We use particle paradigm to interpolate data sensors. Particles model the "space" of the room. In this work we use a partition of the particle set, using two mathematical methods: Delaunay triangulation and Voronoý cells. Avis and Bhattacharya present these two algorithms in. Particles provide information on the room temperature at different coordinates over time. To locate and update particles data we define a computational cost function. To solve this function in an efficient way, we use a client server paradigm. Server computes data and client display this data on different kind of hardware. This paper is organized as follows. The first part presents related algorithm used to visualize large flow of data. The second part presents different platforms and methods used, which was evaluated in order to determine the better solution for the task proposed. The benchmark use the computational cost of our algorithm that formed based on located particles compared to sensors and on update of particles value. The benchmark was done on a personal computer using CPU, multi core programming, GPU programming and hybrid GPU/CPU. GPU programming method is growing in the research field; this method allows getting a real time rendering instates of a precompute rendering. For improving our results, we compute our algorithm on a High Performance Computing (HPC), this benchmark was used to improve multi-core method. HPC is commonly used in data visualization (astronomy, physic, etc) for improving the rendering and getting real-time.
An O(N) and parallel approach to integral problems by a kernel-independent fast multipole method: Application to polarization and magnetization of interacting particles

NASA Astrophysics Data System (ADS)

Jiang, Xikai; Li, Jiyuan; Zhao, Xujun; Qin, Jian; Karpeev, Dmitry; Hernandez-Ortiz, Juan; de Pablo, Juan J.; Heinonen, Olle

2016-08-01

Large classes of materials systems in physics and engineering are governed by magnetic and electrostatic interactions. Continuum or mesoscale descriptions of such systems can be cast in terms of integral equations, whose direct computational evaluation requires O(N2) operations, where N is the number of unknowns. Such a scaling, which arises from the many-body nature of the relevant Green's function, has precluded wide-spread adoption of integral methods for solution of large-scale scientific and engineering problems. In this work, a parallel computational approach is presented that relies on using scalable open source libraries and utilizes a kernel-independent Fast Multipole Method (FMM) to evaluate the integrals in O(N) operations, with O(N) memory cost, thereby substantially improving the scalability and efficiency of computational integral methods. We demonstrate the accuracy, efficiency, and scalability of our approach in the context of two examples. In the first, we solve a boundary value problem for a ferroelectric/ferromagnetic volume in free space. In the second, we solve an electrostatic problem involving polarizable dielectric bodies in an unbounded dielectric medium. The results from these test cases show that our proposed parallel approach, which is built on a kernel-independent FMM, can enable highly efficient and accurate simulations and allow for considerable flexibility in a broad range of applications.
Selection of bi-level image compression method for reduction of communication energy in wireless visual sensor networks

NASA Astrophysics Data System (ADS)

Khursheed, Khursheed; Imran, Muhammad; Ahmad, Naeem; O'Nils, Mattias

2012-06-01

Wireless Visual Sensor Network (WVSN) is an emerging field which combines image sensor, on board computation unit, communication component and energy source. Compared to the traditional wireless sensor network, which operates on one dimensional data, such as temperature, pressure values etc., WVSN operates on two dimensional data (images) which requires higher processing power and communication bandwidth. Normally, WVSNs are deployed in areas where installation of wired solutions is not feasible. The energy budget in these networks is limited to the batteries, because of the wireless nature of the application. Due to the limited availability of energy, the processing at Visual Sensor Nodes (VSN) and communication from VSN to server should consume as low energy as possible. Transmission of raw images wirelessly consumes a lot of energy and requires higher communication bandwidth. Data compression methods reduce data efficiently and hence will be effective in reducing communication cost in WVSN. In this paper, we have compared the compression efficiency and complexity of six well known bi-level image compression methods. The focus is to determine the compression algorithms which can efficiently compress bi-level images and their computational complexity is suitable for computational platform used in WVSNs. These results can be used as a road map for selection of compression methods for different sets of constraints in WVSN.
Analytic energy gradients for orbital-optimized MP3 and MP2.5 with the density-fitting approximation: An efficient implementation.

PubMed

Bozkaya, Uğur

2018-03-15

Efficient implementations of analytic gradients for the orbital-optimized MP3 and MP2.5 and their standard versions with the density-fitting approximation, which are denoted as DF-MP3, DF-MP2.5, DF-OMP3, and DF-OMP2.5, are presented. The DF-MP3, DF-MP2.5, DF-OMP3, and DF-OMP2.5 methods are applied to a set of alkanes and noncovalent interaction complexes to compare the computational cost with the conventional MP3, MP2.5, OMP3, and OMP2.5. Our results demonstrate that density-fitted perturbation theory (DF-MP) methods considered substantially reduce the computational cost compared to conventional MP methods. The efficiency of our DF-MP methods arise from the reduced input/output (I/O) time and the acceleration of gradient related terms, such as computations of particle density and generalized Fock matrices (PDMs and GFM), solution of the Z-vector equation, back-transformations of PDMs and GFM, and evaluation of analytic gradients in the atomic orbital basis. Further, application results show that errors introduced by the DF approach are negligible. Mean absolute errors for bond lengths of a molecular set, with the cc-pCVQZ basis set, is 0.0001-0.0002 Å. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

Dynamic response analysis of structure under time-variant interval process model

NASA Astrophysics Data System (ADS)

Xia, Baizhan; Qin, Yuan; Yu, Dejie; Jiang, Chao

2016-10-01

Due to the aggressiveness of the environmental factor, the variation of the dynamic load, the degeneration of the material property and the wear of the machine surface, parameters related with the structure are distinctly time-variant. Typical model for time-variant uncertainties is the random process model which is constructed on the basis of a large number of samples. In this work, we propose a time-variant interval process model which can be effectively used to deal with time-variant uncertainties with limit information. And then two methods are presented for the dynamic response analysis of the structure under the time-variant interval process model. The first one is the direct Monte Carlo method (DMCM) whose computational burden is relative high. The second one is the Monte Carlo method based on the Chebyshev polynomial expansion (MCM-CPE) whose computational efficiency is high. In MCM-CPE, the dynamic response of the structure is approximated by the Chebyshev polynomials which can be efficiently calculated, and then the variational range of the dynamic response is estimated according to the samples yielded by the Monte Carlo method. To solve the dependency phenomenon of the interval operation, the affine arithmetic is integrated into the Chebyshev polynomial expansion. The computational effectiveness and efficiency of MCM-CPE is verified by two numerical examples, including a spring-mass-damper system and a shell structure.
Parallel implementation of the particle simulation method with dynamic load balancing: Toward realistic geodynamical simulation

NASA Astrophysics Data System (ADS)

Furuichi, M.; Nishiura, D.

2015-12-01

Fully Lagrangian methods such as Smoothed Particle Hydrodynamics (SPH) and Discrete Element Method (DEM) have been widely used to solve the continuum and particles motions in the computational geodynamics field. These mesh-free methods are suitable for the problems with the complex geometry and boundary. In addition, their Lagrangian nature allows non-diffusive advection useful for tracking history dependent properties (e.g. rheology) of the material. These potential advantages over the mesh-based methods offer effective numerical applications to the geophysical flow and tectonic processes, which are for example, tsunami with free surface and floating body, magma intrusion with fracture of rock, and shear zone pattern generation of granular deformation. In order to investigate such geodynamical problems with the particle based methods, over millions to billion particles are required for the realistic simulation. Parallel computing is therefore important for handling such huge computational cost. An efficient parallel implementation of SPH and DEM methods is however known to be difficult especially for the distributed-memory architecture. Lagrangian methods inherently show workload imbalance problem for parallelization with the fixed domain in space, because particles move around and workloads change during the simulation. Therefore dynamic load balance is key technique to perform the large scale SPH and DEM simulation. In this work, we present the parallel implementation technique of SPH and DEM method utilizing dynamic load balancing algorithms toward the high resolution simulation over large domain using the massively parallel super computer system. Our method utilizes the imbalances of the executed time of each MPI process as the nonlinear term of parallel domain decomposition and minimizes them with the Newton like iteration method. In order to perform flexible domain decomposition in space, the slice-grid algorithm is used. Numerical tests show that our approach is suitable for solving the particles with different calculation costs (e.g. boundary particles) as well as the heterogeneous computer architecture. We analyze the parallel efficiency and scalability on the super computer systems (K-computer, Earth simulator 3, etc.).
A novel scene management technology for complex virtual battlefield environment

NASA Astrophysics Data System (ADS)

Sheng, Changchong; Jiang, Libing; Tang, Bo; Tang, Xiaoan

2018-04-01

The efficient scene management of virtual environment is an important research content of computer real-time visualization, which has a decisive influence on the efficiency of drawing. However, Traditional scene management methods do not suitable for complex virtual battlefield environments, this paper combines the advantages of traditional scene graph technology and spatial data structure method, using the idea of management and rendering separation, a loose object-oriented scene graph structure is established to manage the entity model data in the scene, and the performance-based quad-tree structure is created for traversing and rendering. In addition, the collaborative update relationship between the above two structural trees is designed to achieve efficient scene management. Compared with the previous scene management method, this method is more efficient and meets the needs of real-time visualization.
The computational complexity of elliptic curve integer sub-decomposition (ISD) method

NASA Astrophysics Data System (ADS)

Ajeena, Ruma Kareem K.; Kamarulhaili, Hailiza

2014-07-01

The idea of the GLV method of Gallant, Lambert and Vanstone (Crypto 2001) is considered a foundation stone to build a new procedure to compute the elliptic curve scalar multiplication. This procedure, that is integer sub-decomposition (ISD), will compute any multiple kP of elliptic curve point P which has a large prime order n with two low-degrees endomorphisms ψ1 and ψ2 of elliptic curve E over prime field Fp. The sub-decomposition of values k1 and k2, not bounded by ±C√n , gives us new integers k11, k12, k21 and k22 which are bounded by ±C√n and can be computed through solving the closest vector problem in lattice. The percentage of a successful computation for the scalar multiplication increases by ISD method, which improved the computational efficiency in comparison with the general method for computing scalar multiplication in elliptic curves over the prime fields. This paper will present the mechanism of ISD method and will shed light mainly on the computation complexity of the ISD approach that will be determined by computing the cost of operations. These operations include elliptic curve operations and finite field operations.
Extension of a streamwise upwind algorithm to a moving grid system

NASA Technical Reports Server (NTRS)

Obayashi, Shigeru; Goorjian, Peter M.; Guruswamy, Guru P.

1990-01-01

A new streamwise upwind algorithm was derived to compute unsteady flow fields with the use of a moving-grid system. The temporally nonconservative LU-ADI (lower-upper-factored, alternating-direction-implicit) method was applied for time marching computations. A comparison of the temporally nonconservative method with a time-conservative implicit upwind method indicates that the solutions are insensitive to the conservative properties of the implicit solvers when practical time steps are used. Using this new method, computations were made for an oscillating wing at a transonic Mach number. The computed results confirm that the present upwind scheme captures the shock motion better than the central-difference scheme based on the beam-warming algorithm. The new upwind option of the code allows larger time-steps and thus is more efficient, even though it requires slightly more computational time per time step than the central-difference option.
Supersonic nonlinear potential analysis

NASA Technical Reports Server (NTRS)

Siclari, M. J.

1984-01-01

The NCOREL computer code was established to compute supersonic flow fields of wings and bodies. The method encompasses an implicit finite difference transonic relaxation method to solve the full potential equation in a spherical coordinate system. Two basic topic to broaden the applicability and usefulness of the present method which is encompassed within the computer code NCOREL for the treatment of supersonic flow problems were studied. The first topic is that of computing efficiency. Accelerated schemes are in use for transonic flow problems. One such scheme is the approximate factorization (AF) method and an AF scheme to the supersonic flow problem is developed. The second topic is the computation of wake flows. The proper modeling of wake flows is important for multicomponent configurations such as wing-body and multiple lifting surfaces where the wake of one lifting surface has a pronounced effect on a downstream body or other lifting surfaces.
Numerical methods for engine-airframe integration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Murthy, S.N.B.; Paynter, G.C.

1986-01-01

Various papers on numerical methods for engine-airframe integration are presented. The individual topics considered include: scientific computing environment for the 1980s, overview of prediction of complex turbulent flows, numerical solutions of the compressible Navier-Stokes equations, elements of computational engine/airframe integrations, computational requirements for efficient engine installation, application of CAE and CFD techniques to complete tactical missile design, CFD applications to engine/airframe integration, and application of a second-generation low-order panel methods to powerplant installation studies. Also addressed are: three-dimensional flow analysis of turboprop inlet and nacelle configurations, application of computational methods to the design of large turbofan engine nacelles, comparison ofmore » full potential and Euler solution algorithms for aeropropulsive flow field computations, subsonic/transonic, supersonic nozzle flows and nozzle integration, subsonic/transonic prediction capabilities for nozzle/afterbody configurations, three-dimensional viscous design methodology of supersonic inlet systems for advanced technology aircraft, and a user's technology assessment.« less
Simple, efficient allocation of modelling runs on heterogeneous clusters with MPI

USGS Publications Warehouse

Donato, David I.

2017-01-01

In scientific modelling and computation, the choice of an appropriate method for allocating tasks for parallel processing depends on the computational setting and on the nature of the computation. The allocation of independent but similar computational tasks, such as modelling runs or Monte Carlo trials, among the nodes of a heterogeneous computational cluster is a special case that has not been specifically evaluated previously. A simulation study shows that a method of on-demand (that is, worker-initiated) pulling from a bag of tasks in this case leads to reliably short makespans for computational jobs despite heterogeneity both within and between cluster nodes. A simple reference implementation in the C programming language with the Message Passing Interface (MPI) is provided.
Fast methods to numerically integrate the Reynolds equation for gas fluid films

NASA Technical Reports Server (NTRS)

Dimofte, Florin

1992-01-01

The alternating direction implicit (ADI) method is adopted, modified, and applied to the Reynolds equation for thin, gas fluid films. An efficient code is developed to predict both the steady-state and dynamic performance of an aerodynamic journal bearing. An alternative approach is shown for hybrid journal gas bearings by using Liebmann's iterative solution (LIS) for elliptic partial differential equations. The results are compared with known design criteria from experimental data. The developed methods show good accuracy and very short computer running time in comparison with methods based on an inverting of a matrix. The computer codes need a small amount of memory and can be run on either personal computers or on mainframe systems.
Parallel algorithms for computation of the manipulator inertia matrix

NASA Technical Reports Server (NTRS)

Amin-Javaheri, Masoud; Orin, David E.

1989-01-01

The development of an O(log2N) parallel algorithm for the manipulator inertia matrix is presented. It is based on the most efficient serial algorithm which uses the composite rigid body method. Recursive doubling is used to reformulate the linear recurrence equations which are required to compute the diagonal elements of the matrix. It results in O(log2N) levels of computation. Computation of the off-diagonal elements involves N linear recurrences of varying-size and a new method, which avoids redundant computation of position and orientation transforms for the manipulator, is developed. The O(log2N) algorithm is presented in both equation and graphic forms which clearly show the parallelism inherent in the algorithm.
Computer Program for Thin Wire Antenna over a Perfectly Conducting Ground Plane. [using Galerkins method and sinusoidal bases

NASA Technical Reports Server (NTRS)

Richmond, J. H.

1974-01-01

A computer program is presented for a thin-wire antenna over a perfect ground plane. The analysis is performed in the frequency domain, and the exterior medium is free space. The antenna may have finite conductivity and lumped loads. The output data includes the current distribution, impedance, radiation efficiency, and gain. The program uses sinusoidal bases and Galerkin's method.
The truncated conjugate gradient (TCG), a non-iterative/fixed-cost strategy for computing polarization in molecular dynamics: Fast evaluation of analytical forces

NASA Astrophysics Data System (ADS)

Aviat, Félix; Lagardère, Louis; Piquemal, Jean-Philip

2017-10-01

In a recent paper [F. Aviat et al., J. Chem. Theory Comput. 13, 180-190 (2017)], we proposed the Truncated Conjugate Gradient (TCG) approach to compute the polarization energy and forces in polarizable molecular simulations. The method consists in truncating the conjugate gradient algorithm at a fixed predetermined order leading to a fixed computational cost and can thus be considered "non-iterative." This gives the possibility to derive analytical forces avoiding the usual energy conservation (i.e., drifts) issues occurring with iterative approaches. A key point concerns the evaluation of the analytical gradients, which is more complex than that with a usual solver. In this paper, after reviewing the present state of the art of polarization solvers, we detail a viable strategy for the efficient implementation of the TCG calculation. The complete cost of the approach is then measured as it is tested using a multi-time step scheme and compared to timings using usual iterative approaches. We show that the TCG methods are more efficient than traditional techniques, making it a method of choice for future long molecular dynamics simulations using polarizable force fields where energy conservation matters. We detail the various steps required for the implementation of the complete method by software developers.
The truncated conjugate gradient (TCG), a non-iterative/fixed-cost strategy for computing polarization in molecular dynamics: Fast evaluation of analytical forces.

PubMed

Aviat, Félix; Lagardère, Louis; Piquemal, Jean-Philip

2017-10-28

In a recent paper [F. Aviat et al., J. Chem. Theory Comput. 13, 180-190 (2017)], we proposed the Truncated Conjugate Gradient (TCG) approach to compute the polarization energy and forces in polarizable molecular simulations. The method consists in truncating the conjugate gradient algorithm at a fixed predetermined order leading to a fixed computational cost and can thus be considered "non-iterative." This gives the possibility to derive analytical forces avoiding the usual energy conservation (i.e., drifts) issues occurring with iterative approaches. A key point concerns the evaluation of the analytical gradients, which is more complex than that with a usual solver. In this paper, after reviewing the present state of the art of polarization solvers, we detail a viable strategy for the efficient implementation of the TCG calculation. The complete cost of the approach is then measured as it is tested using a multi-time step scheme and compared to timings using usual iterative approaches. We show that the TCG methods are more efficient than traditional techniques, making it a method of choice for future long molecular dynamics simulations using polarizable force fields where energy conservation matters. We detail the various steps required for the implementation of the complete method by software developers.
Single-step reinitialization and extending algorithms for level-set based multi-phase flow simulations

NASA Astrophysics Data System (ADS)

Fu, Lin; Hu, Xiangyu Y.; Adams, Nikolaus A.

2017-12-01

We propose efficient single-step formulations for reinitialization and extending algorithms, which are critical components of level-set based interface-tracking methods. The level-set field is reinitialized with a single-step (non iterative) "forward tracing" algorithm. A minimum set of cells is defined that describes the interface, and reinitialization employs only data from these cells. Fluid states are extrapolated or extended across the interface by a single-step "backward tracing" algorithm. Both algorithms, which are motivated by analogy to ray-tracing, avoid multiple block-boundary data exchanges that are inevitable for iterative reinitialization and extending approaches within a parallel-computing environment. The single-step algorithms are combined with a multi-resolution conservative sharp-interface method and validated by a wide range of benchmark test cases. We demonstrate that the proposed reinitialization method achieves second-order accuracy in conserving the volume of each phase. The interface location is invariant to reapplication of the single-step reinitialization. Generally, we observe smaller absolute errors than for standard iterative reinitialization on the same grid. The computational efficiency is higher than for the standard and typical high-order iterative reinitialization methods. We observe a 2- to 6-times efficiency improvement over the standard method for serial execution. The proposed single-step extending algorithm, which is commonly employed for assigning data to ghost cells with ghost-fluid or conservative interface interaction methods, shows about 10-times efficiency improvement over the standard method while maintaining same accuracy. Despite their simplicity, the proposed algorithms offer an efficient and robust alternative to iterative reinitialization and extending methods for level-set based multi-phase simulations.
Efficient segmentation of 3D fluoroscopic datasets from mobile C-arm

NASA Astrophysics Data System (ADS)

Styner, Martin A.; Talib, Haydar; Singh, Digvijay; Nolte, Lutz-Peter

2004-05-01

The emerging mobile fluoroscopic 3D technology linked with a navigation system combines the advantages of CT-based and C-arm-based navigation. The intra-operative, automatic segmentation of 3D fluoroscopy datasets enables the combined visualization of surgical instruments and anatomical structures for enhanced planning, surgical eye-navigation and landmark digitization. We performed a thorough evaluation of several segmentation algorithms using a large set of data from different anatomical regions and man-made phantom objects. The analyzed segmentation methods include automatic thresholding, morphological operations, an adapted region growing method and an implicit 3D geodesic snake method. In regard to computational efficiency, all methods performed within acceptable limits on a standard Desktop PC (30sec-5min). In general, the best results were obtained with datasets from long bones, followed by extremities. The segmentations of spine, pelvis and shoulder datasets were generally of poorer quality. As expected, the threshold-based methods produced the worst results. The combined thresholding and morphological operations methods were considered appropriate for a smaller set of clean images. The region growing method performed generally much better in regard to computational efficiency and segmentation correctness, especially for datasets of joints, and lumbar and cervical spine regions. The less efficient implicit snake method was able to additionally remove wrongly segmented skin tissue regions. This study presents a step towards efficient intra-operative segmentation of 3D fluoroscopy datasets, but there is room for improvement. Next, we plan to study model-based approaches for datasets from the knee and hip joint region, which would be thenceforth applied to all anatomical regions in our continuing development of an ideal segmentation procedure for 3D fluoroscopic images.
Derek Vigil-Fowler | NREL

Science.gov Websites

simulation methods for materials physics and chemistry, with particular expertise in post-DFT, high accuracy methods such as the GW approximation for electronic structure and random phase approximation (RPA) total the art in computational methods, including efficient methods for including the effects of substrates
Parallel computing techniques for rotorcraft aerodynamics

NASA Astrophysics Data System (ADS)

Ekici, Kivanc

The modification of unsteady three-dimensional Navier-Stokes codes for application on massively parallel and distributed computing environments is investigated. The Euler/Navier-Stokes code TURNS (Transonic Unsteady Rotor Navier-Stokes) was chosen as a test bed because of its wide use by universities and industry. For the efficient implementation of TURNS on parallel computing systems, two algorithmic changes are developed. First, main modifications to the implicit operator, Lower-Upper Symmetric Gauss Seidel (LU-SGS) originally used in TURNS, is performed. Second, application of an inexact Newton method, coupled with a Krylov subspace iterative method (Newton-Krylov method) is carried out. Both techniques have been tried previously for the Euler equations mode of the code. In this work, we have extended the methods to the Navier-Stokes mode. Several new implicit operators were tried because of convergence problems of traditional operators with the high cell aspect ratio (CAR) grids needed for viscous calculations on structured grids. Promising results for both Euler and Navier-Stokes cases are presented for these operators. For the efficient implementation of Newton-Krylov methods to the Navier-Stokes mode of TURNS, efficient preconditioners must be used. The parallel implicit operators used in the previous step are employed as preconditioners and the results are compared. The Message Passing Interface (MPI) protocol has been used because of its portability to various parallel architectures. It should be noted that the proposed methodology is general and can be applied to several other CFD codes (e.g. OVERFLOW).
Utilizing fast multipole expansions for efficient and accurate quantum-classical molecular dynamics simulations

NASA Astrophysics Data System (ADS)

Schwörer, Magnus; Lorenzen, Konstantin; Mathias, Gerald; Tavan, Paul

2015-03-01

Recently, a novel approach to hybrid quantum mechanics/molecular mechanics (QM/MM) molecular dynamics (MD) simulations has been suggested [Schwörer et al., J. Chem. Phys. 138, 244103 (2013)]. Here, the forces acting on the atoms are calculated by grid-based density functional theory (DFT) for a solute molecule and by a polarizable molecular mechanics (PMM) force field for a large solvent environment composed of several 103-105 molecules as negative gradients of a DFT/PMM hybrid Hamiltonian. The electrostatic interactions are efficiently described by a hierarchical fast multipole method (FMM). Adopting recent progress of this FMM technique [Lorenzen et al., J. Chem. Theory Comput. 10, 3244 (2014)], which particularly entails a strictly linear scaling of the computational effort with the system size, and adapting this revised FMM approach to the computation of the interactions between the DFT and PMM fragments of a simulation system, here, we show how one can further enhance the efficiency and accuracy of such DFT/PMM-MD simulations. The resulting gain of total performance, as measured for alanine dipeptide (DFT) embedded in water (PMM) by the product of the gains in efficiency and accuracy, amounts to about one order of magnitude. We also demonstrate that the jointly parallelized implementation of the DFT and PMM-MD parts of the computation enables the efficient use of high-performance computing systems. The associated software is available online.
Computer-intensive simulation of solid-state NMR experiments using SIMPSON.

PubMed

Tošner, Zdeněk; Andersen, Rasmus; Stevensson, Baltzar; Edén, Mattias; Nielsen, Niels Chr; Vosegaard, Thomas

2014-09-01

Conducting large-scale solid-state NMR simulations requires fast computer software potentially in combination with efficient computational resources to complete within a reasonable time frame. Such simulations may involve large spin systems, multiple-parameter fitting of experimental spectra, or multiple-pulse experiment design using parameter scan, non-linear optimization, or optimal control procedures. To efficiently accommodate such simulations, we here present an improved version of the widely distributed open-source SIMPSON NMR simulation software package adapted to contemporary high performance hardware setups. The software is optimized for fast performance on standard stand-alone computers, multi-core processors, and large clusters of identical nodes. We describe the novel features for fast computation including internal matrix manipulations, propagator setups and acquisition strategies. For efficient calculation of powder averages, we implemented interpolation method of Alderman, Solum, and Grant, as well as recently introduced fast Wigner transform interpolation technique. The potential of the optimal control toolbox is greatly enhanced by higher precision gradients in combination with the efficient optimization algorithm known as limited memory Broyden-Fletcher-Goldfarb-Shanno. In addition, advanced parallelization can be used in all types of calculations, providing significant time reductions. SIMPSON is thus reflecting current knowledge in the field of numerical simulations of solid-state NMR experiments. The efficiency and novel features are demonstrated on the representative simulations. Copyright © 2014 Elsevier Inc. All rights reserved.
On the computational aspects of comminution in discrete element method

NASA Astrophysics Data System (ADS)

Chaudry, Mohsin Ali; Wriggers, Peter

2018-04-01

In this paper, computational aspects of crushing/comminution of granular materials are addressed. For crushing, maximum tensile stress-based criterion is used. Crushing model in discrete element method (DEM) is prone to problems of mass conservation and reduction in critical time step. The first problem is addressed by using an iterative scheme which, depending on geometric voids, recovers mass of a particle. In addition, a global-local framework for DEM problem is proposed which tends to alleviate the local unstable motion of particles and increases the computational efficiency.

Finite element analysis and computer graphics visualization of flow around pitching and plunging airfoils

NASA Technical Reports Server (NTRS)

Bratanow, T.; Ecer, A.

1973-01-01

A general computational method for analyzing unsteady flow around pitching and plunging airfoils was developed. The finite element method was applied in developing an efficient numerical procedure for the solution of equations describing the flow around airfoils. The numerical results were employed in conjunction with computer graphics techniques to produce visualization of the flow. The investigation involved mathematical model studies of flow in two phases: (1) analysis of a potential flow formulation and (2) analysis of an incompressible, unsteady, viscous flow from Navier-Stokes equations.
Efficient Conservative Reformulation Schemes for Lithium Intercalation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Urisanga, PC; Rife, D; De, S

Porous electrode theory coupled with transport and reaction mechanisms is a widely used technique to model Li-ion batteries employing an appropriate discretization or approximation for solid phase diffusion with electrode particles. One of the major difficulties in simulating Li-ion battery models is the need to account for solid phase diffusion in a second radial dimension r, which increases the computation time/cost to a great extent. Various methods that reduce the computational cost have been introduced to treat this phenomenon, but most of them do not guarantee mass conservation. The aim of this paper is to introduce an inherently mass conservingmore » yet computationally efficient method for solid phase diffusion based on Lobatto III A quadrature. This paper also presents coupling of the new solid phase reformulation scheme with a macro-homogeneous porous electrode theory based pseudo 20 model for Li-ion battery. (C) The Author(s) 2015. Published by ECS. All rights reserved.« less
Finite volume model for two-dimensional shallow environmental flow

USGS Publications Warehouse

Simoes, F.J.M.

2011-01-01

This paper presents the development of a two-dimensional, depth integrated, unsteady, free-surface model based on the shallow water equations. The development was motivated by the desire of balancing computational efficiency and accuracy by selective and conjunctive use of different numerical techniques. The base framework of the discrete model uses Godunov methods on unstructured triangular grids, but the solution technique emphasizes the use of a high-resolution Riemann solver where needed, switching to a simpler and computationally more efficient upwind finite volume technique in the smooth regions of the flow. Explicit time marching is accomplished with strong stability preserving Runge-Kutta methods, with additional acceleration techniques for steady-state computations. A simplified mass-preserving algorithm is used to deal with wet/dry fronts. Application of the model is made to several benchmark cases that show the interplay of the diverse solution techniques.
Design of a Variational Multiscale Method for Turbulent Compressible Flows

NASA Technical Reports Server (NTRS)

Diosady, Laslo Tibor; Murman, Scott M.

2013-01-01

A spectral-element framework is presented for the simulation of subsonic compressible high-Reynolds-number flows. The focus of the work is maximizing the efficiency of the computational schemes to enable unsteady simulations with a large number of spatial and temporal degrees of freedom. A collocation scheme is combined with optimized computational kernels to provide a residual evaluation with computational cost independent of order of accuracy up to 16th order. The optimized residual routines are used to develop a low-memory implicit scheme based on a matrix-free Newton-Krylov method. A preconditioner based on the finite-difference diagonalized ADI scheme is developed which maintains the low memory of the matrix-free implicit solver, while providing improved convergence properties. Emphasis on low memory usage throughout the solver development is leveraged to implement a coupled space-time DG solver which may offer further efficiency gains through adaptivity in both space and time.
Precise and fast spatial-frequency analysis using the iterative local Fourier transform.

PubMed

Lee, Sukmock; Choi, Heejoo; Kim, Dae Wook

2016-09-19

The use of the discrete Fourier transform has decreased since the introduction of the fast Fourier transform (fFT), which is a numerically efficient computing process. This paper presents the iterative local Fourier transform (ilFT), a set of new processing algorithms that iteratively apply the discrete Fourier transform within a local and optimal frequency domain. The new technique achieves 2¹⁰ times higher frequency resolution than the fFT within a comparable computation time. The method's superb computing efficiency, high resolution, spectrum zoom-in capability, and overall performance are evaluated and compared to other advanced high-resolution Fourier transform techniques, such as the fFT combined with several fitting methods. The effectiveness of the ilFT is demonstrated through the data analysis of a set of Talbot self-images (1280 × 1024 pixels) obtained with an experimental setup using grating in a diverging beam produced by a coherent point source.
A Bitslice Implementation of Anderson's Attack on A5/1

NASA Astrophysics Data System (ADS)

Bulavintsev, Vadim; Semenov, Alexander; Zaikin, Oleg; Kochemazov, Stepan

2018-03-01

The A5/1 keystream generator is a part of Global System for Mobile Communications (GSM) protocol, employed in cellular networks all over the world. Its cryptographic resistance was extensively analyzed in dozens of papers. However, almost all corresponding methods either employ a specific hardware or require an extensive preprocessing stage and significant amounts of memory. In the present study, a bitslice variant of Anderson's Attack on A5/1 is implemented. It requires very little computer memory and no preprocessing. Moreover, the attack can be made even more efficient by harnessing the computing power of modern Graphics Processing Units (GPUs). As a result, using commonly available GPUs this method can quite efficiently recover the secret key using only 64 bits of keystream. To test the performance of the implementation, a volunteer computing project was launched. 10 instances of A5/1 cryptanalysis have been successfully solved in this project in a single week.
A 3D inversion for all-space magnetotelluric data with static shift correction

NASA Astrophysics Data System (ADS)

Zhang, Kun

2017-04-01

Base on the previous studies on the static shift correction and 3D inversion algorithms, we improve the NLCG 3D inversion method and propose a new static shift correction method which work in the inversion. The static shift correction method is based on the 3D theory and real data. The static shift can be detected by the quantitative analysis of apparent parameters (apparent resistivity and impedance phase) of MT in high frequency range, and completed correction with inversion. The method is an automatic processing technology of computer with 0 cost, and avoids the additional field work and indoor processing with good results. The 3D inversion algorithm is improved (Zhang et al., 2013) base on the NLCG method of Newman & Alumbaugh (2000) and Rodi & Mackie (2001). For the algorithm, we added the parallel structure, improved the computational efficiency, reduced the memory of computer and added the topographic and marine factors. So the 3D inversion could work in general PC with high efficiency and accuracy. And all the MT data of surface stations, seabed stations and underground stations can be used in the inversion algorithm.
Spectral difference Lanczos method for efficient time propagation in quantum control theory

NASA Astrophysics Data System (ADS)

Farnum, John D.; Mazziotti, David A.

2004-04-01

Spectral difference methods represent the real-space Hamiltonian of a quantum system as a banded matrix which possesses the accuracy of the discrete variable representation (DVR) and the efficiency of finite differences. When applied to time-dependent quantum mechanics, spectral differences enhance the efficiency of propagation methods for evolving the Schrödinger equation. We develop a spectral difference Lanczos method which is computationally more economical than the sinc-DVR Lanczos method, the split-operator technique, and even the fast-Fourier-Transform Lanczos method. Application of fast propagation is made to quantum control theory where chirped laser pulses are designed to dissociate both diatomic and polyatomic molecules. The specificity of the chirped laser fields is also tested as a possible method for molecular identification and discrimination.
Structural Analysis Methods for Structural Health Management of Future Aerospace Vehicles

NASA Technical Reports Server (NTRS)

Tessler, Alexander

2007-01-01

Two finite element based computational methods, Smoothing Element Analysis (SEA) and the inverse Finite Element Method (iFEM), are reviewed, and examples of their use for structural health monitoring are discussed. Due to their versatility, robustness, and computational efficiency, the methods are well suited for real-time structural health monitoring of future space vehicles, large space structures, and habitats. The methods may be effectively employed to enable real-time processing of sensing information, specifically for identifying three-dimensional deformed structural shapes as well as the internal loads. In addition, they may be used in conjunction with evolutionary algorithms to design optimally distributed sensors. These computational tools have demonstrated substantial promise for utilization in future Structural Health Management (SHM) systems.
Time-Domain Computation Of Electromagnetic Fields In MMICs

NASA Technical Reports Server (NTRS)

Lansing, Faiza S.; Rascoe, Daniel L.

1995-01-01

Maxwell's equations solved on three-dimensional, conformed orthogonal grids by finite-difference techniques. Method of computing frequency-dependent electrical parameters of monolithic microwave integrated circuit (MMIC) involves time-domain computation of propagation of electromagnetic field in response to excitation by single pulse at input terminal, followed by computation of Fourier transforms to obtain frequency-domain response from time-domain response. Parameters computed include electric and magnetic fields, voltages, currents, impedances, scattering parameters, and effective dielectric constants. Powerful and efficient means for analyzing performance of even complicated MMIC.
Methods and devices for determining quality of services of storage systems

DOEpatents

Seelam, Seetharami R [Yorktown Heights, NY; Teller, Patricia J [Las Cruces, NM

2012-01-17

Methods and systems for allowing access to computer storage systems. Multiple requests from multiple applications can be received and processed efficiently to allow traffic from multiple customers to access the storage system concurrently.
Speed-up of the volumetric method of moments for the approximate RCS of large arbitrary-shaped dielectric targets

NASA Astrophysics Data System (ADS)

Moreno, Javier; Somolinos, Álvaro; Romero, Gustavo; González, Iván; Cátedra, Felipe

2017-08-01

A method for the rigorous computation of the electromagnetic scattering of large dielectric volumes is presented. One goal is to simplify the analysis of large dielectric targets with translational symmetries taken advantage of their Toeplitz symmetry. Then, the matrix-fill stage of the Method of Moments is efficiently obtained because the number of coupling terms to compute is reduced. The Multilevel Fast Multipole Method is applied to solve the problem. Structured meshes are obtained efficiently to approximate the dielectric volumes. The regular mesh grid is achieved by using parallelepipeds whose centres have been identified as internal to the target. The ray casting algorithm is used to classify the parallelepiped centres. It may become a bottleneck when too many points are evaluated in volumes defined by parametric surfaces, so a hierarchical algorithm is proposed to minimize the number of evaluations. Measurements and analytical results are included for validation purposes.
Efficient method for computing the maximum-likelihood quantum state from measurements with additive Gaussian noise.

PubMed

Smolin, John A; Gambetta, Jay M; Smith, Graeme

2012-02-17

We provide an efficient method for computing the maximum-likelihood mixed quantum state (with density matrix ρ) given a set of measurement outcomes in a complete orthonormal operator basis subject to Gaussian noise. Our method works by first changing basis yielding a candidate density matrix μ which may have nonphysical (negative) eigenvalues, and then finding the nearest physical state under the 2-norm. Our algorithm takes at worst O(d(4)) for the basis change plus O(d(3)) for finding ρ where d is the dimension of the quantum state. In the special case where the measurement basis is strings of Pauli operators, the basis change takes only O(d(3)) as well. The workhorse of the algorithm is a new linear-time method for finding the closest probability distribution (in Euclidean distance) to a set of real numbers summing to one.
Method and system for efficient video compression with low-complexity encoder

NASA Technical Reports Server (NTRS)

Chen, Jun (Inventor); He, Dake (Inventor); Sheinin, Vadim (Inventor); Jagmohan, Ashish (Inventor); Lu, Ligang (Inventor)

2012-01-01

Disclosed are a method and system for video compression, wherein the video encoder has low computational complexity and high compression efficiency. The disclosed system comprises a video encoder and a video decoder, wherein the method for encoding includes the steps of converting a source frame into a space-frequency representation; estimating conditional statistics of at least one vector of space-frequency coefficients; estimating encoding rates based on the said conditional statistics; and applying Slepian-Wolf codes with the said computed encoding rates. The preferred method for decoding includes the steps of; generating a side-information vector of frequency coefficients based on previously decoded source data, encoder statistics, and previous reconstructions of the source frequency vector; and performing Slepian-Wolf decoding of at least one source frequency vector based on the generated side-information, the Slepian-Wolf code bits and the encoder statistics.
A single-loop optimization method for reliability analysis with second order uncertainty

NASA Astrophysics Data System (ADS)

Xie, Shaojun; Pan, Baisong; Du, Xiaoping

2015-08-01

Reliability analysis may involve random variables and interval variables. In addition, some of the random variables may have interval distribution parameters owing to limited information. This kind of uncertainty is called second order uncertainty. This article develops an efficient reliability method for problems involving the three aforementioned types of uncertain input variables. The analysis produces the maximum and minimum reliability and is computationally demanding because two loops are needed: a reliability analysis loop with respect to random variables and an interval analysis loop for extreme responses with respect to interval variables. The first order reliability method and nonlinear optimization are used for the two loops, respectively. For computational efficiency, the two loops are combined into a single loop by treating the Karush-Kuhn-Tucker (KKT) optimal conditions of the interval analysis as constraints. Three examples are presented to demonstrate the proposed method.
A Fast Gradient Method for Nonnegative Sparse Regression With Self-Dictionary

NASA Astrophysics Data System (ADS)

Gillis, Nicolas; Luce, Robert

2018-01-01

A nonnegative matrix factorization (NMF) can be computed efficiently under the separability assumption, which asserts that all the columns of the given input data matrix belong to the cone generated by a (small) subset of them. The provably most robust methods to identify these conic basis columns are based on nonnegative sparse regression and self dictionaries, and require the solution of large-scale convex optimization problems. In this paper we study a particular nonnegative sparse regression model with self dictionary. As opposed to previously proposed models, this model yields a smooth optimization problem where the sparsity is enforced through linear constraints. We show that the Euclidean projection on the polyhedron defined by these constraints can be computed efficiently, and propose a fast gradient method to solve our model. We compare our algorithm with several state-of-the-art methods on synthetic data sets and real-world hyperspectral images.
Job Scheduling with Efficient Resource Monitoring in Cloud Datacenter

PubMed Central

Loganathan, Shyamala; Mukherjee, Saswati

2015-01-01

Cloud computing is an on-demand computing model, which uses virtualization technology to provide cloud resources to users in the form of virtual machines through internet. Being an adaptable technology, cloud computing is an excellent alternative for organizations for forming their own private cloud. Since the resources are limited in these private clouds maximizing the utilization of resources and giving the guaranteed service for the user are the ultimate goal. For that, efficient scheduling is needed. This research reports on an efficient data structure for resource management and resource scheduling technique in a private cloud environment and discusses a cloud model. The proposed scheduling algorithm considers the types of jobs and the resource availability in its scheduling decision. Finally, we conducted simulations using CloudSim and compared our algorithm with other existing methods, like V-MCT and priority scheduling algorithms. PMID:26473166
Job Scheduling with Efficient Resource Monitoring in Cloud Datacenter.

PubMed

Loganathan, Shyamala; Mukherjee, Saswati

2015-01-01

Cloud computing is an on-demand computing model, which uses virtualization technology to provide cloud resources to users in the form of virtual machines through internet. Being an adaptable technology, cloud computing is an excellent alternative for organizations for forming their own private cloud. Since the resources are limited in these private clouds maximizing the utilization of resources and giving the guaranteed service for the user are the ultimate goal. For that, efficient scheduling is needed. This research reports on an efficient data structure for resource management and resource scheduling technique in a private cloud environment and discusses a cloud model. The proposed scheduling algorithm considers the types of jobs and the resource availability in its scheduling decision. Finally, we conducted simulations using CloudSim and compared our algorithm with other existing methods, like V-MCT and priority scheduling algorithms.
Application of augmented-Lagrangian methods in meteorology: Comparison of different conjugate-gradient codes for large-scale minimization

NASA Technical Reports Server (NTRS)

Navon, I. M.

1984-01-01

A Lagrange multiplier method using techniques developed by Bertsekas (1982) was applied to solving the problem of enforcing simultaneous conservation of the nonlinear integral invariants of the shallow water equations on a limited area domain. This application of nonlinear constrained optimization is of the large dimensional type and the conjugate gradient method was found to be the only computationally viable method for the unconstrained minimization. Several conjugate-gradient codes were tested and compared for increasing accuracy requirements. Robustness and computational efficiency were the principal criteria.
Automatic registration of terrestrial point clouds based on panoramic reflectance images and efficient BaySAC

NASA Astrophysics Data System (ADS)

Kang, Zhizhong

2013-10-01

This paper presents a new approach to automatic registration of terrestrial laser scanning (TLS) point clouds utilizing a novel robust estimation method by an efficient BaySAC (BAYes SAmpling Consensus). The proposed method directly generates reflectance images from 3D point clouds, and then using SIFT algorithm extracts keypoints to identify corresponding image points. The 3D corresponding points, from which transformation parameters between point clouds are computed, are acquired by mapping the 2D ones onto the point cloud. To remove false accepted correspondences, we implement a conditional sampling method to select the n data points with the highest inlier probabilities as a hypothesis set and update the inlier probabilities of each data point using simplified Bayes' rule for the purpose of improving the computation efficiency. The prior probability is estimated by the verification of the distance invariance between correspondences. The proposed approach is tested on four data sets acquired by three different scanners. The results show that, comparing with the performance of RANSAC, BaySAC leads to less iterations and cheaper computation cost when the hypothesis set is contaminated with more outliers. The registration results also indicate that, the proposed algorithm can achieve high registration accuracy on all experimental datasets.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.