computationally efficient reduced-order: Topics by Science.gov

Sample records for computationally efficient reduced-order

Reduced-Order Modeling: New Approaches for Computational Physics

NASA Technical Reports Server (NTRS)

Beran, Philip S.; Silva, Walter A.

2001-01-01

In this paper, we review the development of new reduced-order modeling techniques and discuss their applicability to various problems in computational physics. Emphasis is given to methods ba'sed on Volterra series representations and the proper orthogonal decomposition. Results are reported for different nonlinear systems to provide clear examples of the construction and use of reduced-order models, particularly in the multi-disciplinary field of computational aeroelasticity. Unsteady aerodynamic and aeroelastic behaviors of two- dimensional and three-dimensional geometries are described. Large increases in computational efficiency are obtained through the use of reduced-order models, thereby justifying the initial computational expense of constructing these models and inotivatim,- their use for multi-disciplinary design analysis.
Reduced-Order Models for the Aeroelastic Analysis of Ares Launch Vehicles

NASA Technical Reports Server (NTRS)

Silva, Walter A.; Vatsa, Veer N.; Biedron, Robert T.

2010-01-01

This document presents the development and application of unsteady aerodynamic, structural dynamic, and aeroelastic reduced-order models (ROMs) for the ascent aeroelastic analysis of the Ares I-X flight test and Ares I crew launch vehicles using the unstructured-grid, aeroelastic FUN3D computational fluid dynamics (CFD) code. The purpose of this work is to perform computationally-efficient aeroelastic response calculations that would be prohibitively expensive via computation of multiple full-order aeroelastic FUN3D solutions. These efficient aeroelastic ROM solutions provide valuable insight regarding the aeroelastic sensitivity of the vehicles to various parameters over a range of dynamic pressures.
REVEAL: An Extensible Reduced Order Model Builder for Simulation and Modeling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Agarwal, Khushbu; Sharma, Poorva; Ma, Jinliang

2013-04-30

Many science domains need to build computationally efficient and accurate representations of high fidelity, computationally expensive simulations. These computationally efficient versions are known as reduced-order models. This paper presents the design and implementation of a novel reduced-order model (ROM) builder, the REVEAL toolset. This toolset generates ROMs based on science- and engineering-domain specific simulations executed on high performance computing (HPC) platforms. The toolset encompasses a range of sampling and regression methods that can be used to generate a ROM, automatically quantifies the ROM accuracy, and provides support for an iterative approach to improve ROM accuracy. REVEAL is designed to bemore » extensible in order to utilize the core functionality with any simulator that has published input and output formats. It also defines programmatic interfaces to include new sampling and regression techniques so that users can ‘mix and match’ mathematical techniques to best suit the characteristics of their model. In this paper, we describe the architecture of REVEAL and demonstrate its usage with a computational fluid dynamics model used in carbon capture.« less
Distributing and storing data efficiently by means of special datasets in the ATLAS collaboration

NASA Astrophysics Data System (ADS)

Köneke, Karsten; ATLAS Collaboration

2011-12-01

With the start of the LHC physics program, the ATLAS experiment started to record vast amounts of data. This data has to be distributed and stored on the world-wide computing grid in a smart way in order to enable an effective and efficient analysis by physicists. This article describes how the ATLAS collaboration chose to create specialized reduced datasets in order to efficiently use computing resources and facilitate physics analyses.
A Reduced-Order Model for Efficient Simulation of Synthetic Jet Actuators

NASA Technical Reports Server (NTRS)

Yamaleev, Nail K.; Carpenter, Mark H.

2003-01-01

A new reduced-order model of multidimensional synthetic jet actuators that combines the accuracy and conservation properties of full numerical simulation methods with the efficiency of simplified zero-order models is proposed. The multidimensional actuator is simulated by solving the time-dependent compressible quasi-1-D Euler equations, while the diaphragm is modeled as a moving boundary. The governing equations are approximated with a fourth-order finite difference scheme on a moving mesh such that one of the mesh boundaries coincides with the diaphragm. The reduced-order model of the actuator has several advantages. In contrast to the 3-D models, this approach provides conservation of mass, momentum, and energy. Furthermore, the new method is computationally much more efficient than the multidimensional Navier-Stokes simulation of the actuator cavity flow, while providing practically the same accuracy in the exterior flowfield. The most distinctive feature of the present model is its ability to predict the resonance characteristics of synthetic jet actuators; this is not practical when using the 3-D models because of the computational cost involved. Numerical results demonstrating the accuracy of the new reduced-order model and its limitations are presented.
Alternative Modal Basis Selection Procedures For Reduced-Order Nonlinear Random Response Simulation

NASA Technical Reports Server (NTRS)

Przekop, Adam; Guo, Xinyun; Rizi, Stephen A.

2012-01-01

Three procedures to guide selection of an efficient modal basis in a nonlinear random response analysis are examined. One method is based only on proper orthogonal decomposition, while the other two additionally involve smooth orthogonal decomposition. Acoustic random response problems are employed to assess the performance of the three modal basis selection approaches. A thermally post-buckled beam exhibiting snap-through behavior, a shallowly curved arch in the auto-parametric response regime and a plate structure are used as numerical test articles. The results of a computationally taxing full-order analysis in physical degrees of freedom are taken as the benchmark for comparison with the results from the three reduced-order analyses. For the cases considered, all three methods are shown to produce modal bases resulting in accurate and computationally efficient reduced-order nonlinear simulations.
Equivalent model construction for a non-linear dynamic system based on an element-wise stiffness evaluation procedure and reduced analysis of the equivalent system

NASA Astrophysics Data System (ADS)

Kim, Euiyoung; Cho, Maenghyo

2017-11-01

In most non-linear analyses, the construction of a system matrix uses a large amount of computation time, comparable to the computation time required by the solving process. If the process for computing non-linear internal force matrices is substituted with an effective equivalent model that enables the bypass of numerical integrations and assembly processes used in matrix construction, efficiency can be greatly enhanced. A stiffness evaluation procedure (STEP) establishes non-linear internal force models using polynomial formulations of displacements. To efficiently identify an equivalent model, the method has evolved such that it is based on a reduced-order system. The reduction process, however, makes the equivalent model difficult to parameterize, which significantly affects the efficiency of the optimization process. In this paper, therefore, a new STEP, E-STEP, is proposed. Based on the element-wise nature of the finite element model, the stiffness evaluation is carried out element-by-element in the full domain. Since the unit of computation for the stiffness evaluation is restricted by element size, and since the computation is independent, the equivalent model can be constructed efficiently in parallel, even in the full domain. Due to the element-wise nature of the construction procedure, the equivalent E-STEP model is easily characterized by design parameters. Various reduced-order modeling techniques can be applied to the equivalent system in a manner similar to how they are applied in the original system. The reduced-order model based on E-STEP is successfully demonstrated for the dynamic analyses of non-linear structural finite element systems under varying design parameters.
Alternative Modal Basis Selection Procedures for Nonlinear Random Response Simulation

NASA Technical Reports Server (NTRS)

Przekop, Adam; Guo, Xinyun; Rizzi, Stephen A.

2010-01-01

Three procedures to guide selection of an efficient modal basis in a nonlinear random response analysis are examined. One method is based only on proper orthogonal decomposition, while the other two additionally involve smooth orthogonal decomposition. Acoustic random response problems are employed to assess the performance of the three modal basis selection approaches. A thermally post-buckled beam exhibiting snap-through behavior, a shallowly curved arch in the auto-parametric response regime and a plate structure are used as numerical test articles. The results of the three reduced-order analyses are compared with the results of the computationally taxing simulation in the physical degrees of freedom. For the cases considered, all three methods are shown to produce modal bases resulting in accurate and computationally efficient reduced-order nonlinear simulations.
Wall Shear Stress Distribution in a Patient-Specific Cerebral Aneurysm Model using Reduced Order Modeling

NASA Astrophysics Data System (ADS)

Han, Suyue; Chang, Gary Han; Schirmer, Clemens; Modarres-Sadeghi, Yahya

2016-11-01

We construct a reduced-order model (ROM) to study the Wall Shear Stress (WSS) distributions in image-based patient-specific aneurysms models. The magnitude of WSS has been shown to be a critical factor in growth and rupture of human aneurysms. We start the process by running a training case using Computational Fluid Dynamics (CFD) simulation with time-varying flow parameters, such that these parameters cover the range of parameters of interest. The method of snapshot Proper Orthogonal Decomposition (POD) is utilized to construct the reduced-order bases using the training CFD simulation. The resulting ROM enables us to study the flow patterns and the WSS distributions over a range of system parameters computationally very efficiently with a relatively small number of modes. This enables comprehensive analysis of the model system across a range of physiological conditions without the need to re-compute the simulation for small changes in the system parameters.
Preserving Lagrangian Structure in Nonlinear Model Reduction with Application to Structural Dynamics

DOE PAGES

Carlberg, Kevin; Tuminaro, Ray; Boggs, Paul

2015-03-11

Our work proposes a model-reduction methodology that preserves Lagrangian structure and achieves computational efficiency in the presence of high-order nonlinearities and arbitrary parameter dependence. As such, the resulting reduced-order model retains key properties such as energy conservation and symplectic time-evolution maps. We focus on parameterized simple mechanical systems subjected to Rayleigh damping and external forces, and consider an application to nonlinear structural dynamics. To preserve structure, the method first approximates the system's “Lagrangian ingredients''---the Riemannian metric, the potential-energy function, the dissipation function, and the external force---and subsequently derives reduced-order equations of motion by applying the (forced) Euler--Lagrange equation with thesemore » quantities. Moreover, from the algebraic perspective, key contributions include two efficient techniques for approximating parameterized reduced matrices while preserving symmetry and positive definiteness: matrix gappy proper orthogonal decomposition and reduced-basis sparsification. Our results for a parameterized truss-structure problem demonstrate the practical importance of preserving Lagrangian structure and illustrate the proposed method's merits: it reduces computation time while maintaining high accuracy and stability, in contrast to existing nonlinear model-reduction techniques that do not preserve structure.« less
Preserving Lagrangian Structure in Nonlinear Model Reduction with Application to Structural Dynamics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Carlberg, Kevin; Tuminaro, Ray; Boggs, Paul

Our work proposes a model-reduction methodology that preserves Lagrangian structure and achieves computational efficiency in the presence of high-order nonlinearities and arbitrary parameter dependence. As such, the resulting reduced-order model retains key properties such as energy conservation and symplectic time-evolution maps. We focus on parameterized simple mechanical systems subjected to Rayleigh damping and external forces, and consider an application to nonlinear structural dynamics. To preserve structure, the method first approximates the system's “Lagrangian ingredients''---the Riemannian metric, the potential-energy function, the dissipation function, and the external force---and subsequently derives reduced-order equations of motion by applying the (forced) Euler--Lagrange equation with thesemore » quantities. Moreover, from the algebraic perspective, key contributions include two efficient techniques for approximating parameterized reduced matrices while preserving symmetry and positive definiteness: matrix gappy proper orthogonal decomposition and reduced-basis sparsification. Our results for a parameterized truss-structure problem demonstrate the practical importance of preserving Lagrangian structure and illustrate the proposed method's merits: it reduces computation time while maintaining high accuracy and stability, in contrast to existing nonlinear model-reduction techniques that do not preserve structure.« less
Reduced-order modelling of parameter-dependent, linear and nonlinear dynamic partial differential equation models.

PubMed

Shah, A A; Xing, W W; Triantafyllidis, V

2017-04-01

In this paper, we develop reduced-order models for dynamic, parameter-dependent, linear and nonlinear partial differential equations using proper orthogonal decomposition (POD). The main challenges are to accurately and efficiently approximate the POD bases for new parameter values and, in the case of nonlinear problems, to efficiently handle the nonlinear terms. We use a Bayesian nonlinear regression approach to learn the snapshots of the solutions and the nonlinearities for new parameter values. Computational efficiency is ensured by using manifold learning to perform the emulation in a low-dimensional space. The accuracy of the method is demonstrated on a linear and a nonlinear example, with comparisons with a global basis approach.
Reduced-order modelling of parameter-dependent, linear and nonlinear dynamic partial differential equation models

PubMed Central

Xing, W. W.; Triantafyllidis, V.

2017-01-01

In this paper, we develop reduced-order models for dynamic, parameter-dependent, linear and nonlinear partial differential equations using proper orthogonal decomposition (POD). The main challenges are to accurately and efficiently approximate the POD bases for new parameter values and, in the case of nonlinear problems, to efficiently handle the nonlinear terms. We use a Bayesian nonlinear regression approach to learn the snapshots of the solutions and the nonlinearities for new parameter values. Computational efficiency is ensured by using manifold learning to perform the emulation in a low-dimensional space. The accuracy of the method is demonstrated on a linear and a nonlinear example, with comparisons with a global basis approach. PMID:28484327
A Very High Order, Adaptable MESA Implementation for Aeroacoustic Computations

NASA Technical Reports Server (NTRS)

Dydson, Roger W.; Goodrich, John W.

2000-01-01

Since computational efficiency and wave resolution scale with accuracy, the ideal would be infinitely high accuracy for problems with widely varying wavelength scales. Currently, many of the computational aeroacoustics methods are limited to 4th order accurate Runge-Kutta methods in time which limits their resolution and efficiency. However, a new procedure for implementing the Modified Expansion Solution Approximation (MESA) schemes, based upon Hermitian divided differences, is presented which extends the effective accuracy of the MESA schemes to 57th order in space and time when using 128 bit floating point precision. This new approach has the advantages of reducing round-off error, being easy to program. and is more computationally efficient when compared to previous approaches. Its accuracy is limited only by the floating point hardware. The advantages of this new approach are demonstrated by solving the linearized Euler equations in an open bi-periodic domain. A 500th order MESA scheme can now be created in seconds, making these schemes ideally suited for the next generation of high performance 256-bit (double quadruple) or higher precision computers. This ease of creation makes it possible to adapt the algorithm to the mesh in time instead of its converse: this is ideal for resolving varying wavelength scales which occur in noise generation simulations. And finally, the sources of round-off error which effect the very high order methods are examined and remedies provided that effectively increase the accuracy of the MESA schemes while using current computer technology.
Proper Orthogonal Decomposition in Optimal Control of Fluids

NASA Technical Reports Server (NTRS)

Ravindran, S. S.

1999-01-01

In this article, we present a reduced order modeling approach suitable for active control of fluid dynamical systems based on proper orthogonal decomposition (POD). The rationale behind the reduced order modeling is that numerical simulation of Navier-Stokes equations is still too costly for the purpose of optimization and control of unsteady flows. We examine the possibility of obtaining reduced order models that reduce computational complexity associated with the Navier-Stokes equations while capturing the essential dynamics by using the POD. The POD allows extraction of certain optimal set of basis functions, perhaps few, from a computational or experimental data-base through an eigenvalue analysis. The solution is then obtained as a linear combination of these optimal set of basis functions by means of Galerkin projection. This makes it attractive for optimal control and estimation of systems governed by partial differential equations. We here use it in active control of fluid flows governed by the Navier-Stokes equations. We show that the resulting reduced order model can be very efficient for the computations of optimization and control problems in unsteady flows. Finally, implementational issues and numerical experiments are presented for simulations and optimal control of fluid flow through channels.
Efficient 3D geometric and Zernike moments computation from unstructured surface meshes.

PubMed

Pozo, José María; Villa-Uriol, Maria-Cruz; Frangi, Alejandro F

2011-03-01

This paper introduces and evaluates a fast exact algorithm and a series of faster approximate algorithms for the computation of 3D geometric moments from an unstructured surface mesh of triangles. Being based on the object surface reduces the computational complexity of these algorithms with respect to volumetric grid-based algorithms. In contrast, it can only be applied for the computation of geometric moments of homogeneous objects. This advantage and restriction is shared with other proposed algorithms based on the object boundary. The proposed exact algorithm reduces the computational complexity for computing geometric moments up to order N with respect to previously proposed exact algorithms, from N(9) to N(6). The approximate series algorithm appears as a power series on the rate between triangle size and object size, which can be truncated at any desired degree. The higher the number and quality of the triangles, the better the approximation. This approximate algorithm reduces the computational complexity to N(3). In addition, the paper introduces a fast algorithm for the computation of 3D Zernike moments from the computed geometric moments, with a computational complexity N(4), while the previously proposed algorithm is of order N(6). The error introduced by the proposed approximate algorithms is evaluated in different shapes and the cost-benefit ratio in terms of error, and computational time is analyzed for different moment orders.
Minimized state complexity of quantum-encoded cryptic processes

NASA Astrophysics Data System (ADS)

Riechers, Paul M.; Mahoney, John R.; Aghamohammadi, Cina; Crutchfield, James P.

2016-05-01

The predictive information required for proper trajectory sampling of a stochastic process can be more efficiently transmitted via a quantum channel than a classical one. This recent discovery allows quantum information processing to drastically reduce the memory necessary to simulate complex classical stochastic processes. It also points to a new perspective on the intrinsic complexity that nature must employ in generating the processes we observe. The quantum advantage increases with codeword length: the length of process sequences used in constructing the quantum communication scheme. In analogy with the classical complexity measure, statistical complexity, we use this reduced communication cost as an entropic measure of state complexity in the quantum representation. Previously difficult to compute, the quantum advantage is expressed here in closed form using spectral decomposition. This allows for efficient numerical computation of the quantum-reduced state complexity at all encoding lengths, including infinite. Additionally, it makes clear how finite-codeword reduction in state complexity is controlled by the classical process's cryptic order, and it allows asymptotic analysis of infinite-cryptic-order processes.
Efficient Strategies for Estimating the Spatial Coherence of Backscatter

PubMed Central

Hyun, Dongwoon; Crowley, Anna Lisa C.; Dahl, Jeremy J.

2017-01-01

The spatial coherence of ultrasound backscatter has been proposed to reduce clutter in medical imaging, to measure the anisotropy of the scattering source, and to improve the detection of blood flow. These techniques rely on correlation estimates that are obtained using computationally expensive strategies. In this study, we assess existing spatial coherence estimation methods and propose three computationally efficient modifications: a reduced kernel, a downsampled receive aperture, and the use of an ensemble correlation coefficient. The proposed methods are implemented in simulation and in vivo studies. Reducing the kernel to a single sample improved computational throughput and improved axial resolution. Downsampling the receive aperture was found to have negligible effect on estimator variance, and improved computational throughput by an order of magnitude for a downsample factor of 4. The ensemble correlation estimator demonstrated lower variance than the currently used average correlation. Combining the three methods, the throughput was improved 105-fold in simulation with a downsample factor of 4 and 20-fold in vivo with a downsample factor of 2. PMID:27913342
Low rank approach to computing first and higher order derivatives using automatic differentiation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Reed, J. A.; Abdel-Khalik, H. S.; Utke, J.

2012-07-01

This manuscript outlines a new approach for increasing the efficiency of applying automatic differentiation (AD) to large scale computational models. By using the principles of the Efficient Subspace Method (ESM), low rank approximations of the derivatives for first and higher orders can be calculated using minimized computational resources. The output obtained from nuclear reactor calculations typically has a much smaller numerical rank compared to the number of inputs and outputs. This rank deficiency can be exploited to reduce the number of derivatives that need to be calculated using AD. The effective rank can be determined according to ESM by computingmore » derivatives with AD at random inputs. Reduced or pseudo variables are then defined and new derivatives are calculated with respect to the pseudo variables. Two different AD packages are used: OpenAD and Rapsodia. OpenAD is used to determine the effective rank and the subspace that contains the derivatives. Rapsodia is then used to calculate derivatives with respect to the pseudo variables for the desired order. The overall approach is applied to two simple problems and to MATWS, a safety code for sodium cooled reactors. (authors)« less
On real-space Density Functional Theory for non-orthogonal crystal systems: Kronecker product formulation of the kinetic energy operator

NASA Astrophysics Data System (ADS)

Sharma, Abhiraj; Suryanarayana, Phanish

2018-05-01

We present an accurate and efficient real-space Density Functional Theory (DFT) framework for the ab initio study of non-orthogonal crystal systems. Specifically, employing a local reformulation of the electrostatics, we develop a novel Kronecker product formulation of the real-space kinetic energy operator that significantly reduces the number of operations associated with the Laplacian-vector multiplication, the dominant cost in practical computations. In particular, we reduce the scaling with respect to finite-difference order from quadratic to linear, thereby significantly bridging the gap in computational cost between non-orthogonal and orthogonal systems. We verify the accuracy and efficiency of the proposed methodology through selected examples.

Data-driven reduced order models for effective yield strength and partitioning of strain in multiphase materials

NASA Astrophysics Data System (ADS)

Latypov, Marat I.; Kalidindi, Surya R.

2017-10-01

There is a critical need for the development and verification of practically useful multiscale modeling strategies for simulating the mechanical response of multiphase metallic materials with heterogeneous microstructures. In this contribution, we present data-driven reduced order models for effective yield strength and strain partitioning in such microstructures. These models are built employing the recently developed framework of Materials Knowledge Systems that employ 2-point spatial correlations (or 2-point statistics) for the quantification of the heterostructures and principal component analyses for their low-dimensional representation. The models are calibrated to a large collection of finite element (FE) results obtained for a diverse range of microstructures with various sizes, shapes, and volume fractions of the phases. The performance of the models is evaluated by comparing the predictions of yield strength and strain partitioning in two-phase materials with the corresponding predictions from a classical self-consistent model as well as results of full-field FE simulations. The reduced-order models developed in this work show an excellent combination of accuracy and computational efficiency, and therefore present an important advance towards computationally efficient microstructure-sensitive multiscale modeling frameworks.
POD/DEIM reduced-order strategies for efficient four dimensional variational data assimilation

NASA Astrophysics Data System (ADS)

Ştefănescu, R.; Sandu, A.; Navon, I. M.

2015-08-01

This work studies reduced order modeling (ROM) approaches to speed up the solution of variational data assimilation problems with large scale nonlinear dynamical models. It is shown that a key requirement for a successful reduced order solution is that reduced order Karush-Kuhn-Tucker conditions accurately represent their full order counterparts. In particular, accurate reduced order approximations are needed for the forward and adjoint dynamical models, as well as for the reduced gradient. New strategies to construct reduced order based are developed for proper orthogonal decomposition (POD) ROM data assimilation using both Galerkin and Petrov-Galerkin projections. For the first time POD, tensorial POD, and discrete empirical interpolation method (DEIM) are employed to develop reduced data assimilation systems for a geophysical flow model, namely, the two dimensional shallow water equations. Numerical experiments confirm the theoretical framework for Galerkin projection. In the case of Petrov-Galerkin projection, stabilization strategies must be considered for the reduced order models. The new reduced order shallow water data assimilation system provides analyses similar to those produced by the full resolution data assimilation system in one tenth of the computational time.
Extreme learning machine for reduced order modeling of turbulent geophysical flows.

PubMed

San, Omer; Maulik, Romit

2018-04-01

We investigate the application of artificial neural networks to stabilize proper orthogonal decomposition-based reduced order models for quasistationary geophysical turbulent flows. An extreme learning machine concept is introduced for computing an eddy-viscosity closure dynamically to incorporate the effects of the truncated modes. We consider a four-gyre wind-driven ocean circulation problem as our prototype setting to assess the performance of the proposed data-driven approach. Our framework provides a significant reduction in computational time and effectively retains the dynamics of the full-order model during the forward simulation period beyond the training data set. Furthermore, we show that the method is robust for larger choices of time steps and can be used as an efficient and reliable tool for long time integration of general circulation models.
Extreme learning machine for reduced order modeling of turbulent geophysical flows

NASA Astrophysics Data System (ADS)

San, Omer; Maulik, Romit

2018-04-01

We investigate the application of artificial neural networks to stabilize proper orthogonal decomposition-based reduced order models for quasistationary geophysical turbulent flows. An extreme learning machine concept is introduced for computing an eddy-viscosity closure dynamically to incorporate the effects of the truncated modes. We consider a four-gyre wind-driven ocean circulation problem as our prototype setting to assess the performance of the proposed data-driven approach. Our framework provides a significant reduction in computational time and effectively retains the dynamics of the full-order model during the forward simulation period beyond the training data set. Furthermore, we show that the method is robust for larger choices of time steps and can be used as an efficient and reliable tool for long time integration of general circulation models.
Modeling the effect of shroud contact and friction dampers on the mistuned response of turbopumps

NASA Technical Reports Server (NTRS)

Griffin, Jerry H.; Yang, M.-T.

1994-01-01

The contract has been revised. Under the revised scope of work a reduced order model has been developed that can be used to predict the steady-state response of mistuned bladed disks. The approach has been implemented in a computer code, LMCC. It is concluded that: the reduced order model displays structural fidelity comparable to that of a finite element model of an entire bladed disk system with significantly improved computational efficiency; and, when the disk is stiff, both the finite element model and LMCC predict significantly more amplitude variation than was predicted by earlier models. This second result may have important practical ramifications, especially in the case of integrally bladed disks.
A high-order multiscale finite-element method for time-domain acoustic-wave modeling

NASA Astrophysics Data System (ADS)

Gao, Kai; Fu, Shubin; Chung, Eric T.

2018-05-01

Accurate and efficient wave equation modeling is vital for many applications in such as acoustics, electromagnetics, and seismology. However, solving the wave equation in large-scale and highly heterogeneous models is usually computationally expensive because the computational cost is directly proportional to the number of grids in the model. We develop a novel high-order multiscale finite-element method to reduce the computational cost of time-domain acoustic-wave equation numerical modeling by solving the wave equation on a coarse mesh based on the multiscale finite-element theory. In contrast to existing multiscale finite-element methods that use only first-order multiscale basis functions, our new method constructs high-order multiscale basis functions from local elliptic problems which are closely related to the Gauss-Lobatto-Legendre quadrature points in a coarse element. Essentially, these basis functions are not only determined by the order of Legendre polynomials, but also by local medium properties, and therefore can effectively convey the fine-scale information to the coarse-scale solution with high-order accuracy. Numerical tests show that our method can significantly reduce the computation time while maintain high accuracy for wave equation modeling in highly heterogeneous media by solving the corresponding discrete system only on the coarse mesh with the new high-order multiscale basis functions.
A high-order multiscale finite-element method for time-domain acoustic-wave modeling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gao, Kai; Fu, Shubin; Chung, Eric T.

Accurate and efficient wave equation modeling is vital for many applications in such as acoustics, electromagnetics, and seismology. However, solving the wave equation in large-scale and highly heterogeneous models is usually computationally expensive because the computational cost is directly proportional to the number of grids in the model. We develop a novel high-order multiscale finite-element method to reduce the computational cost of time-domain acoustic-wave equation numerical modeling by solving the wave equation on a coarse mesh based on the multiscale finite-element theory. In contrast to existing multiscale finite-element methods that use only first-order multiscale basis functions, our new method constructsmore » high-order multiscale basis functions from local elliptic problems which are closely related to the Gauss–Lobatto–Legendre quadrature points in a coarse element. Essentially, these basis functions are not only determined by the order of Legendre polynomials, but also by local medium properties, and therefore can effectively convey the fine-scale information to the coarse-scale solution with high-order accuracy. Numerical tests show that our method can significantly reduce the computation time while maintain high accuracy for wave equation modeling in highly heterogeneous media by solving the corresponding discrete system only on the coarse mesh with the new high-order multiscale basis functions.« less
A high-order multiscale finite-element method for time-domain acoustic-wave modeling

DOE PAGES

Gao, Kai; Fu, Shubin; Chung, Eric T.

2018-02-04

Accurate and efficient wave equation modeling is vital for many applications in such as acoustics, electromagnetics, and seismology. However, solving the wave equation in large-scale and highly heterogeneous models is usually computationally expensive because the computational cost is directly proportional to the number of grids in the model. We develop a novel high-order multiscale finite-element method to reduce the computational cost of time-domain acoustic-wave equation numerical modeling by solving the wave equation on a coarse mesh based on the multiscale finite-element theory. In contrast to existing multiscale finite-element methods that use only first-order multiscale basis functions, our new method constructsmore » high-order multiscale basis functions from local elliptic problems which are closely related to the Gauss–Lobatto–Legendre quadrature points in a coarse element. Essentially, these basis functions are not only determined by the order of Legendre polynomials, but also by local medium properties, and therefore can effectively convey the fine-scale information to the coarse-scale solution with high-order accuracy. Numerical tests show that our method can significantly reduce the computation time while maintain high accuracy for wave equation modeling in highly heterogeneous media by solving the corresponding discrete system only on the coarse mesh with the new high-order multiscale basis functions.« less
A transfer function type of simplified electrochemical model with modified boundary conditions and Padé approximation for Li-ion battery: Part 1. lithium concentration estimation

NASA Astrophysics Data System (ADS)

Yuan, Shifei; Jiang, Lei; Yin, Chengliang; Wu, Hongjie; Zhang, Xi

2017-06-01

To guarantee the safety, high efficiency and long lifetime for lithium-ion battery, an advanced battery management system requires a physics-meaningful yet computationally efficient battery model. The pseudo-two dimensional (P2D) electrochemical model can provide physical information about the lithium concentration and potential distributions across the cell dimension. However, the extensive computation burden caused by the temporal and spatial discretization limits its real-time application. In this research, we propose a new simplified electrochemical model (SEM) by modifying the boundary conditions for electrolyte diffusion equations, which significantly facilitates the analytical solving process. Then to obtain a reduced order transfer function, the Padé approximation method is adopted to simplify the derived transcendental impedance solution. The proposed model with the reduced order transfer function can be briefly computable and preserve physical meanings through the presence of parameters such as the solid/electrolyte diffusion coefficients (Ds&De) and particle radius. The simulation illustrates that the proposed simplified model maintains high accuracy for electrolyte phase concentration (Ce) predictions, saying 0.8% and 0.24% modeling error respectively, when compared to the rigorous model under 1C-rate pulse charge/discharge and urban dynamometer driving schedule (UDDS) profiles. Meanwhile, this simplified model yields significantly reduced computational burden, which benefits its real-time application.
Estimation of Sonic Fatigue by Reduced-Order Finite Element Based Analyses

NASA Technical Reports Server (NTRS)

Rizzi, Stephen A.; Przekop, Adam

2006-01-01

A computationally efficient, reduced-order method is presented for prediction of sonic fatigue of structures exhibiting geometrically nonlinear response. A procedure to determine the nonlinear modal stiffness using commercial finite element codes allows the coupled nonlinear equations of motion in physical degrees of freedom to be transformed to a smaller coupled system of equations in modal coordinates. The nonlinear modal system is first solved using a computationally light equivalent linearization solution to determine if the structure responds to the applied loading in a nonlinear fashion. If so, a higher fidelity numerical simulation in modal coordinates is undertaken to more accurately determine the nonlinear response. Comparisons of displacement and stress response obtained from the reduced-order analyses are made with results obtained from numerical simulation in physical degrees-of-freedom. Fatigue life predictions from nonlinear modal and physical simulations are made using the rainflow cycle counting method in a linear cumulative damage analysis. Results computed for a simple beam structure under a random acoustic loading demonstrate the effectiveness of the approach and compare favorably with results obtained from the solution in physical degrees-of-freedom.
An efficient unstructured WENO method for supersonic reactive flows

NASA Astrophysics Data System (ADS)

Zhao, Wen-Geng; Zheng, Hong-Wei; Liu, Feng-Jun; Shi, Xiao-Tian; Gao, Jun; Hu, Ning; Lv, Meng; Chen, Si-Cong; Zhao, Hong-Da

2018-03-01

An efficient high-order numerical method for supersonic reactive flows is proposed in this article. The reactive source term and convection term are solved separately by splitting scheme. In the reaction step, an adaptive time-step method is presented, which can improve the efficiency greatly. In the convection step, a third-order accurate weighted essentially non-oscillatory (WENO) method is adopted to reconstruct the solution in the unstructured grids. Numerical results show that our new method can capture the correct propagation speed of the detonation wave exactly even in coarse grids, while high order accuracy can be achieved in the smooth region. In addition, the proposed adaptive splitting method can reduce the computational cost greatly compared with the traditional splitting method.
User participation in the development of the human/computer interface for control centers

NASA Technical Reports Server (NTRS)

Broome, Richard; Quick-Campbell, Marlene; Creegan, James; Dutilly, Robert

1996-01-01

Technological advances coupled with the requirements to reduce operations staffing costs led to the demand for efficient, technologically-sophisticated mission operations control centers. The control center under development for the earth observing system (EOS) is considered. The users are involved in the development of a control center in order to ensure that it is cost-efficient and flexible. A number of measures were implemented in the EOS program in order to encourage user involvement in the area of human-computer interface development. The following user participation exercises carried out in relation to the system analysis and design are described: the shadow participation of the programmers during a day of operations; the flight operations personnel interviews; and the analysis of the flight operations team tasks. The user participation in the interface prototype development, the prototype evaluation, and the system implementation are reported on. The involvement of the users early in the development process enables the requirements to be better understood and the cost to be reduced.
An Efficient Adaptive Angle-Doppler Compensation Approach for Non-Sidelooking Airborne Radar STAP

PubMed Central

Shen, Mingwei; Yu, Jia; Wu, Di; Zhu, Daiyin

2015-01-01

In this study, the effects of non-sidelooking airborne radar clutter dispersion on space-time adaptive processing (STAP) is considered, and an efficient adaptive angle-Doppler compensation (EAADC) approach is proposed to improve the clutter suppression performance. In order to reduce the computational complexity, the reduced-dimension sparse reconstruction (RDSR) technique is introduced into the angle-Doppler spectrum estimation to extract the required parameters for compensating the clutter spectral center misalignment. Simulation results to demonstrate the effectiveness of the proposed algorithm are presented. PMID:26053755
High accurate interpolation of NURBS tool path for CNC machine tools

NASA Astrophysics Data System (ADS)

Liu, Qiang; Liu, Huan; Yuan, Songmei

2016-09-01

Feedrate fluctuation caused by approximation errors of interpolation methods has great effects on machining quality in NURBS interpolation, but few methods can efficiently eliminate or reduce it to a satisfying level without sacrificing the computing efficiency at present. In order to solve this problem, a high accurate interpolation method for NURBS tool path is proposed. The proposed method can efficiently reduce the feedrate fluctuation by forming a quartic equation with respect to the curve parameter increment, which can be efficiently solved by analytic methods in real-time. Theoretically, the proposed method can totally eliminate the feedrate fluctuation for any 2nd degree NURBS curves and can interpolate 3rd degree NURBS curves with minimal feedrate fluctuation. Moreover, a smooth feedrate planning algorithm is also proposed to generate smooth tool motion with considering multiple constraints and scheduling errors by an efficient planning strategy. Experiments are conducted to verify the feasibility and applicability of the proposed method. This research presents a novel NURBS interpolation method with not only high accuracy but also satisfying computing efficiency.
POD/DEIM reduced-order strategies for efficient four dimensional variational data assimilation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ştefănescu, R., E-mail: rstefane@vt.edu; Sandu, A., E-mail: sandu@cs.vt.edu; Navon, I.M., E-mail: inavon@fsu.edu

2015-08-15

This work studies reduced order modeling (ROM) approaches to speed up the solution of variational data assimilation problems with large scale nonlinear dynamical models. It is shown that a key requirement for a successful reduced order solution is that reduced order Karush–Kuhn–Tucker conditions accurately represent their full order counterparts. In particular, accurate reduced order approximations are needed for the forward and adjoint dynamical models, as well as for the reduced gradient. New strategies to construct reduced order based are developed for proper orthogonal decomposition (POD) ROM data assimilation using both Galerkin and Petrov–Galerkin projections. For the first time POD, tensorialmore » POD, and discrete empirical interpolation method (DEIM) are employed to develop reduced data assimilation systems for a geophysical flow model, namely, the two dimensional shallow water equations. Numerical experiments confirm the theoretical framework for Galerkin projection. In the case of Petrov–Galerkin projection, stabilization strategies must be considered for the reduced order models. The new reduced order shallow water data assimilation system provides analyses similar to those produced by the full resolution data assimilation system in one tenth of the computational time.« less
An integral-factorized implementation of the driven similarity renormalization group second-order multireference perturbation theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hannon, Kevin P.; Li, Chenyang; Evangelista, Francesco A., E-mail: francesco.evangelista@emory.edu

2016-05-28

We report an efficient implementation of a second-order multireference perturbation theory based on the driven similarity renormalization group (DSRG-MRPT2) [C. Li and F. A. Evangelista, J. Chem. Theory Comput. 11, 2097 (2015)]. Our implementation employs factorized two-electron integrals to avoid storage of large four-index intermediates. It also exploits the block structure of the reference density matrices to reduce the computational cost to that of second-order Møller–Plesset perturbation theory. Our new DSRG-MRPT2 implementation is benchmarked on ten naphthyne isomers using basis sets up to quintuple-ζ quality. We find that the singlet-triplet splittings (Δ{sub ST}) of the naphthyne isomers strongly depend onmore » the equilibrium structures. For a consistent set of geometries, the Δ{sub ST} values predicted by the DSRG-MRPT2 are in good agreements with those computed by the reduced multireference coupled cluster theory with singles, doubles, and perturbative triples.« less
Efficient Parallel Algorithm For Direct Numerical Simulation of Turbulent Flows

NASA Technical Reports Server (NTRS)

Moitra, Stuti; Gatski, Thomas B.

1997-01-01

A distributed algorithm for a high-order-accurate finite-difference approach to the direct numerical simulation (DNS) of transition and turbulence in compressible flows is described. This work has two major objectives. The first objective is to demonstrate that parallel and distributed-memory machines can be successfully and efficiently used to solve computationally intensive and input/output intensive algorithms of the DNS class. The second objective is to show that the computational complexity involved in solving the tridiagonal systems inherent in the DNS algorithm can be reduced by algorithm innovations that obviate the need to use a parallelized tridiagonal solver.
Automated Approach to Very High-Order Aeroacoustic Computations. Revision

NASA Technical Reports Server (NTRS)

Dyson, Rodger W.; Goodrich, John W.

2001-01-01

Computational aeroacoustics requires efficient, high-resolution simulation tools. For smooth problems, this is best accomplished with very high-order in space and time methods on small stencils. However, the complexity of highly accurate numerical methods can inhibit their practical application, especially in irregular geometries. This complexity is reduced by using a special form of Hermite divided-difference spatial interpolation on Cartesian grids, and a Cauchy-Kowalewski recursion procedure for time advancement. In addition, a stencil constraint tree reduces the complexity of interpolating grid points that am located near wall boundaries. These procedures are used to develop automatically and to implement very high-order methods (> 15) for solving the linearized Euler equations that can achieve less than one grid point per wavelength resolution away from boundaries by including spatial derivatives of the primitive variables at each grid point. The accuracy of stable surface treatments is currently limited to 11th order for grid aligned boundaries and to 2nd order for irregular boundaries.
An Automated Approach to Very High Order Aeroacoustic Computations in Complex Geometries

NASA Technical Reports Server (NTRS)

Dyson, Rodger W.; Goodrich, John W.

2000-01-01

Computational aeroacoustics requires efficient, high-resolution simulation tools. And for smooth problems, this is best accomplished with very high order in space and time methods on small stencils. But the complexity of highly accurate numerical methods can inhibit their practical application, especially in irregular geometries. This complexity is reduced by using a special form of Hermite divided-difference spatial interpolation on Cartesian grids, and a Cauchy-Kowalewslci recursion procedure for time advancement. In addition, a stencil constraint tree reduces the complexity of interpolating grid points that are located near wall boundaries. These procedures are used to automatically develop and implement very high order methods (>15) for solving the linearized Euler equations that can achieve less than one grid point per wavelength resolution away from boundaries by including spatial derivatives of the primitive variables at each grid point. The accuracy of stable surface treatments is currently limited to 11th order for grid aligned boundaries and to 2nd order for irregular boundaries.
Enhancing battery efficiency for pervasive health-monitoring systems based on electronic textiles.

PubMed

Zheng, Nenggan; Wu, Zhaohui; Lin, Man; Yang, Laurence Tianruo

2010-03-01

Electronic textiles are regarded as one of the most important computation platforms for future computer-assisted health-monitoring applications. In these novel systems, multiple batteries are used in order to prolong their operational lifetime, which is a significant metric for system usability. However, due to the nonlinear features of batteries, computing systems with multiple batteries cannot achieve the same battery efficiency as those powered by a monolithic battery of equal capacity. In this paper, we propose an algorithm aiming to maximize battery efficiency globally for the computer-assisted health-care systems with multiple batteries. Based on an accurate analytical battery model, the concept of weighted battery fatigue degree is introduced and the novel battery-scheduling algorithm called predicted weighted fatigue degree least first (PWFDLF) is developed. Besides, we also discuss our attempts during search PWFDLF: a weighted round-robin (WRR) and a greedy algorithm achieving highest local battery efficiency, which reduces to the sequential discharging policy. Evaluation results show that a considerable improvement in battery efficiency can be obtained by PWFDLF under various battery configurations and current profiles compared to conventional sequential and WRR discharging policies.

Efficient algorithms and implementations of entropy-based moment closures for rarefied gases

NASA Astrophysics Data System (ADS)

Schaerer, Roman Pascal; Bansal, Pratyuksh; Torrilhon, Manuel

2017-07-01

We present efficient algorithms and implementations of the 35-moment system equipped with the maximum-entropy closure in the context of rarefied gases. While closures based on the principle of entropy maximization have been shown to yield very promising results for moderately rarefied gas flows, the computational cost of these closures is in general much higher than for closure theories with explicit closed-form expressions of the closing fluxes, such as Grad's classical closure. Following a similar approach as Garrett et al. (2015) [13], we investigate efficient implementations of the computationally expensive numerical quadrature method used for the moment evaluations of the maximum-entropy distribution by exploiting its inherent fine-grained parallelism with the parallelism offered by multi-core processors and graphics cards. We show that using a single graphics card as an accelerator allows speed-ups of two orders of magnitude when compared to a serial CPU implementation. To accelerate the time-to-solution for steady-state problems, we propose a new semi-implicit time discretization scheme. The resulting nonlinear system of equations is solved with a Newton type method in the Lagrange multipliers of the dual optimization problem in order to reduce the computational cost. Additionally, fully explicit time-stepping schemes of first and second order accuracy are presented. We investigate the accuracy and efficiency of the numerical schemes for several numerical test cases, including a steady-state shock-structure problem.
Development of efficient time-evolution method based on three-term recurrence relation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Akama, Tomoko, E-mail: a.tomo---s-b-l-r@suou.waseda.jp; Kobayashi, Osamu; Nanbu, Shinkoh, E-mail: shinkoh.nanbu@sophia.ac.jp

The advantage of the real-time (RT) propagation method is a direct solution of the time-dependent Schrödinger equation which describes frequency properties as well as all dynamics of a molecular system composed of electrons and nuclei in quantum physics and chemistry. Its applications have been limited by computational feasibility, as the evaluation of the time-evolution operator is computationally demanding. In this article, a new efficient time-evolution method based on the three-term recurrence relation (3TRR) was proposed to reduce the time-consuming numerical procedure. The basic formula of this approach was derived by introducing a transformation of the operator using the arcsine function.more » Since this operator transformation causes transformation of time, we derived the relation between original and transformed time. The formula was adapted to assess the performance of the RT time-dependent Hartree-Fock (RT-TDHF) method and the time-dependent density functional theory. Compared to the commonly used fourth-order Runge-Kutta method, our new approach decreased computational time of the RT-TDHF calculation by about factor of four, showing the 3TRR formula to be an efficient time-evolution method for reducing computational cost.« less
In-Process Items on LCS.

ERIC Educational Resources Information Center

Russell, Thyra K.

Morris Library at Southern Illinois University computerized its technical processes using the Library Computer System (LCS), which was implemented in the library to streamline order processing by: (1) providing up-to-date online files to track in-process items; (2) encouraging quick, efficient accessing of information; (3) reducing manual files;…
Reduced-Order Aerothermoelastic Analysis of Hypersonic Vehicle Structures

NASA Astrophysics Data System (ADS)

Falkiewicz, Nathan J.

Design and simulation of hypersonic vehicles require consideration of a variety of disciplines due to the highly coupled nature of the flight regime. In order to capture all of the potential effects on vehicle dynamics, one must consider the aerodynamics, aerodynamic heating, heat transfer, and structural dynamics as well as the interactions between these disciplines. The problem is further complicated by the large computational expense involved in capturing all of these effects and their interactions in a full-order sense. While high-fidelity modeling techniques exist for each of these disciplines, the use of such techniques is computationally infeasible in a vehicle design and control system simulation setting for such a highly coupled problem. Early in the design stage, many iterations of analyses may need to be carried out as the vehicle design matures, thus requiring quick analysis turnaround time. Additionally, the number of states used in the analyses must be small enough to allow for efficient control simulation and design. As a result, alternatives to full-order models must be considered. This dissertation presents a fully coupled, reduced-order aerothermoelastic framework for the modeling and analysis of hypersonic vehicle structures. The reduced-order transient thermal solution is a modal solution based on the proper orthogonal decomposition. The reduced-order structural dynamic model is based on projection of the equations of motion onto a Ritz modal subspace that is identified a priori. The reduced-order models are assembled into a time-domain aerothermoelastic simulation framework which uses a partitioned time-marching scheme to account for the disparate time scales of the associated physics. The aerothermoelastic modeling framework is outlined and the formulations associated with the unsteady aerodynamics, aerodynamic heating, transient thermal, and structural dynamics are outlined. Results demonstrate the accuracy of the reduced-order transient thermal and structural dynamic models under variation in boundary conditions and flight conditions. The framework is applied to representative hypersonic vehicle control surface structures and a variety of studies are conducted to assess the impact of aerothermoelastic effects on hypersonic vehicle dynamics. The results presented in this dissertation demonstrate the ability of the proposed framework to perform efficient aerothermoelastic analysis.
Development of Reduced-Order Models for Aeroelastic and Flutter Prediction Using the CFL3Dv6.0 Code

NASA Technical Reports Server (NTRS)

Silva, Walter A.; Bartels, Robert E.

2002-01-01

A reduced-order model (ROM) is developed for aeroelastic analysis using the CFL3D version 6.0 computational fluid dynamics (CFD) code, recently developed at the NASA Langley Research Center. This latest version of the flow solver includes a deforming mesh capability, a modal structural definition for nonlinear aeroelastic analyses, and a parallelization capability that provides a significant increase in computational efficiency. Flutter results for the AGARD 445.6 Wing computed using CFL3D v6.0 are presented, including discussion of associated computational costs. Modal impulse responses of the unsteady aerodynamic system are then computed using the CFL3Dv6 code and transformed into state-space form. Important numerical issues associated with the computation of the impulse responses are presented. The unsteady aerodynamic state-space ROM is then combined with a state-space model of the structure to create an aeroelastic simulation using the MATLAB/SIMULINK environment. The MATLAB/SIMULINK ROM is used to rapidly compute aeroelastic transients including flutter. The ROM shows excellent agreement with the aeroelastic analyses computed using the CFL3Dv6.0 code directly.
An Efficient Finite Element Framework to Assess Flexibility Performances of SMA Self-Expandable Carotid Artery Stents

PubMed Central

Ferraro, Mauro; Auricchio, Ferdinando; Boatti, Elisa; Scalet, Giulia; Conti, Michele; Morganti, Simone; Reali, Alessandro

2015-01-01

Computer-based simulations are nowadays widely exploited for the prediction of the mechanical behavior of different biomedical devices. In this aspect, structural finite element analyses (FEA) are currently the preferred computational tool to evaluate the stent response under bending. This work aims at developing a computational framework based on linear and higher order FEA to evaluate the flexibility of self-expandable carotid artery stents. In particular, numerical simulations involving large deformations and inelastic shape memory alloy constitutive modeling are performed, and the results suggest that the employment of higher order FEA allows accurately representing the computational domain and getting a better approximation of the solution with a widely-reduced number of degrees of freedom with respect to linear FEA. Moreover, when buckling phenomena occur, higher order FEA presents a superior capability of reproducing the nonlinear local effects related to buckling phenomena. PMID:26184329
An Efficient Two-Tier Causal Protocol for Mobile Distributed Systems

PubMed Central

Dominguez, Eduardo Lopez; Pomares Hernandez, Saul E.; Gomez, Gustavo Rodriguez; Medina, Maria Auxilio

2013-01-01

Causal ordering is a useful tool for mobile distributed systems (MDS) to reduce the non-determinism induced by three main aspects: host mobility, asynchronous execution, and unpredictable communication delays. Several causal protocols for MDS exist. Most of them, in order to reduce the overhead and the computational cost over wireless channels and mobile hosts (MH), ensure causal ordering at and according to the causal view of the Base Stations. Nevertheless, these protocols introduce certain disadvantage, such as unnecessary inhibition at the delivery of messages. In this paper, we present an efficient causal protocol for groupware that satisfies the MDS's constraints, avoiding unnecessary inhibitions and ensuring the causal delivery based on the view of the MHs. One interesting aspect of our protocol is that it dynamically adapts the causal information attached to each message based on the number of messages with immediate dependency relation, and this is not directly proportional to the number of MHs. PMID:23585828
Computationally efficient algorithm for high sampling-frequency operation of active noise control

NASA Astrophysics Data System (ADS)

Rout, Nirmal Kumar; Das, Debi Prasad; Panda, Ganapati

2015-05-01

In high sampling-frequency operation of active noise control (ANC) system the length of the secondary path estimate and the ANC filter are very long. This increases the computational complexity of the conventional filtered-x least mean square (FXLMS) algorithm. To reduce the computational complexity of long order ANC system using FXLMS algorithm, frequency domain block ANC algorithms have been proposed in past. These full block frequency domain ANC algorithms are associated with some disadvantages such as large block delay, quantization error due to computation of large size transforms and implementation difficulties in existing low-end DSP hardware. To overcome these shortcomings, the partitioned block ANC algorithm is newly proposed where the long length filters in ANC are divided into a number of equal partitions and suitably assembled to perform the FXLMS algorithm in the frequency domain. The complexity of this proposed frequency domain partitioned block FXLMS (FPBFXLMS) algorithm is quite reduced compared to the conventional FXLMS algorithm. It is further reduced by merging one fast Fourier transform (FFT)-inverse fast Fourier transform (IFFT) combination to derive the reduced structure FPBFXLMS (RFPBFXLMS) algorithm. Computational complexity analysis for different orders of filter and partition size are presented. Systematic computer simulations are carried out for both the proposed partitioned block ANC algorithms to show its accuracy compared to the time domain FXLMS algorithm.
Model's sparse representation based on reduced mixed GMsFE basis methods

NASA Astrophysics Data System (ADS)

Jiang, Lijian; Li, Qiuqi

2017-06-01

In this paper, we propose a model's sparse representation based on reduced mixed generalized multiscale finite element (GMsFE) basis methods for elliptic PDEs with random inputs. A typical application for the elliptic PDEs is the flow in heterogeneous random porous media. Mixed generalized multiscale finite element method (GMsFEM) is one of the accurate and efficient approaches to solve the flow problem in a coarse grid and obtain the velocity with local mass conservation. When the inputs of the PDEs are parameterized by the random variables, the GMsFE basis functions usually depend on the random parameters. This leads to a large number degree of freedoms for the mixed GMsFEM and substantially impacts on the computation efficiency. In order to overcome the difficulty, we develop reduced mixed GMsFE basis methods such that the multiscale basis functions are independent of the random parameters and span a low-dimensional space. To this end, a greedy algorithm is used to find a set of optimal samples from a training set scattered in the parameter space. Reduced mixed GMsFE basis functions are constructed based on the optimal samples using two optimal sampling strategies: basis-oriented cross-validation and proper orthogonal decomposition. Although the dimension of the space spanned by the reduced mixed GMsFE basis functions is much smaller than the dimension of the original full order model, the online computation still depends on the number of coarse degree of freedoms. To significantly improve the online computation, we integrate the reduced mixed GMsFE basis methods with sparse tensor approximation and obtain a sparse representation for the model's outputs. The sparse representation is very efficient for evaluating the model's outputs for many instances of parameters. To illustrate the efficacy of the proposed methods, we present a few numerical examples for elliptic PDEs with multiscale and random inputs. In particular, a two-phase flow model in random porous media is simulated by the proposed sparse representation method.
Model's sparse representation based on reduced mixed GMsFE basis methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jiang, Lijian, E-mail: ljjiang@hnu.edu.cn; Li, Qiuqi, E-mail: qiuqili@hnu.edu.cn

2017-06-01

In this paper, we propose a model's sparse representation based on reduced mixed generalized multiscale finite element (GMsFE) basis methods for elliptic PDEs with random inputs. A typical application for the elliptic PDEs is the flow in heterogeneous random porous media. Mixed generalized multiscale finite element method (GMsFEM) is one of the accurate and efficient approaches to solve the flow problem in a coarse grid and obtain the velocity with local mass conservation. When the inputs of the PDEs are parameterized by the random variables, the GMsFE basis functions usually depend on the random parameters. This leads to a largemore » number degree of freedoms for the mixed GMsFEM and substantially impacts on the computation efficiency. In order to overcome the difficulty, we develop reduced mixed GMsFE basis methods such that the multiscale basis functions are independent of the random parameters and span a low-dimensional space. To this end, a greedy algorithm is used to find a set of optimal samples from a training set scattered in the parameter space. Reduced mixed GMsFE basis functions are constructed based on the optimal samples using two optimal sampling strategies: basis-oriented cross-validation and proper orthogonal decomposition. Although the dimension of the space spanned by the reduced mixed GMsFE basis functions is much smaller than the dimension of the original full order model, the online computation still depends on the number of coarse degree of freedoms. To significantly improve the online computation, we integrate the reduced mixed GMsFE basis methods with sparse tensor approximation and obtain a sparse representation for the model's outputs. The sparse representation is very efficient for evaluating the model's outputs for many instances of parameters. To illustrate the efficacy of the proposed methods, we present a few numerical examples for elliptic PDEs with multiscale and random inputs. In particular, a two-phase flow model in random porous media is simulated by the proposed sparse representation method.« less
Parallel, adaptive finite element methods for conservation laws

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Devine, Karen D.; Flaherty, Joseph E.

1994-01-01

We construct parallel finite element methods for the solution of hyperbolic conservation laws in one and two dimensions. Spatial discretization is performed by a discontinuous Galerkin finite element method using a basis of piecewise Legendre polynomials. Temporal discretization utilizes a Runge-Kutta method. Dissipative fluxes and projection limiting prevent oscillations near solution discontinuities. A posteriori estimates of spatial errors are obtained by a p-refinement technique using superconvergence at Radau points. The resulting method is of high order and may be parallelized efficiently on MIMD computers. We compare results using different limiting schemes and demonstrate parallel efficiency through computations on an NCUBE/2 hypercube. We also present results using adaptive h- and p-refinement to reduce the computational cost of the method.
Sensitivity Analysis for Coupled Aero-structural Systems

NASA Technical Reports Server (NTRS)

Giunta, Anthony A.

1999-01-01

A novel method has been developed for calculating gradients of aerodynamic force and moment coefficients for an aeroelastic aircraft model. This method uses the Global Sensitivity Equations (GSE) to account for the aero-structural coupling, and a reduced-order modal analysis approach to condense the coupling bandwidth between the aerodynamic and structural models. Parallel computing is applied to reduce the computational expense of the numerous high fidelity aerodynamic analyses needed for the coupled aero-structural system. Good agreement is obtained between aerodynamic force and moment gradients computed with the GSE/modal analysis approach and the same quantities computed using brute-force, computationally expensive, finite difference approximations. A comparison between the computational expense of the GSE/modal analysis method and a pure finite difference approach is presented. These results show that the GSE/modal analysis approach is the more computationally efficient technique if sensitivity analysis is to be performed for two or more aircraft design parameters.
Large-scale 3D geoelectromagnetic modeling using parallel adaptive high-order finite element method

DOE PAGES

Grayver, Alexander V.; Kolev, Tzanio V.

2015-11-01

Here, we have investigated the use of the adaptive high-order finite-element method (FEM) for geoelectromagnetic modeling. Because high-order FEM is challenging from the numerical and computational points of view, most published finite-element studies in geoelectromagnetics use the lowest order formulation. Solution of the resulting large system of linear equations poses the main practical challenge. We have developed a fully parallel and distributed robust and scalable linear solver based on the optimal block-diagonal and auxiliary space preconditioners. The solver was found to be efficient for high finite element orders, unstructured and nonconforming locally refined meshes, a wide range of frequencies, largemore » conductivity contrasts, and number of degrees of freedom (DoFs). Furthermore, the presented linear solver is in essence algebraic; i.e., it acts on the matrix-vector level and thus requires no information about the discretization, boundary conditions, or physical source used, making it readily efficient for a wide range of electromagnetic modeling problems. To get accurate solutions at reduced computational cost, we have also implemented goal-oriented adaptive mesh refinement. The numerical tests indicated that if highly accurate modeling results were required, the high-order FEM in combination with the goal-oriented local mesh refinement required less computational time and DoFs than the lowest order adaptive FEM.« less
Large-scale 3D geoelectromagnetic modeling using parallel adaptive high-order finite element method

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grayver, Alexander V.; Kolev, Tzanio V.

Here, we have investigated the use of the adaptive high-order finite-element method (FEM) for geoelectromagnetic modeling. Because high-order FEM is challenging from the numerical and computational points of view, most published finite-element studies in geoelectromagnetics use the lowest order formulation. Solution of the resulting large system of linear equations poses the main practical challenge. We have developed a fully parallel and distributed robust and scalable linear solver based on the optimal block-diagonal and auxiliary space preconditioners. The solver was found to be efficient for high finite element orders, unstructured and nonconforming locally refined meshes, a wide range of frequencies, largemore » conductivity contrasts, and number of degrees of freedom (DoFs). Furthermore, the presented linear solver is in essence algebraic; i.e., it acts on the matrix-vector level and thus requires no information about the discretization, boundary conditions, or physical source used, making it readily efficient for a wide range of electromagnetic modeling problems. To get accurate solutions at reduced computational cost, we have also implemented goal-oriented adaptive mesh refinement. The numerical tests indicated that if highly accurate modeling results were required, the high-order FEM in combination with the goal-oriented local mesh refinement required less computational time and DoFs than the lowest order adaptive FEM.« less
An Accurate and Computationally Efficient Model for Membrane-Type Circular-Symmetric Micro-Hotplates

PubMed Central

Khan, Usman; Falconi, Christian

2014-01-01

Ideally, the design of high-performance micro-hotplates would require a large number of simulations because of the existence of many important design parameters as well as the possibly crucial effects of both spread and drift. However, the computational cost of FEM simulations, which are the only available tool for accurately predicting the temperature in micro-hotplates, is very high. As a result, micro-hotplate designers generally have no effective simulation-tools for the optimization. In order to circumvent these issues, here, we propose a model for practical circular-symmetric micro-hot-plates which takes advantage of modified Bessel functions, computationally efficient matrix-approach for considering the relevant boundary conditions, Taylor linearization for modeling the Joule heating and radiation losses, and external-region-segmentation strategy in order to accurately take into account radiation losses in the entire micro-hotplate. The proposed model is almost as accurate as FEM simulations and two to three orders of magnitude more computationally efficient (e.g., 45 s versus more than 8 h). The residual errors, which are mainly associated to the undesired heating in the electrical contacts, are small (e.g., few degrees Celsius for an 800 °C operating temperature) and, for important analyses, almost constant. Therefore, we also introduce a computationally-easy single-FEM-compensation strategy in order to reduce the residual errors to about 1 °C. As illustrative examples of the power of our approach, we report the systematic investigation of a spread in the membrane thermal conductivity and of combined variations of both ambient and bulk temperatures. Our model enables a much faster characterization of micro-hotplates and, thus, a much more effective optimization prior to fabrication. PMID:24763214
A space efficient flexible pivot selection approach to evaluate determinant and inverse of a matrix.

PubMed

Jafree, Hafsa Athar; Imtiaz, Muhammad; Inayatullah, Syed; Khan, Fozia Hanif; Nizami, Tajuddin

2014-01-01

This paper presents new simple approaches for evaluating determinant and inverse of a matrix. The choice of pivot selection has been kept arbitrary thus they reduce the error while solving an ill conditioned system. Computation of determinant of a matrix has been made more efficient by saving unnecessary data storage and also by reducing the order of the matrix at each iteration, while dictionary notation [1] has been incorporated for computing the matrix inverse thereby saving unnecessary calculations. These algorithms are highly class room oriented, easy to use and implemented by students. By taking the advantage of flexibility in pivot selection, one may easily avoid development of the fractions by most. Unlike the matrix inversion method [2] and [3], the presented algorithms obviate the use of permutations and inverse permutations.
An Efficient Algorithm for TUCKALS3 on Data with Large Numbers of Observation Units.

ERIC Educational Resources Information Center

Kiers, Henk A. L.; And Others

1992-01-01

A modification of the TUCKALS3 algorithm is proposed that handles three-way arrays of order I x J x K for any I. The reduced work space needed for storing data and increased execution speed make the modified algorithm very suitable for use on personal computers. (SLD)
Reduced-Order Models Based on Linear and Nonlinear Aerodynamic Impulse Responses

NASA Technical Reports Server (NTRS)

Silva, Walter A.

1999-01-01

This paper discusses a method for the identification and application of reduced-order models based on linear and nonlinear aerodynamic impulse responses. The Volterra theory of nonlinear systems and an appropriate kernel identification technique are described. Insight into the nature of kernels is provided by applying the method to the nonlinear Riccati equation in a non-aerodynamic application. The method is then applied to a nonlinear aerodynamic model of RAE 2822 supercritical airfoil undergoing plunge motions using the CFL3D Navier-Stokes flow solver with the Spalart-Allmaras turbulence model. Results demonstrate the computational efficiency of the technique.
Reduced Order Models Based on Linear and Nonlinear Aerodynamic Impulse Responses

NASA Technical Reports Server (NTRS)

Silva, Walter A.

1999-01-01

This paper discusses a method for the identification and application of reduced-order models based on linear and nonlinear aerodynamic impulse responses. The Volterra theory of nonlinear systems and an appropriate kernel identification technique are described. Insight into the nature of kernels is provided by applying the method to the nonlinear Riccati equation in a non-aerodynamic application. The method is then applied to a nonlinear aerodynamic model of an RAE 2822 supercritical airfoil undergoing plunge motions using the CFL3D Navier-Stokes flow solver with the Spalart-Allmaras turbulence model. Results demonstrate the computational efficiency of the technique.
Molecular dynamics simulations in hybrid particle-continuum schemes: Pitfalls and caveats

NASA Astrophysics Data System (ADS)

Stalter, S.; Yelash, L.; Emamy, N.; Statt, A.; Hanke, M.; Lukáčová-Medvid'ová, M.; Virnau, P.

2018-03-01

Heterogeneous multiscale methods (HMM) combine molecular accuracy of particle-based simulations with the computational efficiency of continuum descriptions to model flow in soft matter liquids. In these schemes, molecular simulations typically pose a computational bottleneck, which we investigate in detail in this study. We find that it is preferable to simulate many small systems as opposed to a few large systems, and that a choice of a simple isokinetic thermostat is typically sufficient while thermostats such as Lowe-Andersen allow for simulations at elevated viscosity. We discuss suitable choices for time steps and finite-size effects which arise in the limit of very small simulation boxes. We also argue that if colloidal systems are considered as opposed to atomistic systems, the gap between microscopic and macroscopic simulations regarding time and length scales is significantly smaller. We propose a novel reduced-order technique for the coupling to the macroscopic solver, which allows us to approximate a non-linear stress-strain relation efficiently and thus further reduce computational effort of microscopic simulations.

Accelerated solution of discrete ordinates approximation to the Boltzmann transport equation via model reduction

DOE PAGES

Tencer, John; Carlberg, Kevin; Larsen, Marvin; ...

2017-06-17

Radiation heat transfer is an important phenomenon in many physical systems of practical interest. When participating media is important, the radiative transfer equation (RTE) must be solved for the radiative intensity as a function of location, time, direction, and wavelength. In many heat-transfer applications, a quasi-steady assumption is valid, thereby removing time dependence. The dependence on wavelength is often treated through a weighted sum of gray gases (WSGG) approach. The discrete ordinates method (DOM) is one of the most common methods for approximating the angular (i.e., directional) dependence. The DOM exactly solves for the radiative intensity for a finite numbermore » of discrete ordinate directions and computes approximations to integrals over the angular space using a quadrature rule; the chosen ordinate directions correspond to the nodes of this quadrature rule. This paper applies a projection-based model-reduction approach to make high-order quadrature computationally feasible for the DOM for purely absorbing applications. First, the proposed approach constructs a reduced basis from (high-fidelity) solutions of the radiative intensity computed at a relatively small number of ordinate directions. Then, the method computes inexpensive approximations of the radiative intensity at the (remaining) quadrature points of a high-order quadrature using a reduced-order model constructed from the reduced basis. Finally, this results in a much more accurate solution than might have been achieved using only the ordinate directions used to compute the reduced basis. One- and three-dimensional test problems highlight the efficiency of the proposed method.« less
Accelerated solution of discrete ordinates approximation to the Boltzmann transport equation via model reduction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tencer, John; Carlberg, Kevin; Larsen, Marvin

Radiation heat transfer is an important phenomenon in many physical systems of practical interest. When participating media is important, the radiative transfer equation (RTE) must be solved for the radiative intensity as a function of location, time, direction, and wavelength. In many heat-transfer applications, a quasi-steady assumption is valid, thereby removing time dependence. The dependence on wavelength is often treated through a weighted sum of gray gases (WSGG) approach. The discrete ordinates method (DOM) is one of the most common methods for approximating the angular (i.e., directional) dependence. The DOM exactly solves for the radiative intensity for a finite numbermore » of discrete ordinate directions and computes approximations to integrals over the angular space using a quadrature rule; the chosen ordinate directions correspond to the nodes of this quadrature rule. This paper applies a projection-based model-reduction approach to make high-order quadrature computationally feasible for the DOM for purely absorbing applications. First, the proposed approach constructs a reduced basis from (high-fidelity) solutions of the radiative intensity computed at a relatively small number of ordinate directions. Then, the method computes inexpensive approximations of the radiative intensity at the (remaining) quadrature points of a high-order quadrature using a reduced-order model constructed from the reduced basis. Finally, this results in a much more accurate solution than might have been achieved using only the ordinate directions used to compute the reduced basis. One- and three-dimensional test problems highlight the efficiency of the proposed method.« less
The application of dynamic programming in production planning

NASA Astrophysics Data System (ADS)

Wu, Run

2017-05-01

Nowadays, with the popularity of the computers, various industries and fields are widely applying computer information technology, which brings about huge demand for a variety of application software. In order to develop software meeting various needs with most economical cost and best quality, programmers must design efficient algorithms. A superior algorithm can not only soul up one thing, but also maximize the benefits and generate the smallest overhead. As one of the common algorithms, dynamic programming algorithms are used to solving problems with some sort of optimal properties. When solving problems with a large amount of sub-problems that needs repetitive calculations, the ordinary sub-recursive method requires to consume exponential time, and dynamic programming algorithm can reduce the time complexity of the algorithm to the polynomial level, according to which we can conclude that dynamic programming algorithm is a very efficient compared to other algorithms reducing the computational complexity and enriching the computational results. In this paper, we expound the concept, basic elements, properties, core, solving steps and difficulties of the dynamic programming algorithm besides, establish the dynamic programming model of the production planning problem.
Efficient algorithms and implementations of entropy-based moment closures for rarefied gases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schaerer, Roman Pascal, E-mail: schaerer@mathcces.rwth-aachen.de; Bansal, Pratyuksh; Torrilhon, Manuel

We present efficient algorithms and implementations of the 35-moment system equipped with the maximum-entropy closure in the context of rarefied gases. While closures based on the principle of entropy maximization have been shown to yield very promising results for moderately rarefied gas flows, the computational cost of these closures is in general much higher than for closure theories with explicit closed-form expressions of the closing fluxes, such as Grad's classical closure. Following a similar approach as Garrett et al. (2015) , we investigate efficient implementations of the computationally expensive numerical quadrature method used for the moment evaluations of the maximum-entropymore » distribution by exploiting its inherent fine-grained parallelism with the parallelism offered by multi-core processors and graphics cards. We show that using a single graphics card as an accelerator allows speed-ups of two orders of magnitude when compared to a serial CPU implementation. To accelerate the time-to-solution for steady-state problems, we propose a new semi-implicit time discretization scheme. The resulting nonlinear system of equations is solved with a Newton type method in the Lagrange multipliers of the dual optimization problem in order to reduce the computational cost. Additionally, fully explicit time-stepping schemes of first and second order accuracy are presented. We investigate the accuracy and efficiency of the numerical schemes for several numerical test cases, including a steady-state shock-structure problem.« less
Algorithms and analyses for stochastic optimization for turbofan noise reduction using parallel reduced-order modeling

NASA Astrophysics Data System (ADS)

Yang, Huanhuan; Gunzburger, Max

2017-06-01

Simulation-based optimization of acoustic liner design in a turbofan engine nacelle for noise reduction purposes can dramatically reduce the cost and time needed for experimental designs. Because uncertainties are inevitable in the design process, a stochastic optimization algorithm is posed based on the conditional value-at-risk measure so that an ideal acoustic liner impedance is determined that is robust in the presence of uncertainties. A parallel reduced-order modeling framework is developed that dramatically improves the computational efficiency of the stochastic optimization solver for a realistic nacelle geometry. The reduced stochastic optimization solver takes less than 500 seconds to execute. In addition, well-posedness and finite element error analyses of the state system and optimization problem are provided.
Ultrafast and scalable cone-beam CT reconstruction using MapReduce in a cloud computing environment.

PubMed

Meng, Bowen; Pratx, Guillem; Xing, Lei

2011-12-01

Four-dimensional CT (4DCT) and cone beam CT (CBCT) are widely used in radiation therapy for accurate tumor target definition and localization. However, high-resolution and dynamic image reconstruction is computationally demanding because of the large amount of data processed. Efficient use of these imaging techniques in the clinic requires high-performance computing. The purpose of this work is to develop a novel ultrafast, scalable and reliable image reconstruction technique for 4D CBCT∕CT using a parallel computing framework called MapReduce. We show the utility of MapReduce for solving large-scale medical physics problems in a cloud computing environment. In this work, we accelerated the Feldcamp-Davis-Kress (FDK) algorithm by porting it to Hadoop, an open-source MapReduce implementation. Gated phases from a 4DCT scans were reconstructed independently. Following the MapReduce formalism, Map functions were used to filter and backproject subsets of projections, and Reduce function to aggregate those partial backprojection into the whole volume. MapReduce automatically parallelized the reconstruction process on a large cluster of computer nodes. As a validation, reconstruction of a digital phantom and an acquired CatPhan 600 phantom was performed on a commercial cloud computing environment using the proposed 4D CBCT∕CT reconstruction algorithm. Speedup of reconstruction time is found to be roughly linear with the number of nodes employed. For instance, greater than 10 times speedup was achieved using 200 nodes for all cases, compared to the same code executed on a single machine. Without modifying the code, faster reconstruction is readily achievable by allocating more nodes in the cloud computing environment. Root mean square error between the images obtained using MapReduce and a single-threaded reference implementation was on the order of 10(-7). Our study also proved that cloud computing with MapReduce is fault tolerant: the reconstruction completed successfully with identical results even when half of the nodes were manually terminated in the middle of the process. An ultrafast, reliable and scalable 4D CBCT∕CT reconstruction method was developed using the MapReduce framework. Unlike other parallel computing approaches, the parallelization and speedup required little modification of the original reconstruction code. MapReduce provides an efficient and fault tolerant means of solving large-scale computing problems in a cloud computing environment.
Ultrafast and scalable cone-beam CT reconstruction using MapReduce in a cloud computing environment

PubMed Central

Meng, Bowen; Pratx, Guillem; Xing, Lei

2011-01-01

Purpose: Four-dimensional CT (4DCT) and cone beam CT (CBCT) are widely used in radiation therapy for accurate tumor target definition and localization. However, high-resolution and dynamic image reconstruction is computationally demanding because of the large amount of data processed. Efficient use of these imaging techniques in the clinic requires high-performance computing. The purpose of this work is to develop a novel ultrafast, scalable and reliable image reconstruction technique for 4D CBCT/CT using a parallel computing framework called MapReduce. We show the utility of MapReduce for solving large-scale medical physics problems in a cloud computing environment. Methods: In this work, we accelerated the Feldcamp–Davis–Kress (FDK) algorithm by porting it to Hadoop, an open-source MapReduce implementation. Gated phases from a 4DCT scans were reconstructed independently. Following the MapReduce formalism, Map functions were used to filter and backproject subsets of projections, and Reduce function to aggregate those partial backprojection into the whole volume. MapReduce automatically parallelized the reconstruction process on a large cluster of computer nodes. As a validation, reconstruction of a digital phantom and an acquired CatPhan 600 phantom was performed on a commercial cloud computing environment using the proposed 4D CBCT/CT reconstruction algorithm. Results: Speedup of reconstruction time is found to be roughly linear with the number of nodes employed. For instance, greater than 10 times speedup was achieved using 200 nodes for all cases, compared to the same code executed on a single machine. Without modifying the code, faster reconstruction is readily achievable by allocating more nodes in the cloud computing environment. Root mean square error between the images obtained using MapReduce and a single-threaded reference implementation was on the order of 10−7. Our study also proved that cloud computing with MapReduce is fault tolerant: the reconstruction completed successfully with identical results even when half of the nodes were manually terminated in the middle of the process. Conclusions: An ultrafast, reliable and scalable 4D CBCT/CT reconstruction method was developed using the MapReduce framework. Unlike other parallel computing approaches, the parallelization and speedup required little modification of the original reconstruction code. MapReduce provides an efficient and fault tolerant means of solving large-scale computing problems in a cloud computing environment. PMID:22149842
Reduced order modeling of head related transfer functions for virtual acoustic displays

NASA Astrophysics Data System (ADS)

Willhite, Joel A.; Frampton, Kenneth D.; Grantham, D. Wesley

2003-04-01

The purpose of this work is to improve the computational efficiency in acoustic virtual applications by creating and testing reduced order models of the head related transfer functions used in localizing sound sources. State space models of varying order were generated from zero-elevation Head Related Impulse Responses (HRIRs) using Kungs Single Value Decomposition (SVD) technique. The inputs to the models are the desired azimuths of the virtual sound sources (from minus 90 deg to plus 90 deg, in 10 deg increments) and the outputs are the left and right ear impulse responses. Trials were conducted in an anechoic chamber in which subjects were exposed to real sounds that were emitted by individual speakers across a numbered speaker array, phantom sources generated from the original HRIRs, and phantom sound sources generated with the different reduced order state space models. The error in the perceived direction of the phantom sources generated from the reduced order models was compared to errors in localization using the original HRIRs.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Carpentier, J.L.; Di Bono, P.J.; Tournebise, P.J.

The efficient bounding method for DC contingency analysis is improved using reciprocity properties. Knowing the consequences of the outage of a branch, these properties provide the consequences on that branch of various kinds of outages. This is used in order to reduce computation times and to get rid of some difficulties, such as those occurring when a branch flow is close to its limit before outage. Compensation, sparse vector, sparse inverse and bounding techniques are also used. A program has been implemented for single branch outages and tested on actual French EHV 650 bus network. Computation times are 60% ofmore » the Efficient Bounding method. The relevant algorithm is described in detail in the first part of this paper. In the second part, reciprocity properties and bounding formulas are extended for multiple branch outages and for multiple generator or load outages. An algorithm is proposed in order to handle all these cases simultaneously.« less
Inner Space Perturbation Theory in Matrix Product States: Replacing Expensive Iterative Diagonalization.

PubMed

Ren, Jiajun; Yi, Yuanping; Shuai, Zhigang

2016-10-11

We propose an inner space perturbation theory (isPT) to replace the expensive iterative diagonalization in the standard density matrix renormalization group theory (DMRG). The retained reduced density matrix eigenstates are partitioned into the active and secondary space. The first-order wave function and the second- and third-order energies are easily computed by using one step Davidson iteration. Our formulation has several advantages including (i) keeping a balance between the efficiency and accuracy, (ii) capturing more entanglement with the same amount of computational time, (iii) recovery of the standard DMRG when all the basis states belong to the active space. Numerical examples for the polyacenes and periacene show that the efficiency gain is considerable and the accuracy loss due to the perturbation treatment is very small, when half of the total basis states belong to the active space. Moreover, the perturbation calculations converge in all our numerical examples.
Improved Feature Matching for Mobile Devices with IMU.

PubMed

Masiero, Andrea; Vettore, Antonio

2016-08-05

Thanks to the recent diffusion of low-cost high-resolution digital cameras and to the development of mostly automated procedures for image-based 3D reconstruction, the popularity of photogrammetry for environment surveys is constantly increasing in the last years. Automatic feature matching is an important step in order to successfully complete the photogrammetric 3D reconstruction: this step is the fundamental basis for the subsequent estimation of the geometry of the scene. This paper reconsiders the feature matching problem when dealing with smart mobile devices (e.g., when using the standard camera embedded in a smartphone as imaging sensor). More specifically, this paper aims at exploiting the information on camera movements provided by the inertial navigation system (INS) in order to make the feature matching step more robust and, possibly, computationally more efficient. First, a revised version of the affine scale-invariant feature transform (ASIFT) is considered: this version reduces the computational complexity of the original ASIFT, while still ensuring an increase of correct feature matches with respect to the SIFT. Furthermore, a new two-step procedure for the estimation of the essential matrix E (and the camera pose) is proposed in order to increase its estimation robustness and computational efficiency.
Flutter Analysis for Turbomachinery Using Volterra Series

NASA Technical Reports Server (NTRS)

Liou, Meng-Sing; Yao, Weigang

2014-01-01

The objective of this paper is to describe an accurate and efficient reduced order modeling method for aeroelastic (AE) analysis and for determining the flutter boundary. Without losing accuracy, we develop a reduced order model based on the Volterra series to achieve significant savings in computational cost. The aerodynamic force is provided by a high-fidelity solution from the Reynolds-averaged Navier-Stokes (RANS) equations; the structural mode shapes are determined from the finite element analysis. The fluid-structure coupling is then modeled by the state-space formulation with the structural displacement as input and the aerodynamic force as output, which in turn acts as an external force to the aeroelastic displacement equation for providing the structural deformation. NASA's rotor 67 blade is used to study its aeroelastic characteristics under the designated operating condition. First, the CFD results are validated against measured data available for the steady state condition. Then, the accuracy of the developed reduced order model is compared with the full-order solutions. Finally the aeroelastic solutions of the blade are computed and a flutter boundary is identified, suggesting that the rotor, with the material property chosen for the study, is structurally stable at the operating condition, free of encountering flutter.
Fractional modeling of viscoelasticity in 3D cerebral arteries and aneurysms

NASA Astrophysics Data System (ADS)

Yu, Yue; Perdikaris, Paris; Karniadakis, George Em

2016-10-01

We develop efficient numerical methods for fractional order PDEs, and employ them to investigate viscoelastic constitutive laws for arterial wall mechanics. Recent simulations using one-dimensional models [1] have indicated that fractional order models may offer a more powerful alternative for modeling the arterial wall response, exhibiting reduced sensitivity to parametric uncertainties compared with the integer-calculus-based models. Here, we study three-dimensional (3D) fractional PDEs that naturally model the continuous relaxation properties of soft tissue, and for the first time employ them to simulate flow structure interactions for patient-specific brain aneurysms. To deal with the high memory requirements and in order to accelerate the numerical evaluation of hereditary integrals, we employ a fast convolution method [2] that reduces the memory cost to O (log ⁡ (N)) and the computational complexity to O (Nlog ⁡ (N)). Furthermore, we combine the fast convolution with high-order backward differentiation to achieve third-order time integration accuracy. We confirm that in 3D viscoelastic simulations, the integer order models strongly depends on the relaxation parameters, while the fractional order models are less sensitive. As an application to long-time simulations in complex geometries, we also apply the method to modeling fluid-structure interaction of a 3D patient-specific compliant cerebral artery with an aneurysm. Taken together, our findings demonstrate that fractional calculus can be employed effectively in modeling complex behavior of materials in realistic 3D time-dependent problems if properly designed efficient algorithms are employed to overcome the extra memory requirements and computational complexity associated with the non-local character of fractional derivatives.
Fractional modeling of viscoelasticity in 3D cerebral arteries and aneurysms

PubMed Central

Perdikaris, Paris; Karniadakis, George Em

2017-01-01

We develop efficient numerical methods for fractional order PDEs, and employ them to investigate viscoelastic constitutive laws for arterial wall mechanics. Recent simulations using one-dimensional models [1] have indicated that fractional order models may offer a more powerful alternative for modeling the arterial wall response, exhibiting reduced sensitivity to parametric uncertainties compared with the integer-calculus-based models. Here, we study three-dimensional (3D) fractional PDEs that naturally model the continuous relaxation properties of soft tissue, and for the first time employ them to simulate flow structure interactions for patient-specific brain aneurysms. To deal with the high memory requirements and in order to accelerate the numerical evaluation of hereditary integrals, we employ a fast convolution method [2] that reduces the memory cost to O(log(N)) and the computational complexity to O(N log(N)). Furthermore, we combine the fast convolution with high-order backward differentiation to achieve third-order time integration accuracy. We confirm that in 3D viscoelastic simulations, the integer order models strongly depends on the relaxation parameters, while the fractional order models are less sensitive. As an application to long-time simulations in complex geometries, we also apply the method to modeling fluid–structure interaction of a 3D patient-specific compliant cerebral artery with an aneurysm. Taken together, our findings demonstrate that fractional calculus can be employed effectively in modeling complex behavior of materials in realistic 3D time-dependent problems if properly designed efficient algorithms are employed to overcome the extra memory requirements and computational complexity associated with the non-local character of fractional derivatives. PMID:29104310
A partitioned model order reduction approach to rationalise computational expenses in nonlinear fracture mechanics

PubMed Central

Kerfriden, P.; Goury, O.; Rabczuk, T.; Bordas, S.P.A.

2013-01-01

We propose in this paper a reduced order modelling technique based on domain partitioning for parametric problems of fracture. We show that coupling domain decomposition and projection-based model order reduction permits to focus the numerical effort where it is most needed: around the zones where damage propagates. No a priori knowledge of the damage pattern is required, the extraction of the corresponding spatial regions being based solely on algebra. The efficiency of the proposed approach is demonstrated numerically with an example relevant to engineering fracture. PMID:23750055
Preliminary Evaluation of MapReduce for High-Performance Climate Data Analysis

NASA Technical Reports Server (NTRS)

Duffy, Daniel Q.; Schnase, John L.; Thompson, John H.; Freeman, Shawn M.; Clune, Thomas L.

2012-01-01

MapReduce is an approach to high-performance analytics that may be useful to data intensive problems in climate research. It offers an analysis paradigm that uses clusters of computers and combines distributed storage of large data sets with parallel computation. We are particularly interested in the potential of MapReduce to speed up basic operations common to a wide range of analyses. In order to evaluate this potential, we are prototyping a series of canonical MapReduce operations over a test suite of observational and climate simulation datasets. Our initial focus has been on averaging operations over arbitrary spatial and temporal extents within Modern Era Retrospective- Analysis for Research and Applications (MERRA) data. Preliminary results suggest this approach can improve efficiencies within data intensive analytic workflows.
Efficient electronic structure theory via hierarchical scale-adaptive coupled-cluster formalism: I. Theory and computational complexity analysis

NASA Astrophysics Data System (ADS)

Lyakh, Dmitry I.

2018-03-01

A novel reduced-scaling, general-order coupled-cluster approach is formulated by exploiting hierarchical representations of many-body tensors, combined with the recently suggested formalism of scale-adaptive tensor algebra. Inspired by the hierarchical techniques from the renormalisation group approach, H/H2-matrix algebra and fast multipole method, the computational scaling reduction in our formalism is achieved via coarsening of quantum many-body interactions at larger interaction scales, thus imposing a hierarchical structure on many-body tensors of coupled-cluster theory. In our approach, the interaction scale can be defined on any appropriate Euclidean domain (spatial domain, momentum-space domain, energy domain, etc.). We show that the hierarchically resolved many-body tensors can reduce the storage requirements to O(N), where N is the number of simulated quantum particles. Subsequently, we prove that any connected many-body diagram consisting of a finite number of arbitrary-order tensors, e.g. an arbitrary coupled-cluster diagram, can be evaluated in O(NlogN) floating-point operations. On top of that, we suggest an additional approximation to further reduce the computational complexity of higher order coupled-cluster equations, i.e. equations involving higher than double excitations, which otherwise would introduce a large prefactor into formal O(NlogN) scaling.
Efficient LIDAR Point Cloud Data Managing and Processing in a Hadoop-Based Distributed Framework

NASA Astrophysics Data System (ADS)

Wang, C.; Hu, F.; Sha, D.; Han, X.

2017-10-01

Light Detection and Ranging (LiDAR) is one of the most promising technologies in surveying and mapping city management, forestry, object recognition, computer vision engineer and others. However, it is challenging to efficiently storage, query and analyze the high-resolution 3D LiDAR data due to its volume and complexity. In order to improve the productivity of Lidar data processing, this study proposes a Hadoop-based framework to efficiently manage and process LiDAR data in a distributed and parallel manner, which takes advantage of Hadoop's storage and computing ability. At the same time, the Point Cloud Library (PCL), an open-source project for 2D/3D image and point cloud processing, is integrated with HDFS and MapReduce to conduct the Lidar data analysis algorithms provided by PCL in a parallel fashion. The experiment results show that the proposed framework can efficiently manage and process big LiDAR data.
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

NASA Technical Reports Server (NTRS)

Sun, Xian-He

1997-01-01

Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper.
Technique for Very High Order Nonlinear Simulation and Validation

NASA Technical Reports Server (NTRS)

Dyson, Rodger W.

2001-01-01

Finding the sources of sound in large nonlinear fields via direct simulation currently requires excessive computational cost. This paper describes a simple technique for efficiently solving the multidimensional nonlinear Euler equations that significantly reduces this cost and demonstrates a useful approach for validating high order nonlinear methods. Up to 15th order accuracy in space and time methods were compared and it is shown that an algorithm with a fixed design accuracy approaches its maximal utility and then its usefulness exponentially decays unless higher accuracy is used. It is concluded that at least a 7th order method is required to efficiently propagate a harmonic wave using the nonlinear Euler equations to a distance of 5 wavelengths while maintaining an overall error tolerance that is low enough to capture both the mean flow and the acoustics.

A High-Order Low-Order Algorithm with Exponentially Convergent Monte Carlo for Thermal Radiative Transfer

DOE PAGES

Bolding, Simon R.; Cleveland, Mathew Allen; Morel, Jim E.

2016-10-21

In this paper, we have implemented a new high-order low-order (HOLO) algorithm for solving thermal radiative transfer problems. The low-order (LO) system is based on the spatial and angular moments of the transport equation and a linear-discontinuous finite-element spatial representation, producing equations similar to the standard S 2 equations. The LO solver is fully implicit in time and efficiently resolves the nonlinear temperature dependence at each time step. The high-order (HO) solver utilizes exponentially convergent Monte Carlo (ECMC) to give a globally accurate solution for the angular intensity to a fixed-source pure-absorber transport problem. This global solution is used tomore » compute consistency terms, which require the HO and LO solutions to converge toward the same solution. The use of ECMC allows for the efficient reduction of statistical noise in the Monte Carlo solution, reducing inaccuracies introduced through the LO consistency terms. Finally, we compare results with an implicit Monte Carlo code for one-dimensional gray test problems and demonstrate the efficiency of ECMC over standard Monte Carlo in this HOLO algorithm.« less
Efficient model reduction of parametrized systems by matrix discrete empirical interpolation

NASA Astrophysics Data System (ADS)

Negri, Federico; Manzoni, Andrea; Amsallem, David

2015-12-01

In this work, we apply a Matrix version of the so-called Discrete Empirical Interpolation (MDEIM) for the efficient reduction of nonaffine parametrized systems arising from the discretization of linear partial differential equations. Dealing with affinely parametrized operators is crucial in order to enhance the online solution of reduced-order models (ROMs). However, in many cases such an affine decomposition is not readily available, and must be recovered through (often) intrusive procedures, such as the empirical interpolation method (EIM) and its discrete variant DEIM. In this paper we show that MDEIM represents a very efficient approach to deal with complex physical and geometrical parametrizations in a non-intrusive, efficient and purely algebraic way. We propose different strategies to combine MDEIM with a state approximation resulting either from a reduced basis greedy approach or Proper Orthogonal Decomposition. A posteriori error estimates accounting for the MDEIM error are also developed in the case of parametrized elliptic and parabolic equations. Finally, the capability of MDEIM to generate accurate and efficient ROMs is demonstrated on the solution of two computationally-intensive classes of problems occurring in engineering contexts, namely PDE-constrained shape optimization and parametrized coupled problems.
hp-Adaptive time integration based on the BDF for viscous flows

NASA Astrophysics Data System (ADS)

Hay, A.; Etienne, S.; Pelletier, D.; Garon, A.

2015-06-01

This paper presents a procedure based on the Backward Differentiation Formulas of order 1 to 5 to obtain efficient time integration of the incompressible Navier-Stokes equations. The adaptive algorithm performs both stepsize and order selections to control respectively the solution accuracy and the computational efficiency of the time integration process. The stepsize selection (h-adaptivity) is based on a local error estimate and an error controller to guarantee that the numerical solution accuracy is within a user prescribed tolerance. The order selection (p-adaptivity) relies on the idea that low-accuracy solutions can be computed efficiently by low order time integrators while accurate solutions require high order time integrators to keep computational time low. The selection is based on a stability test that detects growing numerical noise and deems a method of order p stable if there is no method of lower order that delivers the same solution accuracy for a larger stepsize. Hence, it guarantees both that (1) the used method of integration operates inside of its stability region and (2) the time integration procedure is computationally efficient. The proposed time integration procedure also features a time-step rejection and quarantine mechanisms, a modified Newton method with a predictor and dense output techniques to compute solution at off-step points.
Effective orthorhombic anisotropic models for wavefield extrapolation

NASA Astrophysics Data System (ADS)

Ibanez-Jacome, Wilson; Alkhalifah, Tariq; Waheed, Umair bin

2014-09-01

Wavefield extrapolation in orthorhombic anisotropic media incorporates complicated but realistic models to reproduce wave propagation phenomena in the Earth's subsurface. Compared with the representations used for simpler symmetries, such as transversely isotropic or isotropic, orthorhombic models require an extended and more elaborated formulation that also involves more expensive computational processes. The acoustic assumption yields more efficient description of the orthorhombic wave equation that also provides a simplified representation for the orthorhombic dispersion relation. However, such representation is hampered by the sixth-order nature of the acoustic wave equation, as it also encompasses the contribution of shear waves. To reduce the computational cost of wavefield extrapolation in such media, we generate effective isotropic inhomogeneous models that are capable of reproducing the first-arrival kinematic aspects of the orthorhombic wavefield. First, in order to compute traveltimes in vertical orthorhombic media, we develop a stable, efficient and accurate algorithm based on the fast marching method. The derived orthorhombic acoustic dispersion relation, unlike the isotropic or transversely isotropic ones, is represented by a sixth order polynomial equation with the fastest solution corresponding to outgoing P waves in acoustic media. The effective velocity models are then computed by evaluating the traveltime gradients of the orthorhombic traveltime solution, and using them to explicitly evaluate the corresponding inhomogeneous isotropic velocity field. The inverted effective velocity fields are source dependent and produce equivalent first-arrival kinematic descriptions of wave propagation in orthorhombic media. We extrapolate wavefields in these isotropic effective velocity models using the more efficient isotropic operator, and the results compare well, especially kinematically, with those obtained from the more expensive anisotropic extrapolator.
A computationally efficient ductile damage model accounting for nucleation and micro-inertia at high triaxialities

DOE PAGES

Versino, Daniele; Bronkhorst, Curt Allan

2018-01-31

The computational formulation of a micro-mechanical material model for the dynamic failure of ductile metals is presented in this paper. The statistical nature of porosity initiation is accounted for by introducing an arbitrary probability density function which describes the pores nucleation pressures. Each micropore within the representative volume element is modeled as a thick spherical shell made of plastically incompressible material. The treatment of porosity by a distribution of thick-walled spheres also allows for the inclusion of micro-inertia effects under conditions of shock and dynamic loading. The second order ordinary differential equation governing the microscopic porosity evolution is solved withmore » a robust implicit procedure. A new Chebyshev collocation method is employed to approximate the porosity distribution and remapping is used to optimize memory usage. The adaptive approximation of the porosity distribution leads to a reduction of computational time and memory usage of up to two orders of magnitude. Moreover, the proposed model affords consistent performance: changing the nucleation pressure probability density function and/or the applied strain rate does not reduce accuracy or computational efficiency of the material model. The numerical performance of the model and algorithms presented is tested against three problems for high density tantalum: single void, one-dimensional uniaxial strain, and two-dimensional plate impact. Here, the results using the integration and algorithmic advances suggest a significant improvement in computational efficiency and accuracy over previous treatments for dynamic loading conditions.« less
Inverse finite-size scaling for high-dimensional significance analysis

NASA Astrophysics Data System (ADS)

Xu, Yingying; Puranen, Santeri; Corander, Jukka; Kabashima, Yoshiyuki

2018-06-01

We propose an efficient procedure for significance determination in high-dimensional dependence learning based on surrogate data testing, termed inverse finite-size scaling (IFSS). The IFSS method is based on our discovery of a universal scaling property of random matrices which enables inference about signal behavior from much smaller scale surrogate data than the dimensionality of the original data. As a motivating example, we demonstrate the procedure for ultra-high-dimensional Potts models with order of 1010 parameters. IFSS reduces the computational effort of the data-testing procedure by several orders of magnitude, making it very efficient for practical purposes. This approach thus holds considerable potential for generalization to other types of complex models.
On the placement of active members in adaptive truss structures for vibration control

NASA Technical Reports Server (NTRS)

Lu, L.-Y.; Utku, S.; Wada, B. K.

1992-01-01

The problem of optimal placement of active members which are used for vibration control in adaptive truss structures is investigated. The control scheme is based on the method of eigenvalue assignment as a means of shaping the transient response of the controlled adaptive structures, and the minimization of required control action is considered as the optimization criterion. To this end, a performance index which measures the control strokes of active members is formulated in an efficient way. In order to reduce the computation burden, particularly for the case where the locations of active members have to be selected from a large set of available sites, several heuristic searching schemes are proposed for obtaining the near-optimal locations. The proposed schemes significantly reduce the computational complexity of placing multiple active members to the order of that when a single active member is placed.
Efficient Ab initio Modeling of Random Multicomponent Alloys

DOE PAGES

Jiang, Chao; Uberuaga, Blas P.

2016-03-08

Here, we present in this Letter a novel small set of ordered structures (SSOS) method that allows extremely efficient ab initio modeling of random multi-component alloys. Using inverse II-III spinel oxides and equiatomic quinary bcc (so-called high entropy) alloys as examples, we also demonstrate that a SSOS can achieve the same accuracy as a large supercell or a well-converged cluster expansion, but with significantly reduced computational cost. In particular, because of this efficiency, a large number of quinary alloy compositions can be quickly screened, leading to the identification of several new possible high entropy alloy chemistries. Furthermore, the SSOS methodmore » developed here can be broadly useful for the rapid computational design of multi-component materials, especially those with a large number of alloying elements, a challenging problem for other approaches.« less
A Systems Approach to High Performance Buildings: A Computational Systems Engineering R&D Program to Increase DoD Energy Efficiency

DTIC Science & Technology

2012-02-01

for Low Energy Building Ventilation and Space Conditioning Systems...Building Energy Models ................... 162 APPENDIX D: Reduced-Order Modeling and Control Design for Low Energy Building Systems .... 172 D.1...Design for Low Energy Building Ventilation and Space Conditioning Systems This section focuses on the modeling and control of airflow in buildings
Simulating coupled dynamics of a rigid-flexible multibody system and compressible fluid

NASA Astrophysics Data System (ADS)

Hu, Wei; Tian, Qiang; Hu, HaiYan

2018-04-01

As a subsequent work of previous studies of authors, a new parallel computation approach is proposed to simulate the coupled dynamics of a rigid-flexible multibody system and compressible fluid. In this approach, the smoothed particle hydrodynamics (SPH) method is used to model the compressible fluid, the natural coordinate formulation (NCF) and absolute nodal coordinate formulation (ANCF) are used to model the rigid and flexible bodies, respectively. In order to model the compressible fluid properly and efficiently via SPH method, three measures are taken as follows. The first is to use the Riemann solver to cope with the fluid compressibility, the second is to define virtual particles of SPH to model the dynamic interaction between the fluid and the multibody system, and the third is to impose the boundary conditions of periodical inflow and outflow to reduce the number of SPH particles involved in the computation process. Afterwards, a parallel computation strategy is proposed based on the graphics processing unit (GPU) to detect the neighboring SPH particles and to solve the dynamic equations of SPH particles in order to improve the computation efficiency. Meanwhile, the generalized-alpha algorithm is used to solve the dynamic equations of the multibody system. Finally, four case studies are given to validate the proposed parallel computation approach.
A Parallel First-Order Linear Recurrence Solver.

DTIC Science & Technology

1986-09-01

1og2 -M)steps, but p M did not discuss any specific parallel implementation. Gajski [GAJ81] improved upon this result by performing the SIMD computation...solves a series of reduced recurrences of size p 2. However, when N = p 2, our approach reduces to that of I’- [GAJ81], except that Gajski presents the...existing SIMD algorithms to solve R<N,1>, the SIMD algo- rithm presented by Gajski [GAJ81] can be most efficiently mapped to a uni- directional ring
Vibration extraction based on fast NCC algorithm and high-speed camera.

PubMed

Lei, Xiujun; Jin, Yi; Guo, Jie; Zhu, Chang'an

2015-09-20

In this study, a high-speed camera system is developed to complete the vibration measurement in real time and to overcome the mass introduced by conventional contact measurements. The proposed system consists of a notebook computer and a high-speed camera which can capture the images as many as 1000 frames per second. In order to process the captured images in the computer, the normalized cross-correlation (NCC) template tracking algorithm with subpixel accuracy is introduced. Additionally, a modified local search algorithm based on the NCC is proposed to reduce the computation time and to increase efficiency significantly. The modified algorithm can rapidly accomplish one displacement extraction 10 times faster than the traditional template matching without installing any target panel onto the structures. Two experiments were carried out under laboratory and outdoor conditions to validate the accuracy and efficiency of the system performance in practice. The results demonstrated the high accuracy and efficiency of the camera system in extracting vibrating signals.
Improve the efficiency of the Cartesian tensor based fast multipole method for Coulomb interaction using the traces

DOE Office of Scientific and Technical Information (OSTI.GOV)

Huang, He; Luo, Li -Shi; Li, Rui

To compute the non-oscillating mutual interaction for a systems with N points, the fast multipole method (FMM) has an efficiency that scales linearly with the number of points. Specifically, for Coulomb interaction, FMM can be constructed using either the spherical harmonic functions or the totally symmetric Cartesian tensors. In this paper, we will present that the effciency of the Cartesian tensor-based FMM for the Coulomb interaction can be significantly improved by implementing the traces of the Cartesian tensors in calculation to reduce the independent elements of the n-th rank totally symmetric Cartesian tensor from (n + 1)(n + 2)=2 tomore » 2n + 1. The computation complexity for the operations in FMM are analyzed and expressed as polynomials of the highest rank of the Cartesian tensors. For most operations, the complexity is reduced by one order. Numerical examples regarding the convergence and the effciency of the new algorithm are demonstrated. As a result, a reduction of computation time up to 50% has been observed for a moderate number of points and rank of tensors.« less
Improve the efficiency of the Cartesian tensor based fast multipole method for Coulomb interaction using the traces

DOE PAGES

Huang, He; Luo, Li -Shi; Li, Rui; ...

2018-05-17

To compute the non-oscillating mutual interaction for a systems with N points, the fast multipole method (FMM) has an efficiency that scales linearly with the number of points. Specifically, for Coulomb interaction, FMM can be constructed using either the spherical harmonic functions or the totally symmetric Cartesian tensors. In this paper, we will present that the effciency of the Cartesian tensor-based FMM for the Coulomb interaction can be significantly improved by implementing the traces of the Cartesian tensors in calculation to reduce the independent elements of the n-th rank totally symmetric Cartesian tensor from (n + 1)(n + 2)=2 tomore » 2n + 1. The computation complexity for the operations in FMM are analyzed and expressed as polynomials of the highest rank of the Cartesian tensors. For most operations, the complexity is reduced by one order. Numerical examples regarding the convergence and the effciency of the new algorithm are demonstrated. As a result, a reduction of computation time up to 50% has been observed for a moderate number of points and rank of tensors.« less
Large-scale inverse model analyses employing fast randomized data reduction

NASA Astrophysics Data System (ADS)

Lin, Youzuo; Le, Ellen B.; O'Malley, Daniel; Vesselinov, Velimir V.; Bui-Thanh, Tan

2017-08-01

When the number of observations is large, it is computationally challenging to apply classical inverse modeling techniques. We have developed a new computationally efficient technique for solving inverse problems with a large number of observations (e.g., on the order of 107 or greater). Our method, which we call the randomized geostatistical approach (RGA), is built upon the principal component geostatistical approach (PCGA). We employ a data reduction technique combined with the PCGA to improve the computational efficiency and reduce the memory usage. Specifically, we employ a randomized numerical linear algebra technique based on a so-called "sketching" matrix to effectively reduce the dimension of the observations without losing the information content needed for the inverse analysis. In this way, the computational and memory costs for RGA scale with the information content rather than the size of the calibration data. Our algorithm is coded in Julia and implemented in the MADS open-source high-performance computational framework (http://mads.lanl.gov). We apply our new inverse modeling method to invert for a synthetic transmissivity field. Compared to a standard geostatistical approach (GA), our method is more efficient when the number of observations is large. Most importantly, our method is capable of solving larger inverse problems than the standard GA and PCGA approaches. Therefore, our new model inversion method is a powerful tool for solving large-scale inverse problems. The method can be applied in any field and is not limited to hydrogeological applications such as the characterization of aquifer heterogeneity.
Improving multivariate Horner schemes with Monte Carlo tree search

NASA Astrophysics Data System (ADS)

Kuipers, J.; Plaat, A.; Vermaseren, J. A. M.; van den Herik, H. J.

2013-11-01

Optimizing the cost of evaluating a polynomial is a classic problem in computer science. For polynomials in one variable, Horner's method provides a scheme for producing a computationally efficient form. For multivariate polynomials it is possible to generalize Horner's method, but this leaves freedom in the order of the variables. Traditionally, greedy schemes like most-occurring variable first are used. This simple textbook algorithm has given remarkably efficient results. Finding better algorithms has proved difficult. In trying to improve upon the greedy scheme we have implemented Monte Carlo tree search, a recent search method from the field of artificial intelligence. This results in better Horner schemes and reduces the cost of evaluating polynomials, sometimes by factors up to two.
Computationally efficient multibody simulations

NASA Technical Reports Server (NTRS)

Ramakrishnan, Jayant; Kumar, Manoj

1994-01-01

Computationally efficient approaches to the solution of the dynamics of multibody systems are presented in this work. The computational efficiency is derived from both the algorithmic and implementational standpoint. Order(n) approaches provide a new formulation of the equations of motion eliminating the assembly and numerical inversion of a system mass matrix as required by conventional algorithms. Computational efficiency is also gained in the implementation phase by the symbolic processing and parallel implementation of these equations. Comparison of this algorithm with existing multibody simulation programs illustrates the increased computational efficiency.
Data compression strategies for ptychographic diffraction imaging

NASA Astrophysics Data System (ADS)

Loetgering, Lars; Rose, Max; Treffer, David; Vartanyants, Ivan A.; Rosenhahn, Axel; Wilhein, Thomas

2017-12-01

Ptychography is a computational imaging method for solving inverse scattering problems. To date, the high amount of redundancy present in ptychographic data sets requires computer memory that is orders of magnitude larger than the retrieved information. Here, we propose and compare data compression strategies that significantly reduce the amount of data required for wavefield inversion. Information metrics are used to measure the amount of data redundancy present in ptychographic data. Experimental results demonstrate the technique to be memory efficient and stable in the presence of systematic errors such as partial coherence and noise.
An efficient computer based wavelets approximation method to solve Fuzzy boundary value differential equations

NASA Astrophysics Data System (ADS)

Alam Khan, Najeeb; Razzaq, Oyoon Abdul

2016-03-01

In the present work a wavelets approximation method is employed to solve fuzzy boundary value differential equations (FBVDEs). Essentially, a truncated Legendre wavelets series together with the Legendre wavelets operational matrix of derivative are utilized to convert FB- VDE into a simple computational problem by reducing it into a system of fuzzy algebraic linear equations. The capability of scheme is investigated on second order FB- VDE considered under generalized H-differentiability. Solutions are represented graphically showing competency and accuracy of this method.
Model Order Reduction Algorithm for Estimating the Absorption Spectrum

DOE Office of Scientific and Technical Information (OSTI.GOV)

Van Beeumen, Roel; Williams-Young, David B.; Kasper, Joseph M.

The ab initio description of the spectral interior of the absorption spectrum poses both a theoretical and computational challenge for modern electronic structure theory. Due to the often spectrally dense character of this domain in the quantum propagator’s eigenspectrum for medium-to-large sized systems, traditional approaches based on the partial diagonalization of the propagator often encounter oscillatory and stagnating convergence. Electronic structure methods which solve the molecular response problem through the solution of spectrally shifted linear systems, such as the complex polarization propagator, offer an alternative approach which is agnostic to the underlying spectral density or domain location. This generality comesmore » at a seemingly high computational cost associated with solving a large linear system for each spectral shift in some discretization of the spectral domain of interest. In this work, we present a novel, adaptive solution to this high computational overhead based on model order reduction techniques via interpolation. Model order reduction reduces the computational complexity of mathematical models and is ubiquitous in the simulation of dynamical systems and control theory. The efficiency and effectiveness of the proposed algorithm in the ab initio prediction of X-ray absorption spectra is demonstrated using a test set of challenging water clusters which are spectrally dense in the neighborhood of the oxygen K-edge. On the basis of a single, user defined tolerance we automatically determine the order of the reduced models and approximate the absorption spectrum up to the given tolerance. We also illustrate that, for the systems studied, the automatically determined model order increases logarithmically with the problem dimension, compared to a linear increase of the number of eigenvalues within the energy window. Furthermore, we observed that the computational cost of the proposed algorithm only scales quadratically with respect to the problem dimension.« less

Higher-order adaptive finite-element methods for Kohn–Sham density functional theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Motamarri, P.; Nowak, M.R.; Leiter, K.

2013-11-15

We present an efficient computational approach to perform real-space electronic structure calculations using an adaptive higher-order finite-element discretization of Kohn–Sham density-functional theory (DFT). To this end, we develop an a priori mesh-adaption technique to construct a close to optimal finite-element discretization of the problem. We further propose an efficient solution strategy for solving the discrete eigenvalue problem by using spectral finite-elements in conjunction with Gauss–Lobatto quadrature, and a Chebyshev acceleration technique for computing the occupied eigenspace. The proposed approach has been observed to provide a staggering 100–200-fold computational advantage over the solution of a generalized eigenvalue problem. Using the proposedmore » solution procedure, we investigate the computational efficiency afforded by higher-order finite-element discretizations of the Kohn–Sham DFT problem. Our studies suggest that staggering computational savings—of the order of 1000-fold—relative to linear finite-elements can be realized, for both all-electron and local pseudopotential calculations, by using higher-order finite-element discretizations. On all the benchmark systems studied, we observe diminishing returns in computational savings beyond the sixth-order for accuracies commensurate with chemical accuracy, suggesting that the hexic spectral-element may be an optimal choice for the finite-element discretization of the Kohn–Sham DFT problem. A comparative study of the computational efficiency of the proposed higher-order finite-element discretizations suggests that the performance of finite-element basis is competing with the plane-wave discretization for non-periodic local pseudopotential calculations, and compares to the Gaussian basis for all-electron calculations to within an order of magnitude. Further, we demonstrate the capability of the proposed approach to compute the electronic structure of a metallic system containing 1688 atoms using modest computational resources, and good scalability of the present implementation up to 192 processors.« less
Localized basis functions and other computational improvements in variational nonorthogonal basis function methods for quantum mechanical scattering problems involving chemical reactions

NASA Technical Reports Server (NTRS)

Schwenke, David W.; Truhlar, Donald G.

1990-01-01

The Generalized Newton Variational Principle for 3D quantum mechanical reactive scattering is briefly reviewed. Then three techniques are described which improve the efficiency of the computations. First, the fact that the Hamiltonian is Hermitian is used to reduce the number of integrals computed, and then the properties of localized basis functions are exploited in order to eliminate redundant work in the integral evaluation. A new type of localized basis function with desirable properties is suggested. It is shown how partitioned matrices can be used with localized basis functions to reduce the amount of work required to handle the complex boundary conditions. The new techniques do not introduce any approximations into the calculations, so they may be used to obtain converged solutions of the Schroedinger equation.
Performance of a reduced-order FSI model for flow-induced vocal fold vibration

NASA Astrophysics Data System (ADS)

Chang, Siyuan; Luo, Haoxiang; Luo's lab Team

2016-11-01

Vocal fold vibration during speech production involves a three-dimensional unsteady glottal jet flow and three-dimensional nonlinear tissue mechanics. A full 3D fluid-structure interaction (FSI) model is computationally expensive even though it provides most accurate information about the system. On the other hand, an efficient reduced-order FSI model is useful for fast simulation and analysis of the vocal fold dynamics, which is often needed in procedures such as optimization and parameter estimation. In this work, we study the performance of a reduced-order model as compared with the corresponding full 3D model in terms of its accuracy in predicting the vibration frequency and deformation mode. In the reduced-order model, we use a 1D flow model coupled with a 3D tissue model. Two different hyperelastic tissue behaviors are assumed. In addition, the vocal fold thickness and subglottal pressure are varied for systematic comparison. The result shows that the reduced-order model provides consistent predictions as the full 3D model across different tissue material assumptions and subglottal pressures. However, the vocal fold thickness has most effect on the model accuracy, especially when the vocal fold is thin. Supported by the NSF.
Attitude Determination Algorithm based on Relative Quaternion Geometry of Velocity Incremental Vectors for Cost Efficient AHRS Design

NASA Astrophysics Data System (ADS)

Lee, Byungjin; Lee, Young Jae; Sung, Sangkyung

2018-05-01

A novel attitude determination method is investigated that is computationally efficient and implementable in low cost sensor and embedded platform. Recent result on attitude reference system design is adapted to further develop a three-dimensional attitude determination algorithm through the relative velocity incremental measurements. For this, velocity incremental vectors, computed respectively from INS and GPS with different update rate, are compared to generate filter measurement for attitude estimation. In the quaternion-based Kalman filter configuration, an Euler-like attitude perturbation angle is uniquely introduced for reducing filter states and simplifying propagation processes. Furthermore, assuming a small angle approximation between attitude update periods, it is shown that the reduced order filter greatly simplifies the propagation processes. For performance verification, both simulation and experimental studies are completed. A low cost MEMS IMU and GPS receiver are employed for system integration, and comparison with the true trajectory or a high-grade navigation system demonstrates the performance of the proposed algorithm.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Bolding, Simon R.; Cleveland, Mathew Allen; Morel, Jim E.

In this paper, we have implemented a new high-order low-order (HOLO) algorithm for solving thermal radiative transfer problems. The low-order (LO) system is based on the spatial and angular moments of the transport equation and a linear-discontinuous finite-element spatial representation, producing equations similar to the standard S 2 equations. The LO solver is fully implicit in time and efficiently resolves the nonlinear temperature dependence at each time step. The high-order (HO) solver utilizes exponentially convergent Monte Carlo (ECMC) to give a globally accurate solution for the angular intensity to a fixed-source pure-absorber transport problem. This global solution is used tomore » compute consistency terms, which require the HO and LO solutions to converge toward the same solution. The use of ECMC allows for the efficient reduction of statistical noise in the Monte Carlo solution, reducing inaccuracies introduced through the LO consistency terms. Finally, we compare results with an implicit Monte Carlo code for one-dimensional gray test problems and demonstrate the efficiency of ECMC over standard Monte Carlo in this HOLO algorithm.« less
Application of the MacCormack scheme to overland flow routing for high-spatial resolution distributed hydrological model

NASA Astrophysics Data System (ADS)

Zhang, Ling; Nan, Zhuotong; Liang, Xu; Xu, Yi; Hernández, Felipe; Li, Lianxia

2018-03-01

Although process-based distributed hydrological models (PDHMs) are evolving rapidly over the last few decades, their extensive applications are still challenged by the computational expenses. This study attempted, for the first time, to apply the numerically efficient MacCormack algorithm to overland flow routing in a representative high-spatial resolution PDHM, i.e., the distributed hydrology-soil-vegetation model (DHSVM), in order to improve its computational efficiency. The analytical verification indicates that both the semi and full versions of the MacCormack schemes exhibit robust numerical stability and are more computationally efficient than the conventional explicit linear scheme. The full-version outperforms the semi-version in terms of simulation accuracy when a same time step is adopted. The semi-MacCormack scheme was implemented into DHSVM (version 3.1.2) to solve the kinematic wave equations for overland flow routing. The performance and practicality of the enhanced DHSVM-MacCormack model was assessed by performing two groups of modeling experiments in the Mercer Creek watershed, a small urban catchment near Bellevue, Washington. The experiments show that DHSVM-MacCormack can considerably improve the computational efficiency without compromising the simulation accuracy of the original DHSVM model. More specifically, with the same computational environment and model settings, the computational time required by DHSVM-MacCormack can be reduced to several dozen minutes for a simulation period of three months (in contrast with one day and a half by the original DHSVM model) without noticeable sacrifice of the accuracy. The MacCormack scheme proves to be applicable to overland flow routing in DHSVM, which implies that it can be coupled into other PHDMs for watershed routing to either significantly improve their computational efficiency or to make the kinematic wave routing for high resolution modeling computational feasible.
Developing an eLearning tool formalizing in YAWL the guidelines used in a transfusion medicine service.

PubMed

Russo, Paola; Piazza, Miriam; Leonardi, Giorgio; Roncoroni, Layla; Russo, Carlo; Spadaro, Salvatore; Quaglini, Silvana

2012-01-01

The blood transfusion is a complex activity subject to a high risk of eventually fatal errors. The development and application of computer-based systems could help reducing the error rate, playing a fundamental role in the improvement of the quality of care. This poster presents an under development eLearning tool formalizing the guidelines of the transfusion process. This system, implemented in YAWL (Yet Another Workflow Language), will be used to train the personnel in order to improve the efficiency of care and to reduce errors.
Fast Computation of Solvation Free Energies with Molecular Density Functional Theory: Thermodynamic-Ensemble Partial Molar Volume Corrections.

PubMed

Sergiievskyi, Volodymyr P; Jeanmairet, Guillaume; Levesque, Maximilien; Borgis, Daniel

2014-06-05

Molecular density functional theory (MDFT) offers an efficient implicit-solvent method to estimate molecule solvation free-energies, whereas conserving a fully molecular representation of the solvent. Even within a second-order approximation for the free-energy functional, the so-called homogeneous reference fluid approximation, we show that the hydration free-energies computed for a data set of 500 organic compounds are of similar quality as those obtained from molecular dynamics free-energy perturbation simulations, with a computer cost reduced by 2-3 orders of magnitude. This requires to introduce the proper partial volume correction to transform the results from the grand canonical to the isobaric-isotherm ensemble that is pertinent to experiments. We show that this correction can be extended to 3D-RISM calculations, giving a sound theoretical justification to empirical partial molar volume corrections that have been proposed recently.
A parallelization method for time periodic steady state in simulation of radio frequency sheath dynamics

NASA Astrophysics Data System (ADS)

Kwon, Deuk-Chul; Shin, Sung-Sik; Yu, Dong-Hun

2017-10-01

In order to reduce the computing time in simulation of radio frequency (rf) plasma sources, various numerical schemes were developed. It is well known that the upwind, exponential, and power-law schemes can efficiently overcome the limitation on the grid size for fluid transport simulations of high density plasma discharges. Also, the semi-implicit method is a well-known numerical scheme to overcome on the simulation time step. However, despite remarkable advances in numerical techniques and computing power over the last few decades, efficient multi-dimensional modeling of low temperature plasma discharges has remained a considerable challenge. In particular, there was a difficulty on parallelization in time for the time periodic steady state problems such as capacitively coupled plasma discharges and rf sheath dynamics because values of plasma parameters in previous time step are used to calculate new values each time step. Therefore, we present a parallelization method for the time periodic steady state problems by using period-slices. In order to evaluate the efficiency of the developed method, one-dimensional fluid simulations are conducted for describing rf sheath dynamics. The result shows that speedup can be achieved by using a multithreading method.
Certified dual-corrected radiation patterns of phased antenna arrays by offline–online order reduction of finite-element models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sommer, A., E-mail: a.sommer@lte.uni-saarland.de; Farle, O., E-mail: o.farle@lte.uni-saarland.de; Dyczij-Edlinger, R., E-mail: edlinger@lte.uni-saarland.de

2015-10-15

This paper presents a fast numerical method for computing certified far-field patterns of phased antenna arrays over broad frequency bands as well as wide ranges of steering and look angles. The proposed scheme combines finite-element analysis, dual-corrected model-order reduction, and empirical interpolation. To assure the reliability of the results, improved a posteriori error bounds for the radiated power and directive gain are derived. Both the reduced-order model and the error-bounds algorithm feature offline–online decomposition. A real-world example is provided to demonstrate the efficiency and accuracy of the suggested approach.
ADM For Solving Linear Second-Order Fredholm Integro-Differential Equations

NASA Astrophysics Data System (ADS)

Karim, Mohd F.; Mohamad, Mahathir; Saifullah Rusiman, Mohd; Che-Him, Norziha; Roslan, Rozaini; Khalid, Kamil

2018-04-01

In this paper, we apply Adomian Decomposition Method (ADM) as numerically analyse linear second-order Fredholm Integro-differential Equations. The approximate solutions of the problems are calculated by Maple package. Some numerical examples have been considered to illustrate the ADM for solving this equation. The results are compared with the existing exact solution. Thus, the Adomian decomposition method can be the best alternative method for solving linear second-order Fredholm Integro-Differential equation. It converges to the exact solution quickly and in the same time reduces computational work for solving the equation. The result obtained by ADM shows the ability and efficiency for solving these equations.
Composite SAR imaging using sequential joint sparsity

NASA Astrophysics Data System (ADS)

Sanders, Toby; Gelb, Anne; Platte, Rodrigo B.

2017-06-01

This paper investigates accurate and efficient ℓ1 regularization methods for generating synthetic aperture radar (SAR) images. Although ℓ1 regularization algorithms are already employed in SAR imaging, practical and efficient implementation in terms of real time imaging remain a challenge. Here we demonstrate that fast numerical operators can be used to robustly implement ℓ1 regularization methods that are as or more efficient than traditional approaches such as back projection, while providing superior image quality. In particular, we develop a sequential joint sparsity model for composite SAR imaging which naturally combines the joint sparsity methodology with composite SAR. Our technique, which can be implemented using standard, fractional, or higher order total variation regularization, is able to reduce the effects of speckle and other noisy artifacts with little additional computational cost. Finally we show that generalizing total variation regularization to non-integer and higher orders provides improved flexibility and robustness for SAR imaging.
Complete set of invariants of a 4th order tensor: the 12 tasks of HARDI from ternary quartics.

PubMed

Papadopoulo, Théo; Ghosh, Aurobrata; Deriche, Rachid

2014-01-01

Invariants play a crucial role in Diffusion MRI. In DTI (2nd order tensors), invariant scalars (FA, MD) have been successfully used in clinical applications. But DTI has limitations and HARDI models (e.g. 4th order tensors) have been proposed instead. These, however, lack invariant features and computing them systematically is challenging. We present a simple and systematic method to compute a functionally complete set of invariants of a non-negative 3D 4th order tensor with respect to SO3. Intuitively, this transforms the tensor's non-unique ternary quartic (TQ) decomposition (from Hilbert's theorem) to a unique canonical representation independent of orientation - the invariants. The method consists of two steps. In the first, we reduce the 18 degrees-of-freedom (DOF) of a TQ representation by 3-DOFs via an orthogonal transformation. This transformation is designed to enhance a rotation-invariant property of choice of the 3D 4th order tensor. In the second, we further reduce 3-DOFs via a 3D rotation transformation of coordinates to arrive at a canonical set of invariants to SO3 of the tensor. The resulting invariants are, by construction, (i) functionally complete, (ii) functionally irreducible (if desired), (iii) computationally efficient and (iv) reversible (mappable to the TQ coefficients or shape); which is the novelty of our contribution in comparison to prior work. Results from synthetic and real data experiments validate the method and indicate its importance.
SmartVeh: Secure and Efficient Message Access Control and Authentication for Vehicular Cloud Computing.

PubMed

Huang, Qinlong; Yang, Yixian; Shi, Yuxiang

2018-02-24

With the growing number of vehicles and popularity of various services in vehicular cloud computing (VCC), message exchanging among vehicles under traffic conditions and in emergency situations is one of the most pressing demands, and has attracted significant attention. However, it is an important challenge to authenticate the legitimate sources of broadcast messages and achieve fine-grained message access control. In this work, we propose SmartVeh, a secure and efficient message access control and authentication scheme in VCC. A hierarchical, attribute-based encryption technique is utilized to achieve fine-grained and flexible message sharing, which ensures that vehicles whose persistent or dynamic attributes satisfy the access policies can access the broadcast message with equipped on-board units (OBUs). Message authentication is enforced by integrating an attribute-based signature, which achieves message authentication and maintains the anonymity of the vehicles. In order to reduce the computations of the OBUs in the vehicles, we outsource the heavy computations of encryption, decryption and signing to a cloud server and road-side units. The theoretical analysis and simulation results reveal that our secure and efficient scheme is suitable for VCC.
SmartVeh: Secure and Efficient Message Access Control and Authentication for Vehicular Cloud Computing

PubMed Central

Yang, Yixian; Shi, Yuxiang

2018-01-01

With the growing number of vehicles and popularity of various services in vehicular cloud computing (VCC), message exchanging among vehicles under traffic conditions and in emergency situations is one of the most pressing demands, and has attracted significant attention. However, it is an important challenge to authenticate the legitimate sources of broadcast messages and achieve fine-grained message access control. In this work, we propose SmartVeh, a secure and efficient message access control and authentication scheme in VCC. A hierarchical, attribute-based encryption technique is utilized to achieve fine-grained and flexible message sharing, which ensures that vehicles whose persistent or dynamic attributes satisfy the access policies can access the broadcast message with equipped on-board units (OBUs). Message authentication is enforced by integrating an attribute-based signature, which achieves message authentication and maintains the anonymity of the vehicles. In order to reduce the computations of the OBUs in the vehicles, we outsource the heavy computations of encryption, decryption and signing to a cloud server and road-side units. The theoretical analysis and simulation results reveal that our secure and efficient scheme is suitable for VCC. PMID:29495269
More efficient optimization of long-term water supply portfolios

NASA Astrophysics Data System (ADS)

Kirsch, Brian R.; Characklis, Gregory W.; Dillard, Karen E. M.; Kelley, C. T.

2009-03-01

The use of temporary transfers, such as options and leases, has grown as utilities attempt to meet increases in demand while reducing dependence on the expansion of costly infrastructure capacity (e.g., reservoirs). Earlier work has been done to construct optimal portfolios comprising firm capacity and transfers, using decision rules that determine the timing and volume of transfers. However, such work has only focused on the short-term (e.g., 1-year scenarios), which limits the utility of these planning efforts. Developing multiyear portfolios can lead to the exploration of a wider range of alternatives but also increases the computational burden. This work utilizes a coupled hydrologic-economic model to simulate the long-term performance of a city's water supply portfolio. This stochastic model is linked with an optimization search algorithm that is designed to handle the high-frequency, low-amplitude noise inherent in many simulations, particularly those involving expected values. This noise is detrimental to the accuracy and precision of the optimized solution and has traditionally been controlled by investing greater computational effort in the simulation. However, the increased computational effort can be substantial. This work describes the integration of a variance reduction technique (control variate method) within the simulation/optimization as a means of more efficiently identifying minimum cost portfolios. Random variation in model output (i.e., noise) is moderated using knowledge of random variations in stochastic input variables (e.g., reservoir inflows, demand), thereby reducing the computing time by 50% or more. Using these efficiency gains, water supply portfolios are evaluated over a 10-year period in order to assess their ability to reduce costs and adapt to demand growth, while still meeting reliability goals. As a part of the evaluation, several multiyear option contract structures are explored and compared.
Progress of High Efficiency Centrifugal Compressor Simulations Using TURBO

NASA Technical Reports Server (NTRS)

Kulkarni, Sameer; Beach, Timothy A.

2017-01-01

Three-dimensional, time-accurate, and phase-lagged computational fluid dynamics (CFD) simulations of the High Efficiency Centrifugal Compressor (HECC) stage were generated using the TURBO solver. Changes to the TURBO Parallel Version 4 source code were made in order to properly model the no-slip boundary condition along the spinning hub region for centrifugal impellers. A startup procedure was developed to generate a converged flow field in TURBO. This procedure initialized computations on a coarsened mesh generated by the Turbomachinery Gridding System (TGS) and relied on a method of systematically increasing wheel speed and backpressure. Baseline design-speed TURBO results generally overpredicted total pressure ratio, adiabatic efficiency, and the choking flow rate of the HECC stage as compared with the design-intent CFD results of Code Leo. Including diffuser fillet geometry in the TURBO computation resulted in a 0.6 percent reduction in the choking flow rate and led to a better match with design-intent CFD. Diffuser fillets reduced annulus cross-sectional area but also reduced corner separation, and thus blockage, in the diffuser passage. It was found that the TURBO computations are somewhat insensitive to inlet total pressure changing from the TURBO default inlet pressure of 14.7 pounds per square inch (101.35 kilopascals) down to 11.0 pounds per square inch (75.83 kilopascals), the inlet pressure of the component test. Off-design tip clearance was modeled in TURBO in two computations: one in which the blade tip geometry was trimmed by 12 mils (0.3048 millimeters), and another in which the hub flow path was moved to reflect a 12-mil axial shift in the impeller hub, creating a step at the hub. The one-dimensional results of these two computations indicate non-negligible differences between the two modeling approaches.
Efficient parallelization of analytic bond-order potentials for large-scale atomistic simulations

NASA Astrophysics Data System (ADS)

Teijeiro, C.; Hammerschmidt, T.; Drautz, R.; Sutmann, G.

2016-07-01

Analytic bond-order potentials (BOPs) provide a way to compute atomistic properties with controllable accuracy. For large-scale computations of heterogeneous compounds at the atomistic level, both the computational efficiency and memory demand of BOP implementations have to be optimized. Since the evaluation of BOPs is a local operation within a finite environment, the parallelization concepts known from short-range interacting particle simulations can be applied to improve the performance of these simulations. In this work, several efficient parallelization methods for BOPs that use three-dimensional domain decomposition schemes are described. The schemes are implemented into the bond-order potential code BOPfox, and their performance is measured in a series of benchmarks. Systems of up to several millions of atoms are simulated on a high performance computing system, and parallel scaling is demonstrated for up to thousands of processors.
Motion Planning of Two Stacker Cranes in a Large-Scale Automated Storage/Retrieval System

NASA Astrophysics Data System (ADS)

Kung, Yiheng; Kobayashi, Yoshimasa; Higashi, Toshimitsu; Ota, Jun

We propose a method for reducing the computational time of motion planning for stacker cranes. Most automated storage/retrieval systems (AS/RSs) are only equipped with one stacker crane. However, this is logistically challenging, and greater work efficiency in warehouses, such as those using two stacker cranes, is required. In this paper, a warehouse with two stacker cranes working simultaneously is proposed. Unlike warehouses with only one crane, trajectory planning in those with two cranes is very difficult. Since there are two cranes working together, a proper trajectory must be considered to avoid collision. However, verifying collisions is complicated and requires a considerable amount of computational time. As transport work in AS/RSs occurs randomly, motion planning cannot be conducted in advance. Planning an appropriate trajectory within a restricted duration would be a difficult task. We thereby address the current problem of motion planning requiring extensive calculation time. As a solution, we propose a “free-step” to simplify the procedure of collision verification and reduce the computational time. On the other hand, we proposed a method to reschedule the order of collision verification in order to find an appropriate trajectory in less time. By the proposed method, we reduce the calculation time to less than 1/300 of that achieved in former research.
A new adaptively central-upwind sixth-order WENO scheme

NASA Astrophysics Data System (ADS)

Huang, Cong; Chen, Li Li

2018-03-01

In this paper, we propose a new sixth-order WENO scheme for solving one dimensional hyperbolic conservation laws. The new WENO reconstruction has three properties: (1) it is central in smooth region for low dissipation, and is upwind near discontinuities for numerical stability; (2) it is a convex combination of four linear reconstructions, in which one linear reconstruction is sixth order, and the others are third order; (3) its linear weights can be any positive numbers with requirement that their sum equals one. Furthermore, we propose a simple smoothness indicator for the sixth-order linear reconstruction, this smooth indicator not only can distinguish the smooth region and discontinuities exactly, but also can reduce the computational cost, thus it is more efficient than the classical one.

An Improved Clustering Algorithm of Tunnel Monitoring Data for Cloud Computing

PubMed Central

Zhong, Luo; Tang, KunHao; Li, Lin; Yang, Guang; Ye, JingJing

2014-01-01

With the rapid development of urban construction, the number of urban tunnels is increasing and the data they produce become more and more complex. It results in the fact that the traditional clustering algorithm cannot handle the mass data of the tunnel. To solve this problem, an improved parallel clustering algorithm based on k-means has been proposed. It is a clustering algorithm using the MapReduce within cloud computing that deals with data. It not only has the advantage of being used to deal with mass data but also is more efficient. Moreover, it is able to compute the average dissimilarity degree of each cluster in order to clean the abnormal data. PMID:24982971
A vector radiative transfer model for coupled atmosphere and ocean systems based on successive order of scattering method.

PubMed

Zhai, Peng-Wang; Hu, Yongxiang; Trepte, Charles R; Lucker, Patricia L

2009-02-16

A vector radiative transfer model has been developed for coupled atmosphere and ocean systems based on the Successive Order of Scattering (SOS) Method. The emphasis of this study is to make the model easy-to-use and computationally efficient. This model provides the full Stokes vector at arbitrary locations which can be conveniently specified by users. The model is capable of tracking and labeling different sources of the photons that are measured, e.g. water leaving radiances and reflected sky lights. This model also has the capability to separate florescence from multi-scattered sunlight. The delta - fit technique has been adopted to reduce computational time associated with the strongly forward-peaked scattering phase matrices. The exponential - linear approximation has been used to reduce the number of discretized vertical layers while maintaining the accuracy. This model is developed to serve the remote sensing community in harvesting physical parameters from multi-platform, multi-sensor measurements that target different components of the atmosphere-oceanic system.
Stabilization Approaches for Linear and Nonlinear Reduced Order Models

NASA Astrophysics Data System (ADS)

Rezaian, Elnaz; Wei, Mingjun

2017-11-01

It has been a major concern to establish reduced order models (ROMs) as reliable representatives of the dynamics inherent in high fidelity simulations, while fast computation is achieved. In practice it comes to stability and accuracy of ROMs. Given the inviscid nature of Euler equations it becomes more challenging to achieve stability, especially where moving discontinuities exist. Originally unstable linear and nonlinear ROMs are stabilized here by two approaches. First, a hybrid method is developed by integrating two different stabilization algorithms. At the same time, symmetry inner product is introduced in the generation of ROMs for its known robust behavior for compressible flows. Results have shown a notable improvement in computational efficiency and robustness compared to similar approaches. Second, a new stabilization algorithm is developed specifically for nonlinear ROMs. This method adopts Particle Swarm Optimization to enforce a bounded ROM response for minimum discrepancy between the high fidelity simulation and the ROM outputs. Promising results are obtained in its application on the nonlinear ROM of an inviscid fluid flow with discontinuities. Supported by ARL.
Robust simulation of buckled structures using reduced order modeling

NASA Astrophysics Data System (ADS)

Wiebe, R.; Perez, R. A.; Spottswood, S. M.

2016-09-01

Lightweight metallic structures are a mainstay in aerospace engineering. For these structures, stability, rather than strength, is often the critical limit state in design. For example, buckling of panels and stiffeners may occur during emergency high-g maneuvers, while in supersonic and hypersonic aircraft, it may be induced by thermal stresses. The longstanding solution to such challenges was to increase the sizing of the structural members, which is counter to the ever present need to minimize weight for reasons of efficiency and performance. In this work we present some recent results in the area of reduced order modeling of post- buckled thin beams. A thorough parametric study of the response of a beam to changing harmonic loading parameters, which is useful in exposing complex phenomena and exercising numerical models, is presented. Two error metrics that use but require no time stepping of a (computationally expensive) truth model are also introduced. The error metrics are applied to several interesting forcing parameter cases identified from the parametric study and are shown to yield useful information about the quality of a candidate reduced order model. Parametric studies, especially when considering forcing and structural geometry parameters, coupled environments, and uncertainties would be computationally intractable with finite element models. The goal is to make rapid simulation of complex nonlinear dynamic behavior possible for distributed systems via fast and accurate reduced order models. This ability is crucial in allowing designers to rigorously probe the robustness of their designs to account for variations in loading, structural imperfections, and other uncertainties.
Self-consistent clustering analysis: an efficient multiscale scheme for inelastic heterogeneous materials

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, Z.; Bessa, M. A.; Liu, W.K.

A predictive computational theory is shown for modeling complex, hierarchical materials ranging from metal alloys to polymer nanocomposites. The theory can capture complex mechanisms such as plasticity and failure that span across multiple length scales. This general multiscale material modeling theory relies on sound principles of mathematics and mechanics, and a cutting-edge reduced order modeling method named self-consistent clustering analysis (SCA) [Zeliang Liu, M.A. Bessa, Wing Kam Liu, “Self-consistent clustering analysis: An efficient multi-scale scheme for inelastic heterogeneous materials,” Comput. Methods Appl. Mech. Engrg. 306 (2016) 319–341]. SCA reduces by several orders of magnitude the computational cost of micromechanical andmore » concurrent multiscale simulations, while retaining the microstructure information. This remarkable increase in efficiency is achieved with a data-driven clustering method. Computationally expensive operations are performed in the so-called offline stage, where degrees of freedom (DOFs) are agglomerated into clusters. The interaction tensor of these clusters is computed. In the online or predictive stage, the Lippmann-Schwinger integral equation is solved cluster-wise using a self-consistent scheme to ensure solution accuracy and avoid path dependence. To construct a concurrent multiscale model, this scheme is applied at each material point in a macroscale structure, replacing a conventional constitutive model with the average response computed from the microscale model using just the SCA online stage. A regularized damage theory is incorporated in the microscale that avoids the mesh and RVE size dependence that commonly plagues microscale damage calculations. The SCA method is illustrated with two cases: a carbon fiber reinforced polymer (CFRP) structure with the concurrent multiscale model and an application to fatigue prediction for additively manufactured metals. For the CFRP problem, a speed up estimated to be about 43,000 is achieved by using the SCA method, as opposed to FE2, enabling the solution of an otherwise computationally intractable problem. The second example uses a crystal plasticity constitutive law and computes the fatigue potency of extrinsic microscale features such as voids. This shows that local stress and strain are capture sufficiently well by SCA. This model has been incorporated in a process-structure-properties prediction framework for process design in additive manufacturing.« less
Simulation and optimization of pressure swing adsorption systmes using reduced-order modeling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Agarwal, A.; Biegler, L.; Zitney, S.

2009-01-01

Over the past three decades, pressure swing adsorption (PSA) processes have been widely used as energyefficient gas separation techniques, especially for high purity hydrogen purification from refinery gases. Models for PSA processes are multiple instances of partial differential equations (PDEs) in time and space with periodic boundary conditions that link the processing steps together. The solution of this coupled stiff PDE system is governed by steep fronts moving with time. As a result, the optimization of such systems represents a significant computational challenge to current differential algebraic equation (DAE) optimization techniques and nonlinear programming algorithms. Model reduction is one approachmore » to generate cost-efficient low-order models which can be used as surrogate models in the optimization problems. This study develops a reducedorder model (ROM) based on proper orthogonal decomposition (POD), which is a low-dimensional approximation to a dynamic PDE-based model. The proposed method leads to a DAE system of significantly lower order, thus replacing the one obtained from spatial discretization and making the optimization problem computationally efficient. The method has been applied to the dynamic coupled PDE-based model of a twobed four-step PSA process for separation of hydrogen from methane. Separate ROMs have been developed for each operating step with different POD modes for each of them. A significant reduction in the order of the number of states has been achieved. The reduced-order model has been successfully used to maximize hydrogen recovery by manipulating operating pressures, step times and feed and regeneration velocities, while meeting product purity and tight bounds on these parameters. Current results indicate the proposed ROM methodology as a promising surrogate modeling technique for cost-effective optimization purposes.« less
Parallel discontinuous Galerkin FEM for computing hyperbolic conservation law on unstructured grids

NASA Astrophysics Data System (ADS)

Ma, Xinrong; Duan, Zhijian

2018-04-01

High-order resolution Discontinuous Galerkin finite element methods (DGFEM) has been known as a good method for solving Euler equations and Navier-Stokes equations on unstructured grid, but it costs too much computational resources. An efficient parallel algorithm was presented for solving the compressible Euler equations. Moreover, the multigrid strategy based on three-stage three-order TVD Runge-Kutta scheme was used in order to improve the computational efficiency of DGFEM and accelerate the convergence of the solution of unsteady compressible Euler equations. In order to make each processor maintain load balancing, the domain decomposition method was employed. Numerical experiment performed for the inviscid transonic flow fluid problems around NACA0012 airfoil and M6 wing. The results indicated that our parallel algorithm can improve acceleration and efficiency significantly, which is suitable for calculating the complex flow fluid.
Reducing the computational footprint for real-time BCPNN learning

PubMed Central

Vogginger, Bernhard; Schüffny, René; Lansner, Anders; Cederström, Love; Partzsch, Johannes; Höppner, Sebastian

2015-01-01

The implementation of synaptic plasticity in neural simulation or neuromorphic hardware is usually very resource-intensive, often requiring a compromise between efficiency and flexibility. A versatile, but computationally-expensive plasticity mechanism is provided by the Bayesian Confidence Propagation Neural Network (BCPNN) paradigm. Building upon Bayesian statistics, and having clear links to biological plasticity processes, the BCPNN learning rule has been applied in many fields, ranging from data classification, associative memory, reward-based learning, probabilistic inference to cortical attractor memory networks. In the spike-based version of this learning rule the pre-, postsynaptic and coincident activity is traced in three low-pass-filtering stages, requiring a total of eight state variables, whose dynamics are typically simulated with the fixed step size Euler method. We derive analytic solutions allowing an efficient event-driven implementation of this learning rule. Further speedup is achieved by first rewriting the model which reduces the number of basic arithmetic operations per update to one half, and second by using look-up tables for the frequently calculated exponential decay. Ultimately, in a typical use case, the simulation using our approach is more than one order of magnitude faster than with the fixed step size Euler method. Aiming for a small memory footprint per BCPNN synapse, we also evaluate the use of fixed-point numbers for the state variables, and assess the number of bits required to achieve same or better accuracy than with the conventional explicit Euler method. All of this will allow a real-time simulation of a reduced cortex model based on BCPNN in high performance computing. More important, with the analytic solution at hand and due to the reduced memory bandwidth, the learning rule can be efficiently implemented in dedicated or existing digital neuromorphic hardware. PMID:25657618
Reducing the computational footprint for real-time BCPNN learning.

PubMed

Vogginger, Bernhard; Schüffny, René; Lansner, Anders; Cederström, Love; Partzsch, Johannes; Höppner, Sebastian

2015-01-01

The implementation of synaptic plasticity in neural simulation or neuromorphic hardware is usually very resource-intensive, often requiring a compromise between efficiency and flexibility. A versatile, but computationally-expensive plasticity mechanism is provided by the Bayesian Confidence Propagation Neural Network (BCPNN) paradigm. Building upon Bayesian statistics, and having clear links to biological plasticity processes, the BCPNN learning rule has been applied in many fields, ranging from data classification, associative memory, reward-based learning, probabilistic inference to cortical attractor memory networks. In the spike-based version of this learning rule the pre-, postsynaptic and coincident activity is traced in three low-pass-filtering stages, requiring a total of eight state variables, whose dynamics are typically simulated with the fixed step size Euler method. We derive analytic solutions allowing an efficient event-driven implementation of this learning rule. Further speedup is achieved by first rewriting the model which reduces the number of basic arithmetic operations per update to one half, and second by using look-up tables for the frequently calculated exponential decay. Ultimately, in a typical use case, the simulation using our approach is more than one order of magnitude faster than with the fixed step size Euler method. Aiming for a small memory footprint per BCPNN synapse, we also evaluate the use of fixed-point numbers for the state variables, and assess the number of bits required to achieve same or better accuracy than with the conventional explicit Euler method. All of this will allow a real-time simulation of a reduced cortex model based on BCPNN in high performance computing. More important, with the analytic solution at hand and due to the reduced memory bandwidth, the learning rule can be efficiently implemented in dedicated or existing digital neuromorphic hardware.
Reverse time migration by Krylov subspace reduced order modeling

NASA Astrophysics Data System (ADS)

Basir, Hadi Mahdavi; Javaherian, Abdolrahim; Shomali, Zaher Hossein; Firouz-Abadi, Roohollah Dehghani; Gholamy, Shaban Ali

2018-04-01

Imaging is a key step in seismic data processing. To date, a myriad of advanced pre-stack depth migration approaches have been developed; however, reverse time migration (RTM) is still considered as the high-end imaging algorithm. The main limitations associated with the performance cost of reverse time migration are the intensive computation of the forward and backward simulations, time consumption, and memory allocation related to imaging condition. Based on the reduced order modeling, we proposed an algorithm, which can be adapted to all the aforementioned factors. Our proposed method benefit from Krylov subspaces method to compute certain mode shapes of the velocity model computed by as an orthogonal base of reduced order modeling. Reverse time migration by reduced order modeling is helpful concerning the highly parallel computation and strongly reduces the memory requirement of reverse time migration. The synthetic model results showed that suggested method can decrease the computational costs of reverse time migration by several orders of magnitudes, compared with reverse time migration by finite element method.
Reducing Bottlenecks to Improve the Efficiency of the Lung Cancer Care Delivery Process: A Process Engineering Modeling Approach to Patient-Centered Care.

PubMed

Ju, Feng; Lee, Hyo Kyung; Yu, Xinhua; Faris, Nicholas R; Rugless, Fedoria; Jiang, Shan; Li, Jingshan; Osarogiagbon, Raymond U

2017-12-01

The process of lung cancer care from initial lesion detection to treatment is complex, involving multiple steps, each introducing the potential for substantial delays. Identifying the steps with the greatest delays enables a focused effort to improve the timeliness of care-delivery, without sacrificing quality. We retrospectively reviewed clinical events from initial detection, through histologic diagnosis, radiologic and invasive staging, and medical clearance, to surgery for all patients who had an attempted resection of a suspected lung cancer in a community healthcare system. We used a computer process modeling approach to evaluate delays in care delivery, in order to identify potential 'bottlenecks' in waiting time, the reduction of which could produce greater care efficiency. We also conducted 'what-if' analyses to predict the relative impact of simulated changes in the care delivery process to determine the most efficient pathways to surgery. The waiting time between radiologic lesion detection and diagnostic biopsy, and the waiting time from radiologic staging to surgery were the two most critical bottlenecks impeding efficient care delivery (more than 3 times larger compared to reducing other waiting times). Additionally, instituting surgical consultation prior to cardiac consultation for medical clearance and decreasing the waiting time between CT scans and diagnostic biopsies, were potentially the most impactful measures to reduce care delays before surgery. Rigorous computer simulation modeling, using clinical data, can provide useful information to identify areas for improving the efficiency of care delivery by process engineering, for patients who receive surgery for lung cancer.
Computationally efficient methods for modelling laser wakefield acceleration in the blowout regime

NASA Astrophysics Data System (ADS)

Cowan, B. M.; Kalmykov, S. Y.; Beck, A.; Davoine, X.; Bunkers, K.; Lifschitz, A. F.; Lefebvre, E.; Bruhwiler, D. L.; Shadwick, B. A.; Umstadter, D. P.; Umstadter

2012-08-01

Electron self-injection and acceleration until dephasing in the blowout regime is studied for a set of initial conditions typical of recent experiments with 100-terawatt-class lasers. Two different approaches to computationally efficient, fully explicit, 3D particle-in-cell modelling are examined. First, the Cartesian code vorpal (Nieter, C. and Cary, J. R. 2004 VORPAL: a versatile plasma simulation code. J. Comput. Phys. 196, 538) using a perfect-dispersion electromagnetic solver precisely describes the laser pulse and bubble dynamics, taking advantage of coarser resolution in the propagation direction, with a proportionally larger time step. Using third-order splines for macroparticles helps suppress the sampling noise while keeping the usage of computational resources modest. The second way to reduce the simulation load is using reduced-geometry codes. In our case, the quasi-cylindrical code calder-circ (Lifschitz, A. F. et al. 2009 Particle-in-cell modelling of laser-plasma interaction using Fourier decomposition. J. Comput. Phys. 228(5), 1803-1814) uses decomposition of fields and currents into a set of poloidal modes, while the macroparticles move in the Cartesian 3D space. Cylindrical symmetry of the interaction allows using just two modes, reducing the computational load to roughly that of a planar Cartesian simulation while preserving the 3D nature of the interaction. This significant economy of resources allows using fine resolution in the direction of propagation and a small time step, making numerical dispersion vanishingly small, together with a large number of particles per cell, enabling good particle statistics. Quantitative agreement of two simulations indicates that these are free of numerical artefacts. Both approaches thus retrieve the physically correct evolution of the plasma bubble, recovering the intrinsic connection of electron self-injection to the nonlinear optical evolution of the driver.
Identification of Computational and Experimental Reduced-Order Models

NASA Technical Reports Server (NTRS)

Silva, Walter A.; Hong, Moeljo S.; Bartels, Robert E.; Piatak, David J.; Scott, Robert C.

2003-01-01

The identification of computational and experimental reduced-order models (ROMs) for the analysis of unsteady aerodynamic responses and for efficient aeroelastic analyses is presented. For the identification of a computational aeroelastic ROM, the CFL3Dv6.0 computational fluid dynamics (CFD) code is used. Flutter results for the AGARD 445.6 Wing and for a Rigid Semispan Model (RSM) computed using CFL3Dv6.0 are presented, including discussion of associated computational costs. Modal impulse responses of the unsteady aerodynamic system are computed using the CFL3Dv6.0 code and transformed into state-space form. The unsteady aerodynamic state-space ROM is then combined with a state-space model of the structure to create an aeroelastic simulation using the MATLAB/SIMULINK environment. The MATLAB/SIMULINK ROM is then used to rapidly compute aeroelastic transients, including flutter. The ROM shows excellent agreement with the aeroelastic analyses computed using the CFL3Dv6.0 code directly. For the identification of experimental unsteady pressure ROMs, results are presented for two configurations: the RSM and a Benchmark Supercritical Wing (BSCW). Both models were used to acquire unsteady pressure data due to pitching oscillations on the Oscillating Turntable (OTT) system at the Transonic Dynamics Tunnel (TDT). A deconvolution scheme involving a step input in pitch and the resultant step response in pressure, for several pressure transducers, is used to identify the unsteady pressure impulse responses. The identified impulse responses are then used to predict the pressure responses due to pitching oscillations at several frequencies. Comparisons with the experimental data are then presented.
Quantification of uncertainties in the tsunami hazard for Cascadia using statistical emulation

NASA Astrophysics Data System (ADS)

Guillas, S.; Day, S. J.; Joakim, B.

2016-12-01

We present new high resolution tsunami wave propagation and coastal inundation for the Cascadia region in the Pacific Northwest. The coseismic representation in this analysis is novel, and more realistic than in previous studies, as we jointly parametrize multiple aspects of the seabed deformation. Due to the large computational cost of such simulators, statistical emulation is required in order to carry out uncertainty quantification tasks, as emulators efficiently approximate simulators. The emulator replaces the tsunami model VOLNA by a fast surrogate, so we are able to efficiently propagate uncertainties from the source characteristics to wave heights, in order to probabilistically assess tsunami hazard for Cascadia. We employ a new method for the design of the computer experiments in order to reduce the number of runs while maintaining good approximations properties of the emulator. Out of the initial nine parameters, mostly describing the geometry and time variation of the seabed deformation, we drop two parameters since these turn out to not have an influence on the resulting tsunami waves at the coast. We model the impact of another parameter linearly as its influence on the wave heights is identified as linear. We combine this screening approach with the sequential design algorithm MICE (Mutual Information for Computer Experiments), that adaptively selects the input values at which to run the computer simulator, in order to maximize the expected information gain (mutual information) over the input space. As a result, the emulation is made possible and accurate. Starting from distributions of the source parameters that encapsulate geophysical knowledge of the possible source characteristics, we derive distributions of the tsunami wave heights along the coastline.
Performance of a reduced-order FSI model for flow-induced vocal fold vibration

NASA Astrophysics Data System (ADS)

Luo, Haoxiang; Chang, Siyuan; Chen, Ye; Rousseau, Bernard; PhonoSim Team

2017-11-01

Vocal fold vibration during speech production involves a three-dimensional unsteady glottal jet flow and three-dimensional nonlinear tissue mechanics. A full 3D fluid-structure interaction (FSI) model is computationally expensive even though it provides most accurate information about the system. On the other hand, an efficient reduced-order FSI model is useful for fast simulation and analysis of the vocal fold dynamics, which can be applied in procedures such as optimization and parameter estimation. In this work, we study performance of a reduced-order model as compared with the corresponding full 3D model in terms of its accuracy in predicting the vibration frequency and deformation mode. In the reduced-order model, we use a 1D flow model coupled with a 3D tissue model that is the same as in the full 3D model. Two different hyperelastic tissue behaviors are assumed. In addition, the vocal fold thickness and subglottal pressure are varied for systematic comparison. The result shows that the reduced-order model provides consistent predictions as the full 3D model across different tissue material assumptions and subglottal pressures. However, the vocal fold thickness has most effect on the model accuracy, especially when the vocal fold is thin.
Total variation-based neutron computed tomography

NASA Astrophysics Data System (ADS)

Barnard, Richard C.; Bilheux, Hassina; Toops, Todd; Nafziger, Eric; Finney, Charles; Splitter, Derek; Archibald, Rick

2018-05-01

We perform the neutron computed tomography reconstruction problem via an inverse problem formulation with a total variation penalty. In the case of highly under-resolved angular measurements, the total variation penalty suppresses high-frequency artifacts which appear in filtered back projections. In order to efficiently compute solutions for this problem, we implement a variation of the split Bregman algorithm; due to the error-forgetting nature of the algorithm, the computational cost of updating can be significantly reduced via very inexact approximate linear solvers. We present the effectiveness of the algorithm in the significantly low-angular sampling case using synthetic test problems as well as data obtained from a high flux neutron source. The algorithm removes artifacts and can even roughly capture small features when an extremely low number of angles are used.
Using Neural Net Technology To Enhance the Efficiency of a Computer Adaptive Testing Application.

ERIC Educational Resources Information Center

Van Nelson, C.; Henriksen, Larry W.

The potential for computer adaptive testing (CAT) has been well documented. In order to improve the efficiency of this process, it may be possible to utilize a neural network, or more specifically, a back propagation neural network. The paper asserts that in order to accomplish this end, it must be shown that grouping examinees by ability as…
Spin-neurons: A possible path to energy-efficient neuromorphic computers

NASA Astrophysics Data System (ADS)

Sharad, Mrigank; Fan, Deliang; Roy, Kaushik

2013-12-01

Recent years have witnessed growing interest in the field of brain-inspired computing based on neural-network architectures. In order to translate the related algorithmic models into powerful, yet energy-efficient cognitive-computing hardware, computing-devices beyond CMOS may need to be explored. The suitability of such devices to this field of computing would strongly depend upon how closely their physical characteristics match with the essential computing primitives employed in such models. In this work, we discuss the rationale of applying emerging spin-torque devices for bio-inspired computing. Recent spin-torque experiments have shown the path to low-current, low-voltage, and high-speed magnetization switching in nano-scale magnetic devices. Such magneto-metallic, current-mode spin-torque switches can mimic the analog summing and "thresholding" operation of an artificial neuron with high energy-efficiency. Comparison with CMOS-based analog circuit-model of a neuron shows that "spin-neurons" (spin based circuit model of neurons) can achieve more than two orders of magnitude lower energy and beyond three orders of magnitude reduction in energy-delay product. The application of spin-neurons can therefore be an attractive option for neuromorphic computers of future.
Spin-neurons: A possible path to energy-efficient neuromorphic computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sharad, Mrigank; Fan, Deliang; Roy, Kaushik

Recent years have witnessed growing interest in the field of brain-inspired computing based on neural-network architectures. In order to translate the related algorithmic models into powerful, yet energy-efficient cognitive-computing hardware, computing-devices beyond CMOS may need to be explored. The suitability of such devices to this field of computing would strongly depend upon how closely their physical characteristics match with the essential computing primitives employed in such models. In this work, we discuss the rationale of applying emerging spin-torque devices for bio-inspired computing. Recent spin-torque experiments have shown the path to low-current, low-voltage, and high-speed magnetization switching in nano-scale magnetic devices.more » Such magneto-metallic, current-mode spin-torque switches can mimic the analog summing and “thresholding” operation of an artificial neuron with high energy-efficiency. Comparison with CMOS-based analog circuit-model of a neuron shows that “spin-neurons” (spin based circuit model of neurons) can achieve more than two orders of magnitude lower energy and beyond three orders of magnitude reduction in energy-delay product. The application of spin-neurons can therefore be an attractive option for neuromorphic computers of future.« less
Training Maneuver Evaluation for Reduced Order Modeling of Stability & Control Properties Using Computational Fluid Dynamics

DTIC Science & Technology

2013-03-01

reduced order model is created. Finally, previous research in this area of study will be examined, and its application to this research will be...TRAINING MANEUVER EVALUATION FOR REDUCED ORDER MODELING OF STABILITY & CONTROL PROPERTIES USING COMPUTATIONAL FLUID DYNAMICS THESIS Craig Curtis...Government and is not subject to copyright protection in the United States. AFIT-ENY-13-M-28 TRAINING MANEUVER EVALUATION FOR REDUCED ORDER MODELING OF

Finite element dynamic analysis of soft tissues using state-space model.

PubMed

Iorga, Lucian N; Shan, Baoxiang; Pelegri, Assimina A

2009-04-01

A finite element (FE) model is employed to investigate the dynamic response of soft tissues under external excitations, particularly corresponding to the case of harmonic motion imaging. A solid 3D mixed 'u-p' element S8P0 is implemented to capture the near-incompressibility inherent in soft tissues. Two important aspects in structural modelling of these tissues are studied; these are the influence of viscous damping on the dynamic response and, following FE-modelling, a developed state-space formulation that valuates the efficiency of several order reduction methods. It is illustrated that the order of the mathematical model can be significantly reduced, while preserving the accuracy of the observed system dynamics. Thus, the reduced-order state-space representation of soft tissues for general dynamic analysis significantly reduces the computational cost and provides a unitary framework for the 'forward' simulation and 'inverse' estimation of soft tissues. Moreover, the results suggest that damping in soft-tissue is significant, effectively cancelling the contribution of all but the first few vibration modes.
Non-conforming finite-element formulation for cardiac electrophysiology: an effective approach to reduce the computation time of heart simulations without compromising accuracy

NASA Astrophysics Data System (ADS)

Hurtado, Daniel E.; Rojas, Guillermo

2018-04-01

Computer simulations constitute a powerful tool for studying the electrical activity of the human heart, but computational effort remains prohibitively high. In order to recover accurate conduction velocities and wavefront shapes, the mesh size in linear element (Q1) formulations cannot exceed 0.1 mm. Here we propose a novel non-conforming finite-element formulation for the non-linear cardiac electrophysiology problem that results in accurate wavefront shapes and lower mesh-dependance in the conduction velocity, while retaining the same number of global degrees of freedom as Q1 formulations. As a result, coarser discretizations of cardiac domains can be employed in simulations without significant loss of accuracy, thus reducing the overall computational effort. We demonstrate the applicability of our formulation in biventricular simulations using a coarse mesh size of ˜ 1 mm, and show that the activation wave pattern closely follows that obtained in fine-mesh simulations at a fraction of the computation time, thus improving the accuracy-efficiency trade-off of cardiac simulations.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Versino, Daniele; Bronkhorst, Curt Allan

The computational formulation of a micro-mechanical material model for the dynamic failure of ductile metals is presented in this paper. The statistical nature of porosity initiation is accounted for by introducing an arbitrary probability density function which describes the pores nucleation pressures. Each micropore within the representative volume element is modeled as a thick spherical shell made of plastically incompressible material. The treatment of porosity by a distribution of thick-walled spheres also allows for the inclusion of micro-inertia effects under conditions of shock and dynamic loading. The second order ordinary differential equation governing the microscopic porosity evolution is solved withmore » a robust implicit procedure. A new Chebyshev collocation method is employed to approximate the porosity distribution and remapping is used to optimize memory usage. The adaptive approximation of the porosity distribution leads to a reduction of computational time and memory usage of up to two orders of magnitude. Moreover, the proposed model affords consistent performance: changing the nucleation pressure probability density function and/or the applied strain rate does not reduce accuracy or computational efficiency of the material model. The numerical performance of the model and algorithms presented is tested against three problems for high density tantalum: single void, one-dimensional uniaxial strain, and two-dimensional plate impact. Here, the results using the integration and algorithmic advances suggest a significant improvement in computational efficiency and accuracy over previous treatments for dynamic loading conditions.« less
Combined multi-spectrum and orthogonal Laplacianfaces for fast CB-XLCT imaging with single-view data

NASA Astrophysics Data System (ADS)

Zhang, Haibo; Geng, Guohua; Chen, Yanrong; Qu, Xuan; Zhao, Fengjun; Hou, Yuqing; Yi, Huangjian; He, Xiaowei

2017-12-01

Cone-beam X-ray luminescence computed tomography (CB-XLCT) is an attractive hybrid imaging modality, which has the potential of monitoring the metabolic processes of nanophosphors-based drugs in vivo. Single-view data reconstruction as a key issue of CB-XLCT imaging promotes the effective study of dynamic XLCT imaging. However, it suffers from serious ill-posedness in the inverse problem. In this paper, a multi-spectrum strategy is adopted to relieve the ill-posedness of reconstruction. The strategy is based on the third-order simplified spherical harmonic approximation model. Then, an orthogonal Laplacianfaces-based method is proposed to reduce the large computational burden without degrading the imaging quality. Both simulated data and in vivo experimental data were used to evaluate the efficiency and robustness of the proposed method. The results are satisfactory in terms of both location and quantitative recovering with computational efficiency, indicating that the proposed method is practical and promising for single-view CB-XLCT imaging.
A diagonal algorithm for the method of pseudocompressibility. [for steady-state solution to incompressible Navier-Stokes equation

NASA Technical Reports Server (NTRS)

Rogers, S. E.; Kwak, D.; Chang, J. L. C.

1986-01-01

The method of pseudocompressibility has been shown to be an efficient method for obtaining a steady-state solution to the incompressible Navier-Stokes equations. Recent improvements to this method include the use of a diagonal scheme for the inversion of the equations at each iteration. The necessary transformations have been derived for the pseudocompressibility equations in generalized coordinates. The diagonal algorithm reduces the computing time necessary to obtain a steady-state solution by a factor of nearly three. Implicit viscous terms are maintained in the equations, and it has become possible to use fourth-order implicit dissipation. The steady-state solution is unchanged by the approximations resulting from the diagonalization of the equations. Computed results for flow over a two-dimensional backward-facing step and a three-dimensional cylinder mounted normal to a flat plate are presented for both the old and new algorithms. The accuracy and computing efficiency of these algorithms are compared.
Complementary Reliability-Based Decodings of Binary Linear Block Codes

NASA Technical Reports Server (NTRS)

Fossorier, Marc P. C.; Lin, Shu

1997-01-01

This correspondence presents a hybrid reliability-based decoding algorithm which combines the reprocessing method based on the most reliable basis and a generalized Chase-type algebraic decoder based on the least reliable positions. It is shown that reprocessing with a simple additional algebraic decoding effort achieves significant coding gain. For long codes, the order of reprocessing required to achieve asymptotic optimum error performance is reduced by approximately 1/3. This significantly reduces the computational complexity, especially for long codes. Also, a more efficient criterion for stopping the decoding process is derived based on the knowledge of the algebraic decoding solution.
Development of a nonlinear vortex method

NASA Technical Reports Server (NTRS)

Kandil, O. A.

1982-01-01

Steady and unsteady Nonliner Hybrid Vortex (NHV) method, for low aspect ratio wings at large angles of attack, is developed. The method uses vortex panels with first-order vorticity distribution (equivalent to second-order doublet distribution) to calculate the induced velocity in the near field using closed form expressions. In the far field, the distributed vorticity is reduced to concentrated vortex lines and the simpler Biot-Savart's law is employed. The method is applied to rectangular wings in steady and unsteady flows without any restriction on the order of magnitude of the disturbances in the flow field. The numerical results show that the method accurately predicts the distributed aerodynamic loads and that it is of acceptable computational efficiency.
Nonlinear Dynamic Model-Based Multiobjective Sensor Network Design Algorithm for a Plant with an Estimator-Based Control System

DOE Office of Scientific and Technical Information (OSTI.GOV)

Paul, Prokash; Bhattacharyya, Debangsu; Turton, Richard

Here, a novel sensor network design (SND) algorithm is developed for maximizing process efficiency while minimizing sensor network cost for a nonlinear dynamic process with an estimator-based control system. The multiobjective optimization problem is solved following a lexicographic approach where the process efficiency is maximized first followed by minimization of the sensor network cost. The partial net present value, which combines the capital cost due to the sensor network and the operating cost due to deviation from the optimal efficiency, is proposed as an alternative objective. The unscented Kalman filter is considered as the nonlinear estimator. The large-scale combinatorial optimizationmore » problem is solved using a genetic algorithm. The developed SND algorithm is applied to an acid gas removal (AGR) unit as part of an integrated gasification combined cycle (IGCC) power plant with CO 2 capture. Due to the computational expense, a reduced order nonlinear model of the AGR process is identified and parallel computation is performed during implementation.« less
Nonlinear Dynamic Model-Based Multiobjective Sensor Network Design Algorithm for a Plant with an Estimator-Based Control System

DOE PAGES

Paul, Prokash; Bhattacharyya, Debangsu; Turton, Richard; ...

2017-06-06

Here, a novel sensor network design (SND) algorithm is developed for maximizing process efficiency while minimizing sensor network cost for a nonlinear dynamic process with an estimator-based control system. The multiobjective optimization problem is solved following a lexicographic approach where the process efficiency is maximized first followed by minimization of the sensor network cost. The partial net present value, which combines the capital cost due to the sensor network and the operating cost due to deviation from the optimal efficiency, is proposed as an alternative objective. The unscented Kalman filter is considered as the nonlinear estimator. The large-scale combinatorial optimizationmore » problem is solved using a genetic algorithm. The developed SND algorithm is applied to an acid gas removal (AGR) unit as part of an integrated gasification combined cycle (IGCC) power plant with CO 2 capture. Due to the computational expense, a reduced order nonlinear model of the AGR process is identified and parallel computation is performed during implementation.« less
Unified transform architecture for AVC, AVS, VC-1 and HEVC high-performance codecs

NASA Astrophysics Data System (ADS)

Dias, Tiago; Roma, Nuno; Sousa, Leonel

2014-12-01

A unified architecture for fast and efficient computation of the set of two-dimensional (2-D) transforms adopted by the most recent state-of-the-art digital video standards is presented in this paper. Contrasting to other designs with similar functionality, the presented architecture is supported on a scalable, modular and completely configurable processing structure. This flexible structure not only allows to easily reconfigure the architecture to support different transform kernels, but it also permits its resizing to efficiently support transforms of different orders (e.g. order-4, order-8, order-16 and order-32). Consequently, not only is it highly suitable to realize high-performance multi-standard transform cores, but it also offers highly efficient implementations of specialized processing structures addressing only a reduced subset of transforms that are used by a specific video standard. The experimental results that were obtained by prototyping several configurations of this processing structure in a Xilinx Virtex-7 FPGA show the superior performance and hardware efficiency levels provided by the proposed unified architecture for the implementation of transform cores for the Advanced Video Coding (AVC), Audio Video coding Standard (AVS), VC-1 and High Efficiency Video Coding (HEVC) standards. In addition, such results also demonstrate the ability of this processing structure to realize multi-standard transform cores supporting all the standards mentioned above and that are capable of processing the 8k Ultra High Definition Television (UHDTV) video format (7,680 × 4,320 at 30 fps) in real time.
An efficient numerical technique for calculating thermal spreading resistance

NASA Technical Reports Server (NTRS)

Gale, E. H., Jr.

1977-01-01

An efficient numerical technique for solving the equations resulting from finite difference analyses of fields governed by Poisson's equation is presented. The method is direct (noniterative)and the computer work required varies with the square of the order of the coefficient matrix. The computational work required varies with the cube of this order for standard inversion techniques, e.g., Gaussian elimination, Jordan, Doolittle, etc.
Large-Eddy/Lattice Boltzmann Simulations of Micro-blowing Strategies for Subsonic and Supersonic Drag Control

NASA Technical Reports Server (NTRS)

Menon, Suresh

2003-01-01

This report summarizes the progress made in the first 8 to 9 months of this research. The Lattice Boltzmann Equation (LBE) methodology for Large-eddy Simulations (LES) of microblowing has been validated using a jet-in-crossflow test configuration. In this study, the flow intake is also simulated to allow the interaction to occur naturally. The Lattice Boltzmann Equation Large-eddy Simulations (LBELES) approach is capable of capturing not only the flow features associated with the flow, such as hairpin vortices and recirculation behind the jet, but also is able to show better agreement with experiments when compared to previous RANS predictions. The LBELES is shown to be computationally very efficient and therefore, a viable method for simulating the injection process. Two strategies have been developed to simulate multi-hole injection process as in the experiment. In order to allow natural interaction between the injected fluid and the primary stream, the flow intakes for all the holes have to be simulated. The LBE method is computationally efficient but is still 3D in nature and therefore, there may be some computational penalty. In order to study a large number or holes, a new 1D subgrid model has been developed that will simulate a reduced form of the Navier-Stokes equation in these holes.
On a turbulent wall model to predict hemolysis numerically in medical devices

NASA Astrophysics Data System (ADS)

Lee, Seunghun; Chang, Minwook; Kang, Seongwon; Hur, Nahmkeon; Kim, Wonjung

2017-11-01

Analyzing degradation of red blood cells is very important for medical devices with blood flows. The blood shear stress has been recognized as the most dominant factor for hemolysis in medical devices. Compared to laminar flows, turbulent flows have higher shear stress values in the regions near the wall. In case of predicting hemolysis numerically, this phenomenon can require a very fine mesh and large computational resources. In order to resolve this issue, the purpose of this study is to develop a turbulent wall model to predict the hemolysis more efficiently. In order to decrease the numerical error of hemolysis prediction in a coarse grid resolution, we divided the computational domain into two regions and applied different approaches to each region. In the near-wall region with a steep velocity gradient, an analytic approach using modeled velocity profile is applied to reduce a numerical error to allow a coarse grid resolution. We adopt the Van Driest law as a model for the mean velocity profile. In a region far from the wall, a regular numerical discretization is applied. The proposed turbulent wall model is evaluated for a few turbulent flows inside a cannula and centrifugal pumps. The results present that the proposed turbulent wall model for hemolysis improves the computational efficiency significantly for engineering applications. Corresponding author.
An adaptive discontinuous Galerkin solver for aerodynamic flows

NASA Astrophysics Data System (ADS)

Burgess, Nicholas K.

This work considers the accuracy, efficiency, and robustness of an unstructured high-order accurate discontinuous Galerkin (DG) solver for computational fluid dynamics (CFD). Recently, there has been a drive to reduce the discretization error of CFD simulations using high-order methods on unstructured grids. However, high-order methods are often criticized for lacking robustness and having high computational cost. The goal of this work is to investigate methods that enhance the robustness of high-order discontinuous Galerkin (DG) methods on unstructured meshes, while maintaining low computational cost and high accuracy of the numerical solutions. This work investigates robustness enhancement of high-order methods by examining effective non-linear solvers, shock capturing methods, turbulence model discretizations and adaptive refinement techniques. The goal is to develop an all encompassing solver that can simulate a large range of physical phenomena, where all aspects of the solver work together to achieve a robust, efficient and accurate solution strategy. The components and framework for a robust high-order accurate solver that is capable of solving viscous, Reynolds Averaged Navier-Stokes (RANS) and shocked flows is presented. In particular, this work discusses robust discretizations of the turbulence model equation used to close the RANS equations, as well as stable shock capturing strategies that are applicable across a wide range of discretization orders and applicable to very strong shock waves. Furthermore, refinement techniques are considered as both efficiency and robustness enhancement strategies. Additionally, efficient non-linear solvers based on multigrid and Krylov subspace methods are presented. The accuracy, efficiency, and robustness of the solver is demonstrated using a variety of challenging aerodynamic test problems, which include turbulent high-lift and viscous hypersonic flows. Adaptive mesh refinement was found to play a critical role in obtaining a robust and efficient high-order accurate flow solver. A goal-oriented error estimation technique has been developed to estimate the discretization error of simulation outputs. For high-order discretizations, it is shown that functional output error super-convergence can be obtained, provided the discretization satisfies a property known as dual consistency. The dual consistency of the DG methods developed in this work is shown via mathematical analysis and numerical experimentation. Goal-oriented error estimation is also used to drive an hp-adaptive mesh refinement strategy, where a combination of mesh or h-refinement, and order or p-enrichment, is employed based on the smoothness of the solution. The results demonstrate that the combination of goal-oriented error estimation and hp-adaptation yield superior accuracy, as well as enhanced robustness and efficiency for a variety of aerodynamic flows including flows with strong shock waves. This work demonstrates that DG discretizations can be the basis of an accurate, efficient, and robust CFD solver. Furthermore, enhancing the robustness of DG methods does not adversely impact the accuracy or efficiency of the solver for challenging and complex flow problems. In particular, when considering the computation of shocked flows, this work demonstrates that the available shock capturing techniques are sufficiently accurate and robust, particularly when used in conjunction with adaptive mesh refinement . This work also demonstrates that robust solutions of the Reynolds Averaged Navier-Stokes (RANS) and turbulence model equations can be obtained for complex and challenging aerodynamic flows. In this context, the most robust strategy was determined to be a low-order turbulence model discretization coupled to a high-order discretization of the RANS equations. Although RANS solutions using high-order accurate discretizations of the turbulence model were obtained, the behavior of current-day RANS turbulence models discretized to high-order was found to be problematic, leading to solver robustness issues. This suggests that future work is warranted in the area of turbulence model formulation for use with high-order discretizations. Alternately, the use of Large-Eddy Simulation (LES) subgrid scale models with high-order DG methods offers the potential to leverage the high accuracy of these methods for very high fidelity turbulent simulations. This thesis has developed the algorithmic improvements that will lay the foundation for the development of a three-dimensional high-order flow solution strategy that can be used as the basis for future LES simulations.
A model and variance reduction method for computing statistical outputs of stochastic elliptic partial differential equations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vidal-Codina, F., E-mail: fvidal@mit.edu; Nguyen, N.C., E-mail: cuongng@mit.edu; Giles, M.B., E-mail: mike.giles@maths.ox.ac.uk

We present a model and variance reduction method for the fast and reliable computation of statistical outputs of stochastic elliptic partial differential equations. Our method consists of three main ingredients: (1) the hybridizable discontinuous Galerkin (HDG) discretization of elliptic partial differential equations (PDEs), which allows us to obtain high-order accurate solutions of the governing PDE; (2) the reduced basis method for a new HDG discretization of the underlying PDE to enable real-time solution of the parameterized PDE in the presence of stochastic parameters; and (3) a multilevel variance reduction method that exploits the statistical correlation among the different reduced basismore » approximations and the high-fidelity HDG discretization to accelerate the convergence of the Monte Carlo simulations. The multilevel variance reduction method provides efficient computation of the statistical outputs by shifting most of the computational burden from the high-fidelity HDG approximation to the reduced basis approximations. Furthermore, we develop a posteriori error estimates for our approximations of the statistical outputs. Based on these error estimates, we propose an algorithm for optimally choosing both the dimensions of the reduced basis approximations and the sizes of Monte Carlo samples to achieve a given error tolerance. We provide numerical examples to demonstrate the performance of the proposed method.« less
A High-Order Finite Spectral Volume Method for Conservation Laws on Unstructured Grids

NASA Technical Reports Server (NTRS)

Wang, Z. J.; Liu, Yen; Kwak, Dochan (Technical Monitor)

2001-01-01

A time accurate, high-order, conservative, yet efficient method named Finite Spectral Volume (FSV) is developed for conservation laws on unstructured grids. The concept of a 'spectral volume' is introduced to achieve high-order accuracy in an efficient manner similar to spectral element and multi-domain spectral methods. In addition, each spectral volume is further sub-divided into control volumes (CVs), and cell-averaged data from these control volumes is used to reconstruct a high-order approximation in the spectral volume. Riemann solvers are used to compute the fluxes at spectral volume boundaries. Then cell-averaged state variables in the control volumes are updated independently. Furthermore, TVD (Total Variation Diminishing) and TVB (Total Variation Bounded) limiters are introduced in the FSV method to remove/reduce spurious oscillations near discontinuities. A very desirable feature of the FSV method is that the reconstruction is carried out only once, and analytically, and is the same for all cells of the same type, and that the reconstruction stencil is always non-singular, in contrast to the memory and CPU-intensive reconstruction in a high-order finite volume (FV) method. Discussions are made concerning why the FSV method is significantly more efficient than high-order finite volume and the Discontinuous Galerkin (DG) methods. Fundamental properties of the FSV method are studied and high-order accuracy is demonstrated for several model problems with and without discontinuities.
Reduced-Order Modeling: Cooperative Research and Development at the NASA Langley Research Center

NASA Technical Reports Server (NTRS)

Silva, Walter A.; Beran, Philip S.; Cesnik, Carlos E. S.; Guendel, Randal E.; Kurdila, Andrew; Prazenica, Richard J.; Librescu, Liviu; Marzocca, Piergiovanni; Raveh, Daniella E.

2001-01-01

Cooperative research and development activities at the NASA Langley Research Center (LaRC) involving reduced-order modeling (ROM) techniques are presented. Emphasis is given to reduced-order methods and analyses based on Volterra series representations, although some recent results using Proper Orthogonal Deco in position (POD) are discussed as well. Results are reported for a variety of computational and experimental nonlinear systems to provide clear examples of the use of reduced-order models, particularly within the field of computational aeroelasticity. The need for and the relative performance (speed, accuracy, and robustness) of reduced-order modeling strategies is documented. The development of unsteady aerodynamic state-space models directly from computational fluid dynamics analyses is presented in addition to analytical and experimental identifications of Volterra kernels. Finally, future directions for this research activity are summarized.
Elimination sequence optimization for SPAR

NASA Technical Reports Server (NTRS)

Hogan, Harry A.

1986-01-01

SPAR is a large-scale computer program for finite element structural analysis. The program allows user specification of the order in which the joints of a structure are to be eliminated since this order can have significant influence over solution performance, in terms of both storage requirements and computer time. An efficient elimination sequence can improve performance by over 50% for some problems. Obtaining such sequences, however, requires the expertise of an experienced user and can take hours of tedious effort to affect. Thus, an automatic elimination sequence optimizer would enhance productivity by reducing the analysts' problem definition time and by lowering computer costs. Two possible methods for automating the elimination sequence specifications were examined. Several algorithms based on the graph theory representations of sparse matrices were studied with mixed results. Significant improvement in the program performance was achieved, but sequencing by an experienced user still yields substantially better results. The initial results provide encouraging evidence that the potential benefits of such an automatic sequencer would be well worth the effort.
Computational Burden Resulting from Image Recognition of High Resolution Radar Sensors

PubMed Central

López-Rodríguez, Patricia; Fernández-Recio, Raúl; Bravo, Ignacio; Gardel, Alfredo; Lázaro, José L.; Rufo, Elena

2013-01-01

This paper presents a methodology for high resolution radar image generation and automatic target recognition emphasizing the computational cost involved in the process. In order to obtain focused inverse synthetic aperture radar (ISAR) images certain signal processing algorithms must be applied to the information sensed by the radar. From actual data collected by radar the stages and algorithms needed to obtain ISAR images are revised, including high resolution range profile generation, motion compensation and ISAR formation. Target recognition is achieved by comparing the generated set of actual ISAR images with a database of ISAR images generated by electromagnetic software. High resolution radar image generation and target recognition processes are burdensome and time consuming, so to determine the most suitable implementation platform the analysis of the computational complexity is of great interest. To this end and since target identification must be completed in real time, computational burden of both processes the generation and comparison with a database is explained separately. Conclusions are drawn about implementation platforms and calculation efficiency in order to reduce time consumption in a possible future implementation. PMID:23609804
Computational burden resulting from image recognition of high resolution radar sensors.

PubMed

López-Rodríguez, Patricia; Fernández-Recio, Raúl; Bravo, Ignacio; Gardel, Alfredo; Lázaro, José L; Rufo, Elena

2013-04-22

This paper presents a methodology for high resolution radar image generation and automatic target recognition emphasizing the computational cost involved in the process. In order to obtain focused inverse synthetic aperture radar (ISAR) images certain signal processing algorithms must be applied to the information sensed by the radar. From actual data collected by radar the stages and algorithms needed to obtain ISAR images are revised, including high resolution range profile generation, motion compensation and ISAR formation. Target recognition is achieved by comparing the generated set of actual ISAR images with a database of ISAR images generated by electromagnetic software. High resolution radar image generation and target recognition processes are burdensome and time consuming, so to determine the most suitable implementation platform the analysis of the computational complexity is of great interest. To this end and since target identification must be completed in real time, computational burden of both processes the generation and comparison with a database is explained separately. Conclusions are drawn about implementation platforms and calculation efficiency in order to reduce time consumption in a possible future implementation.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Turner, A.; Davis, A.; University of Wisconsin-Madison, Madison, WI 53706

CCFE perform Monte-Carlo transport simulations on large and complex tokamak models such as ITER. Such simulations are challenging since streaming and deep penetration effects are equally important. In order to make such simulations tractable, both variance reduction (VR) techniques and parallel computing are used. It has been found that the application of VR techniques in such models significantly reduces the efficiency of parallel computation due to 'long histories'. VR in MCNP can be accomplished using energy-dependent weight windows. The weight window represents an 'average behaviour' of particles, and large deviations in the arriving weight of a particle give rise tomore » extreme amounts of splitting being performed and a long history. When running on parallel clusters, a long history can have a detrimental effect on the parallel efficiency - if one process is computing the long history, the other CPUs complete their batch of histories and wait idle. Furthermore some long histories have been found to be effectively intractable. To combat this effect, CCFE has developed an adaptation of MCNP which dynamically adjusts the WW where a large weight deviation is encountered. The method effectively 'de-optimises' the WW, reducing the VR performance but this is offset by a significant increase in parallel efficiency. Testing with a simple geometry has shown the method does not bias the result. This 'long history method' has enabled CCFE to significantly improve the performance of MCNP calculations for ITER on parallel clusters, and will be beneficial for any geometry combining streaming and deep penetration effects. (authors)« less
Wind Farm Flow Modeling using an Input-Output Reduced-Order Model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Annoni, Jennifer; Gebraad, Pieter; Seiler, Peter

Wind turbines in a wind farm operate individually to maximize their own power regardless of the impact of aerodynamic interactions on neighboring turbines. There is the potential to increase power and reduce overall structural loads by properly coordinating turbines. To perform control design and analysis, a model needs to be of low computational cost, but retains the necessary dynamics seen in high-fidelity models. The objective of this work is to obtain a reduced-order model that represents the full-order flow computed using a high-fidelity model. A variety of methods, including proper orthogonal decomposition and dynamic mode decomposition, can be used tomore » extract the dominant flow structures and obtain a reduced-order model. In this paper, we combine proper orthogonal decomposition with a system identification technique to produce an input-output reduced-order model. This technique is used to construct a reduced-order model of the flow within a two-turbine array computed using a large-eddy simulation.« less
A computationally efficient parallel Levenberg-Marquardt algorithm for highly parameterized inverse model analyses

NASA Astrophysics Data System (ADS)

Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.

2016-09-01

Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace such that the dimensionality of the problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2-D and a random hydraulic conductivity field in 3-D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ˜101 to ˜102 in a multicore computational environment. Therefore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate to large-scale problems.
A Proposal of a Fast Computation Method for Thermal Capacity and Voltage ATC by Means of Homotopy Functions

NASA Astrophysics Data System (ADS)

Zoka, Yoshifumi; Yorino, Naoto; Kawano, Koki; Suenari, Hiroyasu

This paper proposes a fast computation method for Available Transfer Capability (ATC) with respect to thermal and voltage magnitude limits. In the paper, ATC is formulated as an optimization problem. In order to obtain the efficiency for the N-1 outage contingency calculations, linear sensitivity methods are applied for screening and ranking all contingency selections with respect to the thermal and voltage magnitude limits margin to identify the severest case. In addition, homotopy functions are used for the generator QV constrains to reduce the maximum error of the linear estimation. Then, the Primal-Dual Interior Point Method (PDIPM) is used to solve the optimization problem for the severest case only, in which the solutions of ATC can be obtained efficiently. The effectiveness of the proposed method is demonstrated through IEEE 30, 57, 118-bus systems.
Contour integral method for obtaining the self-energy matrices of electrodes in electron transport calculations

NASA Astrophysics Data System (ADS)

Iwase, Shigeru; Futamura, Yasunori; Imakura, Akira; Sakurai, Tetsuya; Tsukamoto, Shigeru; Ono, Tomoya

2018-05-01

We propose an efficient computational method for evaluating the self-energy matrices of electrodes to study ballistic electron transport properties in nanoscale systems. To reduce the high computational cost incurred in large systems, a contour integral eigensolver based on the Sakurai-Sugiura method combined with the shifted biconjugate gradient method is developed to solve an exponential-type eigenvalue problem for complex wave vectors. A remarkable feature of the proposed algorithm is that the numerical procedure is very similar to that of conventional band structure calculations. We implement the developed method in the framework of the real-space higher-order finite-difference scheme with nonlocal pseudopotentials. Numerical tests for a wide variety of materials validate the robustness, accuracy, and efficiency of the proposed method. As an illustration of the method, we present the electron transport property of the freestanding silicene with the line defect originating from the reversed buckled phases.
One-dimensional nonlinear theory for rectangular helix traveling-wave tube

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fu, Chengfang, E-mail: fchffchf@126.com; Zhao, Bo; Yang, Yudong

A 1-D nonlinear theory of a rectangular helix traveling-wave tube (TWT) interacting with a ribbon beam is presented in this paper. The RF field is modeled by a transmission line equivalent circuit, the ribbon beam is divided into a sequence of thin rectangular electron discs with the same cross section as the beam, and the charges are assumed to be uniformly distributed over these discs. Then a method of computing the space-charge field by solving Green's Function in the Cartesian Coordinate-system is fully described. Nonlinear partial differential equations for field amplitudes and Lorentz force equations for particles are solved numericallymore » using the fourth-order Runge-Kutta technique. The tube's gain, output power, and efficiency of the above TWT are computed. The results show that increasing the cross section of the ribbon beam will improve a rectangular helix TWT's efficiency and reduce the saturated length.« less
Tensor hypercontraction density fitting. I. Quartic scaling second- and third-order Møller-Plesset perturbation theory

NASA Astrophysics Data System (ADS)

Hohenstein, Edward G.; Parrish, Robert M.; Martínez, Todd J.

2012-07-01

Many approximations have been developed to help deal with the O(N4) growth of the electron repulsion integral (ERI) tensor, where N is the number of one-electron basis functions used to represent the electronic wavefunction. Of these, the density fitting (DF) approximation is currently the most widely used despite the fact that it is often incapable of altering the underlying scaling of computational effort with respect to molecular size. We present a method for exploiting sparsity in three-center overlap integrals through tensor decomposition to obtain a low-rank approximation to density fitting (tensor hypercontraction density fitting or THC-DF). This new approximation reduces the 4th-order ERI tensor to a product of five matrices, simultaneously reducing the storage requirement as well as increasing the flexibility to regroup terms and reduce scaling behavior. As an example, we demonstrate such a scaling reduction for second- and third-order perturbation theory (MP2 and MP3), showing that both can be carried out in O(N4) operations. This should be compared to the usual scaling behavior of O(N5) and O(N6) for MP2 and MP3, respectively. The THC-DF technique can also be applied to other methods in electronic structure theory, such as coupled-cluster and configuration interaction, promising significant gains in computational efficiency and storage reduction.
The thermodynamic efficiency of computations made in cells across the range of life

NASA Astrophysics Data System (ADS)

Kempes, Christopher P.; Wolpert, David; Cohen, Zachary; Pérez-Mercader, Juan

2017-11-01

Biological organisms must perform computation as they grow, reproduce and evolve. Moreover, ever since Landauer's bound was proposed, it has been known that all computation has some thermodynamic cost-and that the same computation can be achieved with greater or smaller thermodynamic cost depending on how it is implemented. Accordingly an important issue concerning the evolution of life is assessing the thermodynamic efficiency of the computations performed by organisms. This issue is interesting both from the perspective of how close life has come to maximally efficient computation (presumably under the pressure of natural selection), and from the practical perspective of what efficiencies we might hope that engineered biological computers might achieve, especially in comparison with current computational systems. Here we show that the computational efficiency of translation, defined as free energy expended per amino acid operation, outperforms the best supercomputers by several orders of magnitude, and is only about an order of magnitude worse than the Landauer bound. However, this efficiency depends strongly on the size and architecture of the cell in question. In particular, we show that the useful efficiency of an amino acid operation, defined as the bulk energy per amino acid polymerization, decreases for increasing bacterial size and converges to the polymerization cost of the ribosome. This cost of the largest bacteria does not change in cells as we progress through the major evolutionary shifts to both single- and multicellular eukaryotes. However, the rates of total computation per unit mass are non-monotonic in bacteria with increasing cell size, and also change across different biological architectures, including the shift from unicellular to multicellular eukaryotes. This article is part of the themed issue 'Reconceptualizing the origins of life'.
Aerosol Drug Delivery During Noninvasive Positive Pressure Ventilation: Effects of Intersubject Variability and Excipient Enhanced Growth

PubMed Central

Walenga, Ross L.; Kaviratna, Anubhav; Hindle, Michael

2017-01-01

Abstract Background: Nebulized aerosol drug delivery during the administration of noninvasive positive pressure ventilation (NPPV) is commonly implemented. While studies have shown improved patient outcomes for this therapeutic approach, aerosol delivery efficiency is reported to be low with high variability in lung-deposited dose. Excipient enhanced growth (EEG) aerosol delivery is a newly proposed technique that may improve drug delivery efficiency and reduce intersubject aerosol delivery variability when coupled with NPPV. Materials and Methods: A combined approach using in vitro experiments and computational fluid dynamics (CFD) was used to characterize aerosol delivery efficiency during NPPV in two new nasal cavity models that include face mask interfaces. Mesh nebulizer and in-line dry powder inhaler (DPI) sources of conventional and EEG aerosols were both considered. Results: Based on validated steady-state CFD predictions, EEG aerosol delivery improved lung penetration fraction (PF) values by factors ranging from 1.3 to 6.4 compared with conventional-sized aerosols. Furthermore, intersubject variability in lung PF was very high for conventional aerosol sizes (relative differences between subjects in the range of 54.5%–134.3%) and was reduced by an order of magnitude with the EEG approach (relative differences between subjects in the range of 5.5%–17.4%). Realistic in vitro experiments of cyclic NPPV demonstrated similar trends in lung delivery to those observed with the steady-state simulations, but with lower lung delivery efficiencies. Reaching the lung delivery efficiencies reported with the steady-state simulations of 80%–90% will require synchronization of aerosol administration during inspiration and reducing the size of the EEG aerosol delivery unit. Conclusions: The EEG approach enabled high-efficiency lung delivery of aerosols administered during NPPV and reduced intersubject aerosol delivery variability by an order of magnitude. Use of an in-line DPI device that connects to the NPPV mask appears to be a convenient method to rapidly administer an EEG aerosol and synchronize the delivery with inspiration. PMID:28075194
Speech recognition for embedded automatic positioner for laparoscope

NASA Astrophysics Data System (ADS)

Chen, Xiaodong; Yin, Qingyun; Wang, Yi; Yu, Daoyin

2014-07-01

In this paper a novel speech recognition methodology based on Hidden Markov Model (HMM) is proposed for embedded Automatic Positioner for Laparoscope (APL), which includes a fixed point ARM processor as the core. The APL system is designed to assist the doctor in laparoscopic surgery, by implementing the specific doctor's vocal control to the laparoscope. Real-time respond to the voice commands asks for more efficient speech recognition algorithm for the APL. In order to reduce computation cost without significant loss in recognition accuracy, both arithmetic and algorithmic optimizations are applied in the method presented. First, depending on arithmetic optimizations most, a fixed point frontend for speech feature analysis is built according to the ARM processor's character. Then the fast likelihood computation algorithm is used to reduce computational complexity of the HMM-based recognition algorithm. The experimental results show that, the method shortens the recognition time within 0.5s, while the accuracy higher than 99%, demonstrating its ability to achieve real-time vocal control to the APL.
Design of k-Space Channel Combination Kernels and Integration with Parallel Imaging

PubMed Central

Beatty, Philip J.; Chang, Shaorong; Holmes, James H.; Wang, Kang; Brau, Anja C. S.; Reeder, Scott B.; Brittain, Jean H.

2014-01-01

Purpose In this work, a new method is described for producing local k-space channel combination kernels using a small amount of low-resolution multichannel calibration data. Additionally, this work describes how these channel combination kernels can be combined with local k-space unaliasing kernels produced by the calibration phase of parallel imaging methods such as GRAPPA, PARS and ARC. Methods Experiments were conducted to evaluate both the image quality and computational efficiency of the proposed method compared to a channel-by-channel parallel imaging approach with image-space sum-of-squares channel combination. Results Results indicate comparable image quality overall, with some very minor differences seen in reduced field-of-view imaging. It was demonstrated that this method enables a speed up in computation time on the order of 3–16X for 32-channel data sets. Conclusion The proposed method enables high quality channel combination to occur earlier in the reconstruction pipeline, reducing computational and memory requirements for image reconstruction. PMID:23943602
Computing Bounds on Resource Levels for Flexible Plans

NASA Technical Reports Server (NTRS)

Muscvettola, Nicola; Rijsman, David

2009-01-01

A new algorithm efficiently computes the tightest exact bound on the levels of resources induced by a flexible activity plan (see figure). Tightness of bounds is extremely important for computations involved in planning because tight bounds can save potentially exponential amounts of search (through early backtracking and detection of solutions), relative to looser bounds. The bound computed by the new algorithm, denoted the resource-level envelope, constitutes the measure of maximum and minimum consumption of resources at any time for all fixed-time schedules in the flexible plan. At each time, the envelope guarantees that there are two fixed-time instantiations one that produces the minimum level and one that produces the maximum level. Therefore, the resource-level envelope is the tightest possible resource-level bound for a flexible plan because any tighter bound would exclude the contribution of at least one fixed-time schedule. If the resource- level envelope can be computed efficiently, one could substitute looser bounds that are currently used in the inner cores of constraint-posting scheduling algorithms, with the potential for great improvements in performance. What is needed to reduce the cost of computation is an algorithm, the measure of complexity of which is no greater than a low-degree polynomial in N (where N is the number of activities). The new algorithm satisfies this need. In this algorithm, the computation of resource-level envelopes is based on a novel combination of (1) the theory of shortest paths in the temporal-constraint network for the flexible plan and (2) the theory of maximum flows for a flow network derived from the temporal and resource constraints. The measure of asymptotic complexity of the algorithm is O(N O(maxflow(N)), where O(x) denotes an amount of computing time or a number of arithmetic operations proportional to a number of the order of x and O(maxflow(N)) is the measure of complexity (and thus of cost) of a maximumflow algorithm applied to an auxiliary flow network of 2N nodes. The algorithm is believed to be efficient in practice; experimental analysis shows the practical cost of maxflow to be as low as O(N1.5). The algorithm could be enhanced following at least two approaches. In the first approach, incremental subalgorithms for the computation of the envelope could be developed. By use of temporal scanning of the events in the temporal network, it may be possible to significantly reduce the size of the networks on which it is necessary to run the maximum-flow subalgorithm, thereby significantly reducing the time required for envelope calculation. In the second approach, the practical effectiveness of resource envelopes in the inner loops of search algorithms could be tested for multi-capacity resource scheduling. This testing would include inner-loop backtracking and termination tests and variable and value-ordering heuristics that exploit the properties of resource envelopes more directly.
Control mechanism of double-rotator-structure ternary optical computer

NASA Astrophysics Data System (ADS)

Kai, SONG; Liping, YAN

2017-03-01

Double-rotator-structure ternary optical processor (DRSTOP) has two characteristics, namely, giant data-bits parallel computing and reconfigurable processor, which can handle thousands of data bits in parallel, and can run much faster than computers and other optical computer systems so far. In order to put DRSTOP into practical application, this paper established a series of methods, namely, task classification method, data-bits allocation method, control information generation method, control information formatting and sending method, and decoded results obtaining method and so on. These methods form the control mechanism of DRSTOP. This control mechanism makes DRSTOP become an automated computing platform. Compared with the traditional calculation tools, DRSTOP computing platform can ease the contradiction between high energy consumption and big data computing due to greatly reducing the cost of communications and I/O. Finally, the paper designed a set of experiments for DRSTOP control mechanism to verify its feasibility and correctness. Experimental results showed that the control mechanism is correct, feasible and efficient.
NASA HPCC Technology for Aerospace Analysis and Design

NASA Technical Reports Server (NTRS)

Schulbach, Catherine H.

1999-01-01

The Computational Aerosciences (CAS) Project is part of NASA's High Performance Computing and Communications Program. Its primary goal is to accelerate the availability of high-performance computing technology to the US aerospace community-thus providing the US aerospace community with key tools necessary to reduce design cycle times and increase fidelity in order to improve safety, efficiency and capability of future aerospace vehicles. A complementary goal is to hasten the emergence of a viable commercial market within the aerospace community for the advantage of the domestic computer hardware and software industry. The CAS Project selects representative aerospace problems (especially design) and uses them to focus efforts on advancing aerospace algorithms and applications, systems software, and computing machinery to demonstrate vast improvements in system performance and capability over the life of the program. Recent demonstrations have served to assess the benefits of possible performance improvements while reducing the risk of adopting high-performance computing technology. This talk will discuss past accomplishments in providing technology to the aerospace community, present efforts, and future goals. For example, the times to do full combustor and compressor simulations (of aircraft engines) have been reduced by factors of 320:1 and 400:1 respectively. While this has enabled new capabilities in engine simulation, the goal of an overnight, dynamic, multi-disciplinary, 3-dimensional simulation of an aircraft engine is still years away and will require new generations of high-end technology.
Capture Efficiency of Biocompatible Magnetic Nanoparticles in Arterial Flow: A Computer Simulation for Magnetic Drug Targeting.

PubMed

Lunnoo, Thodsaphon; Puangmali, Theerapong

2015-12-01

The primary limitation of magnetic drug targeting (MDT) relates to the strength of an external magnetic field which decreases with increasing distance. Small nanoparticles (NPs) displaying superparamagnetic behaviour are also required in order to reduce embolization in the blood vessel. The small NPs, however, make it difficult to vector NPs and keep them in the desired location. The aims of this work were to investigate parameters influencing the capture efficiency of the drug carriers in mimicked arterial flow. In this work, we computationally modelled and evaluated capture efficiency in MDT with COMSOL Multiphysics 4.4. The studied parameters were (i) magnetic nanoparticle size, (ii) three classes of magnetic cores (Fe3O4, Fe2O3, and Fe), and (iii) the thickness of biocompatible coating materials (Au, SiO2, and PEG). It was found that the capture efficiency of small particles decreased with decreasing size and was less than 5 % for magnetic particles in the superparamagnetic regime. The thickness of non-magnetic coating materials did not significantly influence the capture efficiency of MDT. It was difficult to capture small drug carriers (D<200 nm) in the arterial flow. We suggest that the MDT with high-capture efficiency can be obtained in small vessels and low-blood velocities such as micro-capillary vessels.
Reduced-order modeling with sparse polynomial chaos expansion and dimension reduction for evaluating the impact of CO2 and brine leakage on groundwater

NASA Astrophysics Data System (ADS)

Liu, Y.; Zheng, L.; Pau, G. S. H.

2016-12-01

A careful assessment of the risk associated with geologic CO2 storage is critical to the deployment of large-scale storage projects. While numerical modeling is an indispensable tool for risk assessment, there has been increasing need in considering and addressing uncertainties in the numerical models. However, uncertainty analyses have been significantly hindered by the computational complexity of the model. As a remedy, reduced-order models (ROM), which serve as computationally efficient surrogates for high-fidelity models (HFM), have been employed. The ROM is constructed at the expense of an initial set of HFM simulations, and afterwards can be relied upon to predict the model output values at minimal cost. The ROM presented here is part of National Risk Assessment Program (NRAP) and intends to predict the water quality change in groundwater in response to hypothetical CO2 and brine leakage. The HFM based on which the ROM is derived is a multiphase flow and reactive transport model, with 3-D heterogeneous flow field and complex chemical reactions including aqueous complexation, mineral dissolution/precipitation, adsorption/desorption via surface complexation and cation exchange. Reduced-order modeling techniques based on polynomial basis expansion, such as polynomial chaos expansion (PCE), are widely used in the literature. However, the accuracy of such ROMs can be affected by the sparse structure of the coefficients of the expansion. Failing to identify vanishing polynomial coefficients introduces unnecessary sampling errors, the accumulation of which deteriorates the accuracy of the ROMs. To address this issue, we treat the PCE as a sparse Bayesian learning (SBL) problem, and the sparsity is obtained by detecting and including only the non-zero PCE coefficients one at a time by iteratively selecting the most contributing coefficients. The computational complexity due to predicting the entire 3-D concentration fields is further mitigated by a dimension reduction procedure-proper orthogonal decomposition (POD). Our numerical results show that utilizing the sparse structure and POD significantly enhances the accuracy and efficiency of the ROMs, laying the basis for further analyses that necessitate a large number of model simulations.
Machine learning to construct reduced-order models and scaling laws for reactive-transport applications

NASA Astrophysics Data System (ADS)

Mudunuru, M. K.; Karra, S.; Vesselinov, V. V.

2017-12-01

The efficiency of many hydrogeological applications such as reactive-transport and contaminant remediation vastly depends on the macroscopic mixing occurring in the aquifer. In the case of remediation activities, it is fundamental to enhancement and control of the mixing through impact of the structure of flow field which is impacted by groundwater pumping/extraction, heterogeneity, and anisotropy of the flow medium. However, the relative importance of these hydrogeological parameters to understand mixing process is not well studied. This is partially because to understand and quantify mixing, one needs to perform multiple runs of high-fidelity numerical simulations for various subsurface model inputs. Typically, high-fidelity simulations of existing subsurface models take hours to complete on several thousands of processors. As a result, they may not be feasible to study the importance and impact of model inputs on mixing. Hence, there is a pressing need to develop computationally efficient models to accurately predict the desired QoIs for remediation and reactive-transport applications. An attractive way to construct computationally efficient models is through reduced-order modeling using machine learning. These approaches can substantially improve our capabilities to model and predict remediation process. Reduced-Order Models (ROMs) are similar to analytical solutions or lookup tables. However, the method in which ROMs are constructed is different. Here, we present a physics-informed ML framework to construct ROMs based on high-fidelity numerical simulations. First, random forests, F-test, and mutual information are used to evaluate the importance of model inputs. Second, SVMs are used to construct ROMs based on these inputs. These ROMs are then used to understand mixing under perturbed vortex flows. Finally, we construct scaling laws for certain important QoIs such as degree of mixing and product yield. Scaling law parameters dependence on model inputs are evaluated using cluster analysis. We demonstrate application of the developed method for model analyses of reactive-transport and contaminant remediation at the Los Alamos National Laboratory (LANL) chromium contamination sites. The developed method is directly applicable for analyses of alternative site remediation scenarios.
Fast restoration approach for motion blurred image based on deconvolution under the blurring paths

NASA Astrophysics Data System (ADS)

Shi, Yu; Song, Jie; Hua, Xia

2015-12-01

For the real-time motion deblurring, it is of utmost importance to get a higher processing speed with about the same image quality. This paper presents a fast Richardson-Lucy motion deblurring approach to remove motion blur which rotates blurred image under blurring paths. Hence, the computational time is reduced sharply by using one-dimensional Fast Fourier Transform in one-dimensional Richardson-Lucy method. In order to obtain accurate transformational results, interpolation method is incorporated to fetch the gray values. Experiment results demonstrate that the proposed approach is efficient and effective to reduce motion blur under the blur paths.
Rail Mounted Gantry Crane Scheduling Optimization in Railway Container Terminal Based on Hybrid Handling Mode

PubMed Central

Zhu, Xiaoning

2014-01-01

Rail mounted gantry crane (RMGC) scheduling is important in reducing makespan of handling operation and improving container handling efficiency. In this paper, we present an RMGC scheduling optimization model, whose objective is to determine an optimization handling sequence in order to minimize RMGC idle load time in handling tasks. An ant colony optimization is proposed to obtain near optimal solutions. Computational experiments on a specific railway container terminal are conducted to illustrate the proposed model and solution algorithm. The results show that the proposed method is effective in reducing the idle load time of RMGC. PMID:25538768
Minimizing Dispersion in FDTD Methods with CFL Limit Extension

NASA Astrophysics Data System (ADS)

Sun, Chen

The CFL extension in FDTD methods is receiving considerable attention in order to reduce the computational effort and save the simulation time. One of the major issues in the CFL extension methods is the increased dispersion. We formulate a decomposition of FDTD equations to study the behaviour of the dispersion. A compensation scheme to reduce the dispersion in CFL extension is constructed and proposed. We further study the CFL extension in a FDTD subgridding case, where we improve the accuracy by acting only on the FDTD equations of the fine grid. Numerical results confirm the efficiency of the proposed method for minimising dispersion.

Comparison of Mixing Characteristics for Several Fuel Injectors on an Open Plate and in a Ducted Flowpath Configuration at Hypervelocity Flow Conditions

NASA Technical Reports Server (NTRS)

Drozda, Tomasz G.; Shenoy, Rajiv R.; Passe, Bradley J.; Baurle, Robert A.; Drummond, J. Philip

2017-01-01

In order to reduce the cost and complexity associated with fuel injection and mixing experiments for high-speed flows, and to further enable optical access to the test section for nonintrusive diagnostics, the Enhanced Injection and Mixing Project (EIMP) utilizes an open flat plate configuration to characterize inert mixing properties of various fuel injectors for hypervelocity applications. The experiments also utilize reduced total temperature conditions to alleviate the need for hardware cooling. The use of "cold" flows and non-reacting mixtures for mixing experiments is not new, and has been extensively utilized as a screening technique for scramjet fuel injectors. The impact of reduced facility-air total temperature, and the use of inert fuel simulants, such as helium, on the mixing character of the flow has been assessed in previous numerical studies by the authors. Mixing performance was characterized for three different injectors: a strut, a ramp, and a flushwall. The present study focuses on the impact of using an open plate to approximate mixing in the duct. Toward this end, Reynolds-averaged simulations (RAS) were performed for the three fuel injectors in an open plate configuration and in a duct. The mixing parameters of interest, such as mixing efficiency and total pressure recovery, are then computed and compared for the two configurations. In addition to mixing efficiency and total pressure recovery, the combustion efficiency and thrust potential are also computed for the reacting simulations.
Auxiliary Parameter MCMC for Exponential Random Graph Models

NASA Astrophysics Data System (ADS)

Byshkin, Maksym; Stivala, Alex; Mira, Antonietta; Krause, Rolf; Robins, Garry; Lomi, Alessandro

2016-11-01

Exponential random graph models (ERGMs) are a well-established family of statistical models for analyzing social networks. Computational complexity has so far limited the appeal of ERGMs for the analysis of large social networks. Efficient computational methods are highly desirable in order to extend the empirical scope of ERGMs. In this paper we report results of a research project on the development of snowball sampling methods for ERGMs. We propose an auxiliary parameter Markov chain Monte Carlo (MCMC) algorithm for sampling from the relevant probability distributions. The method is designed to decrease the number of allowed network states without worsening the mixing of the Markov chains, and suggests a new approach for the developments of MCMC samplers for ERGMs. We demonstrate the method on both simulated and actual (empirical) network data and show that it reduces CPU time for parameter estimation by an order of magnitude compared to current MCMC methods.
An efficient formulation of robot arm dynamics for control and computer simulation

NASA Astrophysics Data System (ADS)

Lee, C. S. G.; Nigam, R.

This paper describes an efficient formulation of the dynamic equations of motion of industrial robots based on the Lagrange formulation of d'Alembert's principle. This formulation, as applied to a PUMA robot arm, results in a set of closed form second order differential equations with cross product terms. They are not as efficient in computation as those formulated by the Newton-Euler method, but provide a better analytical model for control analysis and computer simulation. Computational complexities of this dynamic model together with other models are tabulated for discussion.
Higher Order Time Integration Schemes for the Unsteady Navier-Stokes Equations on Unstructured Meshes

NASA Technical Reports Server (NTRS)

Jothiprasad, Giridhar; Mavriplis, Dimitri J.; Caughey, David A.

2002-01-01

The rapid increase in available computational power over the last decade has enabled higher resolution flow simulations and more widespread use of unstructured grid methods for complex geometries. While much of this effort has been focused on steady-state calculations in the aerodynamics community, the need to accurately predict off-design conditions, which may involve substantial amounts of flow separation, points to the need to efficiently simulate unsteady flow fields. Accurate unsteady flow simulations can easily require several orders of magnitude more computational effort than a corresponding steady-state simulation. For this reason, techniques for improving the efficiency of unsteady flow simulations are required in order to make such calculations feasible in the foreseeable future. The purpose of this work is to investigate possible reductions in computer time due to the choice of an efficient time-integration scheme from a series of schemes differing in the order of time-accuracy, and by the use of more efficient techniques to solve the nonlinear equations which arise while using implicit time-integration schemes. This investigation is carried out in the context of a two-dimensional unstructured mesh laminar Navier-Stokes solver.
Approximation for discrete Fourier transform and application in study of three-dimensional interacting electron gas.

PubMed

Yan, Xin-Zhong

2011-07-01

The discrete Fourier transform is approximated by summing over part of the terms with corresponding weights. The approximation reduces significantly the requirement for computer memory storage and enhances the numerical computation efficiency with several orders without losing accuracy. As an example, we apply the algorithm to study the three-dimensional interacting electron gas under the renormalized-ring-diagram approximation where the Green's function needs to be self-consistently solved. We present the results for the chemical potential, compressibility, free energy, entropy, and specific heat of the system. The ground-state energy obtained by the present calculation is compared with the existing results of Monte Carlo simulation and random-phase approximation.
Nonlinear Reduced-Order Analysis with Time-Varying Spatial Loading Distributions

NASA Technical Reports Server (NTRS)

Prezekop, Adam

2008-01-01

Oscillating shocks acting in combination with high-intensity acoustic loadings present a challenge to the design of resilient hypersonic flight vehicle structures. This paper addresses some features of this loading condition and certain aspects of a nonlinear reduced-order analysis with emphasis on system identification leading to formation of a robust modal basis. The nonlinear dynamic response of a composite structure subject to the simultaneous action of locally strong oscillating pressure gradients and high-intensity acoustic loadings is considered. The reduced-order analysis used in this work has been previously demonstrated to be both computationally efficient and accurate for time-invariant spatial loading distributions, provided that an appropriate modal basis is used. The challenge of the present study is to identify a suitable basis for loadings with time-varying spatial distributions. Using a proper orthogonal decomposition and modal expansion, it is shown that such a basis can be developed. The basis is made more robust by incrementally expanding it to account for changes in the location, frequency and span of the oscillating pressure gradient.
A power-efficient communication system between brain-implantable devices and external computers.

PubMed

Yao, Ning; Lee, Heung-No; Chang, Cheng-Chun; Sclabassi, Robert J; Sun, Mingui

2007-01-01

In this paper, we propose a power efficient communication system for linking a brain-implantable device to an external system. For battery powered implantable devices, the processor and the transmitter power should be reduced in order to both conserve battery power and reduce the health risks associated with transmission. To accomplish this, a joint source-channel coding/decoding system is devised. Low-density generator matrix (LDGM) codes are used in our system due to their low encoding complexity. The power cost for signal processing within the implantable device is greatly reduced by avoiding explicit source encoding. Raw data which is highly correlated is transmitted. At the receiver, a Markov chain source correlation model is utilized to approximate and capture the correlation of raw data. A turbo iterative receiver algorithm is designed which connects the Markov chain source model to the LDGM decoder in a turbo-iterative way. Simulation results show that the proposed system can save up to 1 to 2.5 dB on transmission power.
Efficient combination of a 3D Quasi-Newton inversion algorithm and a vector dual-primal finite element tearing and interconnecting method

NASA Astrophysics Data System (ADS)

Voznyuk, I.; Litman, A.; Tortel, H.

2015-08-01

A Quasi-Newton method for reconstructing the constitutive parameters of three-dimensional (3D) penetrable scatterers from scattered field measurements is presented. This method is adapted for handling large-scale electromagnetic problems while keeping the memory requirement and the time flexibility as low as possible. The forward scattering problem is solved by applying the finite-element tearing and interconnecting full-dual-primal (FETI-FDP2) method which shares the same spirit as the domain decomposition methods for finite element methods. The idea is to split the computational domain into smaller non-overlapping sub-domains in order to simultaneously solve local sub-problems. Various strategies are proposed in order to efficiently couple the inversion algorithm with the FETI-FDP2 method: a separation into permanent and non-permanent subdomains is performed, iterative solvers are favorized for resolving the interface problem and a marching-on-in-anything initial guess selection further accelerates the process. The computational burden is also reduced by applying the adjoint state vector methodology. Finally, the inversion algorithm is confronted to measurements extracted from the 3D Fresnel database.
A time-space domain stereo finite difference method for 3D scalar wave propagation

NASA Astrophysics Data System (ADS)

Chen, Yushu; Yang, Guangwen; Ma, Xiao; He, Conghui; Song, Guojie

2016-11-01

The time-space domain finite difference methods reduce numerical dispersion effectively by minimizing the error in the joint time-space domain. However, their interpolating coefficients are related with the Courant numbers, leading to significantly extra time costs for loading the coefficients consecutively according to velocity in heterogeneous models. In the present study, we develop a time-space domain stereo finite difference (TSSFD) method for 3D scalar wave equation. The method propagates both the displacements and their gradients simultaneously to keep more information of the wavefields, and minimizes the maximum phase velocity error directly using constant interpolation coefficients for different Courant numbers. We obtain the optimal constant coefficients by combining the truncated Taylor series approximation and the time-space domain optimization, and adjust the coefficients to improve the stability condition. Subsequent investigation shows that the TSSFD can suppress numerical dispersion effectively with high computational efficiency. The maximum phase velocity error of the TSSFD is just 3.09% even with only 2 sampling points per minimum wavelength when the Courant number is 0.4. Numerical experiments show that to generate wavefields with no visible numerical dispersion, the computational efficiency of the TSSFD is 576.9%, 193.5%, 699.0%, and 191.6% of those of the 4th-order and 8th-order Lax-Wendroff correction (LWC) method, the 4th-order staggered grid method (SG), and the 8th-order optimal finite difference method (OFD), respectively. Meanwhile, the TSSFD is compatible to the unsplit convolutional perfectly matched layer (CPML) boundary condition for absorbing artificial boundaries. The efficiency and capability to handle complex velocity models make it an attractive tool in imaging methods such as acoustic reverse time migration (RTM).
Solving large-scale PDE-constrained Bayesian inverse problems with Riemann manifold Hamiltonian Monte Carlo

NASA Astrophysics Data System (ADS)

Bui-Thanh, T.; Girolami, M.

2014-11-01

We consider the Riemann manifold Hamiltonian Monte Carlo (RMHMC) method for solving statistical inverse problems governed by partial differential equations (PDEs). The Bayesian framework is employed to cast the inverse problem into the task of statistical inference whose solution is the posterior distribution in infinite dimensional parameter space conditional upon observation data and Gaussian prior measure. We discretize both the likelihood and the prior using the H1-conforming finite element method together with a matrix transfer technique. The power of the RMHMC method is that it exploits the geometric structure induced by the PDE constraints of the underlying inverse problem. Consequently, each RMHMC posterior sample is almost uncorrelated/independent from the others providing statistically efficient Markov chain simulation. However this statistical efficiency comes at a computational cost. This motivates us to consider computationally more efficient strategies for RMHMC. At the heart of our construction is the fact that for Gaussian error structures the Fisher information matrix coincides with the Gauss-Newton Hessian. We exploit this fact in considering a computationally simplified RMHMC method combining state-of-the-art adjoint techniques and the superiority of the RMHMC method. Specifically, we first form the Gauss-Newton Hessian at the maximum a posteriori point and then use it as a fixed constant metric tensor throughout RMHMC simulation. This eliminates the need for the computationally costly differential geometric Christoffel symbols, which in turn greatly reduces computational effort at a corresponding loss of sampling efficiency. We further reduce the cost of forming the Fisher information matrix by using a low rank approximation via a randomized singular value decomposition technique. This is efficient since a small number of Hessian-vector products are required. The Hessian-vector product in turn requires only two extra PDE solves using the adjoint technique. Various numerical results up to 1025 parameters are presented to demonstrate the ability of the RMHMC method in exploring the geometric structure of the problem to propose (almost) uncorrelated/independent samples that are far away from each other, and yet the acceptance rate is almost unity. The results also suggest that for the PDE models considered the proposed fixed metric RMHMC can attain almost as high a quality performance as the original RMHMC, i.e. generating (almost) uncorrelated/independent samples, while being two orders of magnitude less computationally expensive.
Model-order reduction of lumped parameter systems via fractional calculus

NASA Astrophysics Data System (ADS)

Hollkamp, John P.; Sen, Mihir; Semperlotti, Fabio

2018-04-01

This study investigates the use of fractional order differential models to simulate the dynamic response of non-homogeneous discrete systems and to achieve efficient and accurate model order reduction. The traditional integer order approach to the simulation of non-homogeneous systems dictates the use of numerical solutions and often imposes stringent compromises between accuracy and computational performance. Fractional calculus provides an alternative approach where complex dynamical systems can be modeled with compact fractional equations that not only can still guarantee analytical solutions, but can also enable high levels of order reduction without compromising on accuracy. Different approaches are explored in order to transform the integer order model into a reduced order fractional model able to match the dynamic response of the initial system. Analytical and numerical results show that, under certain conditions, an exact match is possible and the resulting fractional differential models have both a complex and frequency-dependent order of the differential operator. The implications of this type of approach for both model order reduction and model synthesis are discussed.
Aerodynamic design optimization using sensitivity analysis and computational fluid dynamics

NASA Technical Reports Server (NTRS)

Baysal, Oktay; Eleshaky, Mohamed E.

1991-01-01

A new and efficient method is presented for aerodynamic design optimization, which is based on a computational fluid dynamics (CFD)-sensitivity analysis algorithm. The method is applied to design a scramjet-afterbody configuration for an optimized axial thrust. The Euler equations are solved for the inviscid analysis of the flow, which in turn provides the objective function and the constraints. The CFD analysis is then coupled with the optimization procedure that uses a constrained minimization method. The sensitivity coefficients, i.e. gradients of the objective function and the constraints, needed for the optimization are obtained using a quasi-analytical method rather than the traditional brute force method of finite difference approximations. During the one-dimensional search of the optimization procedure, an approximate flow analysis (predicted flow) based on a first-order Taylor series expansion is used to reduce the computational cost. Finally, the sensitivity of the optimum objective function to various design parameters, which are kept constant during the optimization, is computed to predict new optimum solutions. The flow analysis of the demonstrative example are compared with the experimental data. It is shown that the method is more efficient than the traditional methods.
Sub-domain methods for collaborative electromagnetic computations

NASA Astrophysics Data System (ADS)

Soudais, Paul; Barka, André

2006-06-01

In this article, we describe a sub-domain method for electromagnetic computations based on boundary element method. The benefits of the sub-domain method are that the computation can be split between several companies for collaborative studies; also the computation time can be reduced by one or more orders of magnitude especially in the context of parametric studies. The accuracy and efficiency of this technique is assessed by RCS computations on an aircraft air intake with duct and rotating engine mock-up called CHANNEL. Collaborative results, obtained by combining two sets of sub-domains computed by two companies, are compared with measurements on the CHANNEL mock-up. The comparisons are made for several angular positions of the engine to show the benefits of the method for parametric studies. We also discuss the accuracy of two formulations of the sub-domain connecting scheme using edge based or modal field expansion. To cite this article: P. Soudais, A. Barka, C. R. Physique 7 (2006).
Reduced-Order Modeling for Flutter/LCO Using Recurrent Artificial Neural Network

NASA Technical Reports Server (NTRS)

Yao, Weigang; Liou, Meng-Sing

2012-01-01

The present study demonstrates the efficacy of a recurrent artificial neural network to provide a high fidelity time-dependent nonlinear reduced-order model (ROM) for flutter/limit-cycle oscillation (LCO) modeling. An artificial neural network is a relatively straightforward nonlinear method for modeling an input-output relationship from a set of known data, for which we use the radial basis function (RBF) with its parameters determined through a training process. The resulting RBF neural network, however, is only static and is not yet adequate for an application to problems of dynamic nature. The recurrent neural network method [1] is applied to construct a reduced order model resulting from a series of high-fidelity time-dependent data of aero-elastic simulations. Once the RBF neural network ROM is constructed properly, an accurate approximate solution can be obtained at a fraction of the cost of a full-order computation. The method derived during the study has been validated for predicting nonlinear aerodynamic forces in transonic flow and is capable of accurate flutter/LCO simulations. The obtained results indicate that the present recurrent RBF neural network is accurate and efficient for nonlinear aero-elastic system analysis
Facilitating NASA Earth Science Data Processing Using Nebula Cloud Computing

NASA Technical Reports Server (NTRS)

Pham, Long; Chen, Aijun; Kempler, Steven; Lynnes, Christopher; Theobald, Michael; Asghar, Esfandiari; Campino, Jane; Vollmer, Bruce

2011-01-01

Cloud Computing has been implemented in several commercial arenas. The NASA Nebula Cloud Computing platform is an Infrastructure as a Service (IaaS) built in 2008 at NASA Ames Research Center and 2010 at GSFC. Nebula is an open source Cloud platform intended to: a) Make NASA realize significant cost savings through efficient resource utilization, reduced energy consumption, and reduced labor costs. b) Provide an easier way for NASA scientists and researchers to efficiently explore and share large and complex data sets. c) Allow customers to provision, manage, and decommission computing capabilities on an as-needed bases
Identification of reduced-order thermal therapy models using thermal MR images: theory and validation.

PubMed

Niu, Ran; Skliar, Mikhail

2012-07-01

In this paper, we develop and validate a method to identify computationally efficient site- and patient-specific models of ultrasound thermal therapies from MR thermal images. The models of the specific absorption rate of the transduced energy and the temperature response of the therapy target are identified in the reduced basis of proper orthogonal decomposition of thermal images, acquired in response to a mild thermal test excitation. The method permits dynamic reidentification of the treatment models during the therapy by recursively utilizing newly acquired images. Such adaptation is particularly important during high-temperature therapies, which are known to substantially and rapidly change tissue properties and blood perfusion. The developed theory was validated for the case of focused ultrasound heating of a tissue phantom. The experimental and computational results indicate that the developed approach produces accurate low-dimensional treatment models despite temporal and spatial noises in MR images and slow image acquisition rate.
Data-driven Applications for the Sun-Earth System

NASA Astrophysics Data System (ADS)

Kondrashov, D. A.

2016-12-01

Advances in observational and data mining techniques allow extracting information from the large volume of Sun-Earth observational data that can be assimilated into first principles physical models. However, equations governing Sun-Earth phenomena are typically nonlinear, complex, and high-dimensional. The high computational demand of solving the full governing equations over a large range of scales precludes the use of a variety of useful assimilative tools that rely on applied mathematical and statistical techniques for quantifying uncertainty and predictability. Effective use of such tools requires the development of computationally efficient methods to facilitate fusion of data with models. This presentation will provide an overview of various existing as well as newly developed data-driven techniques adopted from atmospheric and oceanic sciences that proved to be useful for space physics applications, such as computationally efficient implementation of Kalman Filter in radiation belts modeling, solar wind gap-filling by Singular Spectrum Analysis, and low-rank procedure for assimilation of low-altitude ionospheric magnetic perturbations into the Lyon-Fedder-Mobarry (LFM) global magnetospheric model. Reduced-order non-Markovian inverse modeling and novel data-adaptive decompositions of Sun-Earth datasets will be also demonstrated.
Modeling of the WSTF frictional heating apparatus in high pressure systems

NASA Technical Reports Server (NTRS)

Skowlund, Christopher T.

1992-01-01

In order to develop a computer program able to model the frictional heating of metals in high pressure oxygen or nitrogen a number of additions have been made to the frictional heating model originally developed for tests in low pressure helium. These additions include: (1) a physical property package for the gases to account for departures from the ideal gas state; (2) two methods for spatial discretization (finite differences with quadratic interpolation or orthogonal collocation on finite elements) which substantially reduce the computer time required to solve the transient heat balance; (3) more efficient programs for the integration of the ordinary differential equations resulting from the discretization of the partial differential equations; and (4) two methods for determining the best-fit parameters via minimization of the mean square error (either a direct search multivariable simplex method or a modified Levenburg-Marquardt algorithm). The resulting computer program has been shown to be accurate, efficient and robust for determining the heat flux or friction coefficient vs. time at the interface of the stationary and rotating samples.
Spectral-based propagation schemes for time-dependent quantum systems with application to carbon nanotubes

NASA Astrophysics Data System (ADS)

Chen, Zuojing; Polizzi, Eric

2010-11-01

Effective modeling and numerical spectral-based propagation schemes are proposed for addressing the challenges in time-dependent quantum simulations of systems ranging from atoms, molecules, and nanostructures to emerging nanoelectronic devices. While time-dependent Hamiltonian problems can be formally solved by propagating the solutions along tiny simulation time steps, a direct numerical treatment is often considered too computationally demanding. In this paper, however, we propose to go beyond these limitations by introducing high-performance numerical propagation schemes to compute the solution of the time-ordered evolution operator. In addition to the direct Hamiltonian diagonalizations that can be efficiently performed using the new eigenvalue solver FEAST, we have designed a Gaussian propagation scheme and a basis-transformed propagation scheme (BTPS) which allow to reduce considerably the simulation times needed by time intervals. It is outlined that BTPS offers the best computational efficiency allowing new perspectives in time-dependent simulations. Finally, these numerical schemes are applied to study the ac response of a (5,5) carbon nanotube within a three-dimensional real-space mesh framework.
An efficient implementation of a high-order filter for a cubed-sphere spectral element model

NASA Astrophysics Data System (ADS)

Kang, Hyun-Gyu; Cheong, Hyeong-Bin

2017-03-01

A parallel-scalable, isotropic, scale-selective spatial filter was developed for the cubed-sphere spectral element model on the sphere. The filter equation is a high-order elliptic (Helmholtz) equation based on the spherical Laplacian operator, which is transformed into cubed-sphere local coordinates. The Laplacian operator is discretized on the computational domain, i.e., on each cell, by the spectral element method with Gauss-Lobatto Lagrange interpolating polynomials (GLLIPs) as the orthogonal basis functions. On the global domain, the discrete filter equation yielded a linear system represented by a highly sparse matrix. The density of this matrix increases quadratically (linearly) with the order of GLLIP (order of the filter), and the linear system is solved in only O (Ng) operations, where Ng is the total number of grid points. The solution, obtained by a row reduction method, demonstrated the typical accuracy and convergence rate of the cubed-sphere spectral element method. To achieve computational efficiency on parallel computers, the linear system was treated by an inverse matrix method (a sparse matrix-vector multiplication). The density of the inverse matrix was lowered to only a few times of the original sparse matrix without degrading the accuracy of the solution. For better computational efficiency, a local-domain high-order filter was introduced: The filter equation is applied to multiple cells, and then the central cell was only used to reconstruct the filtered field. The parallel efficiency of applying the inverse matrix method to the global- and local-domain filter was evaluated by the scalability on a distributed-memory parallel computer. The scale-selective performance of the filter was demonstrated on Earth topography. The usefulness of the filter as a hyper-viscosity for the vorticity equation was also demonstrated.

On-line implementation of nonlinear parameter estimation for the Space Shuttle main engine

NASA Technical Reports Server (NTRS)

Buckland, Julia H.; Musgrave, Jeffrey L.; Walker, Bruce K.

1992-01-01

We investigate the performance of a nonlinear estimation scheme applied to the estimation of several parameters in a performance model of the Space Shuttle Main Engine. The nonlinear estimator is based upon the extended Kalman filter which has been augmented to provide estimates of several key performance variables. The estimated parameters are directly related to the efficiency of both the low pressure and high pressure fuel turbopumps. Decreases in the parameter estimates may be interpreted as degradations in turbine and/or pump efficiencies which can be useful measures for an online health monitoring algorithm. This paper extends previous work which has focused on off-line parameter estimation by investigating the filter's on-line potential from a computational standpoint. ln addition, we examine the robustness of the algorithm to unmodeled dynamics. The filter uses a reduced-order model of the engine that includes only fuel-side dynamics. The on-line results produced during this study are comparable to off-line results generated previously. The results show that the parameter estimates are sensitive to dynamics not included in the filter model. Off-line results using an extended Kalman filter with a full order engine model to address the robustness problems of the reduced-order model are also presented.
Efficient implementation of one- and two-component analytical energy gradients in exact two-component theory

NASA Astrophysics Data System (ADS)

Franzke, Yannick J.; Middendorf, Nils; Weigend, Florian

2018-03-01

We present an efficient algorithm for one- and two-component analytical energy gradients with respect to nuclear displacements in the exact two-component decoupling approach to the one-electron Dirac equation (X2C). Our approach is a generalization of the spin-free ansatz by Cheng and Gauss [J. Chem. Phys. 135, 084114 (2011)], where the perturbed one-electron Hamiltonian is calculated by solving a first-order response equation. Computational costs are drastically reduced by applying the diagonal local approximation to the unitary decoupling transformation (DLU) [D. Peng and M. Reiher, J. Chem. Phys. 136, 244108 (2012)] to the X2C Hamiltonian. The introduced error is found to be almost negligible as the mean absolute error of the optimized structures amounts to only 0.01 pm. Our implementation in TURBOMOLE is also available within the finite nucleus model based on a Gaussian charge distribution. For a X2C/DLU gradient calculation, computational effort scales cubically with the molecular size, while storage increases quadratically. The efficiency is demonstrated in calculations of large silver clusters and organometallic iridium complexes.
Exponential Methods for the Time Integration of Schroedinger Equation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cano, B.; Gonzalez-Pachon, A.

2010-09-30

We consider exponential methods of second order in time in order to integrate the cubic nonlinear Schroedinger equation. We are interested in taking profit of the special structure of this equation. Therefore, we look at symmetry, symplecticity and approximation of invariants of the proposed methods. That will allow to integrate till long times with reasonable accuracy. Computational efficiency is also our aim. Therefore, we make numerical computations in order to compare the methods considered and so as to conclude that explicit Lawson schemes projected on the norm of the solution are an efficient tool to integrate this equation.
A computationally efficient parallel Levenberg-Marquardt algorithm for highly parameterized inverse model analyses

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.

Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally-efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace, such that the dimensionality of themore » problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2D and a random hydraulic conductivity field in 3D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ~10 1 to ~10 2 in a multi-core computational environment. Furthermore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate- to large-scale problems.« less
A computationally efficient parallel Levenberg-Marquardt algorithm for highly parameterized inverse model analyses

DOE PAGES

Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.

2016-09-01

Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally-efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace, such that the dimensionality of themore » problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2D and a random hydraulic conductivity field in 3D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ~10 1 to ~10 2 in a multi-core computational environment. Furthermore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate- to large-scale problems.« less
Cache and energy efficient algorithms for Nussinov's RNA Folding.

PubMed

Zhao, Chunchun; Sahni, Sartaj

2017-12-06

An RNA folding/RNA secondary structure prediction algorithm determines the non-nested/pseudoknot-free structure by maximizing the number of complementary base pairs and minimizing the energy. Several implementations of Nussinov's classical RNA folding algorithm have been proposed. Our focus is to obtain run time and energy efficiency by reducing the number of cache misses. Three cache-efficient algorithms, ByRow, ByRowSegment and ByBox, for Nussinov's RNA folding are developed. Using a simple LRU cache model, we show that the Classical algorithm of Nussinov has the highest number of cache misses followed by the algorithms Transpose (Li et al.), ByRow, ByRowSegment, and ByBox (in this order). Extensive experiments conducted on four computational platforms-Xeon E5, AMD Athlon 64 X2, Intel I7 and PowerPC A2-using two programming languages-C and Java-show that our cache efficient algorithms are also efficient in terms of run time and energy. Our benchmarking shows that, depending on the computational platform and programming language, either ByRow or ByBox give best run time and energy performance. The C version of these algorithms reduce run time by as much as 97.2% and energy consumption by as much as 88.8% relative to Classical and by as much as 56.3% and 57.8% relative to Transpose. The Java versions reduce run time by as much as 98.3% relative to Classical and by as much as 75.2% relative to Transpose. Transpose achieves run time and energy efficiency at the expense of memory as it takes twice the memory required by Classical. The memory required by ByRow, ByRowSegment, and ByBox is the same as that of Classical. As a result, using the same amount of memory, the algorithms proposed by us can solve problems up to 40% larger than those solvable by Transpose.
Accurate modeling of plasma acceleration with arbitrary order pseudo-spectral particle-in-cell methods

DOE PAGES

Jalas, S.; Dornmair, I.; Lehe, R.; ...

2017-03-20

Particle in Cell (PIC) simulations are a widely used tool for the investigation of both laser- and beam-driven plasma acceleration. It is a known issue that the beam quality can be artificially degraded by numerical Cherenkov radiation (NCR) resulting primarily from an incorrectly modeled dispersion relation. Pseudo-spectral solvers featuring infinite order stencils can strongly reduce NCR - or even suppress it - and are therefore well suited to correctly model the beam properties. For efficient parallelization of the PIC algorithm, however, localized solvers are inevitable. Arbitrary order pseudo-spectral methods provide this needed locality. Yet, these methods can again be pronemore » to NCR. Here in this paper, we show that acceptably low solver orders are sufficient to correctly model the physics of interest, while allowing for parallel computation by domain decomposition.« less
Efficient Computation of Closed-loop Frequency Response for Large Order Flexible Systems

NASA Technical Reports Server (NTRS)

Maghami, Peiman G.; Giesy, Daniel P.

1997-01-01

An efficient and robust computational scheme is given for the calculation of the frequency response function of a large order, flexible system implemented with a linear, time invariant control system. Advantage is taken of the highly structured sparsity of the system matrix of the plant based on a model of the structure using normal mode coordinates. The computational time per frequency point of the new computational scheme is a linear function of system size, a significant improvement over traditional, full-matrix techniques whose computational times per frequency point range from quadratic to cubic functions of system size. This permits the practical frequency domain analysis of systems of much larger order than by traditional, full-matrix techniques. Formulations are given for both open and closed loop loop systems. Numerical examples are presented showing the advantages of the present formulation over traditional approaches, both in speed and in accuracy. Using a model with 703 structural modes, a speed-up of almost two orders of magnitude was observed while accuracy improved by up to 5 decimal places.
Development of Boundary Condition Independent Reduced Order Thermal Models using Proper Orthogonal Decomposition

NASA Astrophysics Data System (ADS)

Raghupathy, Arun; Ghia, Karman; Ghia, Urmila

2008-11-01

Compact Thermal Models (CTM) to represent IC packages has been traditionally developed using the DELPHI-based (DEvelopment of Libraries of PHysical models for an Integrated design) methodology. The drawbacks of this method are presented, and an alternative method is proposed. A reduced-order model that provides the complete thermal information accurately with less computational resources can be effectively used in system level simulations. Proper Orthogonal Decomposition (POD), a statistical method, can be used to reduce the order of the degree of freedom or variables of the computations for such a problem. POD along with the Galerkin projection allows us to create reduced-order models that reproduce the characteristics of the system with a considerable reduction in computational resources while maintaining a high level of accuracy. The goal of this work is to show that this method can be applied to obtain a boundary condition independent reduced-order thermal model for complex components. The methodology is applied to the 1D transient heat equation.
Fast prediction and evaluation of eccentric inspirals using reduced-order models

NASA Astrophysics Data System (ADS)

Barta, Dániel; Vasúth, Mátyás

2018-06-01

A large number of theoretically predicted waveforms are required by matched-filtering searches for the gravitational-wave signals produced by compact binary coalescence. In order to substantially alleviate the computational burden in gravitational-wave searches and parameter estimation without degrading the signal detectability, we propose a novel reduced-order-model (ROM) approach with applications to adiabatic 3PN-accurate inspiral waveforms of nonspinning sources that evolve on either highly or slightly eccentric orbits. We provide a singular-value decomposition-based reduced-basis method in the frequency domain to generate reduced-order approximations of any gravitational waves with acceptable accuracy and precision within the parameter range of the model. We construct efficient reduced bases comprised of a relatively small number of the most relevant waveforms over three-dimensional parameter-space covered by the template bank (total mass 2.15 M⊙≤M ≤215 M⊙ , mass ratio 0.01 ≤q ≤1 , and initial orbital eccentricity 0 ≤e0≤0.95 ). The ROM is designed to predict signals in the frequency band from 10 Hz to 2 kHz for aLIGO and aVirgo design sensitivity. Beside moderating the data reduction, finer sampling of fiducial templates improves the accuracy of surrogates. Considerable increase in the speedup from several hundreds to thousands can be achieved by evaluating surrogates for low-mass systems especially when combined with high-eccentricity.
Efficient Unsteady Flow Visualization with High-Order Access Dependencies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Jiang; Guo, Hanqi; Yuan, Xiaoru

We present a novel high-order access dependencies based model for efficient pathline computation in unsteady flow visualization. By taking longer access sequences into account to model more sophisticated data access patterns in particle tracing, our method greatly improves the accuracy and reliability in data access prediction. In our work, high-order access dependencies are calculated by tracing uniformly-seeded pathlines in both forward and backward directions in a preprocessing stage. The effectiveness of our proposed approach is demonstrated through a parallel particle tracing framework with high-order data prefetching. Results show that our method achieves higher data locality and hence improves the efficiencymore » of pathline computation.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Jalas, S.; Dornmair, I.; Lehe, R.

Particle in Cell (PIC) simulations are a widely used tool for the investigation of both laser- and beam-driven plasma acceleration. It is a known issue that the beam quality can be artificially degraded by numerical Cherenkov radiation (NCR) resulting primarily from an incorrectly modeled dispersion relation. Pseudo-spectral solvers featuring infinite order stencils can strongly reduce NCR - or even suppress it - and are therefore well suited to correctly model the beam properties. For efficient parallelization of the PIC algorithm, however, localized solvers are inevitable. Arbitrary order pseudo-spectral methods provide this needed locality. Yet, these methods can again be pronemore » to NCR. Here in this paper, we show that acceptably low solver orders are sufficient to correctly model the physics of interest, while allowing for parallel computation by domain decomposition.« less
Semi-supervised Machine Learning for Analysis of Hydrogeochemical Data and Models

NASA Astrophysics Data System (ADS)

Vesselinov, Velimir; O'Malley, Daniel; Alexandrov, Boian; Moore, Bryan

2017-04-01

Data- and model-based analyses such as uncertainty quantification, sensitivity analysis, and decision support using complex physics models with numerous model parameters and typically require a huge number of model evaluations (on order of 10^6). Furthermore, model simulations of complex physics may require substantial computational time. For example, accounting for simultaneously occurring physical processes such as fluid flow and biogeochemical reactions in heterogeneous porous medium may require several hours of wall-clock computational time. To address these issues, we have developed a novel methodology for semi-supervised machine learning based on Non-negative Matrix Factorization (NMF) coupled with customized k-means clustering. The algorithm allows for automated, robust Blind Source Separation (BSS) of groundwater types (contamination sources) based on model-free analyses of observed hydrogeochemical data. We have also developed reduced order modeling tools, which coupling support vector regression (SVR), genetic algorithms (GA) and artificial and convolutional neural network (ANN/CNN). SVR is applied to predict the model behavior within prior uncertainty ranges associated with the model parameters. ANN and CNN procedures are applied to upscale heterogeneity of the porous medium. In the upscaling process, fine-scale high-resolution models of heterogeneity are applied to inform coarse-resolution models which have improved computational efficiency while capturing the impact of fine-scale effects at the course scale of interest. These techniques are tested independently on a series of synthetic problems. We also present a decision analysis related to contaminant remediation where the developed reduced order models are applied to reproduce groundwater flow and contaminant transport in a synthetic heterogeneous aquifer. The tools are coded in Julia and are a part of the MADS high-performance computational framework (https://github.com/madsjulia/Mads.jl).
Improving Design Efficiency for Large-Scale Heterogeneous Circuits

NASA Astrophysics Data System (ADS)

Gregerson, Anthony

Despite increases in logic density, many Big Data applications must still be partitioned across multiple computing devices in order to meet their strict performance requirements. Among the most demanding of these applications is high-energy physics (HEP), which uses complex computing systems consisting of thousands of FPGAs and ASICs to process the sensor data created by experiments at particles accelerators such as the Large Hadron Collider (LHC). Designing such computing systems is challenging due to the scale of the systems, the exceptionally high-throughput and low-latency performance constraints that necessitate application-specific hardware implementations, the requirement that algorithms are efficiently partitioned across many devices, and the possible need to update the implemented algorithms during the lifetime of the system. In this work, we describe our research to develop flexible architectures for implementing such large-scale circuits on FPGAs. In particular, this work is motivated by (but not limited in scope to) high-energy physics algorithms for the Compact Muon Solenoid (CMS) experiment at the LHC. To make efficient use of logic resources in multi-FPGA systems, we introduce Multi-Personality Partitioning, a novel form of the graph partitioning problem, and present partitioning algorithms that can significantly improve resource utilization on heterogeneous devices while also reducing inter-chip connections. To reduce the high communication costs of Big Data applications, we also introduce Information-Aware Partitioning, a partitioning method that analyzes the data content of application-specific circuits, characterizes their entropy, and selects circuit partitions that enable efficient compression of data between chips. We employ our information-aware partitioning method to improve the performance of the hardware validation platform for evaluating new algorithms for the CMS experiment. Together, these research efforts help to improve the efficiency and decrease the cost of the developing large-scale, heterogeneous circuits needed to enable large-scale application in high-energy physics and other important areas.
Parallel Computing Strategies for Irregular Algorithms

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)

2002-01-01

Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.
BWM*: A Novel, Provable, Ensemble-based Dynamic Programming Algorithm for Sparse Approximations of Computational Protein Design.

PubMed

Jou, Jonathan D; Jain, Swati; Georgiev, Ivelin S; Donald, Bruce R

2016-06-01

Sparse energy functions that ignore long range interactions between residue pairs are frequently used by protein design algorithms to reduce computational cost. Current dynamic programming algorithms that fully exploit the optimal substructure produced by these energy functions only compute the GMEC. This disproportionately favors the sequence of a single, static conformation and overlooks better binding sequences with multiple low-energy conformations. Provable, ensemble-based algorithms such as A* avoid this problem, but A* cannot guarantee better performance than exhaustive enumeration. We propose a novel, provable, dynamic programming algorithm called Branch-Width Minimization* (BWM*) to enumerate a gap-free ensemble of conformations in order of increasing energy. Given a branch-decomposition of branch-width w for an n-residue protein design with at most q discrete side-chain conformations per residue, BWM* returns the sparse GMEC in O([Formula: see text]) time and enumerates each additional conformation in merely O([Formula: see text]) time. We define a new measure, Total Effective Search Space (TESS), which can be computed efficiently a priori before BWM* or A* is run. We ran BWM* on 67 protein design problems and found that TESS discriminated between BWM*-efficient and A*-efficient cases with 100% accuracy. As predicted by TESS and validated experimentally, BWM* outperforms A* in 73% of the cases and computes the full ensemble or a close approximation faster than A*, enumerating each additional conformation in milliseconds. Unlike A*, the performance of BWM* can be predicted in polynomial time before running the algorithm, which gives protein designers the power to choose the most efficient algorithm for their particular design problem.
An effective and secure key-management scheme for hierarchical access control in E-medicine system.

PubMed

Odelu, Vanga; Das, Ashok Kumar; Goswami, Adrijit

2013-04-01

Recently several hierarchical access control schemes are proposed in the literature to provide security of e-medicine systems. However, most of them are either insecure against 'man-in-the-middle attack' or they require high storage and computational overheads. Wu and Chen proposed a key management method to solve dynamic access control problems in a user hierarchy based on hybrid cryptosystem. Though their scheme improves computational efficiency over Nikooghadam et al.'s approach, it suffers from large storage space for public parameters in public domain and computational inefficiency due to costly elliptic curve point multiplication. Recently, Nikooghadam and Zakerolhosseini showed that Wu-Chen's scheme is vulnerable to man-in-the-middle attack. In order to remedy this security weakness in Wu-Chen's scheme, they proposed a secure scheme which is again based on ECC (elliptic curve cryptography) and efficient one-way hash function. However, their scheme incurs huge computational cost for providing verification of public information in the public domain as their scheme uses ECC digital signature which is costly when compared to symmetric-key cryptosystem. In this paper, we propose an effective access control scheme in user hierarchy which is only based on symmetric-key cryptosystem and efficient one-way hash function. We show that our scheme reduces significantly the storage space for both public and private domains, and computational complexity when compared to Wu-Chen's scheme, Nikooghadam-Zakerolhosseini's scheme, and other related schemes. Through the informal and formal security analysis, we further show that our scheme is secure against different attacks and also man-in-the-middle attack. Moreover, dynamic access control problems in our scheme are also solved efficiently compared to other related schemes, making our scheme is much suitable for practical applications of e-medicine systems.
MHOST: An efficient finite element program for inelastic analysis of solids and structures

NASA Technical Reports Server (NTRS)

Nakazawa, S.

1988-01-01

An efficient finite element program for 3-D inelastic analysis of gas turbine hot section components was constructed and validated. A novel mixed iterative solution strategy is derived from the augmented Hu-Washizu variational principle in order to nodally interpolate coordinates, displacements, deformation, strains, stresses and material properties. A series of increasingly sophisticated material models incorporated in MHOST include elasticity, secant plasticity, infinitesimal and finite deformation plasticity, creep and unified viscoplastic constitutive model proposed by Walker. A library of high performance elements is built into this computer program utilizing the concepts of selective reduced integrations and independent strain interpolations. A family of efficient solution algorithms is implemented in MHOST for linear and nonlinear equation solution including the classical Newton-Raphson, modified, quasi and secant Newton methods with optional line search and the conjugate gradient method.
ADVANCED COMPUTATIONAL METHODS IN DOSE MODELING: APPLICATION OF COMPUTATIONAL BIOPHYSICAL TRANSPORT, COMPUTATIONAL CHEMISTRY, AND COMPUTATIONAL BIOLOGY

EPA Science Inventory

Computational toxicology (CompTox) leverages the significant gains in computing power and computational techniques (e.g., numerical approaches, structure-activity relationships, bioinformatics) realized over the last few years, thereby reducing costs and increasing efficiency i...
Multigrid treatment of implicit continuum diffusion

NASA Astrophysics Data System (ADS)

Francisquez, Manaure; Zhu, Ben; Rogers, Barrett

2017-10-01

Implicit treatment of diffusive terms of various differential orders common in continuum mechanics modeling, such as computational fluid dynamics, is investigated with spectral and multigrid algorithms in non-periodic 2D domains. In doubly periodic time dependent problems these terms can be efficiently and implicitly handled by spectral methods, but in non-periodic systems solved with distributed memory parallel computing and 2D domain decomposition, this efficiency is lost for large numbers of processors. We built and present here a multigrid algorithm for these types of problems which outperforms a spectral solution that employs the highly optimized FFTW library. This multigrid algorithm is not only suitable for high performance computing but may also be able to efficiently treat implicit diffusion of arbitrary order by introducing auxiliary equations of lower order. We test these solvers for fourth and sixth order diffusion with idealized harmonic test functions as well as a turbulent 2D magnetohydrodynamic simulation. It is also shown that an anisotropic operator without cross-terms can improve model accuracy and speed, and we examine the impact that the various diffusion operators have on the energy, the enstrophy, and the qualitative aspect of a simulation. This work was supported by DOE-SC-0010508. This research used resources of the National Energy Research Scientific Computing Center (NERSC).

Tailoring Quantum Dot Assemblies to Extend Exciton Coherence Times and Improve Exciton Transport

NASA Astrophysics Data System (ADS)

Seward, Kenton; Lin, Zhibin; Lusk, Mark

2012-02-01

The motion of excitons through nanostructured assemblies plays a central role in a wide range of physical phenomena including quantum computing, molecular electronics, photosynthetic processes, excitonic transistors and light emitting diodes. All of these technologies are severely handicapped, though, by quasi-particle lifetimes on the order of a nanosecond. The movement of excitons must therefore be as efficient as possible in order to move excitons meaningful distances. This is problematic for assemblies of small Si quantum dots (QDs), where excitons quickly localize and entangle with dot phonon modes. Ensuing exciton transport is then characterized by a classical random walk reduced to very short distances because of efficient recombination. We use a combination of master equation (Haken-Strobl) formalism and density functional theory to estimate the rate of decoherence in Si QD assemblies and its impact on exciton mobility. Exciton-phonon coupling and Coulomb interactions are calculated as a function of dot size, spacing and termination to minimize the rate of intra-dot phonon entanglement. This extends the time over which more efficient exciton transport, characterized by partial coherence, can be maintained.
Marc Henry de Frahan | NREL

Science.gov Websites

Computing Project, Marc develops high-fidelity turbulence models to enhance simulation accuracy and efficient numerical algorithms for future high performance computing hardware architectures. Research Interests High performance computing High order numerical methods for computational fluid dynamics Fluid
Efficient implementation of multidimensional fast fourier transform on a distributed-memory parallel multi-node computer

DOEpatents

Bhanot, Gyan V [Princeton, NJ; Chen, Dong [Croton-On-Hudson, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY

2012-01-10

The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.
Efficient implementation of a multidimensional fast fourier transform on a distributed-memory parallel multi-node computer

DOEpatents

Bhanot, Gyan V [Princeton, NJ; Chen, Dong [Croton-On-Hudson, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY

2008-01-01

The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.
Reducing Vehicle Weight and Improving U.S. Energy Efficiency Using Integrated Computational Materials Engineering

NASA Astrophysics Data System (ADS)

Joost, William J.

2012-09-01

Transportation accounts for approximately 28% of U.S. energy consumption with the majority of transportation energy derived from petroleum sources. Many technologies such as vehicle electrification, advanced combustion, and advanced fuels can reduce transportation energy consumption by improving the efficiency of cars and trucks. Lightweight materials are another important technology that can improve passenger vehicle fuel efficiency by 6-8% for each 10% reduction in weight while also making electric and alternative vehicles more competitive. Despite the opportunities for improved efficiency, widespread deployment of lightweight materials for automotive structures is hampered by technology gaps most often associated with performance, manufacturability, and cost. In this report, the impact of reduced vehicle weight on energy efficiency is discussed with a particular emphasis on quantitative relationships determined by several researchers. The most promising lightweight materials systems are described along with a brief review of the most significant technical barriers to their implementation. For each material system, the development of accurate material models is critical to support simulation-intensive processing and structural design for vehicles; improved models also contribute to an integrated computational materials engineering (ICME) approach for addressing technical barriers and accelerating deployment. The value of computational techniques is described by considering recent ICME and computational materials science success stories with an emphasis on applying problem-specific methods.
Multiscale Reactive Molecular Dynamics

DTIC Science & Technology

2012-08-15

biology cannot be described without considering electronic and nuclear-level dynamics and their coupling to slower, cooperative motions of the system ...coupling to slower, cooperative motions of the system . These inherently multiscale problems require computationally efficient and accurate methods to...condensed phase systems with computational efficiency orders of magnitudes greater than currently possible with ab initio simulation methods, thus
CMOS: Efficient Clustered Data Monitoring in Sensor Networks

PubMed Central

2013-01-01

Tiny and smart sensors enable applications that access a network of hundreds or thousands of sensors. Thus, recently, many researchers have paid attention to wireless sensor networks (WSNs). The limitation of energy is critical since most sensors are battery-powered and it is very difficult to replace batteries in cases that sensor networks are utilized outdoors. Data transmission between sensor nodes needs more energy than computation in a sensor node. In order to reduce the energy consumption of sensors, we present an approximate data gathering technique, called CMOS, based on the Kalman filter. The goal of CMOS is to efficiently obtain the sensor readings within a certain error bound. In our approach, spatially close sensors are grouped as a cluster. Since a cluster header generates approximate readings of member nodes, a user query can be answered efficiently using the cluster headers. In addition, we suggest an energy efficient clustering method to distribute the energy consumption of cluster headers. Our simulation results with synthetic data demonstrate the efficiency and accuracy of our proposed technique. PMID:24459444
CMOS: efficient clustered data monitoring in sensor networks.

PubMed

Min, Jun-Ki

2013-01-01

Tiny and smart sensors enable applications that access a network of hundreds or thousands of sensors. Thus, recently, many researchers have paid attention to wireless sensor networks (WSNs). The limitation of energy is critical since most sensors are battery-powered and it is very difficult to replace batteries in cases that sensor networks are utilized outdoors. Data transmission between sensor nodes needs more energy than computation in a sensor node. In order to reduce the energy consumption of sensors, we present an approximate data gathering technique, called CMOS, based on the Kalman filter. The goal of CMOS is to efficiently obtain the sensor readings within a certain error bound. In our approach, spatially close sensors are grouped as a cluster. Since a cluster header generates approximate readings of member nodes, a user query can be answered efficiently using the cluster headers. In addition, we suggest an energy efficient clustering method to distribute the energy consumption of cluster headers. Our simulation results with synthetic data demonstrate the efficiency and accuracy of our proposed technique.
Function-Based Algorithms for Biological Sequences

ERIC Educational Resources Information Center

Mohanty, Pragyan Sheela P.

2015-01-01

Two problems at two different abstraction levels of computational biology are studied. At the molecular level, efficient pattern matching algorithms in DNA sequences are presented. For gene order data, an efficient data structure is presented capable of storing all gene re-orderings in a systematic manner. A common characteristic of presented…
Cycle-averaged dynamics of a periodically driven, closed-loop circulation model

NASA Technical Reports Server (NTRS)

Heldt, T.; Chang, J. L.; Chen, J. J. S.; Verghese, G. C.; Mark, R. G.

2005-01-01

Time-varying elastance models have been used extensively in the past to simulate the pulsatile nature of cardiovascular waveforms. Frequently, however, one is interested in dynamics that occur over longer time scales, in which case a detailed simulation of each cardiac contraction becomes computationally burdensome. In this paper, we apply circuit-averaging techniques to a periodically driven, closed-loop, three-compartment recirculation model. The resultant cycle-averaged model is linear and time invariant, and greatly reduces the computational burden. It is also amenable to systematic order reduction methods that lead to further efficiencies. Despite its simplicity, the averaged model captures the dynamics relevant to the representation of a range of cardiovascular reflex mechanisms. c2004 Elsevier Ltd. All rights reserved.
Automated Development of Accurate Algorithms and Efficient Codes for Computational Aeroacoustics

NASA Technical Reports Server (NTRS)

Goodrich, John W.; Dyson, Rodger W.

1999-01-01

The simulation of sound generation and propagation in three space dimensions with realistic aircraft components is a very large time dependent computation with fine details. Simulations in open domains with embedded objects require accurate and robust algorithms for propagation, for artificial inflow and outflow boundaries, and for the definition of geometrically complex objects. The development, implementation, and validation of methods for solving these demanding problems is being done to support the NASA pillar goals for reducing aircraft noise levels. Our goal is to provide algorithms which are sufficiently accurate and efficient to produce usable results rapidly enough to allow design engineers to study the effects on sound levels of design changes in propulsion systems, and in the integration of propulsion systems with airframes. There is a lack of design tools for these purposes at this time. Our technical approach to this problem combines the development of new, algorithms with the use of Mathematica and Unix utilities to automate the algorithm development, code implementation, and validation. We use explicit methods to ensure effective implementation by domain decomposition for SPMD parallel computing. There are several orders of magnitude difference in the computational efficiencies of the algorithms which we have considered. We currently have new artificial inflow and outflow boundary conditions that are stable, accurate, and unobtrusive, with implementations that match the accuracy and efficiency of the propagation methods. The artificial numerical boundary treatments have been proven to have solutions which converge to the full open domain problems, so that the error from the boundary treatments can be driven as low as is required. The purpose of this paper is to briefly present a method for developing highly accurate algorithms for computational aeroacoustics, the use of computer automation in this process, and a brief survey of the algorithms that have resulted from this work. A review of computational aeroacoustics has recently been given by Lele.
Reduced Order Models for Dynamic Behavior of Elastomer Damping Devices

NASA Astrophysics Data System (ADS)

Morin, B.; Legay, A.; Deü, J.-F.

2016-09-01

In the context of passive damping, various mechanical systems from the space industry use elastomer components (shock absorbers, silent blocks, flexible joints...). The material of these devices has frequency, temperature and amplitude dependent characteristics. The associated numerical models, using viscoelastic and hyperelastic constitutive behaviour, may become computationally too expensive during a design process. The aim of this work is to propose efficient reduced viscoelastic models of rubber devices. The first step is to choose an accurate material model that represent the viscoelasticity. The second step is to reduce the rubber device finite element model to a super-element that keeps the frequency dependence. This reduced model is first built by taking into account the fact that the device's interfaces are much more rigid than the rubber core. To make use of this difference, kinematical constraints enforce the rigid body motion of these interfaces reducing the rubber device model to twelve dofs only on the interfaces (three rotations and three translations per face). Then, the superelement is built by using a component mode synthesis method. As an application, the dynamic behavior of a structure supported by four hourglass shaped rubber devices under harmonic loads is analysed to show the efficiency of the proposed approach.
Boson Sampling with Single-Photon Fock States from a Bright Solid-State Source.

PubMed

Loredo, J C; Broome, M A; Hilaire, P; Gazzano, O; Sagnes, I; Lemaitre, A; Almeida, M P; Senellart, P; White, A G

2017-03-31

A boson-sampling device is a quantum machine expected to perform tasks intractable for a classical computer, yet requiring minimal nonclassical resources as compared to full-scale quantum computers. Photonic implementations to date employed sources based on inefficient processes that only simulate heralded single-photon statistics when strongly reducing emission probabilities. Boson sampling with only single-photon input has thus never been realized. Here, we report on a boson-sampling device operated with a bright solid-state source of single-photon Fock states with high photon-number purity: the emission from an efficient and deterministic quantum dot-micropillar system is demultiplexed into three partially indistinguishable single photons, with a single-photon purity 1-g^{(2)}(0) of 0.990±0.001, interfering in a linear optics network. Our demultiplexed source is between 1 and 2 orders of magnitude more efficient than current heralded multiphoton sources based on spontaneous parametric down-conversion, allowing us to complete the boson-sampling experiment faster than previous equivalent implementations.
Comparison of OPC job prioritization schemes to generate data for mask manufacturing

NASA Astrophysics Data System (ADS)

Lewis, Travis; Veeraraghavan, Vijay; Jantzen, Kenneth; Kim, Stephen; Park, Minyoung; Russell, Gordon; Simmons, Mark

2015-03-01

Delivering mask ready OPC corrected data to the mask shop on-time is critical for a foundry to meet the cycle time commitment for a new product. With current OPC compute resource sharing technology, different job scheduling algorithms are possible, such as, priority based resource allocation and fair share resource allocation. In order to maximize computer cluster efficiency, minimize the cost of the data processing and deliver data on schedule, the trade-offs of each scheduling algorithm need to be understood. Using actual production jobs, each of the scheduling algorithms will be tested in a production tape-out environment. Each scheduling algorithm will be judged on its ability to deliver data on schedule and the trade-offs associated with each method will be analyzed. It is now possible to introduce advance scheduling algorithms to the OPC data processing environment to meet the goals of on-time delivery of mask ready OPC data while maximizing efficiency and reducing cost.
Compressive Spectral Method for the Simulation of the Nonlinear Gravity Waves

PubMed Central

Bayındır, Cihan

2016-01-01

In this paper an approach for decreasing the computational effort required for the spectral simulations of the fully nonlinear ocean waves is introduced. The proposed approach utilizes the compressive sampling algorithm and depends on the idea of using a smaller number of spectral components compared to the classical spectral method. After performing the time integration with a smaller number of spectral components and using the compressive sampling technique, it is shown that the ocean wave field can be reconstructed with a significantly better efficiency compared to the classical spectral method. For the sparse ocean wave model in the frequency domain the fully nonlinear ocean waves with Jonswap spectrum is considered. By implementation of a high-order spectral method it is shown that the proposed methodology can simulate the linear and the fully nonlinear ocean waves with negligible difference in the accuracy and with a great efficiency by reducing the computation time significantly especially for large time evolutions. PMID:26911357
Dynamic optimization of distributed biological systems using robust and efficient numerical techniques.

PubMed

Vilas, Carlos; Balsa-Canto, Eva; García, Maria-Sonia G; Banga, Julio R; Alonso, Antonio A

2012-07-02

Systems biology allows the analysis of biological systems behavior under different conditions through in silico experimentation. The possibility of perturbing biological systems in different manners calls for the design of perturbations to achieve particular goals. Examples would include, the design of a chemical stimulation to maximize the amplitude of a given cellular signal or to achieve a desired pattern in pattern formation systems, etc. Such design problems can be mathematically formulated as dynamic optimization problems which are particularly challenging when the system is described by partial differential equations.This work addresses the numerical solution of such dynamic optimization problems for spatially distributed biological systems. The usual nonlinear and large scale nature of the mathematical models related to this class of systems and the presence of constraints on the optimization problems, impose a number of difficulties, such as the presence of suboptimal solutions, which call for robust and efficient numerical techniques. Here, the use of a control vector parameterization approach combined with efficient and robust hybrid global optimization methods and a reduced order model methodology is proposed. The capabilities of this strategy are illustrated considering the solution of a two challenging problems: bacterial chemotaxis and the FitzHugh-Nagumo model. In the process of chemotaxis the objective was to efficiently compute the time-varying optimal concentration of chemotractant in one of the spatial boundaries in order to achieve predefined cell distribution profiles. Results are in agreement with those previously published in the literature. The FitzHugh-Nagumo problem is also efficiently solved and it illustrates very well how dynamic optimization may be used to force a system to evolve from an undesired to a desired pattern with a reduced number of actuators. The presented methodology can be used for the efficient dynamic optimization of generic distributed biological systems.
A high-order vertex-based central ENO finite-volume scheme for three-dimensional compressible flows

DOE PAGES

Charest, Marc R.J.; Canfield, Thomas R.; Morgan, Nathaniel R.; ...

2015-03-11

High-order discretization methods offer the potential to reduce the computational cost associated with modeling compressible flows. However, it is difficult to obtain accurate high-order discretizations of conservation laws that do not produce spurious oscillations near discontinuities, especially on multi-dimensional unstructured meshes. A novel, high-order, central essentially non-oscillatory (CENO) finite-volume method that does not have these difficulties is proposed for tetrahedral meshes. The proposed unstructured method is vertex-based, which differs from existing cell-based CENO formulations, and uses a hybrid reconstruction procedure that switches between two different solution representations. It applies a high-order k-exact reconstruction in smooth regions and a limited linearmore » reconstruction when discontinuities are encountered. Both reconstructions use a single, central stencil for all variables, making the application of CENO to arbitrary unstructured meshes relatively straightforward. The new approach was applied to the conservation equations governing compressible flows and assessed in terms of accuracy and computational cost. For all problems considered, which included various function reconstructions and idealized flows, CENO demonstrated excellent reliability and robustness. Up to fifth-order accuracy was achieved in smooth regions and essentially non-oscillatory solutions were obtained near discontinuities. The high-order schemes were also more computationally efficient for high-accuracy solutions, i.e., they took less wall time than the lower-order schemes to achieve a desired level of error. In one particular case, it took a factor of 24 less wall-time to obtain a given level of error with the fourth-order CENO scheme than to obtain the same error with the second-order scheme.« less
Estimating Function Approaches for Spatial Point Processes

NASA Astrophysics Data System (ADS)

Deng, Chong

Spatial point pattern data consist of locations of events that are often of interest in biological and ecological studies. Such data are commonly viewed as a realization from a stochastic process called spatial point process. To fit a parametric spatial point process model to such data, likelihood-based methods have been widely studied. However, while maximum likelihood estimation is often too computationally intensive for Cox and cluster processes, pairwise likelihood methods such as composite likelihood, Palm likelihood usually suffer from the loss of information due to the ignorance of correlation among pairs. For many types of correlated data other than spatial point processes, when likelihood-based approaches are not desirable, estimating functions have been widely used for model fitting. In this dissertation, we explore the estimating function approaches for fitting spatial point process models. These approaches, which are based on the asymptotic optimal estimating function theories, can be used to incorporate the correlation among data and yield more efficient estimators. We conducted a series of studies to demonstrate that these estmating function approaches are good alternatives to balance the trade-off between computation complexity and estimating efficiency. First, we propose a new estimating procedure that improves the efficiency of pairwise composite likelihood method in estimating clustering parameters. Our approach combines estimating functions derived from pairwise composite likeli-hood estimation and estimating functions that account for correlations among the pairwise contributions. Our method can be used to fit a variety of parametric spatial point process models and can yield more efficient estimators for the clustering parameters than pairwise composite likelihood estimation. We demonstrate its efficacy through a simulation study and an application to the longleaf pine data. Second, we further explore the quasi-likelihood approach on fitting second-order intensity function of spatial point processes. However, the original second-order quasi-likelihood is barely feasible due to the intense computation and high memory requirement needed to solve a large linear system. Motivated by the existence of geometric regular patterns in the stationary point processes, we find a lower dimension representation of the optimal weight function and propose a reduced second-order quasi-likelihood approach. Through a simulation study, we show that the proposed method not only demonstrates superior performance in fitting the clustering parameter but also merits in the relaxation of the constraint of the tuning parameter, H. Third, we studied the quasi-likelihood type estimating funciton that is optimal in a certain class of first-order estimating functions for estimating the regression parameter in spatial point process models. Then, by using a novel spectral representation, we construct an implementation that is computationally much more efficient and can be applied to more general setup than the original quasi-likelihood method.
Model Order Reduction for the fast solution of 3D Stokes problems and its application in geophysical inversion

NASA Astrophysics Data System (ADS)

Ortega Gelabert, Olga; Zlotnik, Sergio; Afonso, Juan Carlos; Díez, Pedro

2017-04-01

The determination of the present-day physical state of the thermal and compositional structure of the Earth's lithosphere and sub-lithospheric mantle is one of the main goals in modern lithospheric research. All this data is essential to build Earth's evolution models and to reproduce many geophysical observables (e.g. elevation, gravity anomalies, travel time data, heat flow, etc) together with understanding the relationship between them. Determining the lithospheric state involves the solution of high-resolution inverse problems and, consequently, the solution of many direct models is required. The main objective of this work is to contribute to the existing inversion techniques in terms of improving the estimation of the elevation (topography) by including a dynamic component arising from sub-lithospheric mantle flow. In order to do so, we implement an efficient Reduced Order Method (ROM) built upon classic Finite Elements. ROM allows to reduce significantly the computational cost of solving a family of problems, for example all the direct models that are required in the solution of the inverse problem. The strategy of the method consists in creating a (reduced) basis of solutions, so that when a new problem has to be solved, its solution is sought within the basis instead of attempting to solve the problem itself. In order to check the Reduced Basis approach, we implemented the method in a 3D domain reproducing a portion of Earth that covers up to 400 km depth. Within the domain the Stokes equation is solved with realistic viscosities and densities. The different realizations (the family of problems) is created by varying viscosities and densities in a similar way as it would happen in an inversion problem. The Reduced Basis method is shown to be an extremely efficiently solver for the Stokes equation in this context.
Speedup computation of HD-sEMG signals using a motor unit-specific electrical source model.

PubMed

Carriou, Vincent; Boudaoud, Sofiane; Laforet, Jeremy

2018-01-23

Nowadays, bio-reliable modeling of muscle contraction is becoming more accurate and complex. This increasing complexity induces a significant increase in computation time which prevents the possibility of using this model in certain applications and studies. Accordingly, the aim of this work is to significantly reduce the computation time of high-density surface electromyogram (HD-sEMG) generation. This will be done through a new model of motor unit (MU)-specific electrical source based on the fibers composing the MU. In order to assess the efficiency of this approach, we computed the normalized root mean square error (NRMSE) between several simulations on single generated MU action potential (MUAP) using the usual fiber electrical sources and the MU-specific electrical source. This NRMSE was computed for five different simulation sets wherein hundreds of MUAPs are generated and summed into HD-sEMG signals. The obtained results display less than 2% error on the generated signals compared to the same signals generated with fiber electrical sources. Moreover, the computation time of the HD-sEMG signal generation model is reduced to about 90% compared to the fiber electrical source model. Using this model with MU electrical sources, we can simulate HD-sEMG signals of a physiological muscle (hundreds of MU) in less than an hour on a classical workstation. Graphical Abstract Overview of the simulation of HD-sEMG signals using the fiber scale and the MU scale. Upscaling the electrical source to the MU scale reduces the computation time by 90% inducing only small deviation of the same simulated HD-sEMG signals.

Reduced-order model for dynamic optimization of pressure swing adsorption processes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Agarwal, A.; Biegler, L.; Zitney, S.

2007-01-01

Over the past decades, pressure swing adsorption (PSA) processes have been widely used as energy-efficient gas and liquid separation techniques, especially for high purity hydrogen purification from refinery gases. The separation processes are based on solid-gas equilibrium and operate under periodic transient conditions. Models for PSA processes are therefore multiple instances of partial differential equations (PDEs) in time and space with periodic boundary conditions that link the processing steps together. The solution of this coupled stiff PDE system is governed by steep concentrations and temperature fronts moving with time. As a result, the optimization of such systems for either designmore » or operation represents a significant computational challenge to current differential algebraic equation (DAE) optimization techniques and nonlinear programming algorithms. Model reduction is one approach to generate cost-efficient low-order models which can be used as surrogate models in the optimization problems. The study develops a reduced-order model (ROM) based on proper orthogonal decomposition (POD), which is a low-dimensional approximation to a dynamic PDE-based model. Initially, a representative ensemble of solutions of the dynamic PDE system is constructed by solving a higher-order discretization of the model using the method of lines, a two-stage approach that discretizes the PDEs in space and then integrates the resulting DAEs over time. Next, the ROM method applies the Karhunen-Loeve expansion to derive a small set of empirical eigenfunctions (POD modes) which are used as basis functions within a Galerkin's projection framework to derive a low-order DAE system that accurately describes the dominant dynamics of the PDE system. The proposed method leads to a DAE system of significantly lower order, thus replacing the one obtained from spatial discretization before and making optimization problem computationally-efficient. The method has been applied to the dynamic coupled PDE-based model of a two-bed four-step PSA process for separation of hydrogen from methane. Separate ROMs have been developed for each operating step with different POD modes for each of them. A significant reduction in the order of the number of states has been achieved. The gas-phase mole fraction, solid-state loading and temperature profiles from the low-order ROM and from the high-order simulations have been compared. Moreover, the profiles for a different set of inputs and parameter values fed to the same ROM were compared with the accurate profiles from the high-order simulations. Current results indicate the proposed ROM methodology as a promising surrogate modeling technique for cost-effective optimization purposes. Moreover, deviations from the ROM for different set of inputs and parameters suggest that a recalibration of the model is required for the optimization studies. Results for these will also be presented with the aforementioned results.« less
Efficient computation of photonic crystal waveguide modes with dispersive material.

PubMed

Schmidt, Kersten; Kappeler, Roman

2010-03-29

The optimization of PhC waveguides is a key issue for successfully designing PhC devices. Since this design task is computationally expensive, efficient methods are demanded. The available codes for computing photonic bands are also applied to PhC waveguides. They are reliable but not very efficient, which is even more pronounced for dispersive material. We present a method based on higher order finite elements with curved cells, which allows to solve for the band structure taking directly into account the dispersiveness of the materials. This is accomplished by reformulating the wave equations as a linear eigenproblem in the complex wave-vectors k. For this method, we demonstrate the high efficiency for the computation of guided PhC waveguide modes by a convergence analysis.
Efficient conjugate gradient algorithms for computation of the manipulator forward dynamics

NASA Technical Reports Server (NTRS)

Fijany, Amir; Scheid, Robert E.

1989-01-01

The applicability of conjugate gradient algorithms for computation of the manipulator forward dynamics is investigated. The redundancies in the previously proposed conjugate gradient algorithm are analyzed. A new version is developed which, by avoiding these redundancies, achieves a significantly greater efficiency. A preconditioned conjugate gradient algorithm is also presented. A diagonal matrix whose elements are the diagonal elements of the inertia matrix is proposed as the preconditioner. In order to increase the computational efficiency, an algorithm is developed which exploits the synergism between the computation of the diagonal elements of the inertia matrix and that required by the conjugate gradient algorithm.
Rapid Calculation of Spacecraft Trajectories Using Efficient Taylor Series Integration

NASA Technical Reports Server (NTRS)

Scott, James R.; Martini, Michael C.

2011-01-01

A variable-order, variable-step Taylor series integration algorithm was implemented in NASA Glenn's SNAP (Spacecraft N-body Analysis Program) code. SNAP is a high-fidelity trajectory propagation program that can propagate the trajectory of a spacecraft about virtually any body in the solar system. The Taylor series algorithm's very high order accuracy and excellent stability properties lead to large reductions in computer time relative to the code's existing 8th order Runge-Kutta scheme. Head-to-head comparison on near-Earth, lunar, Mars, and Europa missions showed that Taylor series integration is 15.8 times faster than Runge- Kutta on average, and is more accurate. These speedups were obtained for calculations involving central body, other body, thrust, and drag forces. Similar speedups have been obtained for calculations that include J2 spherical harmonic for central body gravitation. The algorithm includes a step size selection method that directly calculates the step size and never requires a repeat step. High-order Taylor series integration algorithms have been shown to provide major reductions in computer time over conventional integration methods in numerous scientific applications. The objective here was to directly implement Taylor series integration in an existing trajectory analysis code and demonstrate that large reductions in computer time (order of magnitude) could be achieved while simultaneously maintaining high accuracy. This software greatly accelerates the calculation of spacecraft trajectories. At each time level, the spacecraft position, velocity, and mass are expanded in a high-order Taylor series whose coefficients are obtained through efficient differentiation arithmetic. This makes it possible to take very large time steps at minimal cost, resulting in large savings in computer time. The Taylor series algorithm is implemented primarily through three subroutines: (1) a driver routine that automatically introduces auxiliary variables and sets up initial conditions and integrates; (2) a routine that calculates system reduced derivatives using recurrence relations for quotients and products; and (3) a routine that determines the step size and sums the series. The order of accuracy used in a trajectory calculation is arbitrary and can be set by the user. The algorithm directly calculates the motion of other planetary bodies and does not require ephemeris files (except to start the calculation). The code also runs with Taylor series and Runge-Kutta used interchangeably for different phases of a mission.
Compute as Fast as the Engineers Can Think! ULTRAFAST COMPUTING TEAM FINAL REPORT

NASA Technical Reports Server (NTRS)

Biedron, R. T.; Mehrotra, P.; Nelson, M. L.; Preston, M. L.; Rehder, J. J.; Rogersm J. L.; Rudy, D. H.; Sobieski, J.; Storaasli, O. O.

1999-01-01

This report documents findings and recommendations by the Ultrafast Computing Team (UCT). In the period 10-12/98, UCT reviewed design case scenarios for a supersonic transport and a reusable launch vehicle to derive computing requirements necessary for support of a design process with efficiency so radically improved that human thought rather than the computer paces the process. Assessment of the present computing capability against the above requirements indicated a need for further improvement in computing speed by several orders of magnitude to reduce time to solution from tens of hours to seconds in major applications. Evaluation of the trends in computer technology revealed a potential to attain the postulated improvement by further increases of single processor performance combined with massively parallel processing in a heterogeneous environment. However, utilization of massively parallel processing to its full capability will require redevelopment of the engineering analysis and optimization methods, including invention of new paradigms. To that end UCT recommends initiation of a new activity at LaRC called Computational Engineering for development of new methods and tools geared to the new computer architectures in disciplines, their coordination, and validation and benefit demonstration through applications.
Accurate Monotonicity - Preserving Schemes With Runge-Kutta Time Stepping

NASA Technical Reports Server (NTRS)

Suresh, A.; Huynh, H. T.

1997-01-01

A new class of high-order monotonicity-preserving schemes for the numerical solution of conservation laws is presented. The interface value in these schemes is obtained by limiting a higher-order polynominal reconstruction. The limiting is designed to preserve accuracy near extrema and to work well with Runge-Kutta time stepping. Computational efficiency is enhanced by a simple test that determines whether the limiting procedure is needed. For linear advection in one dimension, these schemes are shown as well as the Euler equations also confirm their high accuracy, good shock resolution, and computational efficiency.
Algorithms for Efficient Computation of Transfer Functions for Large Order Flexible Systems

NASA Technical Reports Server (NTRS)

Maghami, Peiman G.; Giesy, Daniel P.

1998-01-01

An efficient and robust computational scheme is given for the calculation of the frequency response function of a large order, flexible system implemented with a linear, time invariant control system. Advantage is taken of the highly structured sparsity of the system matrix of the plant based on a model of the structure using normal mode coordinates. The computational time per frequency point of the new computational scheme is a linear function of system size, a significant improvement over traditional, still-matrix techniques whose computational times per frequency point range from quadratic to cubic functions of system size. This permits the practical frequency domain analysis of systems of much larger order than by traditional, full-matrix techniques. Formulations are given for both open- and closed-loop systems. Numerical examples are presented showing the advantages of the present formulation over traditional approaches, both in speed and in accuracy. Using a model with 703 structural modes, the present method was up to two orders of magnitude faster than a traditional method. The present method generally showed good to excellent accuracy throughout the range of test frequencies, while traditional methods gave adequate accuracy for lower frequencies, but generally deteriorated in performance at higher frequencies with worst case errors being many orders of magnitude times the correct values.
Computational methods for aerodynamic design using numerical optimization

NASA Technical Reports Server (NTRS)

Peeters, M. F.

1983-01-01

Five methods to increase the computational efficiency of aerodynamic design using numerical optimization, by reducing the computer time required to perform gradient calculations, are examined. The most promising method consists of drastically reducing the size of the computational domain on which aerodynamic calculations are made during gradient calculations. Since a gradient calculation requires the solution of the flow about an airfoil whose geometry was slightly perturbed from a base airfoil, the flow about the base airfoil is used to determine boundary conditions on the reduced computational domain. This method worked well in subcritical flow.
An alternative low-loss stack topology for vanadium redox flow battery: Comparative assessment

NASA Astrophysics Data System (ADS)

Moro, Federico; Trovò, Andrea; Bortolin, Stefano; Del, Davide, , Col; Guarnieri, Massimo

2017-02-01

Two vanadium redox flow battery topologies have been compared. In the conventional series stack, bipolar plates connect cells electrically in series and hydraulically in parallel. The alternative topology consists of cells connected in parallel inside stacks by means of monopolar plates in order to reduce shunt currents along channels and manifolds. Channelled and flat current collectors interposed between cells were considered in both topologies. In order to compute the stack losses, an equivalent circuit model of a VRFB cell was built from a 2D FEM multiphysics numerical model based on Comsol®, accounting for coupled electrical, electrochemical, and charge and mass transport phenomena. Shunt currents were computed inside the cells with 3D FEM models and in the piping and manifolds by means of equivalent circuits solved with Matlab®. Hydraulic losses were computed with analytical models in piping and manifolds and with 3D numerical analyses based on ANSYS Fluent® in the cell porous electrodes. Total losses in the alternative topology resulted one order of magnitude lower than in an equivalent conventional battery. The alternative topology with channelled current collectors exhibits the lowest shunt currents and hydraulic losses, with round-trip efficiency higher by about 10%, as compared to the conventional topology.
AN EFFICIENT HIGHER-ORDER FAST MULTIPOLE BOUNDARY ELEMENT SOLUTION FOR POISSON-BOLTZMANN BASED MOLECULAR ELECTROSTATICS

PubMed Central

Bajaj, Chandrajit; Chen, Shun-Chuan; Rand, Alexander

2011-01-01

In order to compute polarization energy of biomolecules, we describe a boundary element approach to solving the linearized Poisson-Boltzmann equation. Our approach combines several important features including the derivative boundary formulation of the problem and a smooth approximation of the molecular surface based on the algebraic spline molecular surface. State of the art software for numerical linear algebra and the kernel independent fast multipole method is used for both simplicity and efficiency of our implementation. We perform a variety of computational experiments, testing our method on a number of actual proteins involved in molecular docking and demonstrating the effectiveness of our solver for computing molecular polarization energy. PMID:21660123
Efficient parallel resolution of the simplified transport equations in mixed-dual formulation

NASA Astrophysics Data System (ADS)

Barrault, M.; Lathuilière, B.; Ramet, P.; Roman, J.

2011-03-01

A reactivity computation consists of computing the highest eigenvalue of a generalized eigenvalue problem, for which an inverse power algorithm is commonly used. Very fine modelizations are difficult to treat for our sequential solver, based on the simplified transport equations, in terms of memory consumption and computational time. A first implementation of a Lagrangian based domain decomposition method brings to a poor parallel efficiency because of an increase in the power iterations [1]. In order to obtain a high parallel efficiency, we improve the parallelization scheme by changing the location of the loop over the subdomains in the overall algorithm and by benefiting from the characteristics of the Raviart-Thomas finite element. The new parallel algorithm still allows us to locally adapt the numerical scheme (mesh, finite element order). However, it can be significantly optimized for the matching grid case. The good behavior of the new parallelization scheme is demonstrated for the matching grid case on several hundreds of nodes for computations based on a pin-by-pin discretization.
An Efficient Scheme for Updating Sparse Cholesky Factors

NASA Technical Reports Server (NTRS)

Raghavan, Padma

2002-01-01

Raghavan had earlier developed the software package DCSPACK which can be used for solving sparse linear systems where the coefficient matrix is symmetric and positive definite (this project was not funded by NASA but by agencies such as NSF). DSCPACK-S is the serial code and DSCPACK-P is a parallel implementation suitable for multiprocessors or networks-of-workstations with message passing using MCI. The main algorithm used is the Cholesky factorization of a sparse symmetric positive positive definite matrix A = LL(T). The code can also compute the factorization A = LDL(T). The complexity of the software arises from several factors relating to the sparsity of the matrix A. A sparse N x N matrix A has typically less that cN nonzeroes where c is a small constant. If the matrix were dense, it would have O(N2) nonzeroes. The most complicated part of such sparse Cholesky factorization relates to fill-in, i.e., zeroes in the original matrix that become nonzeroes in the factor L. An efficient implementation depends to a large extent on complex data structures and on techniques from graph theory to reduce, identify, and manage fill. DSCPACK is based on an efficient multifrontal implementation with fill-managing algorithms and implementation arising from earlier research by Raghavan and others. Sparse Cholesky factorization is typically a four step process: (1) ordering to compute a fill-reducing numbering, (2) symbolic factorization to determine the nonzero structure of L, (3) numeric factorization to compute L, and, (4) triangular solution to solve L(T)x = y and Ly = b. The first two steps are symbolic and are performed using the graph of the matrix. The numeric factorization step is of dominant cost and there are several schemes for improving performance by exploiting the nested and dense structure of groups of columns in the factor. The latter are aimed at better utilization of the cache-memory hierarchy on modem processors to prevent cache-misses and provide execution rates (operations/second) that are close to the peak rates for dense matrix computations. Currently, EPISCOPACY is being used in an application at NASA directed by J. Newman and M. James. We propose the implementation of efficient schemes for updating the LL(T) or LDL(T) factors computed in DSCPACK-S to meet the computational requirements of their project. A brief description is provided in the next section.
Framework for computationally efficient optimal irrigation scheduling using ant colony optimization

USDA-ARS?s Scientific Manuscript database

A general optimization framework is introduced with the overall goal of reducing search space size and increasing the computational efficiency of evolutionary algorithm application for optimal irrigation scheduling. The framework achieves this goal by representing the problem in the form of a decisi...
A Reduced Order Model for Whole-Chip Thermal Analysis of Microfluidic Lab-on-a-Chip Systems

PubMed Central

Wang, Yi; Song, Hongjun; Pant, Kapil

2013-01-01

This paper presents a Krylov subspace projection-based Reduced Order Model (ROM) for whole microfluidic chip thermal analysis, including conjugate heat transfer. Two key steps in the reduced order modeling procedure are described in detail, including (1) the acquisition of a 3D full-scale computational model in the state-space form to capture the dynamic thermal behavior of the entire microfluidic chip; and (2) the model order reduction using the Block Arnoldi algorithm to markedly lower the dimension of the full-scale model. Case studies using practically relevant thermal microfluidic chip are undertaken to establish the capability and to evaluate the computational performance of the reduced order modeling technique. The ROM is compared against the full-scale model and exhibits good agreement in spatiotemporal thermal profiles (<0.5% relative error in pertinent time scales) and over three orders-of-magnitude acceleration in computational speed. The salient model reusability and real-time simulation capability renders it amenable for operational optimization and in-line thermal control and management of microfluidic systems and devices. PMID:24443647
A Semi-Vectorization Algorithm to Synthesis of Gravitational Anomaly Quantities on the Earth

NASA Astrophysics Data System (ADS)

Abdollahzadeh, M.; Eshagh, M.; Najafi Alamdari, M.

2009-04-01

The Earth's gravitational potential can be expressed by the well-known spherical harmonic expansion. The computational time of summing up this expansion is an important practical issue which can be reduced by an efficient numerical algorithm. This paper proposes such a method for block-wise synthesizing the anomaly quantities on the Earth surface using vectorization. Fully-vectorization means transformation of the summations to the simple matrix and vector products. It is not a practical for the matrices with large dimensions. Here a semi-vectorization algorithm is proposed to avoid working with large vectors and matrices. It speeds up the computations by using one loop for the summation either on degrees or on orders. The former is a good option to synthesize the anomaly quantities on the Earth surface considering a digital elevation model (DEM). This approach is more efficient than the two-step method which computes the quantities on the reference ellipsoid and continues them upward to the Earth surface. The algorithm has been coded in MATLAB which synthesizes a global grid of 5â²Ã- 5â² (corresponding 9 million points) of gravity anomaly or geoid height using a geopotential model to degree 360 in 10000 seconds by an ordinary computer with 2G RAM.
Post-Optimality Analysis In Aerospace Vehicle Design

NASA Technical Reports Server (NTRS)

Braun, Robert D.; Kroo, Ilan M.; Gage, Peter J.

1993-01-01

This analysis pertains to the applicability of optimal sensitivity information to aerospace vehicle design. An optimal sensitivity (or post-optimality) analysis refers to computations performed once the initial optimization problem is solved. These computations may be used to characterize the design space about the present solution and infer changes in this solution as a result of constraint or parameter variations, without reoptimizing the entire system. The present analysis demonstrates that post-optimality information generated through first-order computations can be used to accurately predict the effect of constraint and parameter perturbations on the optimal solution. This assessment is based on the solution of an aircraft design problem in which the post-optimality estimates are shown to be within a few percent of the true solution over the practical range of constraint and parameter variations. Through solution of a reusable, single-stage-to-orbit, launch vehicle design problem, this optimal sensitivity information is also shown to improve the efficiency of the design process, For a hierarchically decomposed problem, this computational efficiency is realized by estimating the main-problem objective gradient through optimal sep&ivity calculations, By reducing the need for finite differentiation of a re-optimized subproblem, a significant decrease in the number of objective function evaluations required to reach the optimal solution is obtained.
Real Time Metrics and Analysis of Integrated Arrival, Departure, and Surface Operations

NASA Technical Reports Server (NTRS)

Sharma, Shivanjli; Fergus, John

2017-01-01

A real time dashboard was developed in order to inform and present users notifications and integrated information regarding airport surface operations. The dashboard is a supplement to capabilities and tools that incorporate arrival, departure, and surface air-traffic operations concepts in a NextGen environment. As trajectory-based departure scheduling and collaborative decision making tools are introduced in order to reduce delays and uncertainties in taxi and climb operations across the National Airspace System, users across a number of roles benefit from a real time system that enables common situational awareness. In addition to shared situational awareness the dashboard offers the ability to compute real time metrics and analysis to inform users about capacity, predictability, and efficiency of the system as a whole. This paper describes the architecture of the real time dashboard as well as an initial set of metrics computed on operational data. The potential impact of the real time dashboard is studied at the site identified for initial deployment and demonstration in 2017; Charlotte-Douglas International Airport. Analysis and metrics computed in real time illustrate the opportunity to provide common situational awareness and inform users of metrics across delay, throughput, taxi time, and airport capacity. In addition, common awareness of delays and the impact of takeoff and departure restrictions stemming from traffic flow management initiatives are explored. The potential of the real time tool to inform the predictability and efficiency of using a trajectory-based departure scheduling system is also discussed.
Constrained Total Generalized p-Variation Minimization for Few-View X-Ray Computed Tomography Image Reconstruction.

PubMed

Zhang, Hanming; Wang, Linyuan; Yan, Bin; Li, Lei; Cai, Ailong; Hu, Guoen

2016-01-01

Total generalized variation (TGV)-based computed tomography (CT) image reconstruction, which utilizes high-order image derivatives, is superior to total variation-based methods in terms of the preservation of edge information and the suppression of unfavorable staircase effects. However, conventional TGV regularization employs l1-based form, which is not the most direct method for maximizing sparsity prior. In this study, we propose a total generalized p-variation (TGpV) regularization model to improve the sparsity exploitation of TGV and offer efficient solutions to few-view CT image reconstruction problems. To solve the nonconvex optimization problem of the TGpV minimization model, we then present an efficient iterative algorithm based on the alternating minimization of augmented Lagrangian function. All of the resulting subproblems decoupled by variable splitting admit explicit solutions by applying alternating minimization method and generalized p-shrinkage mapping. In addition, approximate solutions that can be easily performed and quickly calculated through fast Fourier transform are derived using the proximal point method to reduce the cost of inner subproblems. The accuracy and efficiency of the simulated and real data are qualitatively and quantitatively evaluated to validate the efficiency and feasibility of the proposed method. Overall, the proposed method exhibits reasonable performance and outperforms the original TGV-based method when applied to few-view problems.
Numerical simulation of the vortical flow around a pitching airfoil

NASA Astrophysics Data System (ADS)

Fu, Xiang; Li, Gaohua; Wang, Fuxin

2017-04-01

In order to study the dynamic behaviors of the flapping wing, the vortical flow around a pitching NACA0012 airfoil is investigated. The unsteady flow field is obtained by a very efficient zonal procedure based on the velocity-vorticity formulation and the Reynolds number based on the chord length of the airfoil is set to 1 million. The zonal procedure divides up the whole computation domain in to three zones: potential flow zone, boundary layer zone and Navier-Stokes zone. Since the vorticity is absent in the potential flow zone, the vorticity transport equation needs only to be solved in the boundary layer zone and Navier-Stokes zone. Moreover, the boundary layer equations are solved in the boundary layer zone. This arrangement drastically reduces the computation time against the traditional numerical method. After the flow field computation, the evolution of the vortices around the airfoil is analyzed in detail.
Scalable software-defined optical networking with high-performance routing and wavelength assignment algorithms.

PubMed

Lee, Chankyun; Cao, Xiaoyuan; Yoshikane, Noboru; Tsuritani, Takehiro; Rhee, June-Koo Kevin

2015-10-19

The feasibility of software-defined optical networking (SDON) for a practical application critically depends on scalability of centralized control performance. The paper, highly scalable routing and wavelength assignment (RWA) algorithms are investigated on an OpenFlow-based SDON testbed for proof-of-concept demonstration. Efficient RWA algorithms are proposed to achieve high performance in achieving network capacity with reduced computation cost, which is a significant attribute in a scalable centralized-control SDON. The proposed heuristic RWA algorithms differ in the orders of request processes and in the procedures of routing table updates. Combined in a shortest-path-based routing algorithm, a hottest-request-first processing policy that considers demand intensity and end-to-end distance information offers both the highest throughput of networks and acceptable computation scalability. We further investigate trade-off relationship between network throughput and computation complexity in routing table update procedure by a simulation study.

Raster Scan Computer Image Generation (CIG) System Based On Refresh Memory

NASA Astrophysics Data System (ADS)

Dichter, W.; Doris, K.; Conkling, C.

1982-06-01

A full color, Computer Image Generation (CIG) raster visual system has been developed which provides a high level of training sophistication by utilizing advanced semiconductor technology and innovative hardware and firmware techniques. Double buffered refresh memory and efficient algorithms eliminate the problem of conventional raster line ordering by allowing the generated image to be stored in a random fashion. Modular design techniques and simplified architecture provide significant advantages in reduced system cost, standardization of parts, and high reliability. The major system components are a general purpose computer to perform interfacing and data base functions; a geometric processor to define the instantaneous scene image; a display generator to convert the image to a video signal; an illumination control unit which provides final image processing; and a CRT monitor for display of the completed image. Additional optional enhancements include texture generators, increased edge and occultation capability, curved surface shading, and data base extensions.
Muons in the CMS High Level Trigger System

NASA Astrophysics Data System (ADS)

Verwilligen, Piet; CMS Collaboration

2016-04-01

The trigger systems of LHC detectors play a fundamental role in defining the physics capabilities of the experiments. A reduction of several orders of magnitude in the rate of collected events, with respect to the proton-proton bunch crossing rate generated by the LHC, is mandatory to cope with the limits imposed by the readout and storage system. An accurate and efficient online selection mechanism is thus required to fulfill the task keeping maximal the acceptance to physics signals. The CMS experiment operates using a two-level trigger system. Firstly a Level-1 Trigger (L1T) system, implemented using custom-designed electronics, is designed to reduce the event rate to a limit compatible to the CMS Data Acquisition (DAQ) capabilities. A High Level Trigger System (HLT) follows, aimed at further reducing the rate of collected events finally stored for analysis purposes. The latter consists of a streamlined version of the CMS offline reconstruction software and operates on a computer farm. It runs algorithms optimized to make a trade-off between computational complexity, rate reduction and high selection efficiency. With the computing power available in 2012 the maximum reconstruction time at HLT was about 200 ms per event, at the nominal L1T rate of 100 kHz. An efficient selection of muons at HLT, as well as an accurate measurement of their properties, such as transverse momentum and isolation, is fundamental for the CMS physics programme. The performance of the muon HLT for single and double muon triggers achieved in Run I will be presented. Results from new developments, aimed at improving the performance of the algorithms for the harsher scenarios of collisions per event (pile-up) and luminosity expected for Run II will also be discussed.
An efficient and portable SIMD algorithm for charge/current deposition in Particle-In-Cell codes

NASA Astrophysics Data System (ADS)

Vincenti, H.; Lobet, M.; Lehe, R.; Sasanka, R.; Vay, J.-L.

2017-01-01

In current computer architectures, data movement (from die to network) is by far the most energy consuming part of an algorithm (≈ 20 pJ/word on-die to ≈10,000 pJ/word on the network). To increase memory locality at the hardware level and reduce energy consumption related to data movement, future exascale computers tend to use many-core processors on each compute nodes that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD register length is expected to double every four years. As a consequence, Particle-In-Cell (PIC) codes will have to achieve good vectorization to fully take advantage of these upcoming architectures. In this paper, we present a new algorithm that allows for efficient and portable SIMD vectorization of current/charge deposition routines that are, along with the field gathering routines, among the most time consuming parts of the PIC algorithm. Our new algorithm uses a particular data structure that takes into account memory alignment constraints and avoids gather/scatter instructions that can significantly affect vectorization performances on current CPUs. The new algorithm was successfully implemented in the 3D skeleton PIC code PICSAR and tested on Haswell Xeon processors (AVX2-256 bits wide data registers). Results show a factor of × 2 to × 2.5 speed-up in double precision for particle shape factor of orders 1- 3. The new algorithm can be applied as is on future KNL (Knights Landing) architectures that will include AVX-512 instruction sets with 512 bits register lengths (8 doubles/16 singles).
Large-Scale Optimization for Bayesian Inference in Complex Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Willcox, Karen; Marzouk, Youssef

2013-11-12

The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimization) Project focused on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimization and inversion methods. The project was a collaborative effort among MIT, the University of Texas at Austin, Georgia Institute of Technology, and Sandia National Laboratories. The research was directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. The MIT--Sandia component of themore » SAGUARO Project addressed the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas--Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to-observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as ``reduce then sample'' and ``sample then reduce.'' In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to achieve their speedups.« less
Final Report: Large-Scale Optimization for Bayesian Inference in Complex Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ghattas, Omar

2013-10-15

The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimiza- tion) Project focuses on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimiza- tion and inversion methods. Our research is directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. Our efforts are integrated in the context of a challenging testbed problem that considers subsurface reacting flow and transport. The MIT component of the SAGUAROmore » Project addresses the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas-Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to- observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as "reduce then sample" and "sample then reduce." In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to achieve their speedups.« less
GPU computing of compressible flow problems by a meshless method with space-filling curves

NASA Astrophysics Data System (ADS)

Ma, Z. H.; Wang, H.; Pu, S. H.

2014-04-01

A graphic processing unit (GPU) implementation of a meshless method for solving compressible flow problems is presented in this paper. Least-square fit is used to discretize the spatial derivatives of Euler equations and an upwind scheme is applied to estimate the flux terms. The compute unified device architecture (CUDA) C programming model is employed to efficiently and flexibly port the meshless solver from CPU to GPU. Considering the data locality of randomly distributed points, space-filling curves are adopted to re-number the points in order to improve the memory performance. Detailed evaluations are firstly carried out to assess the accuracy and conservation property of the underlying numerical method. Then the GPU accelerated flow solver is used to solve external steady flows over aerodynamic configurations. Representative results are validated through extensive comparisons with the experimental, finite volume or other available reference solutions. Performance analysis reveals that the running time cost of simulations is significantly reduced while impressive (more than an order of magnitude) speedups are achieved.
Development of a General Form CO 2 and Brine Flux Input Model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mansoor, K.; Sun, Y.; Carroll, S.

2014-08-01

The National Risk Assessment Partnership (NRAP) project is developing a science-based toolset for the quantitative analysis of the potential risks associated with changes in groundwater chemistry from CO 2 injection. In order to address uncertainty probabilistically, NRAP is developing efficient, reduced-order models (ROMs) as part of its approach. These ROMs are built from detailed, physics-based process models to provide confidence in the predictions over a range of conditions. The ROMs are designed to reproduce accurately the predictions from the computationally intensive process models at a fraction of the computational time, thereby allowing the utilization of Monte Carlo methods to probemore » variability in key parameters. This report presents the procedures used to develop a generalized model for CO 2 and brine leakage fluxes based on the output of a numerical wellbore simulation. The resulting generalized parameters and ranges reported here will be used for the development of third-generation groundwater ROMs.« less
Image restoration for three-dimensional fluorescence microscopy using an orthonormal basis for efficient representation of depth-variant point-spread functions

PubMed Central

Patwary, Nurmohammed; Preza, Chrysanthe

2015-01-01

A depth-variant (DV) image restoration algorithm for wide field fluorescence microscopy, using an orthonormal basis decomposition of DV point-spread functions (PSFs), is investigated in this study. The efficient PSF representation is based on a previously developed principal component analysis (PCA), which is computationally intensive. We present an approach developed to reduce the number of DV PSFs required for the PCA computation, thereby making the PCA-based approach computationally tractable for thick samples. Restoration results from both synthetic and experimental images show consistency and that the proposed algorithm addresses efficiently depth-induced aberration using a small number of principal components. Comparison of the PCA-based algorithm with a previously-developed strata-based DV restoration algorithm demonstrates that the proposed method improves performance by 50% in terms of accuracy and simultaneously reduces the processing time by 64% using comparable computational resources. PMID:26504634
On the study of control effectiveness and computational efficiency of reduced Saint-Venant model in model predictive control of open channel flow

NASA Astrophysics Data System (ADS)

Xu, M.; van Overloop, P. J.; van de Giesen, N. C.

2011-02-01

Model predictive control (MPC) of open channel flow is becoming an important tool in water management. The complexity of the prediction model has a large influence on the MPC application in terms of control effectiveness and computational efficiency. The Saint-Venant equations, called SV model in this paper, and the Integrator Delay (ID) model are either accurate but computationally costly, or simple but restricted to allowed flow changes. In this paper, a reduced Saint-Venant (RSV) model is developed through a model reduction technique, Proper Orthogonal Decomposition (POD), on the SV equations. The RSV model keeps the main flow dynamics and functions over a large flow range but is easier to implement in MPC. In the test case of a modeled canal reach, the number of states and disturbances in the RSV model is about 45 and 16 times less than the SV model, respectively. The computational time of MPC with the RSV model is significantly reduced, while the controller remains effective. Thus, the RSV model is a promising means to balance the control effectiveness and computational efficiency.
Autonomic Closure for Turbulent Flows Using Approximate Bayesian Computation

NASA Astrophysics Data System (ADS)

Doronina, Olga; Christopher, Jason; Hamlington, Peter; Dahm, Werner

2017-11-01

Autonomic closure is a new technique for achieving fully adaptive and physically accurate closure of coarse-grained turbulent flow governing equations, such as those solved in large eddy simulations (LES). Although autonomic closure has been shown in recent a priori tests to more accurately represent unclosed terms than do dynamic versions of traditional LES models, the computational cost of the approach makes it challenging to implement for simulations of practical turbulent flows at realistically high Reynolds numbers. The optimization step used in the approach introduces large matrices that must be inverted and is highly memory intensive. In order to reduce memory requirements, here we propose to use approximate Bayesian computation (ABC) in place of the optimization step, thereby yielding a computationally-efficient implementation of autonomic closure that trades memory-intensive for processor-intensive computations. The latter challenge can be overcome as co-processors such as general purpose graphical processing units become increasingly available on current generation petascale and exascale supercomputers. In this work, we outline the formulation of ABC-enabled autonomic closure and present initial results demonstrating the accuracy and computational cost of the approach.
Scientific Discovery through Advanced Computing (SciDAC-3) Partnership Project Annual Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hoffman, Forest M.; Bochev, Pavel B.; Cameron-Smith, Philip J..

The Applying Computationally Efficient Schemes for BioGeochemical Cycles ACES4BGC Project is advancing the predictive capabilities of Earth System Models (ESMs) by reducing two of the largest sources of uncertainty, aerosols and biospheric feedbacks, with a highly efficient computational approach. In particular, this project is implementing and optimizing new computationally efficient tracer advection algorithms for large numbers of tracer species; adding important biogeochemical interactions between the atmosphere, land, and ocean models; and applying uncertainty quanti cation (UQ) techniques to constrain process parameters and evaluate uncertainties in feedbacks between biogeochemical cycles and the climate system.
Sparsity enabled cluster reduced-order models for control

NASA Astrophysics Data System (ADS)

Kaiser, Eurika; Morzyński, Marek; Daviller, Guillaume; Kutz, J. Nathan; Brunton, Bingni W.; Brunton, Steven L.

2018-01-01

Characterizing and controlling nonlinear, multi-scale phenomena are central goals in science and engineering. Cluster-based reduced-order modeling (CROM) was introduced to exploit the underlying low-dimensional dynamics of complex systems. CROM builds a data-driven discretization of the Perron-Frobenius operator, resulting in a probabilistic model for ensembles of trajectories. A key advantage of CROM is that it embeds nonlinear dynamics in a linear framework, which enables the application of standard linear techniques to the nonlinear system. CROM is typically computed on high-dimensional data; however, access to and computations on this full-state data limit the online implementation of CROM for prediction and control. Here, we address this key challenge by identifying a small subset of critical measurements to learn an efficient CROM, referred to as sparsity-enabled CROM. In particular, we leverage compressive measurements to faithfully embed the cluster geometry and preserve the probabilistic dynamics. Further, we show how to identify fewer optimized sensor locations tailored to a specific problem that outperform random measurements. Both of these sparsity-enabled sensing strategies significantly reduce the burden of data acquisition and processing for low-latency in-time estimation and control. We illustrate this unsupervised learning approach on three different high-dimensional nonlinear dynamical systems from fluids with increasing complexity, with one application in flow control. Sparsity-enabled CROM is a critical facilitator for real-time implementation on high-dimensional systems where full-state information may be inaccessible.
AN OVERVIEW OF REDUCED ORDER MODELING TECHNIQUES FOR SAFETY APPLICATIONS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mandelli, D.; Alfonsi, A.; Talbot, P.

2016-10-01

The RISMC project is developing new advanced simulation-based tools to perform Computational Risk Analysis (CRA) for the existing fleet of U.S. nuclear power plants (NPPs). These tools numerically model not only the thermal-hydraulic behavior of the reactors primary and secondary systems, but also external event temporal evolution and component/system ageing. Thus, this is not only a multi-physics problem being addressed, but also a multi-scale problem (both spatial, µm-mm-m, and temporal, seconds-hours-years). As part of the RISMC CRA approach, a large amount of computationally-expensive simulation runs may be required. An important aspect is that even though computational power is growing, themore » overall computational cost of a RISMC analysis using brute-force methods may be not viable for certain cases. A solution that is being evaluated to assist the computational issue is the use of reduced order modeling techniques. During the FY2015, we investigated and applied reduced order modeling techniques to decrease the RISMC analysis computational cost by decreasing the number of simulation runs; for this analysis improvement we used surrogate models instead of the actual simulation codes. This article focuses on the use of reduced order modeling techniques that can be applied to RISMC analyses in order to generate, analyze, and visualize data. In particular, we focus on surrogate models that approximate the simulation results but in a much faster time (microseconds instead of hours/days).« less
Efficient Jacobian inversion for the control of simple robot manipulators

NASA Technical Reports Server (NTRS)

Fijany, Amir; Bejczy, Antal K.

1988-01-01

Symbolic inversion of the Jacobian matrix for spherical wrist arms is investigated. It is shown that, taking advantage of the simple geometry of these arms, the closed-form solution of the system Q = J-1X, representing a transformation from task space to joint space, can be obtained very efficiently. The solutions for PUMA, Stanford, and a six-revolute-joint coplanar arm, along with all singular points, are presented. The solution for each joint variable is found as an explicit function of the singular points which provides a better insight into the effect of different singular points on the motion and force exertion of each individual joint. For the above arms, the computation cost of the solution is on the same order as the cost of forward kinematic solution and it is significantly reduced if forward kinematic solution is already obtained. A comparison with previous methods shows that this method is the most efficient to date.
Computationally efficient confidence intervals for cross-validated area under the ROC curve estimates.

PubMed

LeDell, Erin; Petersen, Maya; van der Laan, Mark

In binary classification problems, the area under the ROC curve (AUC) is commonly used to evaluate the performance of a prediction model. Often, it is combined with cross-validation in order to assess how the results will generalize to an independent data set. In order to evaluate the quality of an estimate for cross-validated AUC, we obtain an estimate of its variance. For massive data sets, the process of generating a single performance estimate can be computationally expensive. Additionally, when using a complex prediction method, the process of cross-validating a predictive model on even a relatively small data set can still require a large amount of computation time. Thus, in many practical settings, the bootstrap is a computationally intractable approach to variance estimation. As an alternative to the bootstrap, we demonstrate a computationally efficient influence curve based approach to obtaining a variance estimate for cross-validated AUC.
Computationally efficient confidence intervals for cross-validated area under the ROC curve estimates

PubMed Central

Petersen, Maya; van der Laan, Mark

2015-01-01

In binary classification problems, the area under the ROC curve (AUC) is commonly used to evaluate the performance of a prediction model. Often, it is combined with cross-validation in order to assess how the results will generalize to an independent data set. In order to evaluate the quality of an estimate for cross-validated AUC, we obtain an estimate of its variance. For massive data sets, the process of generating a single performance estimate can be computationally expensive. Additionally, when using a complex prediction method, the process of cross-validating a predictive model on even a relatively small data set can still require a large amount of computation time. Thus, in many practical settings, the bootstrap is a computationally intractable approach to variance estimation. As an alternative to the bootstrap, we demonstrate a computationally efficient influence curve based approach to obtaining a variance estimate for cross-validated AUC. PMID:26279737
Computational Benefits Using an Advanced Concatenation Scheme Based on Reduced Order Models for RF Structures

NASA Astrophysics Data System (ADS)

Heller, Johann; Flisgen, Thomas; van Rienen, Ursula

The computation of electromagnetic fields and parameters derived thereof for lossless radio frequency (RF) structures filled with isotropic media is an important task for the design and operation of particle accelerators. Unfortunately, these computations are often highly demanding with regard to computational effort. The entire computational demand of the problem can be reduced using decomposition schemes in order to solve the field problems on standard workstations. This paper presents one of the first detailed comparisons between the recently proposed state-space concatenation approach (SSC) and a direct computation for an accelerator cavity with coupler-elements that break the rotational symmetry.
AREVA Developments for an Efficient and Reliable use of Monte Carlo codes for Radiation Transport Applications

NASA Astrophysics Data System (ADS)

Chapoutier, Nicolas; Mollier, François; Nolin, Guillaume; Culioli, Matthieu; Mace, Jean-Reynald

2017-09-01

In the context of the rising of Monte Carlo transport calculations for any kind of application, AREVA recently improved its suite of engineering tools in order to produce efficient Monte Carlo workflow. Monte Carlo codes, such as MCNP or TRIPOLI, are recognized as reference codes to deal with a large range of radiation transport problems. However the inherent drawbacks of theses codes - laboring input file creation and long computation time - contrast with the maturity of the treatment of the physical phenomena. The goals of the recent AREVA developments were to reach similar efficiency as other mature engineering sciences such as finite elements analyses (e.g. structural or fluid dynamics). Among the main objectives, the creation of a graphical user interface offering CAD tools for geometry creation and other graphical features dedicated to the radiation field (source definition, tally definition) has been reached. The computations times are drastically reduced compared to few years ago thanks to the use of massive parallel runs, and above all, the implementation of hybrid variance reduction technics. From now engineering teams are capable to deliver much more prompt support to any nuclear projects dealing with reactors or fuel cycle facilities from conceptual phase to decommissioning.
Non-Evolutionary Algorithms for Scheduling Dependent Tasks in Distributed Heterogeneous Computing Environments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wayne F. Boyer; Gurdeep S. Hura

2005-09-01

The Problem of obtaining an optimal matching and scheduling of interdependent tasks in distributed heterogeneous computing (DHC) environments is well known to be an NP-hard problem. In a DHC system, task execution time is dependent on the machine to which it is assigned and task precedence constraints are represented by a directed acyclic graph. Recent research in evolutionary techniques has shown that genetic algorithms usually obtain more efficient schedules that other known algorithms. We propose a non-evolutionary random scheduling (RS) algorithm for efficient matching and scheduling of inter-dependent tasks in a DHC system. RS is a succession of randomized taskmore » orderings and a heuristic mapping from task order to schedule. Randomized task ordering is effectively a topological sort where the outcome may be any possible task order for which the task precedent constraints are maintained. A detailed comparison to existing evolutionary techniques (GA and PSGA) shows the proposed algorithm is less complex than evolutionary techniques, computes schedules in less time, requires less memory and fewer tuning parameters. Simulation results show that the average schedules produced by RS are approximately as efficient as PSGA schedules for all cases studied and clearly more efficient than PSGA for certain cases. The standard formulation for the scheduling problem addressed in this paper is Rm|prec|Cmax.,« less
Massively parallel algorithms for real-time wavefront control of a dense adaptive optics system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fijany, A.; Milman, M.; Redding, D.

1994-12-31

In this paper massively parallel algorithms and architectures for real-time wavefront control of a dense adaptive optic system (SELENE) are presented. The authors have already shown that the computation of a near optimal control algorithm for SELENE can be reduced to the solution of a discrete Poisson equation on a regular domain. Although, this represents an optimal computation, due the large size of the system and the high sampling rate requirement, the implementation of this control algorithm poses a computationally challenging problem since it demands a sustained computational throughput of the order of 10 GFlops. They develop a novel algorithm,more » designated as Fast Invariant Imbedding algorithm, which offers a massive degree of parallelism with simple communication and synchronization requirements. Due to these features, this algorithm is significantly more efficient than other Fast Poisson Solvers for implementation on massively parallel architectures. The authors also discuss two massively parallel, algorithmically specialized, architectures for low-cost and optimal implementation of the Fast Invariant Imbedding algorithm.« less

A robust computational technique for model order reduction of two-time-scale discrete systems via genetic algorithms.

PubMed

Alsmadi, Othman M K; Abo-Hammour, Zaer S

2015-01-01

A robust computational technique for model order reduction (MOR) of multi-time-scale discrete systems (single input single output (SISO) and multi-input multioutput (MIMO)) is presented in this paper. This work is motivated by the singular perturbation of multi-time-scale systems where some specific dynamics may not have significant influence on the overall system behavior. The new approach is proposed using genetic algorithms (GA) with the advantage of obtaining a reduced order model, maintaining the exact dominant dynamics in the reduced order, and minimizing the steady state error. The reduction process is performed by obtaining an upper triangular transformed matrix of the system state matrix defined in state space representation along with the elements of B, C, and D matrices. The GA computational procedure is based on maximizing the fitness function corresponding to the response deviation between the full and reduced order models. The proposed computational intelligence MOR method is compared to recently published work on MOR techniques where simulation results show the potential and advantages of the new approach.
Adaptive reduction of constitutive model-form error using a posteriori error estimation techniques

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bishop, Joseph E.; Brown, Judith Alice

In engineering practice, models are typically kept as simple as possible for ease of setup and use, computational efficiency, maintenance, and overall reduced complexity to achieve robustness. In solid mechanics, a simple and efficient constitutive model may be favored over one that is more predictive, but is difficult to parameterize, is computationally expensive, or is simply not available within a simulation tool. In order to quantify the modeling error due to the choice of a relatively simple and less predictive constitutive model, we adopt the use of a posteriori model-form error-estimation techniques. Based on local error indicators in the energymore » norm, an algorithm is developed for reducing the modeling error by spatially adapting the material parameters in the simpler constitutive model. The resulting material parameters are not material properties per se, but depend on the given boundary-value problem. As a first step to the more general nonlinear case, we focus here on linear elasticity in which the “complex” constitutive model is general anisotropic elasticity and the chosen simpler model is isotropic elasticity. As a result, the algorithm for adaptive error reduction is demonstrated using two examples: (1) A transversely-isotropic plate with hole subjected to tension, and (2) a transversely-isotropic tube with two side holes subjected to torsion.« less
Large Eddy Simulations of Colorless Distributed Combustion Systems

NASA Astrophysics Data System (ADS)

Abdulrahman, Husam F.; Jaberi, Farhad; Gupta, Ashwani

2014-11-01

Development of efficient and low-emission colorless distributed combustion (CDC) systems for gas turbine applications require careful examination of the role of various flow and combustion parameters. Numerical simulations of CDC in a laboratory-scale combustor have been conducted to carefully examine the effects of these parameters on the CDC. The computational model is based on a hybrid modeling approach combining large eddy simulation (LES) with the filtered mass density function (FMDF) equations, solved with high order numerical methods and complex chemical kinetics. The simulated combustor operates based on the principle of high temperature air combustion (HiTAC) and has shown to significantly reduce the NOx, and CO emissions while improving the reaction pattern factor and stability without using any flame stabilizer and with low pressure drop and noise. The focus of the current work is to investigate the mixing of air and hydrocarbon fuels and the non-premixed and premixed reactions within the combustor by the LES/FMDF with the reduced chemical kinetic mechanisms for the same flow conditions and configurations investigated experimentally. The main goal is to develop better CDC with higher mixing and efficiency, ultra-low emission levels and optimum residence time. The computational results establish the consistency and the reliability of LES/FMDF and its Lagrangian-Eulerian numerical methodology.
Adaptive reduction of constitutive model-form error using a posteriori error estimation techniques

DOE PAGES

Bishop, Joseph E.; Brown, Judith Alice

2018-06-15

In engineering practice, models are typically kept as simple as possible for ease of setup and use, computational efficiency, maintenance, and overall reduced complexity to achieve robustness. In solid mechanics, a simple and efficient constitutive model may be favored over one that is more predictive, but is difficult to parameterize, is computationally expensive, or is simply not available within a simulation tool. In order to quantify the modeling error due to the choice of a relatively simple and less predictive constitutive model, we adopt the use of a posteriori model-form error-estimation techniques. Based on local error indicators in the energymore » norm, an algorithm is developed for reducing the modeling error by spatially adapting the material parameters in the simpler constitutive model. The resulting material parameters are not material properties per se, but depend on the given boundary-value problem. As a first step to the more general nonlinear case, we focus here on linear elasticity in which the “complex” constitutive model is general anisotropic elasticity and the chosen simpler model is isotropic elasticity. As a result, the algorithm for adaptive error reduction is demonstrated using two examples: (1) A transversely-isotropic plate with hole subjected to tension, and (2) a transversely-isotropic tube with two side holes subjected to torsion.« less
Novel metaheuristic for parameter estimation in nonlinear dynamic biological systems

PubMed Central

Rodriguez-Fernandez, Maria; Egea, Jose A; Banga, Julio R

2006-01-01

Background We consider the problem of parameter estimation (model calibration) in nonlinear dynamic models of biological systems. Due to the frequent ill-conditioning and multi-modality of many of these problems, traditional local methods usually fail (unless initialized with very good guesses of the parameter vector). In order to surmount these difficulties, global optimization (GO) methods have been suggested as robust alternatives. Currently, deterministic GO methods can not solve problems of realistic size within this class in reasonable computation times. In contrast, certain types of stochastic GO methods have shown promising results, although the computational cost remains large. Rodriguez-Fernandez and coworkers have presented hybrid stochastic-deterministic GO methods which could reduce computation time by one order of magnitude while guaranteeing robustness. Our goal here was to further reduce the computational effort without loosing robustness. Results We have developed a new procedure based on the scatter search methodology for nonlinear optimization of dynamic models of arbitrary (or even unknown) structure (i.e. black-box models). In this contribution, we describe and apply this novel metaheuristic, inspired by recent developments in the field of operations research, to a set of complex identification problems and we make a critical comparison with respect to the previous (above mentioned) successful methods. Conclusion Robust and efficient methods for parameter estimation are of key importance in systems biology and related areas. The new metaheuristic presented in this paper aims to ensure the proper solution of these problems by adopting a global optimization approach, while keeping the computational effort under reasonable values. This new metaheuristic was applied to a set of three challenging parameter estimation problems of nonlinear dynamic biological systems, outperforming very significantly all the methods previously used for these benchmark problems. PMID:17081289
Novel metaheuristic for parameter estimation in nonlinear dynamic biological systems.

PubMed

Rodriguez-Fernandez, Maria; Egea, Jose A; Banga, Julio R

2006-11-02

We consider the problem of parameter estimation (model calibration) in nonlinear dynamic models of biological systems. Due to the frequent ill-conditioning and multi-modality of many of these problems, traditional local methods usually fail (unless initialized with very good guesses of the parameter vector). In order to surmount these difficulties, global optimization (GO) methods have been suggested as robust alternatives. Currently, deterministic GO methods can not solve problems of realistic size within this class in reasonable computation times. In contrast, certain types of stochastic GO methods have shown promising results, although the computational cost remains large. Rodriguez-Fernandez and coworkers have presented hybrid stochastic-deterministic GO methods which could reduce computation time by one order of magnitude while guaranteeing robustness. Our goal here was to further reduce the computational effort without loosing robustness. We have developed a new procedure based on the scatter search methodology for nonlinear optimization of dynamic models of arbitrary (or even unknown) structure (i.e. black-box models). In this contribution, we describe and apply this novel metaheuristic, inspired by recent developments in the field of operations research, to a set of complex identification problems and we make a critical comparison with respect to the previous (above mentioned) successful methods. Robust and efficient methods for parameter estimation are of key importance in systems biology and related areas. The new metaheuristic presented in this paper aims to ensure the proper solution of these problems by adopting a global optimization approach, while keeping the computational effort under reasonable values. This new metaheuristic was applied to a set of three challenging parameter estimation problems of nonlinear dynamic biological systems, outperforming very significantly all the methods previously used for these benchmark problems.
Computationally efficient simulation of unsteady aerodynamics using POD on the fly

NASA Astrophysics Data System (ADS)

Moreno-Ramos, Ruben; Vega, José M.; Varas, Fernando

2016-12-01

Modern industrial aircraft design requires a large amount of sufficiently accurate aerodynamic and aeroelastic simulations. Current computational fluid dynamics (CFD) solvers with aeroelastic capabilities, such as the NASA URANS unstructured solver FUN3D, require very large computational resources. Since a very large amount of simulation is necessary, the CFD cost is just unaffordable in an industrial production environment and must be significantly reduced. Thus, a more inexpensive, yet sufficiently precise solver is strongly needed. An opportunity to approach this goal could follow some recent results (Terragni and Vega 2014 SIAM J. Appl. Dyn. Syst. 13 330-65 Rapun et al 2015 Int. J. Numer. Meth. Eng. 104 844-68) on an adaptive reduced order model that combines ‘on the fly’ a standard numerical solver (to compute some representative snapshots), proper orthogonal decomposition (POD) (to extract modes from the snapshots), Galerkin projection (onto the set of POD modes), and several additional ingredients such as projecting the equations using a limited amount of points and fairly generic mode libraries. When applied to the complex Ginzburg-Landau equation, the method produces acceleration factors (comparing with standard numerical solvers) of the order of 20 and 300 in one and two space dimensions, respectively. Unfortunately, the extension of the method to unsteady, compressible flows around deformable geometries requires new approaches to deal with deformable meshes, high-Reynolds numbers, and compressibility. A first step in this direction is presented considering the unsteady compressible, two-dimensional flow around an oscillating airfoil using a CFD solver in a rigidly moving mesh. POD on the Fly gives results whose accuracy is comparable to that of the CFD solver used to compute the snapshots.
Probabilistic methods for rotordynamics analysis

NASA Technical Reports Server (NTRS)

Wu, Y.-T.; Torng, T. Y.; Millwater, H. R.; Fossum, A. F.; Rheinfurth, M. H.

1991-01-01

This paper summarizes the development of the methods and a computer program to compute the probability of instability of dynamic systems that can be represented by a system of second-order ordinary linear differential equations. Two instability criteria based upon the eigenvalues or Routh-Hurwitz test functions are investigated. Computational methods based on a fast probability integration concept and an efficient adaptive importance sampling method are proposed to perform efficient probabilistic analysis. A numerical example is provided to demonstrate the methods.
Electroneutral models for dynamic Poisson-Nernst-Planck systems

NASA Astrophysics Data System (ADS)

Song, Zilong; Cao, Xiulei; Huang, Huaxiong

2018-01-01

The Poisson-Nernst-Planck (PNP) system is a standard model for describing ion transport. In many applications, e.g., ions in biological tissues, the presence of thin boundary layers poses both modeling and computational challenges. In this paper, we derive simplified electroneutral (EN) models where the thin boundary layers are replaced by effective boundary conditions. There are two major advantages of EN models. First, it is much cheaper to solve them numerically. Second, EN models are easier to deal with compared to the original PNP system; therefore, it would also be easier to derive macroscopic models for cellular structures using EN models. Even though the approach used here is applicable to higher-dimensional cases, this paper mainly focuses on the one-dimensional system, including the general multi-ion case. Using systematic asymptotic analysis, we derive a variety of effective boundary conditions directly applicable to the EN system for the bulk region. This EN system can be solved directly and efficiently without computing the solution in the boundary layer. The derivation is based on matched asymptotics, and the key idea is to bring back higher-order contributions into the effective boundary conditions. For Dirichlet boundary conditions, the higher-order terms can be neglected and the classical results (continuity of electrochemical potential) are recovered. For flux boundary conditions, higher-order terms account for the accumulation of ions in boundary layer and neglecting them leads to physically incorrect solutions. To validate the EN model, numerical computations are carried out for several examples. Our results show that solving the EN model is much more efficient than the original PNP system. Implemented with the Hodgkin-Huxley model, the computational time for solving the EN model is significantly reduced without sacrificing the accuracy of the solution due to the fact that it allows for relatively large mesh and time-step sizes.
Software Framework for Advanced Power Plant Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

John Widmann; Sorin Munteanu; Aseem Jain

2010-08-01

This report summarizes the work accomplished during the Phase II development effort of the Advanced Process Engineering Co-Simulator (APECS). The objective of the project is to develop the tools to efficiently combine high-fidelity computational fluid dynamics (CFD) models with process modeling software. During the course of the project, a robust integration controller was developed that can be used in any CAPE-OPEN compliant process modeling environment. The controller mediates the exchange of information between the process modeling software and the CFD software. Several approaches to reducing the time disparity between CFD simulations and process modeling have been investigated and implemented. Thesemore » include enabling the CFD models to be run on a remote cluster and enabling multiple CFD models to be run simultaneously. Furthermore, computationally fast reduced-order models (ROMs) have been developed that can be 'trained' using the results from CFD simulations and then used directly within flowsheets. Unit operation models (both CFD and ROMs) can be uploaded to a model database and shared between multiple users.« less
An LFMCW detector with new structure and FRFT based differential distance estimation method.

PubMed

Yue, Kai; Hao, Xinhong; Li, Ping

2016-01-01

This paper describes a linear frequency modulated continuous wave (LFMCW) detector which is designed for a collision avoidance radar. This detector can estimate distance between the detector and pedestrians or vehicles, thereby it will help to reduce the likelihood of traffic accidents. The detector consists of a transceiver and a signal processor. A novel structure based on the intermediate frequency signal (IFS) is designed for the transceiver which is different from the traditional LFMCW transceiver using the beat frequency signal (BFS) based structure. In the signal processor, a novel fractional Fourier transform (FRFT) based differential distance estimation (DDE) method is used to detect the distance. The new IFS based structure is beneficial for the FRFT based DDE method to reduce the computation complexity, because it does not need the scan of the optimal FRFT order. Low computation complexity ensures the feasibility of practical applications. Simulations are carried out and results demonstrate the efficiency of the detector designed in this paper.
Identification of Reduced-Order Thermal Therapy Models Using Thermal MR Images: Theory and Validation

PubMed Central

2013-01-01

In this paper, we develop and validate a method to identify computationally efficient site- and patient-specific models of ultrasound thermal therapies from MR thermal images. The models of the specific absorption rate of the transduced energy and the temperature response of the therapy target are identified in the reduced basis of proper orthogonal decomposition of thermal images, acquired in response to a mild thermal test excitation. The method permits dynamic reidentification of the treatment models during the therapy by recursively utilizing newly acquired images. Such adaptation is particularly important during high-temperature therapies, which are known to substantially and rapidly change tissue properties and blood perfusion. The developed theory was validated for the case of focused ultrasound heating of a tissue phantom. The experimental and computational results indicate that the developed approach produces accurate low-dimensional treatment models despite temporal and spatial noises in MR images and slow image acquisition rate. PMID:22531754
Multiscale Methods for Accurate, Efficient, and Scale-Aware Models of the Earth System

DOE Office of Scientific and Technical Information (OSTI.GOV)

Goldhaber, Steve; Holland, Marika

The major goal of this project was to contribute improvements to the infrastructure of an Earth System Model in order to support research in the Multiscale Methods for Accurate, Efficient, and Scale-Aware models of the Earth System project. In support of this, the NCAR team accomplished two main tasks: improving input/output performance of the model and improving atmospheric model simulation quality. Improvement of the performance and scalability of data input and diagnostic output within the model required a new infrastructure which can efficiently handle the unstructured grids common in multiscale simulations. This allows for a more computationally efficient model, enablingmore » more years of Earth System simulation. The quality of the model simulations was improved by reducing grid-point noise in the spectral element version of the Community Atmosphere Model (CAM-SE). This was achieved by running the physics of the model using grid-cell data on a finite-volume grid.« less
Center for Efficient Exascale Discretizations Software Suite

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kolev, Tzanio; Dobrev, Veselin; Tomov, Vladimir

The CEED Software suite is a collection of generally applicable software tools focusing on the following computational motives: PDE discretizations on unstructured meshes, high-order finite element and spectral element methods and unstructured adaptive mesh refinement. All of this software is being developed as part of CEED, a co-design Center for Efficient Exascale Discretizations, within DOE's Exascale Computing Project (ECP) program.
An efficient dynamic load balancing algorithm

NASA Astrophysics Data System (ADS)

Lagaros, Nikos D.

2014-01-01

In engineering problems, randomness and uncertainties are inherent. Robust design procedures, formulated in the framework of multi-objective optimization, have been proposed in order to take into account sources of randomness and uncertainty. These design procedures require orders of magnitude more computational effort than conventional analysis or optimum design processes since a very large number of finite element analyses is required to be dealt. It is therefore an imperative need to exploit the capabilities of computing resources in order to deal with this kind of problems. In particular, parallel computing can be implemented at the level of metaheuristic optimization, by exploiting the physical parallelization feature of the nondominated sorting evolution strategies method, as well as at the level of repeated structural analyses required for assessing the behavioural constraints and for calculating the objective functions. In this study an efficient dynamic load balancing algorithm for optimum exploitation of available computing resources is proposed and, without loss of generality, is applied for computing the desired Pareto front. In such problems the computation of the complete Pareto front with feasible designs only, constitutes a very challenging task. The proposed algorithm achieves linear speedup factors and almost 100% speedup factor values with reference to the sequential procedure.
Three-dimensional inverse modelling of damped elastic wave propagation in the Fourier domain

NASA Astrophysics Data System (ADS)

Petrov, Petr V.; Newman, Gregory A.

2014-09-01

3-D full waveform inversion (FWI) of seismic wavefields is routinely implemented with explicit time-stepping simulators. A clear advantage of explicit time stepping is the avoidance of solving large-scale implicit linear systems that arise with frequency domain formulations. However, FWI using explicit time stepping may require a very fine time step and (as a consequence) significant computational resources and run times. If the computational challenges of wavefield simulation can be effectively handled, an FWI scheme implemented within the frequency domain utilizing only a few frequencies, offers a cost effective alternative to FWI in the time domain. We have therefore implemented a 3-D FWI scheme for elastic wave propagation in the Fourier domain. To overcome the computational bottleneck in wavefield simulation, we have exploited an efficient Krylov iterative solver for the elastic wave equations approximated with second and fourth order finite differences. The solver does not exploit multilevel preconditioning for wavefield simulation, but is coupled efficiently to the inversion iteration workflow to reduce computational cost. The workflow is best described as a series of sequential inversion experiments, where in the case of seismic reflection acquisition geometries, the data has been laddered such that we first image highly damped data, followed by data where damping is systemically reduced. The key to our modelling approach is its ability to take advantage of solver efficiency when the elastic wavefields are damped. As the inversion experiment progresses, damping is significantly reduced, effectively simulating non-damped wavefields in the Fourier domain. While the cost of the forward simulation increases as damping is reduced, this is counterbalanced by the cost of the outer inversion iteration, which is reduced because of a better starting model obtained from the larger damped wavefield used in the previous inversion experiment. For cross-well data, it is also possible to launch a successful inversion experiment without laddering the damping constants. With this type of acquisition geometry, the solver is still quite effective using a small fixed damping constant. To avoid cycle skipping, we also employ a multiscale imaging approach, in which frequency content of the data is also laddered (with the data now including both reflection and cross-well data acquisition geometries). Thus the inversion process is launched using low frequency data to first recover the long spatial wavelength of the image. With this image as a new starting model, adding higher frequency data refines and enhances the resolution of the image. FWI using laddered frequencies with an efficient damping schemed enables reconstructing elastic attributes of the subsurface at a resolution that approaches half the smallest wavelength utilized to image the subsurface. We show the possibility of effectively carrying out such reconstructions using two to six frequencies, depending upon the application. Using the proposed FWI scheme, massively parallel computing resources are essential for reasonable execution times.
An accurate and efficient reliability-based design optimization using the second order reliability method and improved stability transformation method

NASA Astrophysics Data System (ADS)

Meng, Zeng; Yang, Dixiong; Zhou, Huanlin; Yu, Bo

2018-05-01

The first order reliability method has been extensively adopted for reliability-based design optimization (RBDO), but it shows inaccuracy in calculating the failure probability with highly nonlinear performance functions. Thus, the second order reliability method is required to evaluate the reliability accurately. However, its application for RBDO is quite challenge owing to the expensive computational cost incurred by the repeated reliability evaluation and Hessian calculation of probabilistic constraints. In this article, a new improved stability transformation method is proposed to search the most probable point efficiently, and the Hessian matrix is calculated by the symmetric rank-one update. The computational capability of the proposed method is illustrated and compared to the existing RBDO approaches through three mathematical and two engineering examples. The comparison results indicate that the proposed method is very efficient and accurate, providing an alternative tool for RBDO of engineering structures.
Air injection test on a Kaplan turbine: prototype - model comparison

NASA Astrophysics Data System (ADS)

Angulo, M.; Rivetti, A.; Díaz, L.; Liscia, S.

2016-11-01

Air injection is a very well-known resource to reduce pressure pulsation magnitude in turbines, especially on Francis type. In the case of large Kaplan designs, even when not so usual, it could be a solution to mitigate vibrations arising when tip vortex cavitation phenomenon becomes erosive and induces structural vibrations. In order to study this alternative, aeration tests were performed on a Kaplan turbine at model and prototype scales. The research was focused on efficiency of different air flow rates injected in reducing vibrations, especially at the draft tube and the discharge ring and also in the efficiency drop magnitude. It was found that results on both scales presents the same trend in particular for vibration levels at the discharge ring. The efficiency drop was overestimated on model tests while on prototype were less than 0.2 % for all power output. On prototype, air has a beneficial effect in reducing pressure fluctuations up to 0.2 ‰ of air flow rate. On model high speed image computing helped to quantify the volume of tip vortex cavitation that is strongly correlated with the vibration level. The hydrophone measurements did not capture the cavitation intensity when air is injected, however on prototype, it was detected by a sonometer installed at the draft tube access gallery.
Beyond mean-field approximations for accurate and computationally efficient models of on-lattice chemical kinetics

NASA Astrophysics Data System (ADS)

Pineda, M.; Stamatakis, M.

2017-07-01

Modeling the kinetics of surface catalyzed reactions is essential for the design of reactors and chemical processes. The majority of microkinetic models employ mean-field approximations, which lead to an approximate description of catalytic kinetics by assuming spatially uncorrelated adsorbates. On the other hand, kinetic Monte Carlo (KMC) methods provide a discrete-space continuous-time stochastic formulation that enables an accurate treatment of spatial correlations in the adlayer, but at a significant computation cost. In this work, we use the so-called cluster mean-field approach to develop higher order approximations that systematically increase the accuracy of kinetic models by treating spatial correlations at a progressively higher level of detail. We further demonstrate our approach on a reduced model for NO oxidation incorporating first nearest-neighbor lateral interactions and construct a sequence of approximations of increasingly higher accuracy, which we compare with KMC and mean-field. The latter is found to perform rather poorly, overestimating the turnover frequency by several orders of magnitude for this system. On the other hand, our approximations, while more computationally intense than the traditional mean-field treatment, still achieve tremendous computational savings compared to KMC simulations, thereby opening the way for employing them in multiscale modeling frameworks.
Efficient Kriging Algorithms

NASA Technical Reports Server (NTRS)

Memarsadeghi, Nargess

2011-01-01

More efficient versions of an interpolation method, called kriging, have been introduced in order to reduce its traditionally high computational cost. Written in C++, these approaches were tested on both synthetic and real data. Kriging is a best unbiased linear estimator and suitable for interpolation of scattered data points. Kriging has long been used in the geostatistic and mining communities, but is now being researched for use in the image fusion of remotely sensed data. This allows a combination of data from various locations to be used to fill in any missing data from any single location. To arrive at the faster algorithms, sparse SYMMLQ iterative solver, covariance tapering, Fast Multipole Methods (FMM), and nearest neighbor searching techniques were used. These implementations were used when the coefficient matrix in the linear system is symmetric, but not necessarily positive-definite.

Design of efficient stiffened shells of revolution

NASA Technical Reports Server (NTRS)

Majumder, D. K.; Thornton, W. A.

1976-01-01

A method to produce efficient piecewise uniform stiffened shells of revolution is presented. The approach uses a first order differential equation formulation for the shell prebuckling and buckling analyses and the necessary conditions for an optimum design are derived by a variational approach. A variety of local yielding and buckling constraints and the general buckling constraint are included in the design process. The local constraints are treated by means of an interior penalty function and the general buckling load is treated by means of an exterior penalty function. This allows the general buckling constraint to be included in the design process only when it is violated. The self-adjoint nature of the prebuckling and buckling formulations is used to reduce the computational effort. Results for four conical shells and one spherical shell are given.
Higher-order methods for simulations on quantum computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sornborger, A.T.; Stewart, E.D.

1999-09-01

To implement many-qubit gates for use in quantum simulations on quantum computers efficiently, we develop and present methods reexpressing exp[[minus]i(H[sub 1]+H[sub 2]+[center dot][center dot][center dot])[Delta]t] as a product of factors exp[[minus]iH[sub 1][Delta]t], exp[[minus]iH[sub 2][Delta]t],[hor ellipsis], which is accurate to third or fourth order in [Delta]t. The methods we derive are an extended form of the symplectic method, and can also be used for an integration of classical Hamiltonians on classical computers. We derive both integral and irrational methods, and find the most efficient methods in both cases. [copyright] [ital 1999] [ital The American Physical Society
Edge-Based Efficient Search over Encrypted Data Mobile Cloud Storage

PubMed Central

Liu, Fang; Cai, Zhiping; Xiao, Nong; Zhao, Ziming

2018-01-01

Smart sensor-equipped mobile devices sense, collect, and process data generated by the edge network to achieve intelligent control, but such mobile devices usually have limited storage and computing resources. Mobile cloud storage provides a promising solution owing to its rich storage resources, great accessibility, and low cost. But it also brings a risk of information leakage. The encryption of sensitive data is the basic step to resist the risk. However, deploying a high complexity encryption and decryption algorithm on mobile devices will greatly increase the burden of terminal operation and the difficulty to implement the necessary privacy protection algorithm. In this paper, we propose ENSURE (EfficieNt and SecURE), an efficient and secure encrypted search architecture over mobile cloud storage. ENSURE is inspired by edge computing. It allows mobile devices to offload the computation intensive task onto the edge server to achieve a high efficiency. Besides, to protect data security, it reduces the information acquisition of untrusted cloud by hiding the relevance between query keyword and search results from the cloud. Experiments on a real data set show that ENSURE reduces the computation time by 15% to 49% and saves the energy consumption by 38% to 69% per query. PMID:29652810
Edge-Based Efficient Search over Encrypted Data Mobile Cloud Storage.

PubMed

Guo, Yeting; Liu, Fang; Cai, Zhiping; Xiao, Nong; Zhao, Ziming

2018-04-13

Smart sensor-equipped mobile devices sense, collect, and process data generated by the edge network to achieve intelligent control, but such mobile devices usually have limited storage and computing resources. Mobile cloud storage provides a promising solution owing to its rich storage resources, great accessibility, and low cost. But it also brings a risk of information leakage. The encryption of sensitive data is the basic step to resist the risk. However, deploying a high complexity encryption and decryption algorithm on mobile devices will greatly increase the burden of terminal operation and the difficulty to implement the necessary privacy protection algorithm. In this paper, we propose ENSURE (EfficieNt and SecURE), an efficient and secure encrypted search architecture over mobile cloud storage. ENSURE is inspired by edge computing. It allows mobile devices to offload the computation intensive task onto the edge server to achieve a high efficiency. Besides, to protect data security, it reduces the information acquisition of untrusted cloud by hiding the relevance between query keyword and search results from the cloud. Experiments on a real data set show that ENSURE reduces the computation time by 15% to 49% and saves the energy consumption by 38% to 69% per query.
Multiscale Methods, Parallel Computation, and Neural Networks for Real-Time Computer Vision.

NASA Astrophysics Data System (ADS)

Battiti, Roberto

1990-01-01

This thesis presents new algorithms for low and intermediate level computer vision. The guiding ideas in the presented approach are those of hierarchical and adaptive processing, concurrent computation, and supervised learning. Processing of the visual data at different resolutions is used not only to reduce the amount of computation necessary to reach the fixed point, but also to produce a more accurate estimation of the desired parameters. The presented adaptive multiple scale technique is applied to the problem of motion field estimation. Different parts of the image are analyzed at a resolution that is chosen in order to minimize the error in the coefficients of the differential equations to be solved. Tests with video-acquired images show that velocity estimation is more accurate over a wide range of motion with respect to the homogeneous scheme. In some cases introduction of explicit discontinuities coupled to the continuous variables can be used to avoid propagation of visual information from areas corresponding to objects with different physical and/or kinematic properties. The human visual system uses concurrent computation in order to process the vast amount of visual data in "real -time." Although with different technological constraints, parallel computation can be used efficiently for computer vision. All the presented algorithms have been implemented on medium grain distributed memory multicomputers with a speed-up approximately proportional to the number of processors used. A simple two-dimensional domain decomposition assigns regions of the multiresolution pyramid to the different processors. The inter-processor communication needed during the solution process is proportional to the linear dimension of the assigned domain, so that efficiency is close to 100% if a large region is assigned to each processor. Finally, learning algorithms are shown to be a viable technique to engineer computer vision systems for different applications starting from multiple-purpose modules. In the last part of the thesis a well known optimization method (the Broyden-Fletcher-Goldfarb-Shanno memoryless quasi -Newton method) is applied to simple classification problems and shown to be superior to the "error back-propagation" algorithm for numerical stability, automatic selection of parameters, and convergence properties.
Towards an accurate representation of electrostatics in classical force fields: Efficient implementation of multipolar interactions in biomolecular simulations

NASA Astrophysics Data System (ADS)

Sagui, Celeste; Pedersen, Lee G.; Darden, Thomas A.

2004-01-01

The accurate simulation of biologically active macromolecules faces serious limitations that originate in the treatment of electrostatics in the empirical force fields. The current use of "partial charges" is a significant source of errors, since these vary widely with different conformations. By contrast, the molecular electrostatic potential (MEP) obtained through the use of a distributed multipole moment description, has been shown to converge to the quantum MEP outside the van der Waals surface, when higher order multipoles are used. However, in spite of the considerable improvement to the representation of the electronic cloud, higher order multipoles are not part of current classical biomolecular force fields due to the excessive computational cost. In this paper we present an efficient formalism for the treatment of higher order multipoles in Cartesian tensor formalism. The Ewald "direct sum" is evaluated through a McMurchie-Davidson formalism [L. McMurchie and E. Davidson, J. Comput. Phys. 26, 218 (1978)]. The "reciprocal sum" has been implemented in three different ways: using an Ewald scheme, a particle mesh Ewald (PME) method, and a multigrid-based approach. We find that even though the use of the McMurchie-Davidson formalism considerably reduces the cost of the calculation with respect to the standard matrix implementation of multipole interactions, the calculation in direct space remains expensive. When most of the calculation is moved to reciprocal space via the PME method, the cost of a calculation where all multipolar interactions (up to hexadecapole-hexadecapole) are included is only about 8.5 times more expensive than a regular AMBER 7 [D. A. Pearlman et al., Comput. Phys. Commun. 91, 1 (1995)] implementation with only charge-charge interactions. The multigrid implementation is slower but shows very promising results for parallelization. It provides a natural way to interface with continuous, Gaussian-based electrostatics in the future. It is hoped that this new formalism will facilitate the systematic implementation of higher order multipoles in classical biomolecular force fields.
PIC codes for plasma accelerators on emerging computer architectures (GPUS, Multicore/Manycore CPUS)

NASA Astrophysics Data System (ADS)

Vincenti, Henri

2016-03-01

The advent of exascale computers will enable 3D simulations of a new laser-plasma interaction regimes that were previously out of reach of current Petasale computers. However, the paradigm used to write current PIC codes will have to change in order to fully exploit the potentialities of these new computing architectures. Indeed, achieving Exascale computing facilities in the next decade will be a great challenge in terms of energy consumption and will imply hardware developments directly impacting our way of implementing PIC codes. As data movement (from die to network) is by far the most energy consuming part of an algorithm future computers will tend to increase memory locality at the hardware level and reduce energy consumption related to data movement by using more and more cores on each compute nodes (''fat nodes'') that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, CPU machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD register length is expected to double every four years. GPU's also have a reduced clock speed per core and can process Multiple Instructions on Multiple Datas (MIMD). At the software level Particle-In-Cell (PIC) codes will thus have to achieve both good memory locality and vectorization (for Multicore/Manycore CPU) to fully take advantage of these upcoming architectures. In this talk, we present the portable solutions we implemented in our high performance skeleton PIC code PICSAR to both achieve good memory locality and cache reuse as well as good vectorization on SIMD architectures. We also present the portable solutions used to parallelize the Pseudo-sepctral quasi-cylindrical code FBPIC on GPUs using the Numba python compiler.
Projection-Based Reduced Order Modeling for Spacecraft Thermal Analysis

NASA Technical Reports Server (NTRS)

Qian, Jing; Wang, Yi; Song, Hongjun; Pant, Kapil; Peabody, Hume; Ku, Jentung; Butler, Charles D.

2015-01-01

This paper presents a mathematically rigorous, subspace projection-based reduced order modeling (ROM) methodology and an integrated framework to automatically generate reduced order models for spacecraft thermal analysis. Two key steps in the reduced order modeling procedure are described: (1) the acquisition of a full-scale spacecraft model in the ordinary differential equation (ODE) and differential algebraic equation (DAE) form to resolve its dynamic thermal behavior; and (2) the ROM to markedly reduce the dimension of the full-scale model. Specifically, proper orthogonal decomposition (POD) in conjunction with discrete empirical interpolation method (DEIM) and trajectory piece-wise linear (TPWL) methods are developed to address the strong nonlinear thermal effects due to coupled conductive and radiative heat transfer in the spacecraft environment. Case studies using NASA-relevant satellite models are undertaken to verify the capability and to assess the computational performance of the ROM technique in terms of speed-up and error relative to the full-scale model. ROM exhibits excellent agreement in spatiotemporal thermal profiles (<0.5% relative error in pertinent time scales) along with salient computational acceleration (up to two orders of magnitude speed-up) over the full-scale analysis. These findings establish the feasibility of ROM to perform rational and computationally affordable thermal analysis, develop reliable thermal control strategies for spacecraft, and greatly reduce the development cycle times and costs.
3 CFR 13513 - Executive Order 13513 of October 1, 2009. Federal Leadership on Reducing Text Messaging While...

Code of Federal Regulations, 2010 CFR

2010-01-01

... Leadership on Reducing Text Messaging While Driving 13513 Order 13513 Presidential Documents Executive Orders Executive Order 13513 of October 1, 2009 EO 13513 Federal Leadership on Reducing Text Messaging While... leadership in improving safety on our roads and highways and to enhance the efficiency of Federal contracting...
Design and implementation of a hybrid sub-band acoustic echo canceller (AEC)

NASA Astrophysics Data System (ADS)

Bai, Mingsian R.; Yang, Cheng-Ken; Hur, Ker-Nan

2009-04-01

An efficient method is presented for implementing an acoustic echo canceller (AEC) that makes use of hybrid sub-band approach. The hybrid system is comprised of a fixed processor and an adaptive filter in each sub-band. The AEC aims at reducing the echo resulting from the acoustic feedback in loudspeaker-enclosure-microphone (LEM) systems such as teleconferencing and hands-free systems. In order to cancel the acoustical echo efficiently, various processing architectures including fixed filters, hybrid processors, and sub-band structure are investigated. A double-talk detector is incorporated into the proposed AEC to prevent the adaptive filter from diverging in double-talk situations. A de-correlation filter is also used alongside sub-band processing in order to enhance the performance and efficiency of AEC. All algorithms are implemented and verified on the platform of a fixed-point digital signal processor (DSP). The AECs are evaluated in terms of cancellation performance and computation complexity. In addition, listening tests are conducted to assess the subjective performance of the AECs. From the results, the proposed hybrid sub-band AEC was found to be the most effective among all methods in terms of echo reduction and timbral quality.
Efficient Strategies for Predictive Cell-Level Control of Lithium-Ion Batteries

NASA Astrophysics Data System (ADS)

Xavier, Marcelo A.

This dissertation introduces a set of state-space based model predictive control (MPC) algorithms tailored to a non-zero feedthrough term to account for the ohmic resistance that is inherent to the battery dynamics. MPC is herein applied to the problem of regulating cell-level measures of performance for lithium-ion batteries; the control methodologies are used first to compute a fast charging profile that respects input, output, and state constraints, i.e., input current, terminal voltage, and state of charge for an equivalent circuit model of the battery cell, and extended later to a linearized physics-based reduced-order model. The novelty of this work can summarized as follows: (1) the MPC variants are employed to a physics based reduce-order model in order to make use of the available set of internal electrochemical variables and mitigate internal mechanisms of cell degradation. (e.g., lithium plating); (2) we developed a dual-mode MPC closed-loop paradigm that suits the battery control problem with the objective of reducing computational effort by solving simpler optimization routines and guaranteeing stability; and finally (3) we developed a completely new approach of the use of a predictive control strategy where MPC is employed as a "smart sensor" for power estimation. Results are presented that show the comparative performance of the MPC algorithms for both EMC and PBROM These results highlight that dual-mode MPC can deliver optimal input current profiles by using a shorter horizon while still guaranteeing stability. Additionally, rigorous mathematical developments are presented for the development of the MPC algorithms. The use of MPC as a "smart sensor" presents it self as an appealing method for power estimation, since MPC permits a fully dynamic input profile that is able to achieve performance right at the proper constraint boundaries. Therefore, MPC is expected to produce accurate power limits for each computed sample time when compared to the Bisection method [1] which assumes constant input values over the prediction interval.
Evaluation of Geometrically Nonlinear Reduced Order Models with Nonlinear Normal Modes

DOE PAGES

Kuether, Robert J.; Deaner, Brandon J.; Hollkamp, Joseph J.; ...

2015-09-15

Several reduced-order modeling strategies have been developed to create low-order models of geometrically nonlinear structures from detailed finite element models, allowing one to compute the dynamic response of the structure at a dramatically reduced cost. But, the parameters of these reduced-order models are estimated by applying a series of static loads to the finite element model, and the quality of the reduced-order model can be highly sensitive to the amplitudes of the static load cases used and to the type/number of modes used in the basis. Our paper proposes to combine reduced-order modeling and numerical continuation to estimate the nonlinearmore » normal modes of geometrically nonlinear finite element models. Not only does this make it possible to compute the nonlinear normal modes far more quickly than existing approaches, but the nonlinear normal modes are also shown to be an excellent metric by which the quality of the reduced-order model can be assessed. Hence, the second contribution of this work is to demonstrate how nonlinear normal modes can be used as a metric by which nonlinear reduced-order models can be compared. Moreover, various reduced-order models with hardening nonlinearities are compared for two different structures to demonstrate these concepts: a clamped–clamped beam model, and a more complicated finite element model of an exhaust panel cover.« less
A Machine-Learning and Filtering Based Data Assimilation Framework for Geologic Carbon Sequestration Monitoring Optimization

NASA Astrophysics Data System (ADS)

Chen, B.; Harp, D. R.; Lin, Y.; Keating, E. H.; Pawar, R.

2017-12-01

Monitoring is a crucial aspect of geologic carbon sequestration (GCS) risk management. It has gained importance as a means to ensure CO2 is safely and permanently stored underground throughout the lifecycle of a GCS project. Three issues are often involved in a monitoring project: (i) where is the optimal location to place the monitoring well(s), (ii) what type of data (pressure, rate and/or CO2 concentration) should be measured, and (iii) What is the optimal frequency to collect the data. In order to address these important issues, a filtering-based data assimilation procedure is developed to perform the monitoring optimization. The optimal monitoring strategy is selected based on the uncertainty reduction of the objective of interest (e.g., cumulative CO2 leak) for all potential monitoring strategies. To reduce the computational cost of the filtering-based data assimilation process, two machine-learning algorithms: Support Vector Regression (SVR) and Multivariate Adaptive Regression Splines (MARS) are used to develop the computationally efficient reduced-order-models (ROMs) from full numerical simulations of CO2 and brine flow. The proposed framework for GCS monitoring optimization is demonstrated with two examples: a simple 3D synthetic case and a real field case named Rock Spring Uplift carbon storage site in Southwestern Wyoming.
Hybrid RANS-LES using high order numerical methods

NASA Astrophysics Data System (ADS)

Henry de Frahan, Marc; Yellapantula, Shashank; Vijayakumar, Ganesh; Knaus, Robert; Sprague, Michael

2017-11-01

Understanding the impact of wind turbine wake dynamics on downstream turbines is particularly important for the design of efficient wind farms. Due to their tractable computational cost, hybrid RANS/LES models are an attractive framework for simulating separation flows such as the wake dynamics behind a wind turbine. High-order numerical methods can be computationally efficient and provide increased accuracy in simulating complex flows. In the context of LES, high-order numerical methods have shown some success in predictions of turbulent flows. However, the specifics of hybrid RANS-LES models, including the transition region between both modeling frameworks, pose unique challenges for high-order numerical methods. In this work, we study the effect of increasing the order of accuracy of the numerical scheme in simulations of canonical turbulent flows using RANS, LES, and hybrid RANS-LES models. We describe the interactions between filtering, model transition, and order of accuracy and their effect on turbulence quantities such as kinetic energy spectra, boundary layer evolution, and dissipation rate. This work was funded by the U.S. Department of Energy, Exascale Computing Project, under Contract No. DE-AC36-08-GO28308 with the National Renewable Energy Laboratory.
Propulsive efficiency of frog swimming with different feet and swimming patterns

PubMed Central

Jizhuang, Fan; Wei, Zhang; Bowen, Yuan; Gangfeng, Liu

2017-01-01

ABSTRACT Aquatic and terrestrial animals have different swimming performances and mechanical efficiencies based on their different swimming methods. To explore propulsion in swimming frogs, this study calculated mechanical efficiencies based on data describing aquatic and terrestrial webbed-foot shapes and swimming patterns. First, a simplified frog model and dynamic equation were established, and hydrodynamic forces on the foot were computed according to computational fluid dynamic calculations. Then, a two-link mechanism was used to stand in for the diverse and complicated hind legs found in different frog species, in order to simplify the input work calculation. Joint torques were derived based on the virtual work principle to compute the efficiency of foot propulsion. Finally, two feet and swimming patterns were combined to compute propulsive efficiency. The aquatic frog demonstrated a propulsive efficiency (43.11%) between those of drag-based and lift-based propulsions, while the terrestrial frog efficiency (29.58%) fell within the range of drag-based propulsion. The results illustrate the main factor of swimming patterns for swimming performance and efficiency. PMID:28302669
Efficient Reformulation of the Thermoelastic Higher-order Theory for Fgms

NASA Technical Reports Server (NTRS)

Bansal, Yogesh; Pindera, Marek-Jerzy; Arnold, Steven M. (Technical Monitor)

2002-01-01

Functionally graded materials (FGMs) are characterized by spatially variable microstructures which are introduced to satisfy given performance requirements. The microstructural gradation gives rise to continuously or discretely changing material properties which complicate FGM analysis. Various techniques have been developed during the past several decades for analyzing traditional composites and many of these have been adapted for the analysis of FGMs. Most of the available techniques use the so-called uncoupled approach in order to analyze graded structures. These techniques ignore the effect of microstructural gradation by employing specific spatial material property variations that are either assumed or obtained by local homogenization. The higher-order theory for functionally graded materials (HOTFGM) is a coupled approach developed by Aboudi et al. (1999) which takes the effect of microstructural gradation into consideration and does not ignore the local-global interaction of the spatially variable inclusion phase(s). Despite its demonstrated utility, however, the original formulation of the higher-order theory is computationally intensive. Herein, an efficient reformulation of the original higher-order theory for two-dimensional elastic problems is developed and validated. The use of the local-global conductivity and local-global stiffness matrix approach is made in order to reduce the number of equations involved. In this approach, surface-averaged quantities are the primary variables which replace volume-averaged quantities employed in the original formulation. The reformulation decreases the size of the global conductivity and stiffness matrices by approximately sixty percent. Various thermal, mechanical, and combined thermomechanical problems are analyzed in order to validate the accuracy of the reformulated theory through comparison with analytical and finite-element solutions. The presented results illustrate the efficiency of the reformulation and its advantages in analyzing functionally graded materials.
Photon-trapping microstructures enable high-speed high-efficiency silicon photodiodes

NASA Astrophysics Data System (ADS)

Gao, Yang; Cansizoglu, Hilal; Polat, Kazim G.; Ghandiparsi, Soroush; Kaya, Ahmet; Mamtaz, Hasina H.; Mayet, Ahmed S.; Wang, Yinan; Zhang, Xinzhi; Yamada, Toshishige; Devine, Ekaterina Ponizovskaya; Elrefaie, Aly F.; Wang, Shih-Yuan; Islam, M. Saif

2017-04-01

High-speed, high-efficiency photodetectors play an important role in optical communication links that are increasingly being used in data centres to handle higher volumes of data traffic and higher bandwidths, as big data and cloud computing continue to grow exponentially. Monolithic integration of optical components with signal-processing electronics on a single silicon chip is of paramount importance in the drive to reduce cost and improve performance. We report the first demonstration of micro- and nanoscale holes enabling light trapping in a silicon photodiode, which exhibits an ultrafast impulse response (full-width at half-maximum) of 30 ps and a high efficiency of more than 50%, for use in data-centre optical communications. The photodiode uses micro- and nanostructured holes to enhance, by an order of magnitude, the absorption efficiency of a thin intrinsic layer of less than 2 µm thickness and is designed for a data rate of 20 gigabits per second or higher at a wavelength of 850 nm. Further optimization can improve the efficiency to more than 70%.
Improving Computational Efficiency of Prediction in Model-Based Prognostics Using the Unscented Transform

NASA Technical Reports Server (NTRS)

Daigle, Matthew John; Goebel, Kai Frank

2010-01-01

Model-based prognostics captures system knowledge in the form of physics-based models of components, and how they fail, in order to obtain accurate predictions of end of life (EOL). EOL is predicted based on the estimated current state distribution of a component and expected profiles of future usage. In general, this requires simulations of the component using the underlying models. In this paper, we develop a simulation-based prediction methodology that achieves computational efficiency by performing only the minimal number of simulations needed in order to accurately approximate the mean and variance of the complete EOL distribution. This is performed through the use of the unscented transform, which predicts the means and covariances of a distribution passed through a nonlinear transformation. In this case, the EOL simulation acts as that nonlinear transformation. In this paper, we review the unscented transform, and describe how this concept is applied to efficient EOL prediction. As a case study, we develop a physics-based model of a solenoid valve, and perform simulation experiments to demonstrate improved computational efficiency without sacrificing prediction accuracy.
Derivation of general analytic gradient expressions for density-fitted post-Hartree-Fock methods: An efficient implementation for the density-fitted second-order Møller–Plesset perturbation theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bozkaya, Uğur, E-mail: ugur.bozkaya@atauni.edu.tr

General analytic gradient expressions (with the frozen-core approximation) are presented for density-fitted post-HF methods. An efficient implementation of frozen-core analytic gradients for the second-order Møller–Plesset perturbation theory (MP2) with the density-fitting (DF) approximation (applying to both reference and correlation energies), which is denoted as DF-MP2, is reported. The DF-MP2 method is applied to a set of alkanes, conjugated dienes, and noncovalent interaction complexes to compare the computational cost of single point analytic gradients with MP2 with the resolution of the identity approach (RI-MP2) [F. Weigend and M. Häser, Theor. Chem. Acc. 97, 331 (1997); R. A. Distasio, R. P. Steele,more » Y. M. Rhee, Y. Shao, and M. Head-Gordon, J. Comput. Chem. 28, 839 (2007)]. In the RI-MP2 method, the DF approach is used only for the correlation energy. Our results demonstrate that the DF-MP2 method substantially accelerate the RI-MP2 method for analytic gradient computations due to the reduced input/output (I/O) time. Because in the DF-MP2 method the DF approach is used for both reference and correlation energies, the storage of 4-index electron repulsion integrals (ERIs) are avoided, 3-index ERI tensors are employed instead. Further, as in case of integrals, our gradient equation is completely avoid construction or storage of the 4-index two-particle density matrix (TPDM), instead we use 2- and 3-index TPDMs. Hence, the I/O bottleneck of a gradient computation is significantly overcome. Therefore, the cost of the generalized-Fock matrix (GFM), TPDM, solution of Z-vector equations, the back transformation of TPDM, and integral derivatives are substantially reduced when the DF approach is used for the entire energy expression. Further application results show that the DF approach introduce negligible errors for closed-shell reaction energies and equilibrium bond lengths.« less
Decreasing the temporal complexity for nonlinear, implicit reduced-order models by forecasting

DOE PAGES

Carlberg, Kevin; Ray, Jaideep; van Bloemen Waanders, Bart

2015-02-14

Implicit numerical integration of nonlinear ODEs requires solving a system of nonlinear algebraic equations at each time step. Each of these systems is often solved by a Newton-like method, which incurs a sequence of linear-system solves. Most model-reduction techniques for nonlinear ODEs exploit knowledge of system's spatial behavior to reduce the computational complexity of each linear-system solve. However, the number of linear-system solves for the reduced-order simulation often remains roughly the same as that for the full-order simulation. We propose exploiting knowledge of the model's temporal behavior to (1) forecast the unknown variable of the reduced-order system of nonlinear equationsmore » at future time steps, and (2) use this forecast as an initial guess for the Newton-like solver during the reduced-order-model simulation. To compute the forecast, we propose using the Gappy POD technique. As a result, the goal is to generate an accurate initial guess so that the Newton solver requires many fewer iterations to converge, thereby decreasing the number of linear-system solves in the reduced-order-model simulation.« less

Manufacturing Technology Support (MATES). Delivery Order 0002: Business Practices Assessment (BPA) Development

DTIC Science & Technology

2008-03-01

order fulfillment visibility, Kanban deployment, inventory count can be made visually, machines and tool labeling, costs, preventive maintenance...order fulfillment, computer scheduling versus Kanban , pull versus push systems, flow time efficiencies, back room costs of scheduling, MRP costs
Energy Efficiency Maximization of Practical Wireless Communication Systems

NASA Astrophysics Data System (ADS)

Eraslan, Eren

Energy consumption of the modern wireless communication systems is rapidly growing due to the ever-increasing data demand and the advanced solutions employed in order to address this demand, such as multiple-input multiple-output (MIMO) and orthogonal frequency division multiplexing (OFDM) techniques. These MIMO systems are power hungry, however, they are capable of changing the transmission parameters, such as number of spatial streams, number of transmitter/receiver antennas, modulation, code rate, and transmit power. They can thus choose the best mode out of possibly thousands of modes in order to optimize an objective function. This problem is referred to as the link adaptation problem. In this work, we focus on the link adaptation for energy efficiency maximization problem, which is defined as choosing the optimal transmission mode to maximize the number of successfully transmitted bits per unit energy consumed by the link. We model the energy consumption and throughput performances of a MIMO-OFDM link and develop a practical link adaptation protocol, which senses the channel conditions and changes its transmission mode in real-time. It turns out that the brute force search, which is usually assumed in previous works, is prohibitively complex, especially when there are large numbers of transmit power levels to choose from. We analyze the relationship between the energy efficiency and transmit power, and prove that energy efficiency of a link is a single-peaked quasiconcave function of transmit power. This leads us to develop a low-complexity algorithm that finds a near-optimal transmit power and take this dimension out of the search space. We further prune the search space by analyzing the singular value decomposition of the channel and excluding the modes that use higher number of spatial streams than the channel can support. These algorithms and our novel formulations provide simpler computations and limit the search space into a much smaller set; hence reducing the computational complexity by orders of magnitude without sacrificing the performance. The result of this work is a highly practical link adaptation protocol for maximizing the energy efficiency of modern wireless communication systems. Simulation results show orders of magnitude gain in the energy efficiency of the link. We also implemented the link adaptation protocol on real-time MIMO-OFDM radios and we report on the experimental results. To the best of our knowledge, this is the first reported testbed that is capable of performing energy-efficient fast link adaptation using PHY layer information.
Computational study of configurational and vibrational contributions to the thermodynamics of substitutional alloys: The case of Ni3Al

NASA Astrophysics Data System (ADS)

Michelon, M. F.; Antonelli, A.

2010-03-01

We have developed a methodology to study the thermodynamics of order-disorder transformations in n -component substitutional alloys that combines nonequilibrium methods, which can efficiently compute free energies, with Monte Carlo simulations, in which configurational and vibrational degrees of freedom are simultaneously considered on an equal footing basis. Furthermore, with this methodology one can easily perform simulations in the canonical and in the isobaric-isothermal ensembles, which allow the investigation of the bulk volume effect. We have applied this methodology to calculate configurational and vibrational contributions to the entropy of the Ni3Al alloy as functions of temperature. The simulations show that when the volume of the system is kept constant, the vibrational entropy does not change upon transition while constant-pressure calculations indicate that the volume increase at the order-disorder transition causes a vibrational entropy increase of 0.08kB/atom . This is significant when compared to the configurational entropy increase of 0.27kB/atom . Our calculations also indicate that the inclusion of vibrations reduces in about 30% the order-disorder transition temperature determined solely considering the configurational degrees of freedom.
GPU-accelerated Modeling and Element-free Reverse-time Migration with Gauss Points Partition

NASA Astrophysics Data System (ADS)

Zhen, Z.; Jia, X.

2014-12-01

Element-free method (EFM) has been applied to seismic modeling and migration. Compared with finite element method (FEM) and finite difference method (FDM), it is much cheaper and more flexible because only the information of the nodes and the boundary of the study area are required in computation. In the EFM, the number of Gauss points should be consistent with the number of model nodes; otherwise the accuracy of the intermediate coefficient matrices would be harmed. Thus when we increase the nodes of velocity model in order to obtain higher resolution, we find that the size of the computer's memory will be a bottleneck. The original EFM can deal with at most 81×81 nodes in the case of 2G memory, as tested by Jia and Hu (2006). In order to solve the problem of storage and computation efficiency, we propose a concept of Gauss points partition (GPP), and utilize the GPUs to improve the computation efficiency. Considering the characteristics of the Gaussian points, the GPP method doesn't influence the propagation of seismic wave in the velocity model. To overcome the time-consuming computation of the stiffness matrix (K) and the mass matrix (M), we also use the GPUs in our computation program. We employ the compressed sparse row (CSR) format to compress the intermediate sparse matrices and try to simplify the operations by solving the linear equations with the CULA Sparse's Conjugate Gradient (CG) solver instead of the linear sparse solver 'PARDISO'. It is observed that our strategy can significantly reduce the computational time of K and Mcompared with the algorithm based on CPU. The model tested is Marmousi model. The length of the model is 7425m and the depth is 2990m. We discretize the model with 595x298 nodes, 300x300 Gauss cells and 3x3 Gauss points in each cell. In contrast to the computational time of the conventional EFM, the GPUs-GPP approach can substantially improve the efficiency. The speedup ratio of time consumption of computing K, M is 120 and the speedup ratio time consumption of RTM is 11.5. At the same time, the accuracy of imaging is not harmed. Another advantage of the GPUs-GPP method is its easy applications in other numerical methods such as the FEM. Finally, in the GPUs-GPP method, the arrays require quite limited memory storage, which makes the method promising in dealing with large-scale 3D problems.
Scaled Runge-Kutta algorithms for handling dense output

NASA Technical Reports Server (NTRS)

Horn, M. K.

1981-01-01

Low order Runge-Kutta algorithms are developed which determine the solution of a system of ordinary differential equations at any point within a given integration step, as well as at the end of each step. The scaled Runge-Kutta methods are designed to be used with existing Runge-Kutta formulas, using the derivative evaluations of these defining algorithms as the core of the system. For a slight increase in computing time, the solution may be generated within the integration step, improving the efficiency of the Runge-Kutta algorithms, since the step length need no longer be severely reduced to coincide with the desired output point. Scaled Runge-Kutta algorithms are presented for orders 3 through 5, along with accuracy comparisons between the defining algorithms and their scaled versions for a test problem.
Predicting Statistical Response and Extreme Events in Uncertainty Quantification through Reduced-Order Models

NASA Astrophysics Data System (ADS)

Qi, D.; Majda, A.

2017-12-01

A low-dimensional reduced-order statistical closure model is developed for quantifying the uncertainty in statistical sensitivity and intermittency in principal model directions with largest variability in high-dimensional turbulent system and turbulent transport models. Imperfect model sensitivity is improved through a recent mathematical strategy for calibrating model errors in a training phase, where information theory and linear statistical response theory are combined in a systematic fashion to achieve the optimal model performance. The idea in the reduced-order method is from a self-consistent mathematical framework for general systems with quadratic nonlinearity, where crucial high-order statistics are approximated by a systematic model calibration procedure. Model efficiency is improved through additional damping and noise corrections to replace the expensive energy-conserving nonlinear interactions. Model errors due to the imperfect nonlinear approximation are corrected by tuning the model parameters using linear response theory with an information metric in a training phase before prediction. A statistical energy principle is adopted to introduce a global scaling factor in characterizing the higher-order moments in a consistent way to improve model sensitivity. Stringent models of barotropic and baroclinic turbulence are used to display the feasibility of the reduced-order methods. Principal statistical responses in mean and variance can be captured by the reduced-order models with accuracy and efficiency. Besides, the reduced-order models are also used to capture crucial passive tracer field that is advected by the baroclinic turbulent flow. It is demonstrated that crucial principal statistical quantities like the tracer spectrum and fat-tails in the tracer probability density functions in the most important large scales can be captured efficiently with accuracy using the reduced-order tracer model in various dynamical regimes of the flow field with distinct statistical structures.
Reduced order modeling of fluid/structure interaction.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barone, Matthew Franklin; Kalashnikova, Irina; Segalman, Daniel Joseph

2009-11-01

This report describes work performed from October 2007 through September 2009 under the Sandia Laboratory Directed Research and Development project titled 'Reduced Order Modeling of Fluid/Structure Interaction.' This project addresses fundamental aspects of techniques for construction of predictive Reduced Order Models (ROMs). A ROM is defined as a model, derived from a sequence of high-fidelity simulations, that preserves the essential physics and predictive capability of the original simulations but at a much lower computational cost. Techniques are developed for construction of provably stable linear Galerkin projection ROMs for compressible fluid flow, including a method for enforcing boundary conditions that preservesmore » numerical stability. A convergence proof and error estimates are given for this class of ROM, and the method is demonstrated on a series of model problems. A reduced order method, based on the method of quadratic components, for solving the von Karman nonlinear plate equations is developed and tested. This method is applied to the problem of nonlinear limit cycle oscillations encountered when the plate interacts with an adjacent supersonic flow. A stability-preserving method for coupling the linear fluid ROM with the structural dynamics model for the elastic plate is constructed and tested. Methods for constructing efficient ROMs for nonlinear fluid equations are developed and tested on a one-dimensional convection-diffusion-reaction equation. These methods are combined with a symmetrization approach to construct a ROM technique for application to the compressible Navier-Stokes equations.« less
An efficient fully-implicit multislope MUSCL method for multiphase flow with gravity in discrete fractured media

NASA Astrophysics Data System (ADS)

Jiang, Jiamin; Younis, Rami M.

2017-06-01

The first-order methods commonly employed in reservoir simulation for computing the convective fluxes introduce excessive numerical diffusion leading to severe smoothing of displacement fronts. We present a fully-implicit cell-centered finite-volume (CCFV) framework that can achieve second-order spatial accuracy on smooth solutions, while at the same time maintain robustness and nonlinear convergence performance. A novel multislope MUSCL method is proposed to construct the required values at edge centroids in a straightforward and effective way by taking advantage of the triangular mesh geometry. In contrast to the monoslope methods in which a unique limited gradient is used, the multislope concept constructs specific scalar slopes for the interpolations on each edge of a given element. Through the edge centroids, the numerical diffusion caused by mesh skewness is reduced, and optimal second order accuracy can be achieved. Moreover, an improved smooth flux-limiter is introduced to ensure monotonicity on non-uniform meshes. The flux-limiter provides high accuracy without degrading nonlinear convergence performance. The CCFV framework is adapted to accommodate a lower-dimensional discrete fracture-matrix (DFM) model. Several numerical tests with discrete fractured system are carried out to demonstrate the efficiency and robustness of the numerical model.
Yarn-dyed fabric defect classification based on convolutional neural network

NASA Astrophysics Data System (ADS)

Jing, Junfeng; Dong, Amei; Li, Pengfei

2017-07-01

Considering that the manual inspection of the yarn-dyed fabric can be time consuming and less efficient, a convolutional neural network (CNN) solution based on the modified AlexNet structure for the classification of the yarn-dyed fabric defect is proposed. CNN has powerful ability of feature extraction and feature fusion which can simulate the learning mechanism of the human brain. In order to enhance computational efficiency and detection accuracy, the local response normalization (LRN) layers in AlexNet are replaced by the batch normalization (BN) layers. In the process of the network training, through several convolution operations, the characteristics of the image are extracted step by step, and the essential features of the image can be obtained from the edge features. And the max pooling layers, the dropout layers, the fully connected layers are also employed in the classification model to reduce the computation cost and acquire more precise features of fabric defect. Finally, the results of the defect classification are predicted by the softmax function. The experimental results show the capability of defect classification via the modified Alexnet model and indicate its robustness.
Dual Super-Systolic Core for Real-Time Reconstructive Algorithms of High-Resolution Radar/SAR Imaging Systems

PubMed Central

Atoche, Alejandro Castillo; Castillo, Javier Vázquez

2012-01-01

A high-speed dual super-systolic core for reconstructive signal processing (SP) operations consists of a double parallel systolic array (SA) machine in which each processing element of the array is also conceptualized as another SA in a bit-level fashion. In this study, we addressed the design of a high-speed dual super-systolic array (SSA) core for the enhancement/reconstruction of remote sensing (RS) imaging of radar/synthetic aperture radar (SAR) sensor systems. The selected reconstructive SP algorithms are efficiently transformed in their parallel representation and then, they are mapped into an efficient high performance embedded computing (HPEC) architecture in reconfigurable Xilinx field programmable gate array (FPGA) platforms. As an implementation test case, the proposed approach was aggregated in a HW/SW co-design scheme in order to solve the nonlinear ill-posed inverse problem of nonparametric estimation of the power spatial spectrum pattern (SSP) from a remotely sensed scene. We show how such dual SSA core, drastically reduces the computational load of complex RS regularization techniques achieving the required real-time operational mode. PMID:22736964
An implicit higher-order spatially accurate scheme for solving time dependent flows on unstructured meshes

NASA Astrophysics Data System (ADS)

Tomaro, Robert F.

1998-07-01

The present research is aimed at developing a higher-order, spatially accurate scheme for both steady and unsteady flow simulations using unstructured meshes. The resulting scheme must work on a variety of general problems to ensure the creation of a flexible, reliable and accurate aerodynamic analysis tool. To calculate the flow around complex configurations, unstructured grids and the associated flow solvers have been developed. Efficient simulations require the minimum use of computer memory and computational times. Unstructured flow solvers typically require more computer memory than a structured flow solver due to the indirect addressing of the cells. The approach taken in the present research was to modify an existing three-dimensional unstructured flow solver to first decrease the computational time required for a solution and then to increase the spatial accuracy. The terms required to simulate flow involving non-stationary grids were also implemented. First, an implicit solution algorithm was implemented to replace the existing explicit procedure. Several test cases, including internal and external, inviscid and viscous, two-dimensional, three-dimensional and axi-symmetric problems, were simulated for comparison between the explicit and implicit solution procedures. The increased efficiency and robustness of modified code due to the implicit algorithm was demonstrated. Two unsteady test cases, a plunging airfoil and a wing undergoing bending and torsion, were simulated using the implicit algorithm modified to include the terms required for a moving and/or deforming grid. Secondly, a higher than second-order spatially accurate scheme was developed and implemented into the baseline code. Third- and fourth-order spatially accurate schemes were implemented and tested. The original dissipation was modified to include higher-order terms and modified near shock waves to limit pre- and post-shock oscillations. The unsteady cases were repeated using the higher-order spatially accurate code. The new solutions were compared with those obtained using the second-order spatially accurate scheme. Finally, the increased efficiency of using an implicit solution algorithm in a production Computational Fluid Dynamics flow solver was demonstrated for steady and unsteady flows. A third- and fourth-order spatially accurate scheme has been implemented creating a basis for a state-of-the-art aerodynamic analysis tool.
Remote sensing image ship target detection method based on visual attention model

NASA Astrophysics Data System (ADS)

Sun, Yuejiao; Lei, Wuhu; Ren, Xiaodong

2017-11-01

The traditional methods of detecting ship targets in remote sensing images mostly use sliding window to search the whole image comprehensively. However, the target usually occupies only a small fraction of the image. This method has high computational complexity for large format visible image data. The bottom-up selective attention mechanism can selectively allocate computing resources according to visual stimuli, thus improving the computational efficiency and reducing the difficulty of analysis. Considering of that, a method of ship target detection in remote sensing images based on visual attention model was proposed in this paper. The experimental results show that the proposed method can reduce the computational complexity while improving the detection accuracy, and improve the detection efficiency of ship targets in remote sensing images.
Efficient Geometric Sound Propagation Using Visibility Culling

NASA Astrophysics Data System (ADS)

Chandak, Anish

2011-07-01

Simulating propagation of sound can improve the sense of realism in interactive applications such as video games and can lead to better designs in engineering applications such as architectural acoustics. In this thesis, we present geometric sound propagation techniques which are faster than prior methods and map well to upcoming parallel multi-core CPUs. We model specular reflections by using the image-source method and model finite-edge diffraction by using the well-known Biot-Tolstoy-Medwin (BTM) model. We accelerate the computation of specular reflections by applying novel visibility algorithms, FastV and AD-Frustum, which compute visibility from a point. We accelerate finite-edge diffraction modeling by applying a novel visibility algorithm which computes visibility from a region. Our visibility algorithms are based on frustum tracing and exploit recent advances in fast ray-hierarchy intersections, data-parallel computations, and scalable, multi-core algorithms. The AD-Frustum algorithm adapts its computation to the scene complexity and allows small errors in computing specular reflection paths for higher computational efficiency. FastV and our visibility algorithm from a region are general, object-space, conservative visibility algorithms that together significantly reduce the number of image sources compared to other techniques while preserving the same accuracy. Our geometric propagation algorithms are an order of magnitude faster than prior approaches for modeling specular reflections and two to ten times faster for modeling finite-edge diffraction. Our algorithms are interactive, scale almost linearly on multi-core CPUs, and can handle large, complex, and dynamic scenes. We also compare the accuracy of our sound propagation algorithms with other methods. Once sound propagation is performed, it is desirable to listen to the propagated sound in interactive and engineering applications. We can generate smooth, artifact-free output audio signals by applying efficient audio-processing algorithms. We also present the first efficient audio-processing algorithm for scenarios with simultaneously moving source and moving receiver (MS-MR) which incurs less than 25% overhead compared to static source and moving receiver (SS-MR) or moving source and static receiver (MS-SR) scenario.
Efficient high-order structure-preserving methods for the generalized Rosenau-type equation with power law nonlinearity

NASA Astrophysics Data System (ADS)

Cai, Jiaxiang; Liang, Hua; Zhang, Chun

2018-06-01

Based on the multi-symplectic Hamiltonian formula of the generalized Rosenau-type equation, a multi-symplectic scheme and an energy-preserving scheme are proposed. To improve the accuracy of the solution, we apply the composition technique to the obtained schemes to develop high-order schemes which are also multi-symplectic and energy-preserving respectively. Discrete fast Fourier transform makes a significant improvement to the computational efficiency of schemes. Numerical results verify that all the proposed schemes have satisfactory performance in providing accurate solution and preserving the discrete mass and energy invariants. Numerical results also show that although each basic time step is divided into several composition steps, the computational efficiency of the composition schemes is much higher than that of the non-composite schemes.
Constrained Total Generalized p-Variation Minimization for Few-View X-Ray Computed Tomography Image Reconstruction

PubMed Central

Zhang, Hanming; Wang, Linyuan; Yan, Bin; Li, Lei; Cai, Ailong; Hu, Guoen

2016-01-01

Total generalized variation (TGV)-based computed tomography (CT) image reconstruction, which utilizes high-order image derivatives, is superior to total variation-based methods in terms of the preservation of edge information and the suppression of unfavorable staircase effects. However, conventional TGV regularization employs l1-based form, which is not the most direct method for maximizing sparsity prior. In this study, we propose a total generalized p-variation (TGpV) regularization model to improve the sparsity exploitation of TGV and offer efficient solutions to few-view CT image reconstruction problems. To solve the nonconvex optimization problem of the TGpV minimization model, we then present an efficient iterative algorithm based on the alternating minimization of augmented Lagrangian function. All of the resulting subproblems decoupled by variable splitting admit explicit solutions by applying alternating minimization method and generalized p-shrinkage mapping. In addition, approximate solutions that can be easily performed and quickly calculated through fast Fourier transform are derived using the proximal point method to reduce the cost of inner subproblems. The accuracy and efficiency of the simulated and real data are qualitatively and quantitatively evaluated to validate the efficiency and feasibility of the proposed method. Overall, the proposed method exhibits reasonable performance and outperforms the original TGV-based method when applied to few-view problems. PMID:26901410
Two reduced form air quality modeling techniques for rapidly calculating pollutant mitigation potential across many sources, locations and precursor emission types

EPA Science Inventory

Due to the computational cost of running regional-scale numerical air quality models, reduced form models (RFM) have been proposed as computationally efficient simulation tools for characterizing the pollutant response to many different types of emission reductions. The U.S. Envi...
Development of a Reduced-Order Three-Dimensional Flow Model for Thermal Mixing and Stratification Simulation during Reactor Transients

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hu, Rui

2017-09-03

Mixing, thermal-stratification, and mass transport phenomena in large pools or enclosures play major roles for the safety of reactor systems. Depending on the fidelity requirement and computational resources, various modeling methods, from the 0-D perfect mixing model to 3-D Computational Fluid Dynamics (CFD) models, are available. Each is associated with its own advantages and shortcomings. It is very desirable to develop an advanced and efficient thermal mixing and stratification modeling capability embedded in a modern system analysis code to improve the accuracy of reactor safety analyses and to reduce modeling uncertainties. An advanced system analysis tool, SAM, is being developedmore » at Argonne National Laboratory for advanced non-LWR reactor safety analysis. While SAM is being developed as a system-level modeling and simulation tool, a reduced-order three-dimensional module is under development to model the multi-dimensional flow and thermal mixing and stratification in large enclosures of reactor systems. This paper provides an overview of the three-dimensional finite element flow model in SAM, including the governing equations, stabilization scheme, and solution methods. Additionally, several verification and validation tests are presented, including lid-driven cavity flow, natural convection inside a cavity, laminar flow in a channel of parallel plates. Based on the comparisons with the analytical solutions and experimental results, it is demonstrated that the developed 3-D fluid model can perform very well for a wide range of flow problems.« less
Computationally efficient stochastic optimization using multiple realizations

NASA Astrophysics Data System (ADS)

Bayer, P.; Bürger, C. M.; Finkel, M.

2008-02-01

The presented study is concerned with computationally efficient methods for solving stochastic optimization problems involving multiple equally probable realizations of uncertain parameters. A new and straightforward technique is introduced that is based on dynamically ordering the stack of realizations during the search procedure. The rationale is that a small number of critical realizations govern the output of a reliability-based objective function. By utilizing a problem, which is typical to designing a water supply well field, several variants of this "stack ordering" approach are tested. The results are statistically assessed, in terms of optimality and nominal reliability. This study demonstrates that the simple ordering of a given number of 500 realizations while applying an evolutionary search algorithm can save about half of the model runs without compromising the optimization procedure. More advanced variants of stack ordering can, if properly configured, save up to more than 97% of the computational effort that would be required if the entire number of realizations were considered. The findings herein are promising for similar problems of water management and reliability-based design in general, and particularly for non-convex problems that require heuristic search techniques.
An adaptive grid to improve the efficiency and accuracy of modelling underwater noise from shipping

NASA Astrophysics Data System (ADS)

Trigg, Leah; Chen, Feng; Shapiro, Georgy; Ingram, Simon; Embling, Clare

2017-04-01

Underwater noise from shipping is becoming a significant concern and has been listed as a pollutant under Descriptor 11 of the Marine Strategy Framework Directive. Underwater noise models are an essential tool to assess and predict noise levels for regulatory procedures such as environmental impact assessments and ship noise monitoring. There are generally two approaches to noise modelling. The first is based on simplified energy flux models, assuming either spherical or cylindrical propagation of sound energy. These models are very quick but they ignore important water column and seabed properties, and produce significant errors in the areas subject to temperature stratification (Shapiro et al., 2014). The second type of model (e.g. ray-tracing and parabolic equation) is based on an advanced physical representation of sound propagation. However, these acoustic propagation models are computationally expensive to execute. Shipping noise modelling requires spatial discretization in order to group noise sources together using a grid. A uniform grid size is often selected to achieve either the greatest efficiency (i.e. speed of computations) or the greatest accuracy. In contrast, this work aims to produce efficient and accurate noise level predictions by presenting an adaptive grid where cell size varies with distance from the receiver. The spatial range over which a certain cell size is suitable was determined by calculating the distance from the receiver at which propagation loss becomes uniform across a grid cell. The computational efficiency and accuracy of the resulting adaptive grid was tested by comparing it to uniform 1 km and 5 km grids. These represent an accurate and computationally efficient grid respectively. For a case study of the Celtic Sea, an application of the adaptive grid over an area of 160×160 km reduced the number of model executions required from 25600 for a 1 km grid to 5356 in December and to between 5056 and 13132 in August, which represents a 2 to 5-fold increase in efficiency. The 5 km grid reduces the number of model executions further to 1024. However, over the first 25 km the 5 km grid produces errors of up to 13.8 dB when compared to the highly accurate but inefficient 1 km grid. The newly developed adaptive grid generates much smaller errors of less than 0.5 dB while demonstrating high computational efficiency. Our results show that the adaptive grid provides the ability to retain the accuracy of noise level predictions and improve the efficiency of the modelling process. This can help safeguard sensitive marine ecosystems from noise pollution by improving the underwater noise predictions that inform management activities. References Shapiro, G., Chen, F., Thain, R., 2014. The Effect of Ocean Fronts on Acoustic Wave Propagation in a Shallow Sea, Journal of Marine System, 139: 217 - 226. http://dx.doi.org/10.1016/j.jmarsys.2014.06.007.
Physics and Computational Methods for X-ray Scatter Estimation and Correction in Cone-Beam Computed Tomography

NASA Astrophysics Data System (ADS)

Bootsma, Gregory J.

X-ray scatter in cone-beam computed tomography (CBCT) is known to reduce image quality by introducing image artifacts, reducing contrast, and limiting computed tomography (CT) number accuracy. The extent of the effect of x-ray scatter on CBCT image quality is determined by the shape and magnitude of the scatter distribution in the projections. A method to allay the effects of scatter is imperative to enable application of CBCT to solve a wider domain of clinical problems. The work contained herein proposes such a method. A characterization of the scatter distribution through the use of a validated Monte Carlo (MC) model is carried out. The effects of imaging parameters and compensators on the scatter distribution are investigated. The spectral frequency components of the scatter distribution in CBCT projection sets are analyzed using Fourier analysis and found to reside predominately in the low frequency domain. The exact frequency extents of the scatter distribution are explored for different imaging configurations and patient geometries. Based on the Fourier analysis it is hypothesized the scatter distribution can be represented by a finite sum of sine and cosine functions. The fitting of MC scatter distribution estimates enables the reduction of the MC computation time by diminishing the number of photon tracks required by over three orders of magnitude. The fitting method is incorporated into a novel scatter correction method using an algorithm that simultaneously combines multiple MC scatter simulations. Running concurrent MC simulations while simultaneously fitting the results allows for the physical accuracy and flexibility of MC methods to be maintained while enhancing the overall efficiency. CBCT projection set scatter estimates, using the algorithm, are computed on the order of 1--2 minutes instead of hours or days. Resulting scatter corrected reconstructions show a reduction in artifacts and improvement in tissue contrast and voxel value accuracy.

Reduced-cost linear-response CC2 method based on natural orbitals and natural auxiliary functions

PubMed Central

Mester, Dávid

2017-01-01

A reduced-cost density fitting (DF) linear-response second-order coupled-cluster (CC2) method has been developed for the evaluation of excitation energies. The method is based on the simultaneous truncation of the molecular orbital (MO) basis and the auxiliary basis set used for the DF approximation. For the reduction of the size of the MO basis, state-specific natural orbitals (NOs) are constructed for each excited state using the average of the second-order Møller–Plesset (MP2) and the corresponding configuration interaction singles with perturbative doubles [CIS(D)] density matrices. After removing the NOs of low occupation number, natural auxiliary functions (NAFs) are constructed [M. Kállay, J. Chem. Phys. 141, 244113 (2014)], and the NAF basis is also truncated. Our results show that, for a triple-zeta basis set, about 60% of the virtual MOs can be dropped, while the size of the fitting basis can be reduced by a factor of five. This results in a dramatic reduction of the computational costs of the solution of the CC2 equations, which are in our approach about as expensive as the evaluation of the MP2 and CIS(D) density matrices. All in all, an average speedup of more than an order of magnitude can be achieved at the expense of a mean absolute error of 0.02 eV in the calculated excitation energies compared to the canonical CC2 results. Our benchmark calculations demonstrate that the new approach enables the efficient computation of CC2 excitation energies for excited states of all types of medium-sized molecules composed of up to 100 atoms with triple-zeta quality basis sets. PMID:28527453
A CW FFAG for Proton Computed Tomography

DOE Office of Scientific and Technical Information (OSTI.GOV)

Johnstone, C.; Neuffer, D. V.; Snopok, P.

2012-05-01

An advantage of the cyclotron in proton therapy is the continuous (CW) beam output which reduces complexity and response time in the dosimetry requirements and beam controls. A CW accelerator requires isochronous particle orbits at all energie s through the acceleration cycle and present compact isochronous cyclotrons for proton therapy reach only 250 MeV (kinetic energy) which is required for patient treatment, but low for full Proton Computed Tomography (PCT) capability. PCT specifications ne ed 300-330 MeV in order for protons to transit the human body. Recent innovations in nonscaling FFAG design have achieved isochronous performance in a compact (~3more » m radius) design at these higher energies. Preliminary isochronous designs are presented her e. Lower energy beams can be efficiently extracted for patient treatment without changes to the acceleration cycle and magnet currents.« less
Differential Evolution algorithm applied to FSW model calibration

NASA Astrophysics Data System (ADS)

Idagawa, H. S.; Santos, T. F. A.; Ramirez, A. J.

2014-03-01

Friction Stir Welding (FSW) is a solid state welding process that can be modelled using a Computational Fluid Dynamics (CFD) approach. These models use adjustable parameters to control the heat transfer and the heat input to the weld. These parameters are used to calibrate the model and they are generally determined using the conventional trial and error approach. Since this method is not very efficient, we used the Differential Evolution (DE) algorithm to successfully determine these parameters. In order to improve the success rate and to reduce the computational cost of the method, this work studied different characteristics of the DE algorithm, such as the evolution strategy, the objective function, the mutation scaling factor and the crossover rate. The DE algorithm was tested using a friction stir weld performed on a UNS S32205 Duplex Stainless Steel.
Operating room integration and telehealth.

PubMed

Bucholz, Richard D; Laycock, Keith A; McDurmont, Leslie

2011-01-01

The increasing use of advanced automated and computer-controlled systems and devices in surgical procedures has resulted in problems arising from the crowding of the operating room with equipment and the incompatible control and communication standards associated with each system. This lack of compatibility between systems and centralized control means that the surgeon is frequently required to interact with multiple computer interfaces in order to obtain updates and exert control over the various devices at his disposal. To reduce this complexity and provide the surgeon with more complete and precise control of the operating room systems, a unified interface and communication network has been developed. In addition to improving efficiency, this network also allows the surgeon to grant remote access to consultants and observers at other institutions, enabling experts to participate in the procedure without having to travel to the site.
A Parametric Geometry Computational Fluid Dynamics (CFD) Study Utilizing Design of Experiments (DOE)

NASA Technical Reports Server (NTRS)

Rhew, Ray D.; Parker, Peter A.

2007-01-01

Design of Experiments (DOE) was applied to the LAS geometric parameter study to efficiently identify and rank primary contributors to integrated drag over the vehicles ascent trajectory in an order of magnitude fewer CFD configurations thereby reducing computational resources and solution time. SME s were able to gain a better understanding on the underlying flowphysics of different geometric parameter configurations through the identification of interaction effects. An interaction effect, which describes how the effect of one factor changes with respect to the levels of other factors, is often the key to product optimization. A DOE approach emphasizes a sequential approach to learning through successive experimentation to continuously build on previous knowledge. These studies represent a starting point for expanded experimental activities that will eventually cover the entire design space of the vehicle and flight trajectory.
Development of an Output-based Adaptive Method for Multi-Dimensional Euler and Navier-Stokes Simulations

NASA Technical Reports Server (NTRS)

Darmofal, David L.

2003-01-01

The use of computational simulations in the prediction of complex aerodynamic flows is becoming increasingly prevalent in the design process within the aerospace industry. Continuing advancements in both computing technology and algorithmic development are ultimately leading to attempts at simulating ever-larger, more complex problems. However, by increasing the reliance on computational simulations in the design cycle, we must also increase the accuracy of these simulations in order to maintain or improve the reliability arid safety of the resulting aircraft. At the same time, large-scale computational simulations must be made more affordable so that their potential benefits can be fully realized within the design cycle. Thus, a continuing need exists for increasing the accuracy and efficiency of computational algorithms such that computational fluid dynamics can become a viable tool in the design of more reliable, safer aircraft. The objective of this research was the development of an error estimation and grid adaptive strategy for reducing simulation errors in integral outputs (functionals) such as lift or drag from from multi-dimensional Euler and Navier-Stokes simulations. In this final report, we summarize our work during this grant.
Fast sweeping methods for hyperbolic systems of conservation laws at steady state II

NASA Astrophysics Data System (ADS)

Engquist, Björn; Froese, Brittany D.; Tsai, Yen-Hsi Richard

2015-04-01

The idea of using fast sweeping methods for solving stationary systems of conservation laws has previously been proposed for efficiently computing solutions with sharp shocks. We further develop these methods to allow for a more challenging class of problems including problems with sonic points, shocks originating in the interior of the domain, rarefaction waves, and two-dimensional systems. We show that fast sweeping methods can produce higher-order accuracy. Computational results validate the claims of accuracy, sharp shock curves, and optimal computational efficiency.
Structure-guided Protein Transition Modeling with a Probabilistic Roadmap Algorithm.

PubMed

Maximova, Tatiana; Plaku, Erion; Shehu, Amarda

2016-07-07

Proteins are macromolecules in perpetual motion, switching between structural states to modulate their function. A detailed characterization of the precise yet complex relationship between protein structure, dynamics, and function requires elucidating transitions between functionally-relevant states. Doing so challenges both wet and dry laboratories, as protein dynamics involves disparate temporal scales. In this paper we present a novel, sampling-based algorithm to compute transition paths. The algorithm exploits two main ideas. First, it leverages known structures to initialize its search and define a reduced conformation space for rapid sampling. This is key to address the insufficient sampling issue suffered by sampling-based algorithms. Second, the algorithm embeds samples in a nearest-neighbor graph where transition paths can be efficiently computed via queries. The algorithm adapts the probabilistic roadmap framework that is popular in robot motion planning. In addition to efficiently computing lowest-cost paths between any given structures, the algorithm allows investigating hypotheses regarding the order of experimentally-known structures in a transition event. This novel contribution is likely to open up new venues of research. Detailed analysis is presented on multiple-basin proteins of relevance to human disease. Multiscaling and the AMBER ff14SB force field are used to obtain energetically-credible paths at atomistic detail.
Numerical simulation of disperse particle flows on a graphics processing unit

NASA Astrophysics Data System (ADS)

Sierakowski, Adam J.

In both nature and technology, we commonly encounter solid particles being carried within fluid flows, from dust storms to sediment erosion and from food processing to energy generation. The motion of uncountably many particles in highly dynamic flow environments characterizes the tremendous complexity of such phenomena. While methods exist for the full-scale numerical simulation of such systems, current computational capabilities require the simplification of the numerical task with significant approximation using closure models widely recognized as insufficient. There is therefore a fundamental need for the investigation of the underlying physical processes governing these disperse particle flows. In the present work, we develop a new tool based on the Physalis method for the first-principles numerical simulation of thousands of particles (a small fraction of an entire disperse particle flow system) in order to assist in the search for new reduced-order closure models. We discuss numerous enhancements to the efficiency and stability of the Physalis method, which introduces the influence of spherical particles to a fixed-grid incompressible Navier-Stokes flow solver using a local analytic solution to the flow equations. Our first-principles investigation demands the modeling of unresolved length and time scales associated with particle collisions. We introduce a collision model alongside Physalis, incorporating lubrication effects and proposing a new nonlinearly damped Hertzian contact model. By reproducing experimental studies from the literature, we document extensive validation of the methods. We discuss the implementation of our methods for massively parallel computation using a graphics processing unit (GPU). We combine Eulerian grid-based algorithms with Lagrangian particle-based algorithms to achieve computational throughput up to 90 times faster than the legacy implementation of Physalis for a single central processing unit. By avoiding all data communication between the GPU and the host system during the simulation, we utilize with great efficacy the GPU hardware with which many high performance computing systems are currently equipped. We conclude by looking forward to the future of Physalis with multi-GPU parallelization in order to perform resolved disperse flow simulations of more than 100,000 particles and further advance the development of reduced-order closure models.
Toward an optimal solver for time-spectral fluid-dynamic and aeroelastic solutions on unstructured meshes

NASA Astrophysics Data System (ADS)

Mundis, Nathan L.; Mavriplis, Dimitri J.

2017-09-01

The time-spectral method applied to the Euler and coupled aeroelastic equations theoretically offers significant computational savings for purely periodic problems when compared to standard time-implicit methods. However, attaining superior efficiency with time-spectral methods over traditional time-implicit methods hinges on the ability rapidly to solve the large non-linear system resulting from time-spectral discretizations which become larger and stiffer as more time instances are employed or the period of the flow becomes especially short (i.e. the maximum resolvable wave-number increases). In order to increase the efficiency of these solvers, and to improve robustness, particularly for large numbers of time instances, the Generalized Minimal Residual Method (GMRES) is used to solve the implicit linear system over all coupled time instances. The use of GMRES as the linear solver makes time-spectral methods more robust, allows them to be applied to a far greater subset of time-accurate problems, including those with a broad range of harmonic content, and vastly improves the efficiency of time-spectral methods. In previous work, a wave-number independent preconditioner that mitigates the increased stiffness of the time-spectral method when applied to problems with large resolvable wave numbers has been developed. This preconditioner, however, directly inverts a large matrix whose size increases in proportion to the number of time instances. As a result, the computational time of this method scales as the cube of the number of time instances. In the present work, this preconditioner has been reworked to take advantage of an approximate-factorization approach that effectively decouples the spatial and temporal systems. Once decoupled, the time-spectral matrix can be inverted in frequency space, where it has entries only on the main diagonal and therefore can be inverted quite efficiently. This new GMRES/preconditioner combination is shown to be over an order of magnitude more efficient than the previous wave-number independent preconditioner for problems with large numbers of time instances and/or large reduced frequencies.
The Application of Multiobjective Evolutionary Algorithms to an Educational Computational Model of Science Information Processing: A Computational Experiment in Science Education

ERIC Educational Resources Information Center

Lamb, Richard L.; Firestone, Jonah B.

2017-01-01

Conflicting explanations and unrelated information in science classrooms increase cognitive load and decrease efficiency in learning. This reduced efficiency ultimately limits one's ability to solve reasoning problems in the science. In reasoning, it is the ability of students to sift through and identify critical pieces of information that is of…
A new approach to road accident rescue.

PubMed

Morales, Alejandro; González-Aguilera, Diego; López, Alfonso I; Gutiérrez, Miguel A

2016-01-01

This article develops and validates a new methodology and tool for rescue assistance in traffic accidents, with the aim of improving its efficiency and safety in the evacuation of people, reducing the number of victims in road accidents. Different tests supported by professionals and experts have been designed under different circumstances and with different categories of damaged vehicles coming from real accidents and simulated trapped victims in order to calibrate and refine the proposed methodology and tool. To validate this new approach, a tool called App_Rescue has been developed. This tool is based on the use of a computer system that allows an efficient access to the technical information of the vehicle and sanitary information of the common passengers. The time spent during rescue using the standard protocol and the proposed method was compared. This rescue assistance system allows us to make vital information accessible in posttrauma care services, improving the effectiveness of interventions by the emergency services, reducing the rescue time and therefore minimizing the consequences involved and the number of victims. This could often mean saving lives. In the different simulated rescue operations, the rescue time has been reduced an average of 14%.
An 81.6 μW FastICA processor for epileptic seizure detection.

PubMed

Yang, Chia-Hsiang; Shih, Yi-Hsin; Chiueh, Herming

2015-02-01

To improve the performance of epileptic seizure detection, independent component analysis (ICA) is applied to multi-channel signals to separate artifacts and signals of interest. FastICA is an efficient algorithm to compute ICA. To reduce the energy dissipation, eigenvalue decomposition (EVD) is utilized in the preprocessing stage to reduce the convergence time of iterative calculation of ICA components. EVD is computed efficiently through an array structure of processing elements running in parallel. Area-efficient EVD architecture is realized by leveraging the approximate Jacobi algorithm, leading to a 77.2% area reduction. By choosing proper memory element and reduced wordlength, the power and area of storage memory are reduced by 95.6% and 51.7%, respectively. The chip area is minimized through fixed-point implementation and architectural transformations. Given a latency constraint of 0.1 s, an 86.5% area reduction is achieved compared to the direct-mapped architecture. Fabricated in 90 nm CMOS, the core area of the chip is 0.40 mm(2). The FastICA processor, part of an integrated epileptic control SoC, dissipates 81.6 μW at 0.32 V. The computation delay of a frame of 256 samples for 8 channels is 84.2 ms. Compared to prior work, 0.5% power dissipation, 26.7% silicon area, and 3.4 × computation speedup are achieved. The performance of the chip was verified by human dataset.
Higher-order ice-sheet modelling accelerated by multigrid on graphics cards

NASA Astrophysics Data System (ADS)

Brædstrup, Christian; Egholm, David

2013-04-01

Higher-order ice flow modelling is a very computer intensive process owing primarily to the nonlinear influence of the horizontal stress coupling. When applied for simulating long-term glacial landscape evolution, the ice-sheet models must consider very long time series, while both high temporal and spatial resolution is needed to resolve small effects. The use of higher-order and full stokes models have therefore seen very limited usage in this field. However, recent advances in graphics card (GPU) technology for high performance computing have proven extremely efficient in accelerating many large-scale scientific computations. The general purpose GPU (GPGPU) technology is cheap, has a low power consumption and fits into a normal desktop computer. It could therefore provide a powerful tool for many glaciologists working on ice flow models. Our current research focuses on utilising the GPU as a tool in ice-sheet and glacier modelling. To this extent we have implemented the Integrated Second-Order Shallow Ice Approximation (iSOSIA) equations on the device using the finite difference method. To accelerate the computations, the GPU solver uses a non-linear Red-Black Gauss-Seidel iterator coupled with a Full Approximation Scheme (FAS) multigrid setup to further aid convergence. The GPU finite difference implementation provides the inherent parallelization that scales from hundreds to several thousands of cores on newer cards. We demonstrate the efficiency of the GPU multigrid solver using benchmark experiments.
A fast and efficient segmentation scheme for cell microscopic image.

PubMed

Lebrun, G; Charrier, C; Lezoray, O; Meurie, C; Cardot, H

2007-04-27

Microscopic cellular image segmentation schemes must be efficient for reliable analysis and fast to process huge quantity of images. Recent studies have focused on improving segmentation quality. Several segmentation schemes have good quality but processing time is too expensive to deal with a great number of images per day. For segmentation schemes based on pixel classification, the classifier design is crucial since it is the one which requires most of the processing time necessary to segment an image. The main contribution of this work is focused on how to reduce the complexity of decision functions produced by support vector machines (SVM) while preserving recognition rate. Vector quantization is used in order to reduce the inherent redundancy present in huge pixel databases (i.e. images with expert pixel segmentation). Hybrid color space design is also used in order to improve data set size reduction rate and recognition rate. A new decision function quality criterion is defined to select good trade-off between recognition rate and processing time of pixel decision function. The first results of this study show that fast and efficient pixel classification with SVM is possible. Moreover posterior class pixel probability estimation is easy to compute with Platt method. Then a new segmentation scheme using probabilistic pixel classification has been developed. This one has several free parameters and an automatic selection must dealt with, but criteria for evaluate segmentation quality are not well adapted for cell segmentation, especially when comparison with expert pixel segmentation must be achieved. Another important contribution in this paper is the definition of a new quality criterion for evaluation of cell segmentation. The results presented here show that the selection of free parameters of the segmentation scheme by optimisation of the new quality cell segmentation criterion produces efficient cell segmentation.
Searching the Force Field Electrostatic Multipole Parameter Space.

PubMed

Jakobsen, Sofie; Jensen, Frank

2016-04-12

We show by tensor decomposition analyses that the molecular electrostatic potential for amino acid peptide models has an effective rank less than twice the number of atoms. This rank indicates the number of parameters that can be derived from the electrostatic potential in a statistically significant way. Using this as a guideline, we investigate different strategies for deriving a reduced set of atomic charges, dipoles, and quadrupoles capable of reproducing the reference electrostatic potential with a low error. A full combinatorial search of selected parameter subspaces for N-methylacetamide and a cysteine peptide model indicates that there are many different parameter sets capable of providing errors close to that of the global minimum. Among the different reduced multipole parameter sets that have low errors, there is consensus that atoms involved in π-bonding require higher order multipole moments. The possible correlation between multipole parameters is investigated by exhaustive searches of combinations of up to four parameters distributed in all possible ways on all possible atomic sites. These analyses show that there is no advantage in considering combinations of multipoles compared to a simple approach where the importance of each multipole moment is evaluated sequentially. When combined with possible weighting factors related to the computational efficiency of each type of multipole moment, this may provide a systematic strategy for determining a computational efficient representation of the electrostatic component in force field calculations.
An Efficient Location Verification Scheme for Static Wireless Sensor Networks.

PubMed

Kim, In-Hwan; Kim, Bo-Sung; Song, JooSeok

2017-01-24

In wireless sensor networks (WSNs), the accuracy of location information is vital to support many interesting applications. Unfortunately, sensors have difficulty in estimating their location when malicious sensors attack the location estimation process. Even though secure localization schemes have been proposed to protect location estimation process from attacks, they are not enough to eliminate the wrong location estimations in some situations. The location verification can be the solution to the situations or be the second-line defense. The problem of most of the location verifications is the explicit involvement of many sensors in the verification process and requirements, such as special hardware, a dedicated verifier and the trusted third party, which causes more communication and computation overhead. In this paper, we propose an efficient location verification scheme for static WSN called mutually-shared region-based location verification (MSRLV), which reduces those overheads by utilizing the implicit involvement of sensors and eliminating several requirements. In order to achieve this, we use the mutually-shared region between location claimant and verifier for the location verification. The analysis shows that MSRLV reduces communication overhead by 77% and computation overhead by 92% on average, when compared with the other location verification schemes, in a single sensor verification. In addition, simulation results for the verification of the whole network show that MSRLV can detect the malicious sensors by over 90% when sensors in the network have five or more neighbors.
An Efficient Location Verification Scheme for Static Wireless Sensor Networks

PubMed Central

Kim, In-hwan; Kim, Bo-sung; Song, JooSeok

2017-01-01

In wireless sensor networks (WSNs), the accuracy of location information is vital to support many interesting applications. Unfortunately, sensors have difficulty in estimating their location when malicious sensors attack the location estimation process. Even though secure localization schemes have been proposed to protect location estimation process from attacks, they are not enough to eliminate the wrong location estimations in some situations. The location verification can be the solution to the situations or be the second-line defense. The problem of most of the location verifications is the explicit involvement of many sensors in the verification process and requirements, such as special hardware, a dedicated verifier and the trusted third party, which causes more communication and computation overhead. In this paper, we propose an efficient location verification scheme for static WSN called mutually-shared region-based location verification (MSRLV), which reduces those overheads by utilizing the implicit involvement of sensors and eliminating several requirements. In order to achieve this, we use the mutually-shared region between location claimant and verifier for the location verification. The analysis shows that MSRLV reduces communication overhead by 77% and computation overhead by 92% on average, when compared with the other location verification schemes, in a single sensor verification. In addition, simulation results for the verification of the whole network show that MSRLV can detect the malicious sensors by over 90% when sensors in the network have five or more neighbors. PMID:28125007
Real-time solution of linear computational problems using databases of parametric reduced-order models with arbitrary underlying meshes

NASA Astrophysics Data System (ADS)

Amsallem, David; Tezaur, Radek; Farhat, Charbel

2016-12-01

A comprehensive approach for real-time computations using a database of parametric, linear, projection-based reduced-order models (ROMs) based on arbitrary underlying meshes is proposed. In the offline phase of this approach, the parameter space is sampled and linear ROMs defined by linear reduced operators are pre-computed at the sampled parameter points and stored. Then, these operators and associated ROMs are transformed into counterparts that satisfy a certain notion of consistency. In the online phase of this approach, a linear ROM is constructed in real-time at a queried but unsampled parameter point by interpolating the pre-computed linear reduced operators on matrix manifolds and therefore computing an interpolated linear ROM. The proposed overall model reduction framework is illustrated with two applications: a parametric inverse acoustic scattering problem associated with a mockup submarine, and a parametric flutter prediction problem associated with a wing-tank system. The second application is implemented on a mobile device, illustrating the capability of the proposed computational framework to operate in real-time.
Efficient storage, computation, and exposure of computer-generated holograms by electron-beam lithography.

PubMed

Newman, D M; Hawley, R W; Goeckel, D L; Crawford, R D; Abraham, S; Gallagher, N C

1993-05-10

An efficient storage format was developed for computer-generated holograms for use in electron-beam lithography. This method employs run-length encoding and Lempel-Ziv-Welch compression and succeeds in exposing holograms that were previously infeasible owing to the hologram's tremendous pattern-data file size. These holograms also require significant computation; thus the algorithm was implemented on a parallel computer, which improved performance by 2 orders of magnitude. The decompression algorithm was integrated into the Cambridge electron-beam machine's front-end processor.Although this provides much-needed ability, some hardware enhancements will be required in the future to overcome inadequacies in the current front-end processor that result in a lengthy exposure time.

[Developing a home care nursing information system by utilizing wire-wireless network and mobile computing system].

PubMed

Park, Jung-Ho; Park, Sung-Ae; Yoon, Soon-Nyoung; Kang, Sung-Rye

2004-04-01

The purpose of this study was to develop a home care nursing network system for operating home care effectively and efficiently by utilizing a wire-wireless network and mobile computing in order to record and send patients' data in real time, and by combining the headquarter office and the local offices with home care nurses over the Internet. It complements the preceding research from 1999 by adding home care nursing standard guidelines and upgrading the PDA program. Method/1 and Prototyping were adopted to develop the main network system. The detailed research process is as follows : 1)home care nursing standard guidelines for Diabetes, cancer and peritoneal-dialysis were added in 12 domains of nursing problem fields with nursing assessment/intervention algorithms. 2) complementing the PDA program was done by omitting and integrating the home care nursing algorithm path which is unnecessary and duplicated. Also, upgrading the PDA system was done by utilizing the machinery and tools where the PDA and the data transmission modem are integrated, CDMX-1X base construction, in order to reduce a transmission error or transmission failure.
A simple algorithm to improve the performance of the WENO scheme on non-uniform grids

NASA Astrophysics Data System (ADS)

Huang, Wen-Feng; Ren, Yu-Xin; Jiang, Xiong

2018-02-01

This paper presents a simple approach for improving the performance of the weighted essentially non-oscillatory (WENO) finite volume scheme on non-uniform grids. This technique relies on the reformulation of the fifth-order WENO-JS (WENO scheme presented by Jiang and Shu in J. Comput. Phys. 126:202-228, 1995) scheme designed on uniform grids in terms of one cell-averaged value and its left and/or right interfacial values of the dependent variable. The effect of grid non-uniformity is taken into consideration by a proper interpolation of the interfacial values. On non-uniform grids, the proposed scheme is much more accurate than the original WENO-JS scheme, which was designed for uniform grids. When the grid is uniform, the resulting scheme reduces to the original WENO-JS scheme. In the meantime, the proposed scheme is computationally much more efficient than the fifth-order WENO scheme designed specifically for the non-uniform grids. A number of numerical test cases are simulated to verify the performance of the present scheme.
When cloud computing meets bioinformatics: a review.

PubMed

Zhou, Shuigeng; Liao, Ruiqi; Guan, Jihong

2013-10-01

In the past decades, with the rapid development of high-throughput technologies, biology research has generated an unprecedented amount of data. In order to store and process such a great amount of data, cloud computing and MapReduce were applied to many fields of bioinformatics. In this paper, we first introduce the basic concepts of cloud computing and MapReduce, and their applications in bioinformatics. We then highlight some problems challenging the applications of cloud computing and MapReduce to bioinformatics. Finally, we give a brief guideline for using cloud computing in biology research.
Computational Science: A Research Methodology for the 21st Century

NASA Astrophysics Data System (ADS)

Orbach, Raymond L.

2004-03-01

Computational simulation - a means of scientific discovery that employs computer systems to simulate a physical system according to laws derived from theory and experiment - has attained peer status with theory and experiment. Important advances in basic science are accomplished by a new "sociology" for ultrascale scientific computing capability (USSCC), a fusion of sustained advances in scientific models, mathematical algorithms, computer architecture, and scientific software engineering. Expansion of current capabilities by factors of 100 - 1000 open up new vistas for scientific discovery: long term climatic variability and change, macroscopic material design from correlated behavior at the nanoscale, design and optimization of magnetic confinement fusion reactors, strong interactions on a computational lattice through quantum chromodynamics, and stellar explosions and element production. The "virtual prototype," made possible by this expansion, can markedly reduce time-to-market for industrial applications such as jet engines and safer, more fuel efficient cleaner cars. In order to develop USSCC, the National Energy Research Scientific Computing Center (NERSC) announced the competition "Innovative and Novel Computational Impact on Theory and Experiment" (INCITE), with no requirement for current DOE sponsorship. Fifty nine proposals for grand challenge scientific problems were submitted for a small number of awards. The successful grants, and their preliminary progress, will be described.
A high-order spatial filter for a cubed-sphere spectral element model

NASA Astrophysics Data System (ADS)

Kang, Hyun-Gyu; Cheong, Hyeong-Bin

2017-04-01

A high-order spatial filter is developed for the spectral-element-method dynamical core on the cubed-sphere grid which employs the Gauss-Lobatto Lagrange interpolating polynomials (GLLIP) as orthogonal basis functions. The filter equation is the high-order Helmholtz equation which corresponds to the implicit time-differencing of a diffusion equation employing the high-order Laplacian. The Laplacian operator is discretized within a cell which is a building block of the cubed sphere grid and consists of the Gauss-Lobatto grid. When discretizing a high-order Laplacian, due to the requirement of C0 continuity along the cell boundaries the grid-points in neighboring cells should be used for the target cell: The number of neighboring cells is nearly quadratically proportional to the filter order. Discrete Helmholtz equation yields a huge-sized and highly sparse matrix equation whose size is N*N with N the number of total grid points on the globe. The number of nonzero entries is also almost in quadratic proportion to the filter order. Filtering is accomplished by solving the huge-matrix equation. While requiring a significant computing time, the solution of global matrix provides the filtered field free of discontinuity along the cell boundaries. To achieve the computational efficiency and the accuracy at the same time, the solution of the matrix equation was obtained by only accounting for the finite number of adjacent cells. This is called as a local-domain filter. It was shown that to remove the numerical noise near the grid-scale, inclusion of 5*5 cells for the local-domain filter was found sufficient, giving the same accuracy as that obtained by global domain solution while reducing the computing time to a considerably lower level. The high-order filter was evaluated using the standard test cases including the baroclinic instability of the zonal flow. Results indicated that the filter performs better on the removal of grid-scale numerical noises than the explicit high-order viscosity. It was also presented that the filter can be easily implemented on the distributed-memory parallel computers with a desirable scalability.
Structural mode significance using INCA. [Interactive Controls Analysis computer program

NASA Technical Reports Server (NTRS)

Bauer, Frank H.; Downing, John P.; Thorpe, Christopher J.

1990-01-01

Structural finite element models are often too large to be used in the design and analysis of control systems. Model reduction techniques must be applied to reduce the structural model to manageable size. In the past, engineers either performed the model order reduction by hand or used distinct computer programs to retrieve the data, to perform the significance analysis and to reduce the order of the model. To expedite this process, the latest version of INCA has been expanded to include an interactive graphical structural mode significance and model order reduction capability.
A Vision System For A Mars Rover

NASA Astrophysics Data System (ADS)

Wilcox, Brian H.; Gennery, Donald B.; Mishkin, Andrew H.; Cooper, Brian K.; Lawton, Teri B.; Lay, N. Keith; Katzmann, Steven P.

1987-01-01

A Mars rover must be able to sense its local environment with sufficient resolution and accuracy to avoid local obstacles and hazards while moving a significant distance each day. Power efficiency and reliability are extremely important considerations, making stereo correlation an attractive method of range sensing compared to laser scanning, if the computational load and correspondence errors can be handled. Techniques for treatment of these problems, including the use of more than two cameras to reduce correspondence errors and possibly to limit the computational burden of stereo processing, have been tested at JPL. Once a reliable range map is obtained, it must be transformed to a plan view and compared to a stored terrain database, in order to refine the estimated position of the rover and to improve the database. The slope and roughness of each terrain region are computed, which form the basis for a traversability map allowing local path planning. Ongoing research and field testing of such a system is described.
Efficient electromagnetic source imaging with adaptive standardized LORETA/FOCUSS.

PubMed

Schimpf, Paul H; Liu, Hesheng; Ramon, Ceon; Haueisen, Jens

2005-05-01

Functional brain imaging and source localization based on the scalp's potential field require a solution to an ill-posed inverse problem with many solutions. This makes it necessary to incorporate a priori knowledge in order to select a particular solution. A computational challenge for some subject-specific head models is that many inverse algorithms require a comprehensive sampling of the candidate source space at the desired resolution. In this study, we present an algorithm that can accurately reconstruct details of localized source activity from a sparse sampling of the candidate source space. Forward computations are minimized through an adaptive procedure that increases source resolution as the spatial extent is reduced. With this algorithm, we were able to compute inverses using only 6% to 11% of the full resolution lead-field, with a localization accuracy that was not significantly different than an exhaustive search through a fully-sampled source space. The technique is, therefore, applicable for use with anatomically-realistic, subject-specific forward models for applications with spatially concentrated source activity.
A vision system for a Mars rover

NASA Technical Reports Server (NTRS)

Wilcox, Brian H.; Gennery, Donald B.; Mishkin, Andrew H.; Cooper, Brian K.; Lawton, Teri B.; Lay, N. Keith; Katzmann, Steven P.

1988-01-01

A Mars rover must be able to sense its local environment with sufficient resolution and accuracy to avoid local obstacles and hazards while moving a significant distance each day. Power efficiency and reliability are extremely important considerations, making stereo correlation an attractive method of range sensing compared to laser scanning, if the computational load and correspondence errors can be handled. Techniques for treatment of these problems, including the use of more than two cameras to reduce correspondence errors and possibly to limit the computational burden of stereo processing, have been tested at JPL. Once a reliable range map is obtained, it must be transformed to a plan view and compared to a stored terrain database, in order to refine the estimated position of the rover and to improve the database. The slope and roughness of each terrain region are computed, which form the basis for a traversability map allowing local path planning. Ongoing research and field testing of such a system is described.
Fast and Accurate Circuit Design Automation through Hierarchical Model Switching.

PubMed

Huynh, Linh; Tagkopoulos, Ilias

2015-08-21

In computer-aided biological design, the trifecta of characterized part libraries, accurate models and optimal design parameters is crucial for producing reliable designs. As the number of parts and model complexity increase, however, it becomes exponentially more difficult for any optimization method to search the solution space, hence creating a trade-off that hampers efficient design. To address this issue, we present a hierarchical computer-aided design architecture that uses a two-step approach for biological design. First, a simple model of low computational complexity is used to predict circuit behavior and assess candidate circuit branches through branch-and-bound methods. Then, a complex, nonlinear circuit model is used for a fine-grained search of the reduced solution space, thus achieving more accurate results. Evaluation with a benchmark of 11 circuits and a library of 102 experimental designs with known characterization parameters demonstrates a speed-up of 3 orders of magnitude when compared to other design methods that provide optimality guarantees.
Accelerating image reconstruction in dual-head PET system by GPU and symmetry properties.

PubMed

Chou, Cheng-Ying; Dong, Yun; Hung, Yukai; Kao, Yu-Jiun; Wang, Weichung; Kao, Chien-Min; Chen, Chin-Tu

2012-01-01

Positron emission tomography (PET) is an important imaging modality in both clinical usage and research studies. We have developed a compact high-sensitivity PET system that consisted of two large-area panel PET detector heads, which produce more than 224 million lines of response and thus request dramatic computational demands. In this work, we employed a state-of-the-art graphics processing unit (GPU), NVIDIA Tesla C2070, to yield an efficient reconstruction process. Our approaches ingeniously integrate the distinguished features of the symmetry properties of the imaging system and GPU architectures, including block/warp/thread assignments and effective memory usage, to accelerate the computations for ordered subset expectation maximization (OSEM) image reconstruction. The OSEM reconstruction algorithms were implemented employing both CPU-based and GPU-based codes, and their computational performance was quantitatively analyzed and compared. The results showed that the GPU-accelerated scheme can drastically reduce the reconstruction time and thus can largely expand the applicability of the dual-head PET system.
Energy Efficiency Challenges of 5G Small Cell Networks.

PubMed

Ge, Xiaohu; Yang, Jing; Gharavi, Hamid; Sun, Yang

2017-05-01

The deployment of a large number of small cells poses new challenges to energy efficiency, which has often been ignored in fifth generation (5G) cellular networks. While massive multiple-input multiple outputs (MIMO) will reduce the transmission power at the expense of higher computational cost, the question remains as to which computation or transmission power is more important in the energy efficiency of 5G small cell networks. Thus, the main objective in this paper is to investigate the computation power based on the Landauer principle. Simulation results reveal that more than 50% of the energy is consumed by the computation power at 5G small cell base stations (BSs). Moreover, the computation power of 5G small cell BS can approach 800 watt when the massive MIMO (e.g., 128 antennas) is deployed to transmit high volume traffic. This clearly indicates that computation power optimization can play a major role in the energy efficiency of small cell networks.
Energy Efficiency Challenges of 5G Small Cell Networks

PubMed Central

Ge, Xiaohu; Yang, Jing; Gharavi, Hamid; Sun, Yang

2017-01-01

The deployment of a large number of small cells poses new challenges to energy efficiency, which has often been ignored in fifth generation (5G) cellular networks. While massive multiple-input multiple outputs (MIMO) will reduce the transmission power at the expense of higher computational cost, the question remains as to which computation or transmission power is more important in the energy efficiency of 5G small cell networks. Thus, the main objective in this paper is to investigate the computation power based on the Landauer principle. Simulation results reveal that more than 50% of the energy is consumed by the computation power at 5G small cell base stations (BSs). Moreover, the computation power of 5G small cell BS can approach 800 watt when the massive MIMO (e.g., 128 antennas) is deployed to transmit high volume traffic. This clearly indicates that computation power optimization can play a major role in the energy efficiency of small cell networks. PMID:28757670
Rapid performance modeling and parameter regression of geodynamic models

NASA Astrophysics Data System (ADS)

Brown, J.; Duplyakin, D.

2016-12-01

Geodynamic models run in a parallel environment have many parameters with complicated effects on performance and scientifically-relevant functionals. Manually choosing an efficient machine configuration and mapping out the parameter space requires a great deal of expert knowledge and time-consuming experiments. We propose an active learning technique based on Gaussion Process Regression to automatically select experiments to map out the performance landscape with respect to scientific and machine parameters. The resulting performance model is then used to select optimal experiments for improving the accuracy of a reduced order model per unit of computational cost. We present the framework and evaluate its quality and capability using popular lithospheric dynamics models.
The constraint method: A new finite element technique. [applied to static and dynamic loads on plates

NASA Technical Reports Server (NTRS)

Tsai, C.; Szabo, B. A.

1973-01-01

An approch to the finite element method which utilizes families of conforming finite elements based on complete polynomials is presented. Finite element approximations based on this method converge with respect to progressively reduced element sizes as well as with respect to progressively increasing orders of approximation. Numerical results of static and dynamic applications of plates are presented to demonstrate the efficiency of the method. Comparisons are made with plate elements in NASTRAN and the high-precision plate element developed by Cowper and his co-workers. Some considerations are given to implementation of the constraint method into general purpose computer programs such as NASTRAN.
Reduced-order modeling of piezoelectric energy harvesters with nonlinear circuits under complex conditions

NASA Astrophysics Data System (ADS)

Xiang, Hong-Jun; Zhang, Zhi-Wei; Shi, Zhi-Fei; Li, Hong

2018-04-01

A fully coupled modeling approach is developed for piezoelectric energy harvesters in this work based on the use of available robust finite element packages and efficient reducing order modeling techniques. At first, the harvester is modeled using finite element packages. The dynamic equilibrium equations of harvesters are rebuilt by extracting system matrices from the finite element model using built-in commands without any additional tools. A Krylov subspace-based scheme is then applied to obtain a reduced-order model for improving simulation efficiency but preserving the key features of harvesters. Co-simulation of the reduced-order model with nonlinear energy harvesting circuits is achieved in a system level. Several examples in both cases of harmonic response and transient response analysis are conducted to validate the present approach. The proposed approach allows to improve the simulation efficiency by several orders of magnitude. Moreover, the parameters used in the equivalent circuit model can be conveniently obtained by the proposed eigenvector-based model order reduction technique. More importantly, this work establishes a methodology for modeling of piezoelectric energy harvesters with any complicated mechanical geometries and nonlinear circuits. The input load may be more complex also. The method can be employed by harvester designers to optimal mechanical structures or by circuit designers to develop novel energy harvesting circuits.
Fluid dynamic characteristics of the VentrAssist rotary blood pump.

PubMed

Tansley, G; Vidakovic, S; Reizes, J

2000-06-01

The VentrAssist pump has no shaft or seal, and the device is unique in design because the rotor is suspended passively by hydrodynamic forces, and urging is accomplished by an integrated direct current motor rotor that also acts as the pump impeller. This device has led to many challenges in its fluidic design, namely large flow-blockage from impeller blades, low stiffness of bearings with concomitant impeller displacement under pulsatile load conditions, and very small running clearances. Low specific speed and radial blade off-flow were selected in order to minimize the hemolysis. Pulsatile and steady-flow tests show the impeller is stable under normal operating conditions. Computational fluid dynamics (CFD) has been used to optimize flow paths and reduce net axial force imbalance to acceptably small values. The latest design of the pump achieved a system efficiency of 18% (in 30% hematocrit of red blood cells suspended in phosphate-buffered saline), and efficiency was optimized over the range of operating conditions. Parameters critical to improving pump efficiency were investigated.
Multidisciplinary optimization of aeroservoelastic systems using reduced-size models

NASA Technical Reports Server (NTRS)

Karpel, Mordechay

1992-01-01

Efficient analytical and computational tools for simultaneous optimal design of the structural and control components of aeroservoelastic systems are presented. The optimization objective is to achieve aircraft performance requirements and sufficient flutter and control stability margins with a minimal weight penalty and without violating the design constraints. Analytical sensitivity derivatives facilitate an efficient optimization process which allows a relatively large number of design variables. Standard finite element and unsteady aerodynamic routines are used to construct a modal data base. Minimum State aerodynamic approximations and dynamic residualization methods are used to construct a high accuracy, low order aeroservoelastic model. Sensitivity derivatives of flutter dynamic pressure, control stability margins and control effectiveness with respect to structural and control design variables are presented. The performance requirements are utilized by equality constraints which affect the sensitivity derivatives. A gradient-based optimization algorithm is used to minimize an overall cost function. A realistic numerical example of a composite wing with four controls is used to demonstrate the modeling technique, the optimization process, and their accuracy and efficiency.
Efficient convolutional sparse coding

DOEpatents

Wohlberg, Brendt

2017-06-20

Computationally efficient algorithms may be applied for fast dictionary learning solving the convolutional sparse coding problem in the Fourier domain. More specifically, efficient convolutional sparse coding may be derived within an alternating direction method of multipliers (ADMM) framework that utilizes fast Fourier transforms (FFT) to solve the main linear system in the frequency domain. Such algorithms may enable a significant reduction in computational cost over conventional approaches by implementing a linear solver for the most critical and computationally expensive component of the conventional iterative algorithm. The theoretical computational cost of the algorithm may be reduced from O(M.sup.3N) to O(MN log N), where N is the dimensionality of the data and M is the number of elements in the dictionary. This significant improvement in efficiency may greatly increase the range of problems that can practically be addressed via convolutional sparse representations.
An adaptive sparse-grid high-order stochastic collocation method for Bayesian inference in groundwater reactive transport modeling

NASA Astrophysics Data System (ADS)

Zhang, Guannan; Lu, Dan; Ye, Ming; Gunzburger, Max; Webster, Clayton

2013-10-01

Bayesian analysis has become vital to uncertainty quantification in groundwater modeling, but its application has been hindered by the computational cost associated with numerous model executions required by exploring the posterior probability density function (PPDF) of model parameters. This is particularly the case when the PPDF is estimated using Markov Chain Monte Carlo (MCMC) sampling. In this study, a new approach is developed to improve the computational efficiency of Bayesian inference by constructing a surrogate of the PPDF, using an adaptive sparse-grid high-order stochastic collocation (aSG-hSC) method. Unlike previous works using first-order hierarchical basis, this paper utilizes a compactly supported higher-order hierarchical basis to construct the surrogate system, resulting in a significant reduction in the number of required model executions. In addition, using the hierarchical surplus as an error indicator allows locally adaptive refinement of sparse grids in the parameter space, which further improves computational efficiency. To efficiently build the surrogate system for the PPDF with multiple significant modes, optimization techniques are used to identify the modes, for which high-probability regions are defined and components of the aSG-hSC approximation are constructed. After the surrogate is determined, the PPDF can be evaluated by sampling the surrogate system directly without model execution, resulting in improved efficiency of the surrogate-based MCMC compared with conventional MCMC. The developed method is evaluated using two synthetic groundwater reactive transport models. The first example involves coupled linear reactions and demonstrates the accuracy of our high-order hierarchical basis approach in approximating high-dimensional posteriori distribution. The second example is highly nonlinear because of the reactions of uranium surface complexation, and demonstrates how the iterative aSG-hSC method is able to capture multimodal and non-Gaussian features of PPDF caused by model nonlinearity. Both experiments show that aSG-hSC is an effective and efficient tool for Bayesian inference.

Further reduction of minimal first-met bad markings for the computationally efficient synthesis of a maximally permissive controller

NASA Astrophysics Data System (ADS)

Liu, GaiYun; Chao, Daniel Yuh

2015-08-01

To date, research on the supervisor design for flexible manufacturing systems focuses on speeding up the computation of optimal (maximally permissive) liveness-enforcing controllers. Recent deadlock prevention policies for systems of simple sequential processes with resources (S3PR) reduce the computation burden by considering only the minimal portion of all first-met bad markings (FBMs). Maximal permissiveness is ensured by not forbidding any live state. This paper proposes a method to further reduce the size of minimal set of FBMs to efficiently solve integer linear programming problems while maintaining maximal permissiveness using a vector-covering approach. This paper improves the previous work and achieves the simplest structure with the minimal number of monitors.
Reduced Order Model Implementation in the Risk-Informed Safety Margin Characterization Toolkit

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mandelli, Diego; Smith, Curtis L.; Alfonsi, Andrea

2015-09-01

The RISMC project aims to develop new advanced simulation-based tools to perform Probabilistic Risk Analysis (PRA) for the existing fleet of U.S. nuclear power plants (NPPs). These tools numerically model not only the thermo-hydraulic behavior of the reactor primary and secondary systems but also external events temporal evolution and components/system ageing. Thus, this is not only a multi-physics problem but also a multi-scale problem (both spatial, µm-mm-m, and temporal, ms-s-minutes-years). As part of the RISMC PRA approach, a large amount of computationally expensive simulation runs are required. An important aspect is that even though computational power is regularly growing, themore » overall computational cost of a RISMC analysis may be not viable for certain cases. A solution that is being evaluated is the use of reduce order modeling techniques. During the FY2015, we investigated and applied reduced order modeling techniques to decrease the RICM analysis computational cost by decreasing the number of simulations runs to perform and employ surrogate models instead of the actual simulation codes. This report focuses on the use of reduced order modeling techniques that can be applied to any RISMC analysis to generate, analyze and visualize data. In particular, we focus on surrogate models that approximate the simulation results but in a much faster time (µs instead of hours/days). We apply reduced order and surrogate modeling techniques to several RISMC types of analyses using RAVEN and RELAP-7 and show the advantages that can be gained.« less
Projected role of advanced computational aerodynamic methods at the Lockheed-Georgia company

NASA Technical Reports Server (NTRS)

Lores, M. E.

1978-01-01

Experience with advanced computational methods being used at the Lockheed-Georgia Company to aid in the evaluation and design of new and modified aircraft indicates that large and specialized computers will be needed to make advanced three-dimensional viscous aerodynamic computations practical. The Numerical Aerodynamic Simulation Facility should be used to provide a tool for designing better aerospace vehicles while at the same time reducing development costs by performing computations using Navier-Stokes equations solution algorithms and permitting less sophisticated but nevertheless complex calculations to be made efficiently. Configuration definition procedures and data output formats can probably best be defined in cooperation with industry, therefore, the computer should handle many remote terminals efficiently. The capability of transferring data to and from other computers needs to be provided. Because of the significant amount of input and output associated with 3-D viscous flow calculations and because of the exceedingly fast computation speed envisioned for the computer, special attention should be paid to providing rapid, diversified, and efficient input and output.
Stimulation of a turbofan engine for evaluation of multivariable optimal control concepts. [(computerized simulation)

NASA Technical Reports Server (NTRS)

Seldner, K.

1976-01-01

The development of control systems for jet engines requires a real-time computer simulation. The simulation provides an effective tool for evaluating control concepts and problem areas prior to actual engine testing. The development and use of a real-time simulation of the Pratt and Whitney F100-PW100 turbofan engine is described. The simulation was used in a multi-variable optimal controls research program using linear quadratic regulator theory. The simulation is used to generate linear engine models at selected operating points and evaluate the control algorithm. To reduce the complexity of the design, it is desirable to reduce the order of the linear model. A technique to reduce the order of the model; is discussed. Selected results between high and low order models are compared. The LQR control algorithms can be programmed on digital computer. This computer will control the engine simulation over the desired flight envelope.
Parameterized reduced-order models using hyper-dual numbers.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fike, Jeffrey A.; Brake, Matthew Robert

2013-10-01

The goal of most computational simulations is to accurately predict the behavior of a real, physical system. Accurate predictions often require very computationally expensive analyses and so reduced order models (ROMs) are commonly used. ROMs aim to reduce the computational cost of the simulations while still providing accurate results by including all of the salient physics of the real system in the ROM. However, real, physical systems often deviate from the idealized models used in simulations due to variations in manufacturing or other factors. One approach to this issue is to create a parameterized model in order to characterize themore » effect of perturbations from the nominal model on the behavior of the system. This report presents a methodology for developing parameterized ROMs, which is based on Craig-Bampton component mode synthesis and the use of hyper-dual numbers to calculate the derivatives necessary for the parameterization.« less
Efficient Feature Selection and Classification of Protein Sequence Data in Bioinformatics

PubMed Central

Faye, Ibrahima; Samir, Brahim Belhaouari; Md Said, Abas

2014-01-01

Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth. PMID:25045727
A shifted Jacobi collocation algorithm for wave type equations with non-local conservation conditions

NASA Astrophysics Data System (ADS)

Doha, Eid H.; Bhrawy, Ali H.; Abdelkawy, Mohammed A.

2014-09-01

In this paper, we propose an efficient spectral collocation algorithm to solve numerically wave type equations subject to initial, boundary and non-local conservation conditions. The shifted Jacobi pseudospectral approximation is investigated for the discretization of the spatial variable of such equations. It possesses spectral accuracy in the spatial variable. The shifted Jacobi-Gauss-Lobatto (SJ-GL) quadrature rule is established for treating the non-local conservation conditions, and then the problem with its initial and non-local boundary conditions are reduced to a system of second-order ordinary differential equations in temporal variable. This system is solved by two-stage forth-order A-stable implicit RK scheme. Five numerical examples with comparisons are given. The computational results demonstrate that the proposed algorithm is more accurate than finite difference method, method of lines and spline collocation approach
A fast algorithm for forward-modeling of gravitational fields in spherical coordinates with 3D Gauss-Legendre quadrature

NASA Astrophysics Data System (ADS)

Zhao, G.; Liu, J.; Chen, B.; Guo, R.; Chen, L.

2017-12-01

Forward modeling of gravitational fields at large-scale requires to consider the curvature of the Earth and to evaluate the Newton's volume integral in spherical coordinates. To acquire fast and accurate gravitational effects for subsurface structures, subsurface mass distribution is usually discretized into small spherical prisms (called tesseroids). The gravity fields of tesseroids are generally calculated numerically. One of the commonly used numerical methods is the 3D Gauss-Legendre quadrature (GLQ). However, the traditional GLQ integration suffers from low computational efficiency and relatively poor accuracy when the observation surface is close to the source region. We developed a fast and high accuracy 3D GLQ integration based on the equivalence of kernel matrix, adaptive discretization and parallelization using OpenMP. The equivalence of kernel matrix strategy increases efficiency and reduces memory consumption by calculating and storing the same matrix elements in each kernel matrix just one time. In this method, the adaptive discretization strategy is used to improve the accuracy. The numerical investigations show that the executing time of the proposed method is reduced by two orders of magnitude compared with the traditional method that without these optimized strategies. High accuracy results can also be guaranteed no matter how close the computation points to the source region. In addition, the algorithm dramatically reduces the memory requirement by N times compared with the traditional method, where N is the number of discretization of the source region in the longitudinal direction. It makes the large-scale gravity forward modeling and inversion with a fine discretization possible.
Effects of Ordering Strategies and Programming Paradigms on Sparse Matrix Computations

NASA Technical Reports Server (NTRS)

Oliker, Leonid; Li, Xiaoye; Husbands, Parry; Biswas, Rupak; Biegel, Bryan (Technical Monitor)

2002-01-01

The Conjugate Gradient (CG) algorithm is perhaps the best-known iterative technique to solve sparse linear systems that are symmetric and positive definite. For systems that are ill-conditioned, it is often necessary to use a preconditioning technique. In this paper, we investigate the effects of various ordering and partitioning strategies on the performance of parallel CG and ILU(O) preconditioned CG (PCG) using different programming paradigms and architectures. Results show that for this class of applications: ordering significantly improves overall performance on both distributed and distributed shared-memory systems, that cache reuse may be more important than reducing communication, that it is possible to achieve message-passing performance using shared-memory constructs through careful data ordering and distribution, and that a hybrid MPI+OpenMP paradigm increases programming complexity with little performance gains. A implementation of CG on the Cray MTA does not require special ordering or partitioning to obtain high efficiency and scalability, giving it a distinct advantage for adaptive applications; however, it shows limited scalability for PCG due to a lack of thread level parallelism.
Total variation regularization of the 3-D gravity inverse problem using a randomized generalized singular value decomposition

NASA Astrophysics Data System (ADS)

Vatankhah, Saeed; Renaut, Rosemary A.; Ardestani, Vahid E.

2018-04-01

We present a fast algorithm for the total variation regularization of the 3-D gravity inverse problem. Through imposition of the total variation regularization, subsurface structures presenting with sharp discontinuities are preserved better than when using a conventional minimum-structure inversion. The associated problem formulation for the regularization is nonlinear but can be solved using an iteratively reweighted least-squares algorithm. For small-scale problems the regularized least-squares problem at each iteration can be solved using the generalized singular value decomposition. This is not feasible for large-scale, or even moderate-scale, problems. Instead we introduce the use of a randomized generalized singular value decomposition in order to reduce the dimensions of the problem and provide an effective and efficient solution technique. For further efficiency an alternating direction algorithm is used to implement the total variation weighting operator within the iteratively reweighted least-squares algorithm. Presented results for synthetic examples demonstrate that the novel randomized decomposition provides good accuracy for reduced computational and memory demands as compared to use of classical approaches.
Baseband pulse shaping for pi /4 FQPSK in nonlinearly amplified mobile channels

NASA Astrophysics Data System (ADS)

Subasinghe-Dias, Dileeka; Feher, Kamilo

1994-10-01

We apply baseband pulse shaping techniques for pi /4 QPSK in order to reduce the spectral regeneration of the bandlimited carrier after nonlinear amplification. These Feher's patented techniques, namely, pi /4 FQPSK (superposed QPSK) and pi /4 CTPSK (controlled transition PSK), may also be noncoherently demodulated. Application of these techniques is in fast fading, power efficient channels, typical of the mobile radio environment. Patents related to FQPSK are described. Computer simulation and experimental studies demonstrate that with these baseband waveshaping techniques, carrier envelope fluctuations are significantly reduced, and the out-of-band power after nonlinear amplification is suppressed by up to 20 dB compared to pi /4 QPSK. In frequency noninterleaved land or satellite mobile radio systems operating in a nonlinear, fading and ACI (adjacent channel interference) environment, these techniques may achieve 20%-50% higher spectral efficiency compared to pi /4 QPSK. In mobile cellular systems using pi /4 QPSK, such as the new North American and the Japanese digital cellular systems, the application of these baseband pulse shapes may allow more convenient and less costly amplifier linearization.
Reducing Earth Topography Resolution for SMAP Mission Ground Tracks Using K-Means Clustering

NASA Technical Reports Server (NTRS)

Rizvi, Farheen

2013-01-01

The K-means clustering algorithm is used to reduce Earth topography resolution for the SMAP mission ground tracks. As SMAP propagates in orbit, knowledge of the radar antenna footprints on Earth is required for the antenna misalignment calibration. Each antenna footprint contains a latitude and longitude location pair on the Earth surface. There are 400 pairs in one data set for the calibration model. It is computationally expensive to calculate corresponding Earth elevation for these data pairs. Thus, the antenna footprint resolution is reduced. Similar topographical data pairs are grouped together with the K-means clustering algorithm. The resolution is reduced to the mean of each topographical cluster called the cluster centroid. The corresponding Earth elevation for each cluster centroid is assigned to the entire group. Results show that 400 data points are reduced to 60 while still maintaining algorithm performance and computational efficiency. In this work, sensitivity analysis is also performed to show a trade-off between algorithm performance versus computational efficiency as the number of cluster centroids and algorithm iterations are increased.
Need for speed: An optimized gridding approach for spatially explicit disease simulations.

PubMed

Sellman, Stefan; Tsao, Kimberly; Tildesley, Michael J; Brommesson, Peter; Webb, Colleen T; Wennergren, Uno; Keeling, Matt J; Lindström, Tom

2018-04-01

Numerical models for simulating outbreaks of infectious diseases are powerful tools for informing surveillance and control strategy decisions. However, large-scale spatially explicit models can be limited by the amount of computational resources they require, which poses a problem when multiple scenarios need to be explored to provide policy recommendations. We introduce an easily implemented method that can reduce computation time in a standard Susceptible-Exposed-Infectious-Removed (SEIR) model without introducing any further approximations or truncations. It is based on a hierarchical infection process that operates on entire groups of spatially related nodes (cells in a grid) in order to efficiently filter out large volumes of susceptible nodes that would otherwise have required expensive calculations. After the filtering of the cells, only a subset of the nodes that were originally at risk are then evaluated for actual infection. The increase in efficiency is sensitive to the exact configuration of the grid, and we describe a simple method to find an estimate of the optimal configuration of a given landscape as well as a method to partition the landscape into a grid configuration. To investigate its efficiency, we compare the introduced methods to other algorithms and evaluate computation time, focusing on simulated outbreaks of foot-and-mouth disease (FMD) on the farm population of the USA, the UK and Sweden, as well as on three randomly generated populations with varying degree of clustering. The introduced method provided up to 500 times faster calculations than pairwise computation, and consistently performed as well or better than other available methods. This enables large scale, spatially explicit simulations such as for the entire continental USA without sacrificing realism or predictive power.
Need for speed: An optimized gridding approach for spatially explicit disease simulations

PubMed Central

Tildesley, Michael J.; Brommesson, Peter; Webb, Colleen T.; Wennergren, Uno; Lindström, Tom

2018-01-01

Numerical models for simulating outbreaks of infectious diseases are powerful tools for informing surveillance and control strategy decisions. However, large-scale spatially explicit models can be limited by the amount of computational resources they require, which poses a problem when multiple scenarios need to be explored to provide policy recommendations. We introduce an easily implemented method that can reduce computation time in a standard Susceptible-Exposed-Infectious-Removed (SEIR) model without introducing any further approximations or truncations. It is based on a hierarchical infection process that operates on entire groups of spatially related nodes (cells in a grid) in order to efficiently filter out large volumes of susceptible nodes that would otherwise have required expensive calculations. After the filtering of the cells, only a subset of the nodes that were originally at risk are then evaluated for actual infection. The increase in efficiency is sensitive to the exact configuration of the grid, and we describe a simple method to find an estimate of the optimal configuration of a given landscape as well as a method to partition the landscape into a grid configuration. To investigate its efficiency, we compare the introduced methods to other algorithms and evaluate computation time, focusing on simulated outbreaks of foot-and-mouth disease (FMD) on the farm population of the USA, the UK and Sweden, as well as on three randomly generated populations with varying degree of clustering. The introduced method provided up to 500 times faster calculations than pairwise computation, and consistently performed as well or better than other available methods. This enables large scale, spatially explicit simulations such as for the entire continental USA without sacrificing realism or predictive power. PMID:29624574
Computationally efficient algorithms for Brownian dynamics simulation of long flexible macromolecules modeled as bead-rod chains

NASA Astrophysics Data System (ADS)

Moghani, Mahdy Malekzadeh; Khomami, Bamin

2017-02-01

The computational efficiency of Brownian dynamics (BD) simulation of the constrained model of a polymeric chain (bead-rod) with n beads and in the presence of hydrodynamic interaction (HI) is reduced to the order of n2 via an efficient algorithm which utilizes the conjugate-gradient (CG) method within a Picard iteration scheme. Moreover, the utility of the Barnes and Hut (BH) multipole method in BD simulation of polymeric solutions in the presence of HI, with regard to computational cost, scaling, and accuracy, is discussed. Overall, it is determined that this approach leads to a scaling of O (n1.2) . Furthermore, a stress algorithm is developed which accurately captures the transient stress growth in the startup of flow for the bead-rod model with HI and excluded volume (EV) interaction. Rheological properties of the chains up to n =350 in the presence of EV and HI are computed via the former algorithm. The result depicts qualitative differences in shear thinning behavior of the polymeric solutions in the intermediate values of the Weissenburg number (10
Efficient integration method for fictitious domain approaches

NASA Astrophysics Data System (ADS)

Duczek, Sascha; Gabbert, Ulrich

2015-10-01

In the current article, we present an efficient and accurate numerical method for the integration of the system matrices in fictitious domain approaches such as the finite cell method (FCM). In the framework of the FCM, the physical domain is embedded in a geometrically larger domain of simple shape which is discretized using a regular Cartesian grid of cells. Therefore, a spacetree-based adaptive quadrature technique is normally deployed to resolve the geometry of the structure. Depending on the complexity of the structure under investigation this method accounts for most of the computational effort. To reduce the computational costs for computing the system matrices an efficient quadrature scheme based on the divergence theorem (Gauß-Ostrogradsky theorem) is proposed. Using this theorem the dimension of the integral is reduced by one, i.e. instead of solving the integral for the whole domain only its contour needs to be considered. In the current paper, we present the general principles of the integration method and its implementation. The results to several two-dimensional benchmark problems highlight its properties. The efficiency of the proposed method is compared to conventional spacetree-based integration techniques.
Improvements to Fidelity, Generation and Implementation of Physics-Based Lithium-Ion Reduced-Order Models

NASA Astrophysics Data System (ADS)

Rodriguez Marco, Albert

Battery management systems (BMS) require computationally simple but highly accurate models of the battery cells they are monitoring and controlling. Historically, empirical equivalent-circuit models have been used, but increasingly researchers are focusing their attention on physics-based models due to their greater predictive capabilities. These models are of high intrinsic computational complexity and so must undergo some kind of order-reduction process to make their use by a BMS feasible: we favor methods based on a transfer-function approach of battery cell dynamics. In prior works, transfer functions have been found from full-order PDE models via two simplifying assumptions: (1) a linearization assumption--which is a fundamental necessity in order to make transfer functions--and (2) an assumption made out of expedience that decouples the electrolyte-potential and electrolyte-concentration PDEs in order to render an approach to solve for the transfer functions from the PDEs. This dissertation improves the fidelity of physics-based models by eliminating the need for the second assumption and, by linearizing nonlinear dynamics around different constant currents. Electrochemical transfer functions are infinite-order and cannot be expressed as a ratio of polynomials in the Laplace variable s. Thus, for practical use, these systems need to be approximated using reduced-order models that capture the most significant dynamics. This dissertation improves the generation of physics-based reduced-order models by introducing different realization algorithms, which produce a low-order model from the infinite-order electrochemical transfer functions. Physics-based reduced-order models are linear and describe cell dynamics if operated near the setpoint at which they have been generated. Hence, multiple physics-based reduced-order models need to be generated at different setpoints (i.e., state-of-charge, temperature and C-rate) in order to extend the cell operating range. This dissertation improves the implementation of physics-based reduced-order models by introducing different blending approaches that combine the pre-computed models generated (offline) at different setpoints in order to produce good electrochemical estimates (online) along the cell state-of-charge, temperature and C-rate range.
Stochastic reduced order models for inverse problems under uncertainty

PubMed Central

Warner, James E.; Aquino, Wilkins; Grigoriu, Mircea D.

2014-01-01

This work presents a novel methodology for solving inverse problems under uncertainty using stochastic reduced order models (SROMs). Given statistical information about an observed state variable in a system, unknown parameters are estimated probabilistically through the solution of a model-constrained, stochastic optimization problem. The point of departure and crux of the proposed framework is the representation of a random quantity using a SROM - a low dimensional, discrete approximation to a continuous random element that permits e cient and non-intrusive stochastic computations. Characterizing the uncertainties with SROMs transforms the stochastic optimization problem into a deterministic one. The non-intrusive nature of SROMs facilitates e cient gradient computations for random vector unknowns and relies entirely on calls to existing deterministic solvers. Furthermore, the method is naturally extended to handle multiple sources of uncertainty in cases where state variable data, system parameters, and boundary conditions are all considered random. The new and widely-applicable SROM framework is formulated for a general stochastic optimization problem in terms of an abstract objective function and constraining model. For demonstration purposes, however, we study its performance in the specific case of inverse identification of random material parameters in elastodynamics. We demonstrate the ability to efficiently recover random shear moduli given material displacement statistics as input data. We also show that the approach remains effective for the case where the loading in the problem is random as well. PMID:25558115
Convergence and Efficiency of Adaptive Importance Sampling Techniques with Partial Biasing

NASA Astrophysics Data System (ADS)

Fort, G.; Jourdain, B.; Lelièvre, T.; Stoltz, G.

2018-04-01

We propose a new Monte Carlo method to efficiently sample a multimodal distribution (known up to a normalization constant). We consider a generalization of the discrete-time Self Healing Umbrella Sampling method, which can also be seen as a generalization of well-tempered metadynamics. The dynamics is based on an adaptive importance technique. The importance function relies on the weights (namely the relative probabilities) of disjoint sets which form a partition of the space. These weights are unknown but are learnt on the fly yielding an adaptive algorithm. In the context of computational statistical physics, the logarithm of these weights is, up to an additive constant, the free-energy, and the discrete valued function defining the partition is called the collective variable. The algorithm falls into the general class of Wang-Landau type methods, and is a generalization of the original Self Healing Umbrella Sampling method in two ways: (i) the updating strategy leads to a larger penalization strength of already visited sets in order to escape more quickly from metastable states, and (ii) the target distribution is biased using only a fraction of the free-energy, in order to increase the effective sample size and reduce the variance of importance sampling estimators. We prove the convergence of the algorithm and analyze numerically its efficiency on a toy example.
Second-Order Sensitivity Analysis of Uncollided Particle Contributions to Radiation Detector Responses

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cacuci, Dan G.; Favorite, Jeffrey A.

This work presents an application of Cacuci’s Second-Order Adjoint Sensitivity Analysis Methodology (2nd-ASAM) to the simplified Boltzmann equation that models the transport of uncollided particles through a medium to compute efficiently and exactly all of the first- and second-order derivatives (sensitivities) of a detector’s response with respect to the system’s isotopic number densities, microscopic cross sections, source emission rates, and detector response function. The off-the-shelf PARTISN multigroup discrete ordinates code is employed to solve the equations underlying the 2nd-ASAM. The accuracy of the results produced using PARTISN is verified by using the results of three test configurations: (1) a homogeneousmore » sphere, for which the response is the exactly known total uncollided leakage, (2) a multiregion two-dimensional (r-z) cylinder, and (3) a two-region sphere for which the response is a reaction rate. For the homogeneous sphere, results for the total leakage as well as for the respective first- and second-order sensitivities are in excellent agreement with the exact benchmark values. For the nonanalytic problems, the results obtained by applying the 2nd-ASAM to compute sensitivities are in excellent agreement with central-difference estimates. The efficiency of the 2nd-ASAM is underscored by the fact that, for the cylinder, only 12 adjoint PARTISN computations were required by the 2nd-ASAM to compute all of the benchmark’s 18 first-order sensitivities and 224 second-order sensitivities, in contrast to the 877 PARTISN calculations needed to compute the respective sensitivities using central finite differences, and this number does not include the additional calculations that were required to find appropriate values of the perturbations to use for the central differences.« less

Second-Order Sensitivity Analysis of Uncollided Particle Contributions to Radiation Detector Responses

DOE PAGES

Cacuci, Dan G.; Favorite, Jeffrey A.

2018-04-06

This work presents an application of Cacuci’s Second-Order Adjoint Sensitivity Analysis Methodology (2nd-ASAM) to the simplified Boltzmann equation that models the transport of uncollided particles through a medium to compute efficiently and exactly all of the first- and second-order derivatives (sensitivities) of a detector’s response with respect to the system’s isotopic number densities, microscopic cross sections, source emission rates, and detector response function. The off-the-shelf PARTISN multigroup discrete ordinates code is employed to solve the equations underlying the 2nd-ASAM. The accuracy of the results produced using PARTISN is verified by using the results of three test configurations: (1) a homogeneousmore » sphere, for which the response is the exactly known total uncollided leakage, (2) a multiregion two-dimensional (r-z) cylinder, and (3) a two-region sphere for which the response is a reaction rate. For the homogeneous sphere, results for the total leakage as well as for the respective first- and second-order sensitivities are in excellent agreement with the exact benchmark values. For the nonanalytic problems, the results obtained by applying the 2nd-ASAM to compute sensitivities are in excellent agreement with central-difference estimates. The efficiency of the 2nd-ASAM is underscored by the fact that, for the cylinder, only 12 adjoint PARTISN computations were required by the 2nd-ASAM to compute all of the benchmark’s 18 first-order sensitivities and 224 second-order sensitivities, in contrast to the 877 PARTISN calculations needed to compute the respective sensitivities using central finite differences, and this number does not include the additional calculations that were required to find appropriate values of the perturbations to use for the central differences.« less
Computationally Efficient Multiconfigurational Reactive Molecular Dynamics

PubMed Central

Yamashita, Takefumi; Peng, Yuxing; Knight, Chris; Voth, Gregory A.

2012-01-01

It is a computationally demanding task to explicitly simulate the electronic degrees of freedom in a system to observe the chemical transformations of interest, while at the same time sampling the time and length scales required to converge statistical properties and thus reduce artifacts due to initial conditions, finite-size effects, and limited sampling. One solution that significantly reduces the computational expense consists of molecular models in which effective interactions between particles govern the dynamics of the system. If the interaction potentials in these models are developed to reproduce calculated properties from electronic structure calculations and/or ab initio molecular dynamics simulations, then one can calculate accurate properties at a fraction of the computational cost. Multiconfigurational algorithms model the system as a linear combination of several chemical bonding topologies to simulate chemical reactions, also sometimes referred to as “multistate”. These algorithms typically utilize energy and force calculations already found in popular molecular dynamics software packages, thus facilitating their implementation without significant changes to the structure of the code. However, the evaluation of energies and forces for several bonding topologies per simulation step can lead to poor computational efficiency if redundancy is not efficiently removed, particularly with respect to the calculation of long-ranged Coulombic interactions. This paper presents accurate approximations (effective long-range interaction and resulting hybrid methods) and multiple-program parallelization strategies for the efficient calculation of electrostatic interactions in reactive molecular simulations. PMID:25100924
Reducing the Time and Cost of Testing Engines

NASA Technical Reports Server (NTRS)

2004-01-01

Producing a new aircraft engine currently costs approximately $1 billion, with 3 years of development time for a commercial engine and 10 years for a military engine. The high development time and cost make it extremely difficult to transition advanced technologies for cleaner, quieter, and more efficient new engines. To reduce this time and cost, NASA created a vision for the future where designers would use high-fidelity computer simulations early in the design process in order to resolve critical design issues before building the expensive engine hardware. To accomplish this vision, NASA's Glenn Research Center initiated a collaborative effort with the aerospace industry and academia to develop its Numerical Propulsion System Simulation (NPSS), an advanced engineering environment for the analysis and design of aerospace propulsion systems and components. Partners estimate that using NPSS has the potential to dramatically reduce the time, effort, and expense necessary to design and test jet engines by generating sophisticated computer simulations of an aerospace object or system. These simulations will permit an engineer to test various design options without having to conduct costly and time-consuming real-life tests. By accelerating and streamlining the engine system design analysis and test phases, NPSS facilitates bringing the final product to market faster. NASA's NPSS Version (V)1.X effort was a task within the Agency s Computational Aerospace Sciences project of the High Performance Computing and Communication program, which had a mission to accelerate the availability of high-performance computing hardware and software to the U.S. aerospace community for its use in design processes. The technology brings value back to NASA by improving methods of analyzing and testing space transportation components.
Key Factors for Determining Risk of Groundwater Impacts Due to Leakage from Geologic Carbon Sequestration Reservoirs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Carroll, Susan; Keating, Elizabeth; Mansoor, Kayyum

2014-01-06

The National Risk Assessment Partnership (NRAP) is developing a science-based toolset for the analysis of potential impacts to groundwater chemistry from CO 2 injection (www.netldoe.gov/nrap). The toolset adopts a stochastic approach in which predictions address uncertainties in shallow underwater and leakage scenarios. It is derived from detailed physics and chemistry simulation results that are used to train more computationally efficient models,l referred to here as reduced-order models (ROMs), for each component system. In particular, these tools can be used to help regulators and operators understand the expected sizes and longevity of plumes in pH, TDS, and dissolved metals that couldmore » result from a leakage of brine and/or CO 2 from a storage reservoir into aquifers. This information can inform, for example, decisions on monitoring strategies that are both effective and efficient. We have used this approach to develop predictive reduced-order models for two common types of reservoirs, but the approach could be used to develop a model for a specific aquifer or other common types of aquifers. In this paper we describe potential impacts to groundwater quality due to CO 2 and brine leakage, discuss an approach to calculate thresholds under which "no impact" to groundwater occurs, describe the time scale for impact on groundwater, and discuss the probability of detecting a groundwater plume should leakage occur.« less
Novel scheme to compute chemical potentials of chain molecules on a lattice

NASA Astrophysics Data System (ADS)

Mooij, G. C. A. M.; Frenkel, D.

We present a novel method that allows efficient computation of the total number of allowed conformations of a chain molecule in a dense phase. Using this method, it is possible to estimate the chemical potential of such a chain molecule. We have tested the present method in simulations of a two-dimensional monolayer of chain molecules on a lattice (Whittington-Chapman model) and compared it with existing schemes to compute the chemical potential. We find that the present approach is two to three orders of magnitude faster than the most efficient of the existing methods.
An efficient parallel algorithm for the solution of a tridiagonal linear system of equations

NASA Technical Reports Server (NTRS)

Stone, H. S.

1971-01-01

Tridiagonal linear systems of equations are solved on conventional serial machines in a time proportional to N, where N is the number of equations. The conventional algorithms do not lend themselves directly to parallel computations on computers of the ILLIAC IV class, in the sense that they appear to be inherently serial. An efficient parallel algorithm is presented in which computation time grows as log sub 2 N. The algorithm is based on recursive doubling solutions of linear recurrence relations, and can be used to solve recurrence relations of all orders.
An extended Kalman filter approach to non-stationary Bayesian estimation of reduced-order vocal fold model parameters.

PubMed

Hadwin, Paul J; Peterson, Sean D

2017-04-01

The Bayesian framework for parameter inference provides a basis from which subject-specific reduced-order vocal fold models can be generated. Previously, it has been shown that a particle filter technique is capable of producing estimates and associated credibility intervals of time-varying reduced-order vocal fold model parameters. However, the particle filter approach is difficult to implement and has a high computational cost, which can be barriers to clinical adoption. This work presents an alternative estimation strategy based upon Kalman filtering aimed at reducing the computational cost of subject-specific model development. The robustness of this approach to Gaussian and non-Gaussian noise is discussed. The extended Kalman filter (EKF) approach is found to perform very well in comparison with the particle filter technique at dramatically lower computational cost. Based upon the test cases explored, the EKF is comparable in terms of accuracy to the particle filter technique when greater than 6000 particles are employed; if less particles are employed, the EKF actually performs better. For comparable levels of accuracy, the solution time is reduced by 2 orders of magnitude when employing the EKF. By virtue of the approximations used in the EKF, however, the credibility intervals tend to be slightly underpredicted.
Investigation of probabilistic principal component analysis compared to proper orthogonal decomposition methods for basis extraction and missing data estimation

NASA Astrophysics Data System (ADS)

Lee, Kyunghoon

To evaluate the maximum likelihood estimates (MLEs) of probabilistic principal component analysis (PPCA) parameters such as a factor-loading, PPCA can invoke an expectation-maximization (EM) algorithm, yielding an EM algorithm for PPCA (EM-PCA). In order to examine the benefits of the EM-PCA for aerospace engineering applications, this thesis attempts to qualitatively and quantitatively scrutinize the EM-PCA alongside both POD and gappy POD using high-dimensional simulation data. In pursuing qualitative investigations, the theoretical relationship between POD and PPCA is transparent such that the factor-loading MLE of PPCA, evaluated by the EM-PCA, pertains to an orthogonal basis obtained by POD. By contrast, the analytical connection between gappy POD and the EM-PCA is nebulous because they distinctively approximate missing data due to their antithetical formulation perspectives: gappy POD solves a least-squares problem whereas the EM-PCA relies on the expectation of the observation probability model. To juxtapose both gappy POD and the EM-PCA, this research proposes a unifying least-squares perspective that embraces the two disparate algorithms within a generalized least-squares framework. As a result, the unifying perspective reveals that both methods address similar least-squares problems; however, their formulations contain dissimilar bases and norms. Furthermore, this research delves into the ramifications of the different bases and norms that will eventually characterize the traits of both methods. To this end, two hybrid algorithms of gappy POD and the EM-PCA are devised and compared to the original algorithms for a qualitative illustration of the different basis and norm effects. After all, a norm reflecting a curve-fitting method is found to more significantly affect estimation error reduction than a basis for two example test data sets: one is absent of data only at a single snapshot and the other misses data across all the snapshots. From a numerical performance aspect, the EM-PCA is computationally less efficient than POD for intact data since it suffers from slow convergence inherited from the EM algorithm. For incomplete data, this thesis quantitatively found that the number of data missing snapshots predetermines whether the EM-PCA or gappy POD outperforms the other because of the computational cost of a coefficient evaluation, resulting from a norm selection. For instance, gappy POD demands laborious computational effort in proportion to the number of data-missing snapshots as a consequence of the gappy norm. In contrast, the computational cost of the EM-PCA is invariant to the number of data-missing snapshots thanks to the L2 norm. In general, the higher the number of data-missing snapshots, the wider the gap between the computational cost of gappy POD and the EM-PCA. Based on the numerical experiments reported in this thesis, the following criterion is recommended regarding the selection between gappy POD and the EM-PCA for computational efficiency: gappy POD for an incomplete data set containing a few data-missing snapshots and the EM-PCA for an incomplete data set involving multiple data-missing snapshots. Last, the EM-PCA is applied to two aerospace applications in comparison to gappy POD as a proof of concept: one with an emphasis on basis extraction and the other with a focus on missing data reconstruction for a given incomplete data set with scattered missing data. The first application exploits the EM-PCA to efficiently construct reduced-order models of engine deck responses obtained by the numerical propulsion system simulation (NPSS), some of whose results are absent due to failed analyses caused by numerical instability. Model-prediction tests validate that engine performance metrics estimated by the reduced-order NPSS model exhibit considerably good agreement with those directly obtained by NPSS. Similarly, the second application illustrates that the EM-PCA is significantly more cost effective than gappy POD at repairing spurious PIV measurements obtained from acoustically-excited, bluff-body jet flow experiments. The EM-PCA reduces computational cost on factors 8 ˜ 19 compared to gappy POD while generating the same restoration results as those evaluated by gappy POD. All in all, through comprehensive theoretical and numerical investigation, this research establishes that the EM-PCA is an efficient alternative to gappy POD for an incomplete data set containing missing data over an entire data set. (Abstract shortened by UMI.)
Efficient Bayesian parameter estimation with implicit sampling and surrogate modeling for a vadose zone hydrological problem

NASA Astrophysics Data System (ADS)

Liu, Y.; Pau, G. S. H.; Finsterle, S.

2015-12-01

Parameter inversion involves inferring the model parameter values based on sparse observations of some observables. To infer the posterior probability distributions of the parameters, Markov chain Monte Carlo (MCMC) methods are typically used. However, the large number of forward simulations needed and limited computational resources limit the complexity of the hydrological model we can use in these methods. In view of this, we studied the implicit sampling (IS) method, an efficient importance sampling technique that generates samples in the high-probability region of the posterior distribution and thus reduces the number of forward simulations that we need to run. For a pilot-point inversion of a heterogeneous permeability field based on a synthetic ponded infiltration experiment simulated with TOUGH2 (a subsurface modeling code), we showed that IS with linear map provides an accurate Bayesian description of the parameterized permeability field at the pilot points with just approximately 500 forward simulations. We further studied the use of surrogate models to improve the computational efficiency of parameter inversion. We implemented two reduced-order models (ROMs) for the TOUGH2 forward model. One is based on polynomial chaos expansion (PCE), of which the coefficients are obtained using the sparse Bayesian learning technique to mitigate the "curse of dimensionality" of the PCE terms. The other model is Gaussian process regression (GPR) for which different covariance, likelihood and inference models are considered. Preliminary results indicate that ROMs constructed based on the prior parameter space perform poorly. It is thus impractical to replace this hydrological model by a ROM directly in a MCMC method. However, the IS method can work with a ROM constructed for parameters in the close vicinity of the maximum a posteriori probability (MAP) estimate. We will discuss the accuracy and computational efficiency of using ROMs in the implicit sampling procedure for the hydrological problem considered. This work was supported, in part, by the U.S. Dept. of Energy under Contract No. DE-AC02-05CH11231
Influence of computational fluid dynamics on experimental aerospace facilities: A fifteen year projection

NASA Technical Reports Server (NTRS)

1983-01-01

An assessment was made of the impact of developments in computational fluid dynamics (CFD) on the traditional role of aerospace ground test facilities over the next fifteen years. With improvements in CFD and more powerful scientific computers projected over this period it is expected to have the capability to compute the flow over a complete aircraft at a unit cost three orders of magnitude lower than presently possible. Over the same period improvements in ground test facilities will progress by application of computational techniques including CFD to data acquisition, facility operational efficiency, and simulation of the light envelope; however, no dramatic change in unit cost is expected as greater efficiency will be countered by higher energy and labor costs.
An Adaptive QSE-reduced Nuclear Reaction Network for Silicon Burning

NASA Astrophysics Data System (ADS)

Parete-Koon, Suzanne; Hix, W.; Thielemann, F.

2008-03-01

The nuclei of the "iron peak" are formed in massive stars shortly before core collapse and during their supernova outbursts as well as during thermonuclear supernovae. Complete and incomplete silicon burning during these events are responsible for the production of a wide range of nuclei with atomic mass numbers from 28 to 64. Because of the large number of nuclei involved, accurate modeling of silicon burning is computationally expensive. However, examination of the physics of silicon burning has revealed that the nuclear evolution is dominated by large groups of nuclei in mutual equilibrium. We present an improvement on our hybrid equilibrium-network scheme which takes advantage of this quasi-equilibrium in order to reduce the number of independent variables calculated. Because the size and membership of these groups vary as the temperature, density and electron faction change, achieving maximal efficiency requires dynamic adjustment of group number and membership. Toward this end, we are implementing a scheme beginning with a single QSE (NSE) group at appropriately high temperature, then progressing through 2, 3 and 4 group stages (with successively more independent variables) as temperature declines. This combination allows accurate prediction of the nuclear abundance evolution, deleptonization and energy generation at a further reduced computational cost when compared to a conventional nuclear reaction network or our previous 3 fixed group QSE-reduced network. During silicon burning, the resultant QSE-reduced network is up to 20 times faster than the full network it replaces without significant loss of accuracy. These reductions in computational cost and the number of species evolved make QSE-reduced networks well suited for inclusion within hydrodynamic simulations, particularly in multi-dimensional applications. This work has been supported by the National Science Foundation, by the Department of Energy's Scientic Discovery through Advanced Computing Programs, and by the Joint Institute for Heavy Ion Research at ORNL.
Encoding neural and synaptic functionalities in electron spin: A pathway to efficient neuromorphic computing

NASA Astrophysics Data System (ADS)

Sengupta, Abhronil; Roy, Kaushik

2017-12-01

Present day computers expend orders of magnitude more computational resources to perform various cognitive and perception related tasks that humans routinely perform every day. This has recently resulted in a seismic shift in the field of computation where research efforts are being directed to develop a neurocomputer that attempts to mimic the human brain by nanoelectronic components and thereby harness its efficiency in recognition problems. Bridging the gap between neuroscience and nanoelectronics, this paper attempts to provide a review of the recent developments in the field of spintronic device based neuromorphic computing. Description of various spin-transfer torque mechanisms that can be potentially utilized for realizing device structures mimicking neural and synaptic functionalities is provided. A cross-layer perspective extending from the device to the circuit and system level is presented to envision the design of an All-Spin neuromorphic processor enabled with on-chip learning functionalities. Device-circuit-algorithm co-simulation framework calibrated to experimental results suggest that such All-Spin neuromorphic systems can potentially achieve almost two orders of magnitude energy improvement in comparison to state-of-the-art CMOS implementations.
Expanding Software Productivity and Power while Reducing Costs.

ERIC Educational Resources Information Center

Winer, Ellen N.

1988-01-01

Microcomputer efficiency and software economy can be achieved through file transfer and data sharing. Costs can be reduced by purchasing computer systems that allow for expansion and portability of data. (MLF)
Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications.

PubMed

Christen, Matthias; Del Medico, Luca; Christen, Heinz; Christen, Beat

2017-01-01

Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner.
Kato expansion in quantum canonical perturbation theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nikolaev, Andrey, E-mail: Andrey.Nikolaev@rdtex.ru

2016-06-15

This work establishes a connection between canonical perturbation series in quantum mechanics and a Kato expansion for the resolvent of the Liouville superoperator. Our approach leads to an explicit expression for a generator of a block-diagonalizing Dyson’s ordered exponential in arbitrary perturbation order. Unitary intertwining of perturbed and unperturbed averaging superprojectors allows for a description of ambiguities in the generator and block-diagonalized Hamiltonian. We compare the efficiency of the corresponding computational algorithm with the efficiencies of the Van Vleck and Magnus methods for high perturbative orders.
Standardized development of computer software. Part 1: Methods

NASA Technical Reports Server (NTRS)

Tausworthe, R. C.

1976-01-01

This work is a two-volume set on standards for modern software engineering methodology. This volume presents a tutorial and practical guide to the efficient development of reliable computer software, a unified and coordinated discipline for design, coding, testing, documentation, and project organization and management. The aim of the monograph is to provide formal disciplines for increasing the probability of securing software that is characterized by high degrees of initial correctness, readability, and maintainability, and to promote practices which aid in the consistent and orderly development of a total software system within schedule and budgetary constraints. These disciplines are set forth as a set of rules to be applied during software development to drastically reduce the time traditionally spent in debugging, to increase documentation quality, to foster understandability among those who must come in contact with it, and to facilitate operations and alterations of the program as requirements on the program environment change.
Optimal segmentation and packaging process

DOEpatents

Kostelnik, Kevin M.; Meservey, Richard H.; Landon, Mark D.

1999-01-01

A process for improving packaging efficiency uses three dimensional, computer simulated models with various optimization algorithms to determine the optimal segmentation process and packaging configurations based on constraints including container limitations. The present invention is applied to a process for decontaminating, decommissioning (D&D), and remediating a nuclear facility involving the segmentation and packaging of contaminated items in waste containers in order to minimize the number of cuts, maximize packaging density, and reduce worker radiation exposure. A three-dimensional, computer simulated, facility model of the contaminated items are created. The contaminated items are differentiated. The optimal location, orientation and sequence of the segmentation and packaging of the contaminated items is determined using the simulated model, the algorithms, and various constraints including container limitations. The cut locations and orientations are transposed to the simulated model. The contaminated items are actually segmented and packaged. The segmentation and packaging may be simulated beforehand. In addition, the contaminated items may be cataloged and recorded.
Time Domain Estimation of Arterial Parameters using the Windkessel Model and the Monte Carlo Method

NASA Astrophysics Data System (ADS)

Gostuski, Vladimir; Pastore, Ignacio; Rodriguez Palacios, Gaspar; Vaca Diez, Gustavo; Moscoso-Vasquez, H. Marcela; Risk, Marcelo

2016-04-01

Numerous parameter estimation techniques exist for characterizing the arterial system using electrical circuit analogs. However, they are often limited by their requirements and usually high computational burdain. Therefore, a new method for estimating arterial parameters based on Monte Carlo simulation is proposed. A three element Windkessel model was used to represent the arterial system. The approach was to reduce the error between the calculated and physiological aortic pressure by randomly generating arterial parameter values, while keeping constant the arterial resistance. This last value was obtained for each subject using the arterial flow, and was a necessary consideration in order to obtain a unique set of values for the arterial compliance and peripheral resistance. The estimation technique was applied to in vivo data containing steady beats in mongrel dogs, and it reliably estimated Windkessel arterial parameters. Further, this method appears to be computationally efficient for on-line time-domain estimation of these parameters.
Sampling schemes and parameter estimation for nonlinear Bernoulli-Gaussian sparse models

NASA Astrophysics Data System (ADS)

Boudineau, Mégane; Carfantan, Hervé; Bourguignon, Sébastien; Bazot, Michael

2016-06-01

We address the sparse approximation problem in the case where the data are approximated by the linear combination of a small number of elementary signals, each of these signals depending non-linearly on additional parameters. Sparsity is explicitly expressed through a Bernoulli-Gaussian hierarchical model in a Bayesian framework. Posterior mean estimates are computed using Markov Chain Monte-Carlo algorithms. We generalize the partially marginalized Gibbs sampler proposed in the linear case in [1], and build an hybrid Hastings-within-Gibbs algorithm in order to account for the nonlinear parameters. All model parameters are then estimated in an unsupervised procedure. The resulting method is evaluated on a sparse spectral analysis problem. It is shown to converge more efficiently than the classical joint estimation procedure, with only a slight increase of the computational cost per iteration, consequently reducing the global cost of the estimation procedure.
Optimization of Computational Performance and Accuracy in 3-D Transient CFD Model for CFB Hydrodynamics Predictions

NASA Astrophysics Data System (ADS)

Rampidis, I.; Nikolopoulos, A.; Koukouzas, N.; Grammelis, P.; Kakaras, E.

2007-09-01

This work aims to present a pure 3-D CFD model, accurate and efficient, for the simulation of a pilot scale CFB hydrodynamics. The accuracy of the model was investigated as a function of the numerical parameters, in order to derive an optimum model setup with respect to computational cost. The necessity of the in depth examination of hydrodynamics emerges by the trend to scale up CFBCs. This scale up brings forward numerous design problems and uncertainties, which can be successfully elucidated by CFD techniques. Deriving guidelines for setting a computational efficient model is important as the scale of the CFBs grows fast, while computational power is limited. However, the optimum efficiency matter has not been investigated thoroughly in the literature as authors were more concerned for their models accuracy and validity. The objective of this work is to investigate the parameters that influence the efficiency and accuracy of CFB computational fluid dynamics models, find the optimum set of these parameters and thus establish this technique as a competitive method for the simulation and design of industrial, large scale beds, where the computational cost is otherwise prohibitive. During the tests that were performed in this work, the influence of turbulence modeling approach, time and space density and discretization schemes were investigated on a 1.2 MWth CFB test rig. Using Fourier analysis dominant frequencies were extracted in order to estimate the adequate time period for the averaging of all instantaneous values. The compliance with the experimental measurements was very good. The basic differences between the predictions that arose from the various model setups were pointed out and analyzed. The results showed that a model with high order space discretization schemes when applied on a coarse grid and averaging of the instantaneous scalar values for a 20 sec period, adequately described the transient hydrodynamic behaviour of a pilot CFB while the computational cost was kept low. Flow patterns inside the bed such as the core-annulus flow and the transportation of clusters were at least qualitatively captured.

GPU and APU computations of Finite Time Lyapunov Exponent fields

NASA Astrophysics Data System (ADS)

Conti, Christian; Rossinelli, Diego; Koumoutsakos, Petros

2012-03-01

We present GPU and APU accelerated computations of Finite-Time Lyapunov Exponent (FTLE) fields. The calculation of FTLEs is a computationally intensive process, as in order to obtain the sharp ridges associated with the Lagrangian Coherent Structures an extensive resampling of the flow field is required. The computational performance of this resampling is limited by the memory bandwidth of the underlying computer architecture. The present technique harnesses data-parallel execution of many-core architectures and relies on fast and accurate evaluations of moment conserving functions for the mesh to particle interpolations. We demonstrate how the computation of FTLEs can be efficiently performed on a GPU and on an APU through OpenCL and we report over one order of magnitude improvements over multi-threaded executions in FTLE computations of bluff body flows.
An approximate solution to improve computational efficiency of impedance-type payload load prediction

NASA Technical Reports Server (NTRS)

White, C. W.

1981-01-01

The computational efficiency of the impedance type loads prediction method was studied. Three goals were addressed: devise a method to make the impedance method operate more efficiently in the computer; assess the accuracy and convenience of the method for determining the effect of design changes; and investigate the use of the method to identify design changes for reduction of payload loads. The method is suitable for calculation of dynamic response in either the frequency or time domain. It is concluded that: the choice of an orthogonal coordinate system will allow the impedance method to operate more efficiently in the computer; the approximate mode impedance technique is adequate for determining the effect of design changes, and is applicable for both statically determinate and statically indeterminate payload attachments; and beneficial design changes to reduce payload loads can be identified by the combined application of impedance techniques and energy distribution review techniques.
Efficient Maintenance and Update of Nonbonded Lists in Macromolecular Simulations.

PubMed

Chowdhury, Rezaul; Beglov, Dmitri; Moghadasi, Mohammad; Paschalidis, Ioannis Ch; Vakili, Pirooz; Vajda, Sandor; Bajaj, Chandrajit; Kozakov, Dima

2014-10-14

Molecular mechanics and dynamics simulations use distance based cutoff approximations for faster computation of pairwise van der Waals and electrostatic energy terms. These approximations traditionally use a precalculated and periodically updated list of interacting atom pairs, known as the "nonbonded neighborhood lists" or nblists, in order to reduce the overhead of finding atom pairs that are within distance cutoff. The size of nblists grows linearly with the number of atoms in the system and superlinearly with the distance cutoff, and as a result, they require significant amount of memory for large molecular systems. The high space usage leads to poor cache performance, which slows computation for large distance cutoffs. Also, the high cost of updates means that one cannot afford to keep the data structure always synchronized with the configuration of the molecules when efficiency is at stake. We propose a dynamic octree data structure for implicit maintenance of nblists using space linear in the number of atoms but independent of the distance cutoff. The list can be updated very efficiently as the coordinates of atoms change during the simulation. Unlike explicit nblists, a single octree works for all distance cutoffs. In addition, octree is a cache-friendly data structure, and hence, it is less prone to cache miss slowdowns on modern memory hierarchies than nblists. Octrees use almost 2 orders of magnitude less memory, which is crucial for simulation of large systems, and while they are comparable in performance to nblists when the distance cutoff is small, they outperform nblists for larger systems and large cutoffs. Our tests show that octree implementation is approximately 1.5 times faster in practical use case scenarios as compared to nblists.
Classification of breast tissue in mammograms using efficient coding.

PubMed

Costa, Daniel D; Campos, Lúcio F; Barros, Allan K

2011-06-24

Female breast cancer is the major cause of death by cancer in western countries. Efforts in Computer Vision have been made in order to improve the diagnostic accuracy by radiologists. Some methods of lesion diagnosis in mammogram images were developed based in the technique of principal component analysis which has been used in efficient coding of signals and 2D Gabor wavelets used for computer vision applications and modeling biological vision. In this work, we present a methodology that uses efficient coding along with linear discriminant analysis to distinguish between mass and non-mass from 5090 region of interest from mammograms. The results show that the best rates of success reached with Gabor wavelets and principal component analysis were 85.28% and 87.28%, respectively. In comparison, the model of efficient coding presented here reached up to 90.07%. Altogether, the results presented demonstrate that independent component analysis performed successfully the efficient coding in order to discriminate mass from non-mass tissues. In addition, we have observed that LDA with ICA bases showed high predictive performance for some datasets and thus provide significant support for a more detailed clinical investigation.
Calibration of aero-structural reduced order models using full-field experimental measurements

NASA Astrophysics Data System (ADS)

Perez, R.; Bartram, G.; Beberniss, T.; Wiebe, R.; Spottswood, S. M.

2017-03-01

The structural response of hypersonic aircraft panels is a multi-disciplinary problem, where the nonlinear structural dynamics, aerodynamics, and heat transfer models are coupled. A clear understanding of the impact of high-speed flow effects on the structural response, and the potential influence of the structure on the local environment, is needed in order to prevent the design of overly-conservative structures, a common problem in past hypersonic programs. The current work investigates these challenges from a structures perspective. To this end, the first part of this investigation looks at the modeling of the response of a rectangular panel to an external heating source (thermo-structural coupling) where the temperature effect on the structure is obtained from forward looking infrared (FLIR) measurements and the displacement via 3D-digital image correlation (DIC). The second part of the study uses data from a previous series of wind-tunnel experiments, performed to investigate the response of a compliant panel to the effects of high-speed flow, to train a pressure surrogate model. In this case, the panel aero-loading is obtained from fast-response pressure sensitive paint (PSP) measurements, both directly and from the pressure surrogate model. The result of this investigation is the use of full-field experimental measurements to update the structural model and train a computational efficient model of the loading environment. The use of reduced order models, informed by these full-field physical measurements, is a significant step toward the development of accurate simulation models of complex structures that are computationally tractable.
Design of modular control system for grain dryers

NASA Astrophysics Data System (ADS)

He, Gaoqing; Liu, Yanhua; Zu, Yuan

In order to effectively control the temperature of grain drying bin, grain ,air outlet as well as the grain moisture, it designed the control system of 5HCY-35 which is based on MCU to adapt to all grains drying conditions, high drying efficiency, long life usage and less manually. The system includes: the control module of the constant temperature and the temperature difference control in drying bin, the constant temperature control of heating furnace, on-line testing of moisture, variety of grain-circulation speed control and human-computer interaction interface. Spatial curve simulation, which takes moisture as control objectives, controls the constant temperature and the temperature difference in drying bin according to preset parameter by the user or a list to reduce the grains explosive to ensure the seed germination percentage. The system can realize the intelligent control of high efficiency and various drying, the good scalability and the high quality.
Adaptive mixed finite element methods for Darcy flow in fractured porous media

NASA Astrophysics Data System (ADS)

Chen, Huangxin; Salama, Amgad; Sun, Shuyu

2016-10-01

In this paper, we propose adaptive mixed finite element methods for simulating the single-phase Darcy flow in two-dimensional fractured porous media. The reduced model that we use for the simulation is a discrete fracture model coupling Darcy flows in the matrix and the fractures, and the fractures are modeled by one-dimensional entities. The Raviart-Thomas mixed finite element methods are utilized for the solution of the coupled Darcy flows in the matrix and the fractures. In order to improve the efficiency of the simulation, we use adaptive mixed finite element methods based on novel residual-based a posteriori error estimators. In addition, we develop an efficient upscaling algorithm to compute the effective permeability of the fractured porous media. Several interesting examples of Darcy flow in the fractured porous media are presented to demonstrate the robustness of the algorithm.
Wavelet compression of multichannel ECG data by enhanced set partitioning in hierarchical trees algorithm.

PubMed

Sharifahmadian, Ershad

2006-01-01

The set partitioning in hierarchical trees (SPIHT) algorithm is very effective and computationally simple technique for image and signal compression. Here the author modified the algorithm which provides even better performance than the SPIHT algorithm. The enhanced set partitioning in hierarchical trees (ESPIHT) algorithm has performance faster than the SPIHT algorithm. In addition, the proposed algorithm reduces the number of bits in a bit stream which is stored or transmitted. I applied it to compression of multichannel ECG data. Also, I presented a specific procedure based on the modified algorithm for more efficient compression of multichannel ECG data. This method employed on selected records from the MIT-BIH arrhythmia database. According to experiments, the proposed method attained the significant results regarding compression of multichannel ECG data. Furthermore, in order to compress one signal which is stored for a long time, the proposed multichannel compression method can be utilized efficiently.
Compression of computer generated phase-shifting hologram sequence using AVC and HEVC

NASA Astrophysics Data System (ADS)

Xing, Yafei; Pesquet-Popescu, Béatrice; Dufaux, Frederic

2013-09-01

With the capability of achieving twice the compression ratio of Advanced Video Coding (AVC) with similar reconstruction quality, High Efficiency Video Coding (HEVC) is expected to become the newleading technique of video coding. In order to reduce the storage and transmission burden of digital holograms, in this paper we propose to use HEVC for compressing the phase-shifting digital hologram sequences (PSDHS). By simulating phase-shifting digital holography (PSDH) interferometry, interference patterns between illuminated three dimensional( 3D) virtual objects and the stepwise phase changed reference wave are generated as digital holograms. The hologram sequences are obtained by the movement of the virtual objects and compressed by AVC and HEVC. The experimental results show that AVC and HEVC are efficient to compress PSDHS, with HEVC giving better performance. Good compression rate and reconstruction quality can be obtained with bitrate above 15000kbps.
A secure and efficient authentication and key agreement scheme based on ECC for telecare medicine information systems.

PubMed

Xu, Xin; Zhu, Ping; Wen, Qiaoyan; Jin, Zhengping; Zhang, Hua; He, Lian

2014-01-01

In the field of the Telecare Medicine Information System, recent researches have focused on consummating more convenient and secure healthcare delivery services for patients. In order to protect the sensitive information, various attempts such as access control have been proposed to safeguard patients' privacy in this system. However, these schemes suffered from some certain security defects and had costly consumption, which were not suitable for the telecare medicine information system. In this paper, based on the elliptic curve cryptography, we propose a secure and efficient two-factor mutual authentication and key agreement scheme to reduce the computational cost. Such a scheme enables to provide the patient anonymity by employing the dynamic identity. Compared with other related protocols, the security analysis and performance evaluation show that our scheme overcomes some well-known attacks and has a better performance in the telecare medicine information system.
XRootD popularity on hadoop clusters

NASA Astrophysics Data System (ADS)

Meoni, Marco; Boccali, Tommaso; Magini, Nicolò; Menichetti, Luca; Giordano, Domenico; CMS Collaboration

2017-10-01

Performance data and metadata of the computing operations at the CMS experiment are collected through a distributed monitoring infrastructure, currently relying on a traditional Oracle database system. This paper shows how to harness Big Data architectures in order to improve the throughput and the efficiency of such monitoring. A large set of operational data - user activities, job submissions, resources, file transfers, site efficiencies, software releases, network traffic, machine logs - is being injected into a readily available Hadoop cluster, via several data streamers. The collected metadata is further organized running fast arbitrary queries; this offers the ability to test several Map&Reduce-based frameworks and measure the system speed-up when compared to the original database infrastructure. By leveraging a quality Hadoop data store and enabling an analytics framework on top, it is possible to design a mining platform to predict dataset popularity and discover patterns and correlations.
Parallel high-precision orbit propagation using the modified Picard-Chebyshev method

NASA Astrophysics Data System (ADS)

Koblick, Darin C.

2012-03-01

The modified Picard-Chebyshev method, when run in parallel, is thought to be more accurate and faster than the most efficient sequential numerical integration techniques when applied to orbit propagation problems. Previous experiments have shown that the modified Picard-Chebyshev method can have up to a one order magnitude speedup over the 12th order Runge-Kutta-Nystrom method. For this study, the evaluation of the accuracy and computational time of the modified Picard-Chebyshev method, using the Java Astrodynamics Toolkit high-precision force model, is conducted to assess its runtime performance. Simulation results of the modified Picard-Chebyshev method, implemented in MATLAB and the MATLAB Parallel Computing Toolbox, are compared against the most efficient first and second order Ordinary Differential Equation (ODE) solvers. A total of six processors were used to assess the runtime performance of the modified Picard-Chebyshev method. It was found that for all orbit propagation test cases, where the gravity model was simulated to be of higher degree and order (above 225 to increase computational overhead), the modified Picard-Chebyshev method was faster, by as much as a factor of two, than the other ODE solvers which were tested.
Reduced Order Modeling of Combustion Instability in a Gas Turbine Model Combustor

NASA Astrophysics Data System (ADS)

Arnold-Medabalimi, Nicholas; Huang, Cheng; Duraisamy, Karthik

2017-11-01

Hydrocarbon fuel based propulsion systems are expected to remain relevant in aerospace vehicles for the foreseeable future. Design of these devices is complicated by combustion instabilities. The capability to model and predict these effects at reduced computational cost is a requirement for both design and control of these devices. This work focuses on computational studies on a dual swirl model gas turbine combustor in the context of reduced order model development. Full fidelity simulations are performed utilizing URANS and Hybrid RANS-LES with finite rate chemistry. Following this, data decomposition techniques are used to extract a reduced basis representation of the unsteady flow field. These bases are first used to identify sensor locations to guide experimental interrogations and controller feedback. Following this, initial results on developing a control-oriented reduced order model (ROM) will be presented. The capability of the ROM will be further assessed based on different operating conditions and geometric configurations.
High-order computational fluid dynamics tools for aircraft design

PubMed Central

Wang, Z. J.

2014-01-01

Most forecasts predict an annual airline traffic growth rate between 4.5 and 5% in the foreseeable future. To sustain that growth, the environmental impact of aircraft cannot be ignored. Future aircraft must have much better fuel economy, dramatically less greenhouse gas emissions and noise, in addition to better performance. Many technical breakthroughs must take place to achieve the aggressive environmental goals set up by governments in North America and Europe. One of these breakthroughs will be physics-based, highly accurate and efficient computational fluid dynamics and aeroacoustics tools capable of predicting complex flows over the entire flight envelope and through an aircraft engine, and computing aircraft noise. Some of these flows are dominated by unsteady vortices of disparate scales, often highly turbulent, and they call for higher-order methods. As these tools will be integral components of a multi-disciplinary optimization environment, they must be efficient to impact design. Ultimately, the accuracy, efficiency, robustness, scalability and geometric flexibility will determine which methods will be adopted in the design process. This article explores these aspects and identifies pacing items. PMID:25024419
Yahoo! Compute Coop (YCC). A Next-Generation Passive Cooling Design for Data Centers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Robison, AD; Page, Christina; Lytle, Bob

The purpose of the Yahoo! Compute Coop (YCC) project is to research, design, build and implement a greenfield "efficient data factory" and to specifically demonstrate that the YCC concept is feasible for large facilities housing tens of thousands of heat-producing computing servers. The project scope for the Yahoo! Compute Coop technology includes: - Analyzing and implementing ways in which to drastically decrease energy consumption and waste output. - Analyzing the laws of thermodynamics and implementing naturally occurring environmental effects in order to maximize the "free-cooling" for large data center facilities. "Free cooling" is the direct usage of outside air tomore » cool the servers vs. traditional "mechanical cooling" which is supplied by chillers or other Dx units. - Redesigning and simplifying building materials and methods. - Shortening and simplifying build-to-operate schedules while at the same time reducing initial build and operating costs. Selected for its favorable climate, the greenfield project site is located in Lockport, NY. Construction on the 9.0 MW critical load data center facility began in May 2009, with the fully operational facility deployed in September 2010. The relatively low initial build cost, compatibility with current server and network models, and the efficient use of power and water are all key features that make it a highly compatible and globally implementable design innovation for the data center industry. Yahoo! Compute Coop technology is designed to achieve 99.98% uptime availability. This integrated building design allows for free cooling 99% of the year via the building's unique shape and orientation, as well as server physical configuration.« less
Computing volume potentials for noninvasive imaging of cardiac excitation.

PubMed

van der Graaf, A W Maurits; Bhagirath, Pranav; van Driel, Vincent J H M; Ramanna, Hemanth; de Hooge, Jacques; de Groot, Natasja M S; Götte, Marco J W

2015-03-01

In noninvasive imaging of cardiac excitation, the use of body surface potentials (BSP) rather than body volume potentials (BVP) has been favored due to enhanced computational efficiency and reduced modeling effort. Nowadays, increased computational power and the availability of open source software enable the calculation of BVP for clinical purposes. In order to illustrate the possible advantages of this approach, the explanatory power of BVP is investigated using a rectangular tank filled with an electrolytic conductor and a patient specific three dimensional model. MRI images of the tank and of a patient were obtained in three orthogonal directions using a turbo spin echo MRI sequence. MRI images were segmented in three dimensional using custom written software. Gmsh software was used for mesh generation. BVP were computed using a transfer matrix and FEniCS software. The solution for 240,000 nodes, corresponding to a resolution of 5 mm throughout the thorax volume, was computed in 3 minutes. The tank experiment revealed that an increased electrode surface renders the position of the 4 V equipotential plane insensitive to mesh cell size and reduces simulated deviations. In the patient-specific model, the impact of assigning a different conductivity to lung tissue on the distribution of volume potentials could be visualized. Generation of high quality volume meshes and computation of BVP with a resolution of 5 mm is feasible using generally available software and hardware. Estimation of BVP may lead to an improved understanding of the genesis of BSP and sources of local inaccuracies. © 2014 Wiley Periodicals, Inc.
Hierarchical Parallelization of Gene Differential Association Analysis

PubMed Central

2011-01-01

Background Microarray gene differential expression analysis is a widely used technique that deals with high dimensional data and is computationally intensive for permutation-based procedures. Microarray gene differential association analysis is even more computationally demanding and must take advantage of multicore computing technology, which is the driving force behind increasing compute power in recent years. In this paper, we present a two-layer hierarchical parallel implementation of gene differential association analysis. It takes advantage of both fine- and coarse-grain (with granularity defined by the frequency of communication) parallelism in order to effectively leverage the non-uniform nature of parallel processing available in the cutting-edge systems of today. Results Our results show that this hierarchical strategy matches data sharing behavior to the properties of the underlying hardware, thereby reducing the memory and bandwidth needs of the application. The resulting improved efficiency reduces computation time and allows the gene differential association analysis code to scale its execution with the number of processors. The code and biological data used in this study are downloadable from http://www.urmc.rochester.edu/biostat/people/faculty/hu.cfm. Conclusions The performance sweet spot occurs when using a number of threads per MPI process that allows the working sets of the corresponding MPI processes running on the multicore to fit within the machine cache. Hence, we suggest that practitioners follow this principle in selecting the appropriate number of MPI processes and threads within each MPI process for their cluster configurations. We believe that the principles of this hierarchical approach to parallelization can be utilized in the parallelization of other computationally demanding kernels. PMID:21936916
Hierarchical parallelization of gene differential association analysis.

PubMed

Needham, Mark; Hu, Rui; Dwarkadas, Sandhya; Qiu, Xing

2011-09-21

Microarray gene differential expression analysis is a widely used technique that deals with high dimensional data and is computationally intensive for permutation-based procedures. Microarray gene differential association analysis is even more computationally demanding and must take advantage of multicore computing technology, which is the driving force behind increasing compute power in recent years. In this paper, we present a two-layer hierarchical parallel implementation of gene differential association analysis. It takes advantage of both fine- and coarse-grain (with granularity defined by the frequency of communication) parallelism in order to effectively leverage the non-uniform nature of parallel processing available in the cutting-edge systems of today. Our results show that this hierarchical strategy matches data sharing behavior to the properties of the underlying hardware, thereby reducing the memory and bandwidth needs of the application. The resulting improved efficiency reduces computation time and allows the gene differential association analysis code to scale its execution with the number of processors. The code and biological data used in this study are downloadable from http://www.urmc.rochester.edu/biostat/people/faculty/hu.cfm. The performance sweet spot occurs when using a number of threads per MPI process that allows the working sets of the corresponding MPI processes running on the multicore to fit within the machine cache. Hence, we suggest that practitioners follow this principle in selecting the appropriate number of MPI processes and threads within each MPI process for their cluster configurations. We believe that the principles of this hierarchical approach to parallelization can be utilized in the parallelization of other computationally demanding kernels.
Uncertainty Aware Structural Topology Optimization Via a Stochastic Reduced Order Model Approach

NASA Technical Reports Server (NTRS)

Aguilo, Miguel A.; Warner, James E.

2017-01-01

This work presents a stochastic reduced order modeling strategy for the quantification and propagation of uncertainties in topology optimization. Uncertainty aware optimization problems can be computationally complex due to the substantial number of model evaluations that are necessary to accurately quantify and propagate uncertainties. This computational complexity is greatly magnified if a high-fidelity, physics-based numerical model is used for the topology optimization calculations. Stochastic reduced order model (SROM) methods are applied here to effectively 1) alleviate the prohibitive computational cost associated with an uncertainty aware topology optimization problem; and 2) quantify and propagate the inherent uncertainties due to design imperfections. A generic SROM framework that transforms the uncertainty aware, stochastic topology optimization problem into a deterministic optimization problem that relies only on independent calls to a deterministic numerical model is presented. This approach facilitates the use of existing optimization and modeling tools to accurately solve the uncertainty aware topology optimization problems in a fraction of the computational demand required by Monte Carlo methods. Finally, an example in structural topology optimization is presented to demonstrate the effectiveness of the proposed uncertainty aware structural topology optimization approach.
Comparison of different models for non-invasive FFR estimation

NASA Astrophysics Data System (ADS)

Mirramezani, Mehran; Shadden, Shawn

2017-11-01

Coronary artery disease is a leading cause of death worldwide. Fractional flow reserve (FFR), derived from invasively measuring the pressure drop across a stenosis, is considered the gold standard to diagnose disease severity and need for treatment. Non-invasive estimation of FFR has gained recent attention for its potential to reduce patient risk and procedural cost versus invasive FFR measurement. Non-invasive FFR can be obtained by using image-based computational fluid dynamics to simulate blood flow and pressure in a patient-specific coronary model. However, 3D simulations require extensive effort for model construction and numerical computation, which limits their routine use. In this study we compare (ordered by increasing computational cost/complexity): reduced-order algebraic models of pressure drop across a stenosis; 1D, 2D (multiring) and 3D CFD models; as well as 3D FSI for the computation of FFR in idealized and patient-specific stenosis geometries. We demonstrate the ability of an appropriate reduced order algebraic model to closely predict FFR when compared to FFR from a full 3D simulation. This work was supported by the NIH, Grant No. R01-HL103419.

Design of a Variational Multiscale Method for Turbulent Compressible Flows

NASA Technical Reports Server (NTRS)

Diosady, Laslo Tibor; Murman, Scott M.

2013-01-01

A spectral-element framework is presented for the simulation of subsonic compressible high-Reynolds-number flows. The focus of the work is maximizing the efficiency of the computational schemes to enable unsteady simulations with a large number of spatial and temporal degrees of freedom. A collocation scheme is combined with optimized computational kernels to provide a residual evaluation with computational cost independent of order of accuracy up to 16th order. The optimized residual routines are used to develop a low-memory implicit scheme based on a matrix-free Newton-Krylov method. A preconditioner based on the finite-difference diagonalized ADI scheme is developed which maintains the low memory of the matrix-free implicit solver, while providing improved convergence properties. Emphasis on low memory usage throughout the solver development is leveraged to implement a coupled space-time DG solver which may offer further efficiency gains through adaptivity in both space and time.
Efficient computation of aerodynamic influence coefficients for aeroelastic analysis on a transputer network

NASA Technical Reports Server (NTRS)

Janetzke, David C.; Murthy, Durbha V.

1991-01-01

Aeroelastic analysis is multi-disciplinary and computationally expensive. Hence, it can greatly benefit from parallel processing. As part of an effort to develop an aeroelastic capability on a distributed memory transputer network, a parallel algorithm for the computation of aerodynamic influence coefficients is implemented on a network of 32 transputers. The aerodynamic influence coefficients are calculated using a 3-D unsteady aerodynamic model and a parallel discretization. Efficiencies up to 85 percent were demonstrated using 32 processors. The effect of subtask ordering, problem size, and network topology are presented. A comparison to results on a shared memory computer indicates that higher speedup is achieved on the distributed memory system.
3D Higher Order Modeling in the BEM/FEM Hybrid Formulation

NASA Technical Reports Server (NTRS)

Fink, P. W.; Wilton, D. R.

2000-01-01

Higher order divergence- and curl-conforming bases have been shown to provide significant benefits, in both convergence rate and accuracy, in the 2D hybrid finite element/boundary element formulation (P. Fink and D. Wilton, National Radio Science Meeting, Boulder, CO, Jan. 2000). A critical issue in achieving the potential for accuracy of the approach is the accurate evaluation of all matrix elements. These involve products of high order polynomials and, in some instances, singular Green's functions. In the 2D formulation, the use of a generalized Gaussian quadrature method was found to greatly facilitate the computation and to improve the accuracy of the boundary integral equation self-terms. In this paper, a 3D, hybrid electric field formulation employing higher order bases and higher order elements is presented. The improvements in convergence rate and accuracy, compared to those resulting from lower order modeling, are established. Techniques developed to facilitate the computation of the boundary integral self-terms are also shown to improve the accuracy of these terms. Finally, simple preconditioning techniques are used in conjunction with iterative solution procedures to solve the resulting linear system efficiently. In order to handle the boundary integral singularities in the 3D formulation, the parent element- either a triangle or rectangle-is subdivided into a set of sub-triangles with a common vertex at the singularity. The contribution to the integral from each of the sub-triangles is computed using the Duffy transformation to remove the singularity. This method is shown to greatly facilitate t'pe self-term computation when the bases are of higher order. In addition, the sub-triangles can be further divided to achieve near arbitrary accuracy in the self-term computation. An efficient method for subdividing the parent element is presented. The accuracy obtained using higher order bases is compared to that obtained using lower order bases when the number of unknowns is approximately equal. Also, convergence rates obtained using higher order bases are compared to those obtained with lower order bases for selected sample
I/O-Efficient Scientific Computation Using TPIE

NASA Technical Reports Server (NTRS)

Vengroff, Darren Erik; Vitter, Jeffrey Scott

1996-01-01

In recent years, input/output (I/O)-efficient algorithms for a wide variety of problems have appeared in the literature. However, systems specifically designed to assist programmers in implementing such algorithms have remained scarce. TPIE is a system designed to support I/O-efficient paradigms for problems from a variety of domains, including computational geometry, graph algorithms, and scientific computation. The TPIE interface frees programmers from having to deal not only with explicit read and write calls, but also the complex memory management that must be performed for I/O-efficient computation. In this paper we discuss applications of TPIE to problems in scientific computation. We discuss algorithmic issues underlying the design and implementation of the relevant components of TPIE and present performance results of programs written to solve a series of benchmark problems using our current TPIE prototype. Some of the benchmarks we present are based on the NAS parallel benchmarks while others are of our own creation. We demonstrate that the central processing unit (CPU) overhead required to manage I/O is small and that even with just a single disk, the I/O overhead of I/O-efficient computation ranges from negligible to the same order of magnitude as CPU time. We conjecture that if we use a number of disks in parallel this overhead can be all but eliminated.
Towards a Versatile Tele-Education Platform for Computer Science Educators Based on the Greek School Network

ERIC Educational Resources Information Center

Paraskevas, Michael; Zarouchas, Thomas; Angelopoulos, Panagiotis; Perikos, Isidoros

2013-01-01

Now days the growing need for highly qualified computer science educators in modern educational environments is commonplace. This study examines the potential use of Greek School Network (GSN) to provide a robust and comprehensive e-training course for computer science educators in order to efficiently exploit advanced IT services and establish a…
Damage Detection in Flexible Plates through Reduced-Order Modeling and Hybrid Particle-Kalman Filtering

PubMed Central

Capellari, Giovanni; Eftekhar Azam, Saeed; Mariani, Stefano

2015-01-01

Health monitoring of lightweight structures, like thin flexible plates, is of interest in several engineering fields. In this paper, a recursive Bayesian procedure is proposed to monitor the health of such structures through data collected by a network of optimally placed inertial sensors. As a main drawback of standard monitoring procedures is linked to the computational costs, two remedies are jointly considered: first, an order-reduction of the numerical model used to track the structural dynamics, enforced with proper orthogonal decomposition; and, second, an improved particle filter, which features an extended Kalman updating of each evolving particle before the resampling stage. The former remedy can reduce the number of effective degrees-of-freedom of the structural model to a few only (depending on the excitation), whereas the latter one allows to track the evolution of damage and to locate it thanks to an intricate formulation. To assess the effectiveness of the proposed procedure, the case of a plate subject to bending is investigated; it is shown that, when the procedure is appropriately fed by measurements, damage is efficiently and accurately estimated. PMID:26703615
A density matrix-based method for the linear-scaling calculation of dynamic second- and third-order properties at the Hartree-Fock and Kohn-Sham density functional theory levels.

PubMed

Kussmann, Jörg; Ochsenfeld, Christian

2007-11-28

A density matrix-based time-dependent self-consistent field (D-TDSCF) method for the calculation of dynamic polarizabilities and first hyperpolarizabilities using the Hartree-Fock and Kohn-Sham density functional theory approaches is presented. The D-TDSCF method allows us to reduce the asymptotic scaling behavior of the computational effort from cubic to linear for systems with a nonvanishing band gap. The linear scaling is achieved by combining a density matrix-based reformulation of the TDSCF equations with linear-scaling schemes for the formation of Fock- or Kohn-Sham-type matrices. In our reformulation only potentially linear-scaling matrices enter the formulation and efficient sparse algebra routines can be employed. Furthermore, the corresponding formulas for the first hyperpolarizabilities are given in terms of zeroth- and first-order one-particle reduced density matrices according to Wigner's (2n+1) rule. The scaling behavior of our method is illustrated for first exemplary calculations with systems of up to 1011 atoms and 8899 basis functions.
Multiple multicontrol unitary operations: Implementation and applications

NASA Astrophysics Data System (ADS)

Lin, Qing

2018-04-01

The efficient implementation of computational tasks is critical to quantum computations. In quantum circuits, multicontrol unitary operations are important components. Here, we present an extremely efficient and direct approach to multiple multicontrol unitary operations without decomposition to CNOT and single-photon gates. With the proposed approach, the necessary two-photon operations could be reduced from O( n 3) with the traditional decomposition approach to O( n), which will greatly relax the requirements and make large-scale quantum computation feasible. Moreover, we propose the potential application to the ( n- k)-uniform hypergraph state.
Using Volunteer Computing to Study Some Features of Diagonal Latin Squares

NASA Astrophysics Data System (ADS)

Vatutin, Eduard; Zaikin, Oleg; Kochemazov, Stepan; Valyaev, Sergey

2017-12-01

In this research, the study concerns around several features of diagonal Latin squares (DLSs) of small order. Authors of the study suggest an algorithm for computing minimal and maximal numbers of transversals of DLSs. According to this algorithm, all DLSs of a particular order are generated, and for each square all its transversals and diagonal transversals are constructed. The algorithm was implemented and applied to DLSs of order at most 7 on a personal computer. The experiment for order 8 was performed in the volunteer computing project Gerasim@home. In addition, the problem of finding pairs of orthogonal DLSs of order 10 was considered and reduced to Boolean satisfiability problem. The obtained problem turned out to be very hard, therefore it was decomposed into a family of subproblems. In order to solve the problem, the volunteer computing project SAT@home was used. As a result, several dozen pairs of described kind were found.
Application of microarray analysis on computer cluster and cloud platforms.

PubMed

Bernau, C; Boulesteix, A-L; Knaus, J

2013-01-01

Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.
77 FR 12528 - Amendments to Commission's Rules of Practice and Procedure-Subparts E and L

Federal Register 2010, 2011, 2012, 2013, 2014

2012-03-01

..., he advocates the use of a cloud computing system in which documents can be filed giving multiple... concerns. Barillo believes that cloud computing would streamline efficiency and reduce staff labor in...
Energy 101: Energy Efficient Data Centers

ScienceCinema

None

2018-04-16

Data centers provide mission-critical computing functions vital to the daily operation of top U.S. economic, scientific, and technological organizations. These data centers consume large amounts of energy to run and maintain their computer systems, servers, and associated high-performance componentsâup to 3% of all U.S. electricity powers data centers. And as more information comes online, data centers will consume even more energy. Data centers can become more energy efficient by incorporating features like power-saving "stand-by" modes, energy monitoring software, and efficient cooling systems instead of energy-intensive air conditioners. These and other efficiency improvements to data centers can produce significant energy savings, reduce the load on the electric grid, and help protect the nation by increasing the reliability of critical computer operations.
A high-order strong stability preserving Runge-Kutta method for three-dimensional full waveform modeling and inversion of anelastic models

NASA Astrophysics Data System (ADS)

Wang, N.; Shen, Y.; Yang, D.; Bao, X.; Li, J.; Zhang, W.

2017-12-01

Accurate and efficient forward modeling methods are important for high resolution full waveform inversion. Compared with the elastic case, solving anelastic wave equation requires more computational time, because of the need to compute additional material-independent anelastic functions. A numerical scheme with a large Courant-Friedrichs-Lewy (CFL) condition number enables us to use a large time step to simulate wave propagation, which improves computational efficiency. In this work, we apply the fourth-order strong stability preserving Runge-Kutta method with an optimal CFL coeffiecient to solve the anelastic wave equation. We use a fourth order DRP/opt MacCormack scheme for the spatial discretization, and we approximate the rheological behaviors of the Earth by using the generalized Maxwell body model. With a larger CFL condition number, we find that the computational efficient is significantly improved compared with the traditional fourth-order Runge-Kutta method. Then, we apply the scattering-integral method for calculating travel time and amplitude sensitivity kernels with respect to velocity and attenuation structures. For each source, we carry out one forward simulation and save the time-dependent strain tensor. For each station, we carry out three `backward' simulations for the three components and save the corresponding strain tensors. The sensitivity kernels at each point in the medium are the convolution of the two sets of the strain tensors. Finally, we show several synthetic tests to verify the effectiveness of the strong stability preserving Runge-Kutta method in generating accurate synthetics in full waveform modeling, and in generating accurate strain tensors for calculating sensitivity kernels at regional and global scales.
Efficient Computation of Coherent Synchrotron Radiation Taking into Account 6D Phase Space Distribution of Emitting Electrons

NASA Astrophysics Data System (ADS)

Chubar, O.; Couprie, M.-E.

2007-01-01

CPU-efficient method for calculation of the frequency domain electric field of Coherent Synchrotron Radiation (CSR) taking into account 6D phase space distribution of electrons in a bunch is proposed. As an application example, calculation results of the CSR emitted by an electron bunch with small longitudinal and large transverse sizes are presented. Such situation can be realized in storage rings or ERLs by transverse deflection of the electron bunches in special crab-type RF cavities, i.e. using the technique proposed for the generation of femtosecond X-ray pulses (A. Zholents et. al., 1999). The computation, performed for the parameters of the SOLEIL storage ring, shows that if the transverse size of electron bunch is larger than the diffraction limit for single-electron SR at a given wavelength — this affects the angular distribution of the CSR at this wavelength and reduces the coherent flux. Nevertheless, for transverse bunch dimensions up to several millimeters and a longitudinal bunch size smaller than hundred micrometers, the resulting CSR flux in the far infrared spectral range is still many orders of magnitude higher than the flux of incoherent SR, and therefore can be considered for practical use.
Improved treatment of exact exchange in Quantum ESPRESSO

DOE PAGES

Barnes, Taylor A.; Kurth, Thorsten; Carrier, Pierre; ...

2017-01-18

Here, we present an algorithm and implementation for the parallel computation of exact exchange in Quantum ESPRESSO (QE) that exhibits greatly improved strong scaling. QE is an open-source software package for electronic structure calculations using plane wave density functional theory, and supports the use of local, semi-local, and hybrid DFT functionals. Wider application of hybrid functionals is desirable for the improved simulation of electronic band energy alignments and thermodynamic properties, but the computational complexity of evaluating the exact exchange potential limits the practical application of hybrid functionals to large systems and requires efficient implementations. We demonstrate that existing implementations ofmore » hybrid DFT that utilize a single data structure for both the local and exact exchange regions of the code are significantly limited in the degree of parallelization achievable. We present a band-pair parallelization approach, in which the calculation of exact exchange is parallelized and evaluated independently from the parallelization of the remainder of the calculation, with the wavefunction data being efficiently transformed on-the-fly into a form that is optimal for each part of the calculation. For a 64 water molecule supercell, our new algorithm reduces the overall time to solution by nearly an order of magnitude.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Barnes, Taylor A.; Kurth, Thorsten; Carrier, Pierre

Here, we present an algorithm and implementation for the parallel computation of exact exchange in Quantum ESPRESSO (QE) that exhibits greatly improved strong scaling. QE is an open-source software package for electronic structure calculations using plane wave density functional theory, and supports the use of local, semi-local, and hybrid DFT functionals. Wider application of hybrid functionals is desirable for the improved simulation of electronic band energy alignments and thermodynamic properties, but the computational complexity of evaluating the exact exchange potential limits the practical application of hybrid functionals to large systems and requires efficient implementations. We demonstrate that existing implementations ofmore » hybrid DFT that utilize a single data structure for both the local and exact exchange regions of the code are significantly limited in the degree of parallelization achievable. We present a band-pair parallelization approach, in which the calculation of exact exchange is parallelized and evaluated independently from the parallelization of the remainder of the calculation, with the wavefunction data being efficiently transformed on-the-fly into a form that is optimal for each part of the calculation. For a 64 water molecule supercell, our new algorithm reduces the overall time to solution by nearly an order of magnitude.« less
Real-time haptic cutting of high-resolution soft tissues.

PubMed

Wu, Jun; Westermann, Rüdiger; Dick, Christian

2014-01-01

We present our systematic efforts in advancing the computational performance of physically accurate soft tissue cutting simulation, which is at the core of surgery simulators in general. We demonstrate a real-time performance of 15 simulation frames per second for haptic soft tissue cutting of a deformable body at an effective resolution of 170,000 finite elements. This is achieved by the following innovative components: (1) a linked octree discretization of the deformable body, which allows for fast and robust topological modifications of the simulation domain, (2) a composite finite element formulation, which thoroughly reduces the number of simulation degrees of freedom and thus enables to carefully balance simulation performance and accuracy, (3) a highly efficient geometric multigrid solver for solving the linear systems of equations arising from implicit time integration, (4) an efficient collision detection algorithm that effectively exploits the composition structure, and (5) a stable haptic rendering algorithm for computing the feedback forces. Considering that our method increases the finite element resolution for physically accurate real-time soft tissue cutting simulation by an order of magnitude, our technique has a high potential to significantly advance the realism of surgery simulators.
Salary Management System for Small and Medium-sized Enterprises

NASA Astrophysics Data System (ADS)

Hao, Zhang; Guangli, Xu; Yuhuan, Zhang; Yilong, Lei

Small and Medium-sized Enterprises (SMEs) in the process of wage entry, calculation, the total number are needed to be done manually in the past, the data volume is quite large, processing speed is low, and it is easy to make error, which is resulting in low efficiency. The main purpose of writing this paper is to present the basis of salary management system, establish a scientific database, the computer payroll system, using the computer instead of a lot of past manual work in order to reduce duplication of staff labor, it will improve working efficiency.This system combines the actual needs of SMEs, through in-depth study and practice of the C/S mode, PowerBuilder10.0 development tools, databases and SQL language, Completed a payroll system needs analysis, database design, application design and development work. Wages, departments, units and personnel database file are included in this system, and have data management, department management, personnel management and other functions, through the control and management of the database query, add, delete, modify, and other functions can be realized. This system is reasonable design, a more complete function, stable operation has been tested to meet the basic needs of the work.
Computationally efficient method for Fourier transform of highly chirped pulses for laser and parametric amplifier modeling.

PubMed

Andrianov, Alexey; Szabo, Aron; Sergeev, Alexander; Kim, Arkady; Chvykov, Vladimir; Kalashnikov, Mikhail

2016-11-14

We developed an improved approach to calculate the Fourier transform of signals with arbitrary large quadratic phase which can be efficiently implemented in numerical simulations utilizing Fast Fourier transform. The proposed algorithm significantly reduces the computational cost of Fourier transform of a highly chirped and stretched pulse by splitting it into two separate transforms of almost transform limited pulses, thereby reducing the required grid size roughly by a factor of the pulse stretching. The application of our improved Fourier transform algorithm in the split-step method for numerical modeling of CPA and OPCPA shows excellent agreement with standard algorithms.
Parallel implementation of geometrical shock dynamics for two dimensional converging shock waves

NASA Astrophysics Data System (ADS)

Qiu, Shi; Liu, Kuang; Eliasson, Veronica

2016-10-01

Geometrical shock dynamics (GSD) theory is an appealing method to predict the shock motion in the sense that it is more computationally efficient than solving the traditional Euler equations, especially for converging shock waves. However, to solve and optimize large scale configurations, the main bottleneck is the computational cost. Among the existing numerical GSD schemes, there is only one that has been implemented on parallel computers, with the purpose to analyze detonation waves. To extend the computational advantage of the GSD theory to more general applications such as converging shock waves, a numerical implementation using a spatial decomposition method has been coupled with a front tracking approach on parallel computers. In addition, an efficient tridiagonal system solver for massively parallel computers has been applied to resolve the most expensive function in this implementation, resulting in an efficiency of 0.93 while using 32 HPCC cores. Moreover, symmetric boundary conditions have been developed to further reduce the computational cost, achieving a speedup of 19.26 for a 12-sided polygonal converging shock.

Time-Accurate Local Time Stepping and High-Order Time CESE Methods for Multi-Dimensional Flows Using Unstructured Meshes

NASA Technical Reports Server (NTRS)

Chang, Chau-Lyan; Venkatachari, Balaji Shankar; Cheng, Gary

2013-01-01

With the wide availability of affordable multiple-core parallel supercomputers, next generation numerical simulations of flow physics are being focused on unsteady computations for problems involving multiple time scales and multiple physics. These simulations require higher solution accuracy than most algorithms and computational fluid dynamics codes currently available. This paper focuses on the developmental effort for high-fidelity multi-dimensional, unstructured-mesh flow solvers using the space-time conservation element, solution element (CESE) framework. Two approaches have been investigated in this research in order to provide high-accuracy, cross-cutting numerical simulations for a variety of flow regimes: 1) time-accurate local time stepping and 2) highorder CESE method. The first approach utilizes consistent numerical formulations in the space-time flux integration to preserve temporal conservation across the cells with different marching time steps. Such approach relieves the stringent time step constraint associated with the smallest time step in the computational domain while preserving temporal accuracy for all the cells. For flows involving multiple scales, both numerical accuracy and efficiency can be significantly enhanced. The second approach extends the current CESE solver to higher-order accuracy. Unlike other existing explicit high-order methods for unstructured meshes, the CESE framework maintains a CFL condition of one for arbitrarily high-order formulations while retaining the same compact stencil as its second-order counterpart. For large-scale unsteady computations, this feature substantially enhances numerical efficiency. Numerical formulations and validations using benchmark problems are discussed in this paper along with realistic examples.
Efficient quantum computing using coherent photon conversion.

PubMed

Langford, N K; Ramelow, S; Prevedel, R; Munro, W J; Milburn, G J; Zeilinger, A

2011-10-12

Single photons are excellent quantum information carriers: they were used in the earliest demonstrations of entanglement and in the production of the highest-quality entanglement reported so far. However, current schemes for preparing, processing and measuring them are inefficient. For example, down-conversion provides heralded, but randomly timed, single photons, and linear optics gates are inherently probabilistic. Here we introduce a deterministic process--coherent photon conversion (CPC)--that provides a new way to generate and process complex, multiquanta states for photonic quantum information applications. The technique uses classically pumped nonlinearities to induce coherent oscillations between orthogonal states of multiple quantum excitations. One example of CPC, based on a pumped four-wave-mixing interaction, is shown to yield a single, versatile process that provides a full set of photonic quantum processing tools. This set satisfies the DiVincenzo criteria for a scalable quantum computing architecture, including deterministic multiqubit entanglement gates (based on a novel form of photon-photon interaction), high-quality heralded single- and multiphoton states free from higher-order imperfections, and robust, high-efficiency detection. It can also be used to produce heralded multiphoton entanglement, create optically switchable quantum circuits and implement an improved form of down-conversion with reduced higher-order effects. Such tools are valuable building blocks for many quantum-enabled technologies. Finally, using photonic crystal fibres we experimentally demonstrate quantum correlations arising from a four-colour nonlinear process suitable for CPC and use these measurements to study the feasibility of reaching the deterministic regime with current technology. Our scheme, which is based on interacting bosonic fields, is not restricted to optical systems but could also be implemented in optomechanical, electromechanical and superconducting systems with extremely strong intrinsic nonlinearities. Furthermore, exploiting higher-order nonlinearities with multiple pump fields yields a mechanism for multiparty mediation of the complex, coherent dynamics.
Efficient Simulation of Wing Modal Response: Application of 2nd Order Shape Sensitivities and Neural Networks

NASA Technical Reports Server (NTRS)

Kapania, Rakesh K.; Liu, Youhua

2000-01-01

At the preliminary design stage of a wing structure, an efficient simulation, one needing little computation but yielding adequately accurate results for various response quantities, is essential in the search of optimal design in a vast design space. In the present paper, methods of using sensitivities up to 2nd order, and direct application of neural networks are explored. The example problem is how to decide the natural frequencies of a wing given the shape variables of the structure. It is shown that when sensitivities cannot be obtained analytically, the finite difference approach is usually more reliable than a semi-analytical approach provided an appropriate step size is used. The use of second order sensitivities is proved of being able to yield much better results than the case where only the first order sensitivities are used. When neural networks are trained to relate the wing natural frequencies to the shape variables, a negligible computation effort is needed to accurately determine the natural frequencies of a new design.
Ordered fast fourier transforms on a massively parallel hypercube multiprocessor

NASA Technical Reports Server (NTRS)

Tong, Charles; Swarztrauber, Paul N.

1989-01-01

Design alternatives for ordered Fast Fourier Transformation (FFT) algorithms were examined on massively parallel hypercube multiprocessors such as the Connection Machine. Particular emphasis is placed on reducing communication which is known to dominate the overall computing time. To this end, the order and computational phases of the FFT were combined, and the sequence to processor maps that reduce communication were used. The class of ordered transforms is expanded to include any FFT in which the order of the transform is the same as that of the input sequence. Two such orderings are examined, namely, standard-order and A-order which can be implemented with equal ease on the Connection Machine where orderings are determined by geometries and priorities. If the sequence has N = 2 exp r elements and the hypercube has P = 2 exp d processors, then a standard-order FFT can be implemented with d + r/2 + 1 parallel transmissions. An A-order sequence can be transformed with 2d - r/2 parallel transmissions which is r - d + 1 fewer than the standard order. A parallel method for computing the trigonometric coefficients is presented that does not use trigonometric functions or interprocessor communication. A performance of 0.9 GFLOPS was obtained for an A-order transform on the Connection Machine.
Discontinuous Galerkin Methods and High-Speed Turbulent Flows

NASA Astrophysics Data System (ADS)

Atak, Muhammed; Larsson, Johan; Munz, Claus-Dieter

2014-11-01

Discontinuous Galerkin methods gain increasing importance within the CFD community as they combine arbitrary high order of accuracy in complex geometries with parallel efficiency. Particularly the discontinuous Galerkin spectral element method (DGSEM) is a promising candidate for both the direct numerical simulation (DNS) and large eddy simulation (LES) of turbulent flows due to its excellent scaling attributes. In this talk, we present a DNS of a compressible turbulent boundary layer along a flat plate at a free-stream Mach number of M = 2.67 and assess the computational efficiency of the DGSEM at performing high-fidelity simulations of both transitional and turbulent boundary layers. We compare the accuracy of the results as well as the computational performance to results using a high order finite difference method.
Investigation to realize a computationally efficient implementation of the high-order instantaneous-moments-based fringe analysis method

NASA Astrophysics Data System (ADS)

Gorthi, Sai Siva; Rajshekhar, Gannavarpu; Rastogi, Pramod

2010-06-01

Recently, a high-order instantaneous moments (HIM)-operator-based method was proposed for accurate phase estimation in digital holographic interferometry. The method relies on piece-wise polynomial approximation of phase and subsequent evaluation of the polynomial coefficients from the HIM operator using single-tone frequency estimation. The work presents a comparative analysis of the performance of different single-tone frequency estimation techniques, like Fourier transform followed by optimization, estimation of signal parameters by rotational invariance technique (ESPRIT), multiple signal classification (MUSIC), and iterative frequency estimation by interpolation on Fourier coefficients (IFEIF) in HIM-operator-based methods for phase estimation. Simulation and experimental results demonstrate the potential of the IFEIF technique with respect to computational efficiency and estimation accuracy.
Textbook Multigrid Efficiency for the Steady Euler Equations

NASA Technical Reports Server (NTRS)

Roberts, Thomas W.; Sidilkover, David; Swanson, R. C.

2004-01-01

A fast multigrid solver for the steady incompressible Euler equations is presented. Unlike time-marching schemes, this approach uses relaxation of the steady equations. Application of this method results in a discretization that correctly distinguishes between the advection and elliptic parts of the operator, allowing efficient smoothers to be constructed. Solvers for both unstructured triangular grids and structured quadrilateral grids have been written. Computations for channel flow and flow over a nonlifting airfoil have computed. Using Gauss-Seidel relaxation ordered in the flow direction, textbook multigrid convergence rates of nearly one order-of-magnitude residual reduction per multigrid cycle are achieved, independent of the grid spacing. This approach also may be applied to the compressible Euler equations and the incompressible Navier-Stokes equations.
Logistics Control Facility: A Normative Model for Total Asset Visibility in the Air Force Logistics System

DTIC Science & Technology

1994-09-01

IIssue Computers, information systems, and communication systems are being increasingly used in transportation, warehousing, order processing , materials...inventory levels, reduced order processing times, reduced order processing costs, and increased customer satisfaction. While purchasing and transportation...process, the speed in which crders are processed would increase significantly. Lowering the order processing time in turn lowers the lead time, which in
Optimized distributed computing environment for mask data preparation

NASA Astrophysics Data System (ADS)

Ahn, Byoung-Sup; Bang, Ju-Mi; Ji, Min-Kyu; Kang, Sun; Jang, Sung-Hoon; Choi, Yo-Han; Ki, Won-Tai; Choi, Seong-Woon; Han, Woo-Sung

2005-11-01

As the critical dimension (CD) becomes smaller, various resolution enhancement techniques (RET) are widely adopted. In developing sub-100nm devices, the complexity of optical proximity correction (OPC) is severely increased and applied OPC layers are expanded to non-critical layers. The transformation of designed pattern data by OPC operation causes complexity, which cause runtime overheads to following steps such as mask data preparation (MDP), and collapse of existing design hierarchy. Therefore, many mask shops exploit the distributed computing method in order to reduce the runtime of mask data preparation rather than exploit the design hierarchy. Distributed computing uses a cluster of computers that are connected to local network system. However, there are two things to limit the benefit of the distributing computing method in MDP. First, every sequential MDP job, which uses maximum number of available CPUs, is not efficient compared to parallel MDP job execution due to the input data characteristics. Second, the runtime enhancement over input cost is not sufficient enough since the scalability of fracturing tools is limited. In this paper, we will discuss optimum load balancing environment that is useful in increasing the uptime of distributed computing system by assigning appropriate number of CPUs for each input design data. We will also describe the distributed processing (DP) parameter optimization to obtain maximum throughput in MDP job processing.
A single-loop optimization method for reliability analysis with second order uncertainty

NASA Astrophysics Data System (ADS)

Xie, Shaojun; Pan, Baisong; Du, Xiaoping

2015-08-01

Reliability analysis may involve random variables and interval variables. In addition, some of the random variables may have interval distribution parameters owing to limited information. This kind of uncertainty is called second order uncertainty. This article develops an efficient reliability method for problems involving the three aforementioned types of uncertain input variables. The analysis produces the maximum and minimum reliability and is computationally demanding because two loops are needed: a reliability analysis loop with respect to random variables and an interval analysis loop for extreme responses with respect to interval variables. The first order reliability method and nonlinear optimization are used for the two loops, respectively. For computational efficiency, the two loops are combined into a single loop by treating the Karush-Kuhn-Tucker (KKT) optimal conditions of the interval analysis as constraints. Three examples are presented to demonstrate the proposed method.
Ordered fast Fourier transforms on a massively parallel hypercube multiprocessor

NASA Technical Reports Server (NTRS)

Tong, Charles; Swarztrauber, Paul N.

1991-01-01

The present evaluation of alternative, massively parallel hypercube processor-applicable designs for ordered radix-2 decimation-in-frequency FFT algorithms gives attention to the reduction of computation time-dominating communication. A combination of the order and computational phases of the FFT is accordingly employed, in conjunction with sequence-to-processor maps which reduce communication. Two orderings, 'standard' and 'cyclic', in which the order of the transform is the same as that of the input sequence, can be implemented with ease on the Connection Machine (where orderings are determined by geometries and priorities. A parallel method for trigonometric coefficient computation is presented which does not employ trigonometric functions or interprocessor communication.
High-Performance Computing Data Center | Computational Science | NREL

Science.gov Websites

liquid cooling to achieve its very low PUE, then captures and reuses waste heat as the primary heating dry cooler that uses refrigerant in a passive cycle to dissipate heat-is reducing onsite water Measuring efficiency through PUE Warm-water liquid cooling Re-using waste heat from computing components
Efficient Computation of Difference Vibrational Spectra in Isothermal-Isobaric Ensemble.

PubMed

Joutsuka, Tatsuya; Morita, Akihiro

2016-11-03

Difference spectroscopy between two close systems is widely used to augment its selectivity to the different parts of the observed system, though the molecular dynamics calculation of tiny difference spectra would be computationally extraordinary demanding by subtraction of two spectra. Therefore, we have proposed an efficient computational algorithm of difference spectra without resorting to the subtraction. The present paper reports our extension of the theoretical method in the isothermal-isobaric (NPT) ensemble. The present theory expands our applications of analysis including pressure dependence of the spectra. We verified that the present theory yields accurate difference spectra in the NPT condition as well, with remarkable computational efficiency over the straightforward subtraction by several orders of magnitude. This method is further applied to vibrational spectra of liquid water with varying pressure and succeeded in reproducing tiny difference spectra by pressure change. The anomalous pressure dependence is elucidated in relation to other properties of liquid water.
Protein pharmacophore selection using hydration-site analysis

PubMed Central

Hu, Bingjie; Lill, Markus A.

2012-01-01

Virtual screening using pharmacophore models is an efficient method to identify potential lead compounds for target proteins. Pharmacophore models based on protein structures are advantageous because a priori knowledge of active ligands is not required and the models are not biased by the chemical space of previously identified actives. However, in order to capture most potential interactions between all potentially binding ligands and the protein, the size of the pharmacophore model, i.e. number of pharmacophore elements, is typically quite large and therefore reduces the efficiency of pharmacophore based screening. We have developed a new method to select important pharmacophore elements using hydration-site information. The basic premise is that ligand functional groups that replace water molecules in the apo protein contribute strongly to the overall binding affinity of the ligand, due to the additional free energy gained from releasing the water molecule into the bulk solvent. We computed the free energy of water released from the binding site for each hydration site using thermodynamic analysis of molecular dynamics (MD) simulations. Pharmacophores which are co-localized with hydration sites with estimated favorable contributions to the free energy of binding are selected to generate a reduced pharmacophore model. We constructed reduced pharmacophore models for three protein systems and demonstrated good enrichment quality combined with high efficiency. The reduction in pharmacophore model size reduces the required screening time by a factor of 200–500 compared to using all protein pharmacophore elements. We also describe a training process using a small set of known actives to reliably select the optimal set of criteria for pharmacophore selection for each protein system. PMID:22397751
Machine learning on-a-chip: a high-performance low-power reusable neuron architecture for artificial neural networks in ECG classifications.

PubMed

Sun, Yuwen; Cheng, Allen C

2012-07-01

Artificial neural networks (ANNs) are a promising machine learning technique in classifying non-linear electrocardiogram (ECG) signals and recognizing abnormal patterns suggesting risks of cardiovascular diseases (CVDs). In this paper, we propose a new reusable neuron architecture (RNA) enabling a performance-efficient and cost-effective silicon implementation for ANN. The RNA architecture consists of a single layer of physical RNA neurons, each of which is designed to use minimal hardware resource (e.g., a single 2-input multiplier-accumulator is used to compute the dot product of two vectors). By carefully applying the principal of time sharing, RNA can multiplexs this single layer of physical neurons to efficiently execute both feed-forward and back-propagation computations of an ANN while conserving the area and reducing the power dissipation of the silicon. A three-layer 51-30-12 ANN is implemented in RNA to perform the ECG classification for CVD detection. This RNA hardware also allows on-chip automatic training update. A quantitative design space exploration in area, power dissipation, and execution speed between RNA and three other implementations representative of different reusable hardware strategies is presented and discussed. Compared with an equivalent software implementation in C executed on an embedded microprocessor, the RNA ASIC achieves three orders of magnitude improvements in both the execution speed and the energy efficiency. Copyright © 2012 Elsevier Ltd. All rights reserved.
A hydrological emulator for global applications - HE v1.0.0

NASA Astrophysics Data System (ADS)

Liu, Yaling; Hejazi, Mohamad; Li, Hongyi; Zhang, Xuesong; Leng, Guoyong

2018-03-01

While global hydrological models (GHMs) are very useful in exploring water resources and interactions between the Earth and human systems, their use often requires numerous model inputs, complex model calibration, and high computation costs. To overcome these challenges, we construct an efficient open-source and ready-to-use hydrological emulator (HE) that can mimic complex GHMs at a range of spatial scales (e.g., basin, region, globe). More specifically, we construct both a lumped and a distributed scheme of the HE based on the monthly abcd model to explore the tradeoff between computational cost and model fidelity. Model predictability and computational efficiency are evaluated in simulating global runoff from 1971 to 2010 with both the lumped and distributed schemes. The results are compared against the runoff product from the widely used Variable Infiltration Capacity (VIC) model. Our evaluation indicates that the lumped and distributed schemes present comparable results regarding annual total quantity, spatial pattern, and temporal variation of the major water fluxes (e.g., total runoff, evapotranspiration) across the global 235 basins (e.g., correlation coefficient r between the annual total runoff from either of these two schemes and the VIC is > 0.96), except for several cold (e.g., Arctic, interior Tibet), dry (e.g., North Africa) and mountainous (e.g., Argentina) regions. Compared against the monthly total runoff product from the VIC (aggregated from daily runoff), the global mean Kling-Gupta efficiencies are 0.75 and 0.79 for the lumped and distributed schemes, respectively, with the distributed scheme better capturing spatial heterogeneity. Notably, the computation efficiency of the lumped scheme is 2 orders of magnitude higher than the distributed one and 7 orders more efficient than the VIC model. A case study of uncertainty analysis for the world's 16 basins with top annual streamflow is conducted using 100 000 model simulations, and it demonstrates the lumped scheme's extraordinary advantage in computational efficiency. Our results suggest that the revised lumped abcd model can serve as an efficient and reasonable HE for complex GHMs and is suitable for broad practical use, and the distributed scheme is also an efficient alternative if spatial heterogeneity is of more interest.
Hadoop for High-Performance Climate Analytics: Use Cases and Lessons Learned

NASA Technical Reports Server (NTRS)

Tamkin, Glenn

2013-01-01

Scientific data services are a critical aspect of the NASA Center for Climate Simulations mission (NCCS). Hadoop, via MapReduce, provides an approach to high-performance analytics that is proving to be useful to data intensive problems in climate research. It offers an analysis paradigm that uses clusters of computers and combines distributed storage of large data sets with parallel computation. The NCCS is particularly interested in the potential of Hadoop to speed up basic operations common to a wide range of analyses. In order to evaluate this potential, we prototyped a series of canonical MapReduce operations over a test suite of observational and climate simulation datasets. The initial focus was on averaging operations over arbitrary spatial and temporal extents within Modern Era Retrospective- Analysis for Research and Applications (MERRA) data. After preliminary results suggested that this approach improves efficiencies within data intensive analytic workflows, we invested in building a cyber infrastructure resource for developing a new generation of climate data analysis capabilities using Hadoop. This resource is focused on reducing the time spent in the preparation of reanalysis data used in data-model inter-comparison, a long sought goal of the climate community. This paper summarizes the related use cases and lessons learned.
Dimensional reduction of the Standard Model coupled to a new singlet scalar field

NASA Astrophysics Data System (ADS)

Brauner, Tomáš; Tenkanen, Tuomas V. I.; Tranberg, Anders; Vuorinen, Aleksi; Weir, David J.

2017-03-01

We derive an effective dimensionally reduced theory for the Standard Model augmented by a real singlet scalar. We treat the singlet as a superheavy field and integrate it out, leaving an effective theory involving only the Higgs and SU(2) L × U(1) Y gauge fields, identical to the one studied previously for the Standard Model. This opens up the possibility of efficiently computing the order and strength of the electroweak phase transition, numerically and nonperturbatively, in this extension of the Standard Model. Understanding the phase diagram is crucial for models of electroweak baryogenesis and for studying the production of gravitational waves at thermal phase transitions.
Efficient numerical simulation of an electrothermal de-icer pad

NASA Technical Reports Server (NTRS)

Roelke, R. J.; Keith, T. G., Jr.; De Witt, K. J.; Wright, W. B.

1987-01-01

In this paper, a new approach to calculate the transient thermal behavior of an iced electrothermal de-icer pad was developed. The method of splines was used to obtain the temperature distribution within the layered pad. Splines were used in order to create a tridiagonal system of equations that could be directly solved by Gauss elimination. The Stefan problem was solved using the enthalpy method along with a recent implicit technique. Only one to three iterations were needed to locate the melt front during any time step. Computational times were shown to be greatly reduced over those of an existing one dimensional procedure without any reduction in accuracy; the curent technique was more than 10 times faster.
Comprehensive automatic assessment of retinal vascular abnormalities for computer-assisted retinopathy grading.

PubMed

Joshi, Vinayak; Agurto, Carla; VanNess, Richard; Nemeth, Sheila; Soliz, Peter; Barriga, Simon

2014-01-01

One of the most important signs of systemic disease that presents on the retina is vascular abnormalities such as in hypertensive retinopathy. Manual analysis of fundus images by human readers is qualitative and lacks in accuracy, consistency and repeatability. Present semi-automatic methods for vascular evaluation are reported to increase accuracy and reduce reader variability, but require extensive reader interaction; thus limiting the software-aided efficiency. Automation thus holds a twofold promise. First, decrease variability while increasing accuracy, and second, increasing the efficiency. In this paper we propose fully automated software as a second reader system for comprehensive assessment of retinal vasculature; which aids the readers in the quantitative characterization of vessel abnormalities in fundus images. This system provides the reader with objective measures of vascular morphology such as tortuosity, branching angles, as well as highlights of areas with abnormalities such as artery-venous nicking, copper and silver wiring, and retinal emboli; in order for the reader to make a final screening decision. To test the efficacy of our system, we evaluated the change in performance of a newly certified retinal reader when grading a set of 40 color fundus images with and without the assistance of the software. The results demonstrated an improvement in reader's performance with the software assistance, in terms of accuracy of detection of vessel abnormalities, determination of retinopathy, and reading time. This system enables the reader in making computer-assisted vasculature assessment with high accuracy and consistency, at a reduced reading time.

Usefulness of a Regional Health Care Information System in primary care: a case study.

PubMed

Maass, Marianne C; Asikainen, Paula; Mäenpää, Tiina; Wanne, Olli; Suominen, Tarja

2008-08-01

The goal of this paper is to describe some benefits and possible cost consequences of computer based access to specialised health care information. A before-after activity analysis regarding 20 diabetic patients' clinical appointments was performed in a Health Centre in Satakunta region in Finland. Cost data, an interview, time-and-motion studies, and flow charts based on modelling were applied. Access to up-to-date diagnostic information reduced redundant clinical re-appointments, repeated tests, and mail orders for missing data. Timely access to diagnostic information brought about several benefits regarding workflow, patient care, and disease management. These benefits resulted in theoretical net cost savings. The study results indicated that Regional Information Systems may be useful tools to support performance and improve efficiency. However, further studies are required in order to verify how the monetary savings would impact the performance of Health Care Units.
The RKGL method for the numerical solution of initial-value problems

NASA Astrophysics Data System (ADS)

Prentice, J. S. C.

2008-04-01

We introduce the RKGL method for the numerical solution of initial-value problems of the form y'=f(x,y), y(a)=[alpha]. The method is a straightforward modification of a classical explicit Runge-Kutta (RK) method, into which Gauss-Legendre (GL) quadrature has been incorporated. The idea is to enhance the efficiency of the method by reducing the number of times the derivative f(x,y) needs to be computed. The incorporation of GL quadrature serves to enhance the global order of the method by, relative to the underlying RK method. Indeed, the RKGL method has a global error of the form Ahr+1+Bh2m, where r is the order of the RK method and m is the number of nodes used in the GL component. In this paper we derive this error expression and show that RKGL is consistent, convergent and strongly stable.
Seismic and acoustic signal identification algorithms

DOE Office of Scientific and Technical Information (OSTI.GOV)

LADD,MARK D.; ALAM,M. KATHLEEN; SLEEFE,GERARD E.

2000-04-03

This paper will describe an algorithm for detecting and classifying seismic and acoustic signals for unattended ground sensors. The algorithm must be computationally efficient and continuously process a data stream in order to establish whether or not a desired signal has changed state (turned-on or off). The paper will focus on describing a Fourier based technique that compares the running power spectral density estimate of the data to a predetermined signature in order to determine if the desired signal has changed state. How to establish the signature and the detection thresholds will be discussed as well as the theoretical statisticsmore » of the algorithm for the Gaussian noise case with results from simulated data. Actual seismic data results will also be discussed along with techniques used to reduce false alarms due to the inherent nonstationary noise environments found with actual data.« less
Evaluation of reinitialization-free nonvolatile computer systems for energy-harvesting Internet of things applications

NASA Astrophysics Data System (ADS)

Onizawa, Naoya; Tamakoshi, Akira; Hanyu, Takahiro

2017-08-01

In this paper, reinitialization-free nonvolatile computer systems are designed and evaluated for energy-harvesting Internet of things (IoT) applications. In energy-harvesting applications, as power supplies generated from renewable power sources cause frequent power failures, data processed need to be backed up when power failures occur. Unless data are safely backed up before power supplies diminish, reinitialization processes are required when power supplies are recovered, which results in low energy efficiencies and slow operations. Using nonvolatile devices in processors and memories can realize a faster backup than a conventional volatile computer system, leading to a higher energy efficiency. To evaluate the energy efficiency upon frequent power failures, typical computer systems including processors and memories are designed using 90 nm CMOS or CMOS/magnetic tunnel junction (MTJ) technologies. Nonvolatile ARM Cortex-M0 processors with 4 kB MRAMs are evaluated using a typical computing benchmark program, Dhrystone, which shows a few order-of-magnitude reductions in energy in comparison with a volatile processor with SRAM.
Three-Dimensional Navier-Stokes Method with Two-Equation Turbulence Models for Efficient Numerical Simulation of Hypersonic Flows

NASA Technical Reports Server (NTRS)

Bardina, J. E.

1994-01-01

A new computational efficient 3-D compressible Reynolds-averaged implicit Navier-Stokes method with advanced two equation turbulence models for high speed flows is presented. All convective terms are modeled using an entropy satisfying higher-order Total Variation Diminishing (TVD) scheme based on implicit upwind flux-difference split approximations and arithmetic averaging procedure of primitive variables. This method combines the best features of data management and computational efficiency of space marching procedures with the generality and stability of time dependent Navier-Stokes procedures to solve flows with mixed supersonic and subsonic zones, including streamwise separated flows. Its robust stability derives from a combination of conservative implicit upwind flux-difference splitting with Roe's property U to provide accurate shock capturing capability that non-conservative schemes do not guarantee, alternating symmetric Gauss-Seidel 'method of planes' relaxation procedure coupled with a three-dimensional two-factor diagonal-dominant approximate factorization scheme, TVD flux limiters of higher-order flux differences satisfying realizability, and well-posed characteristic-based implicit boundary-point a'pproximations consistent with the local characteristics domain of dependence. The efficiency of the method is highly increased with Newton Raphson acceleration which allows convergence in essentially one forward sweep for supersonic flows. The method is verified by comparing with experiment and other Navier-Stokes methods. Here, results of adiabatic and cooled flat plate flows, compression corner flow, and 3-D hypersonic shock-wave/turbulent boundary layer interaction flows are presented. The robust 3-D method achieves a better computational efficiency of at least one order of magnitude over the CNS Navier-Stokes code. It provides cost-effective aerodynamic predictions in agreement with experiment, and the capability of predicting complex flow structures in complex geometries with good accuracy.
Hybrid reduced order modeling for assembly calculations

DOE PAGES

Bang, Youngsuk; Abdel-Khalik, Hany S.; Jessee, Matthew A.; ...

2015-08-14

While the accuracy of assembly calculations has greatly improved due to the increase in computer power enabling more refined description of the phase space and use of more sophisticated numerical algorithms, the computational cost continues to increase which limits the full utilization of their effectiveness for routine engineering analysis. Reduced order modeling is a mathematical vehicle that scales down the dimensionality of large-scale numerical problems to enable their repeated executions on small computing environment, often available to end users. This is done by capturing the most dominant underlying relationships between the model's inputs and outputs. Previous works demonstrated the usemore » of the reduced order modeling for a single physics code, such as a radiation transport calculation. This paper extends those works to coupled code systems as currently employed in assembly calculations. Finally, numerical tests are conducted using realistic SCALE assembly models with resonance self-shielding, neutron transport, and nuclides transmutation/depletion models representing the components of the coupled code system.« less
Constructing Neuronal Network Models in Massively Parallel Environments.

PubMed

Ippen, Tammo; Eppler, Jochen M; Plesser, Hans E; Diesmann, Markus

2017-01-01

Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers.
Noniterative MAP reconstruction using sparse matrix representations.

PubMed

Cao, Guangzhi; Bouman, Charles A; Webb, Kevin J

2009-09-01

We present a method for noniterative maximum a posteriori (MAP) tomographic reconstruction which is based on the use of sparse matrix representations. Our approach is to precompute and store the inverse matrix required for MAP reconstruction. This approach has generally not been used in the past because the inverse matrix is typically large and fully populated (i.e., not sparse). In order to overcome this problem, we introduce two new ideas. The first idea is a novel theory for the lossy source coding of matrix transformations which we refer to as matrix source coding. This theory is based on a distortion metric that reflects the distortions produced in the final matrix-vector product, rather than the distortions in the coded matrix itself. The resulting algorithms are shown to require orthonormal transformations of both the measurement data and the matrix rows and columns before quantization and coding. The second idea is a method for efficiently storing and computing the required orthonormal transformations, which we call a sparse-matrix transform (SMT). The SMT is a generalization of the classical FFT in that it uses butterflies to compute an orthonormal transform; but unlike an FFT, the SMT uses the butterflies in an irregular pattern, and is numerically designed to best approximate the desired transforms. We demonstrate the potential of the noniterative MAP reconstruction with examples from optical tomography. The method requires offline computation to encode the inverse transform. However, once these offline computations are completed, the noniterative MAP algorithm is shown to reduce both storage and computation by well over two orders of magnitude, as compared to a linear iterative reconstruction methods.
Constructing Neuronal Network Models in Massively Parallel Environments

PubMed Central

Ippen, Tammo; Eppler, Jochen M.; Plesser, Hans E.; Diesmann, Markus

2017-01-01

Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers. PMID:28559808
Characterizing Deep Brain Stimulation effects in computationally efficient neural network models.

PubMed

Latteri, Alberta; Arena, Paolo; Mazzone, Paolo

2011-04-15

Recent studies on the medical treatment of Parkinson's disease (PD) led to the introduction of the so called Deep Brain Stimulation (DBS) technique. This particular therapy allows to contrast actively the pathological activity of various Deep Brain structures, responsible for the well known PD symptoms. This technique, frequently joined to dopaminergic drugs administration, replaces the surgical interventions implemented to contrast the activity of specific brain nuclei, called Basal Ganglia (BG). This clinical protocol gave the possibility to analyse and inspect signals measured from the electrodes implanted into the deep brain regions. The analysis of these signals led to the possibility to study the PD as a specific case of dynamical synchronization in biological neural networks, with the advantage to apply the theoretical analysis developed in such scientific field to find efficient treatments to face with this important disease. Experimental results in fact show that the PD neurological diseases are characterized by a pathological signal synchronization in BG. Parkinsonian tremor, for example, is ascribed to be caused by neuron populations of the Thalamic and Striatal structures that undergo an abnormal synchronization. On the contrary, in normal conditions, the activity of the same neuron populations do not appear to be correlated and synchronized. To study in details the effect of the stimulation signal on a pathological neural medium, efficient models of these neural structures were built, which are able to show, without any external input, the intrinsic properties of a pathological neural tissue, mimicking the BG synchronized dynamics.We start considering a model already introduced in the literature to investigate the effects of electrical stimulation on pathologically synchronized clusters of neurons. This model used Morris Lecar type neurons. This neuron model, although having a high level of biological plausibility, requires a large computational effort to simulate large scale networks. For this reason we considered a reduced order model, the Izhikevich one, which is computationally much lighter. The comparison between neural lattices built using both neuron models provided comparable results, both without traditional stimulation and in presence of all the stimulation protocols. This was a first result toward the study and simulation of the large scale neural networks involved in pathological dynamics.Using the reduced order model an inspection on the activity of two neural lattices was also carried out at the aim to analyze how the stimulation in one area could affect the dynamics in another area, like the usual medical treatment protocols require.The study of population dynamics that was carried out allowed us to investigate, through simulations, the positive effects of the stimulation signals in terms of desynchronization of the neural dynamics. The results obtained constitute a significant added value to the analysis of synchronization and desynchronization effects due to neural stimulation. This work gives the opportunity to more efficiently study the effect of stimulation in large scale yet computationally efficient neural networks. Results were compared both with the other mathematical models, using Morris Lecar and Izhikevich neurons, and with simulated Local Field Potentials (LFP).
Modal Substructuring of Geometrically Nonlinear Finite-Element Models

DOE PAGES

Kuether, Robert J.; Allen, Matthew S.; Hollkamp, Joseph J.

2015-12-21

The efficiency of a modal substructuring method depends on the component modes used to reduce each subcomponent model. Methods such as Craig–Bampton have been used extensively to reduce linear finite-element models with thousands or even millions of degrees of freedom down orders of magnitude while maintaining acceptable accuracy. A novel reduction method is proposed here for geometrically nonlinear finite-element models using the fixed-interface and constraint modes of the linearized system to reduce each subcomponent model. The geometric nonlinearity requires an additional cubic and quadratic polynomial function in the modal equations, and the nonlinear stiffness coefficients are determined by applying amore » series of static loads and using the finite-element code to compute the response. The geometrically nonlinear, reduced modal equations for each subcomponent are then coupled by satisfying compatibility and force equilibrium. This modal substructuring approach is an extension of the Craig–Bampton method and is readily applied to geometrically nonlinear models built directly within commercial finite-element packages. The efficiency of this new approach is demonstrated on two example problems: one that couples two geometrically nonlinear beams at a shared rotational degree of freedom, and another that couples an axial spring element to the axial degree of freedom of a geometrically nonlinear beam. The nonlinear normal modes of the assembled models are compared with those of a truth model to assess the accuracy of the novel modal substructuring approach.« less
Modal Substructuring of Geometrically Nonlinear Finite-Element Models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kuether, Robert J.; Allen, Matthew S.; Hollkamp, Joseph J.

The efficiency of a modal substructuring method depends on the component modes used to reduce each subcomponent model. Methods such as Craig–Bampton have been used extensively to reduce linear finite-element models with thousands or even millions of degrees of freedom down orders of magnitude while maintaining acceptable accuracy. A novel reduction method is proposed here for geometrically nonlinear finite-element models using the fixed-interface and constraint modes of the linearized system to reduce each subcomponent model. The geometric nonlinearity requires an additional cubic and quadratic polynomial function in the modal equations, and the nonlinear stiffness coefficients are determined by applying amore » series of static loads and using the finite-element code to compute the response. The geometrically nonlinear, reduced modal equations for each subcomponent are then coupled by satisfying compatibility and force equilibrium. This modal substructuring approach is an extension of the Craig–Bampton method and is readily applied to geometrically nonlinear models built directly within commercial finite-element packages. The efficiency of this new approach is demonstrated on two example problems: one that couples two geometrically nonlinear beams at a shared rotational degree of freedom, and another that couples an axial spring element to the axial degree of freedom of a geometrically nonlinear beam. The nonlinear normal modes of the assembled models are compared with those of a truth model to assess the accuracy of the novel modal substructuring approach.« less
Multiconfiguration Pair-Density Functional Theory Predicts Spin-State Ordering in Iron Complexes with the Same Accuracy as Complete Active Space Second-Order Perturbation Theory at a Significantly Reduced Computational Cost.

PubMed

Wilbraham, Liam; Verma, Pragya; Truhlar, Donald G; Gagliardi, Laura; Ciofini, Ilaria

2017-05-04

The spin-state orderings in nine Fe(II) and Fe(III) complexes with ligands of diverse ligand-field strength were investigated with multiconfiguration pair-density functional theory (MC-PDFT). The performance of this method was compared to that of complete active space second-order perturbation theory (CASPT2) and Kohn-Sham density functional theory. We also investigated the dependence of CASPT2 and MC-PDFT results on the size of the active-space. MC-PDFT reproduces the CASPT2 spin-state ordering, the dependence on the ligand field strength, and the dependence on active space at a computational cost that is significantly reduced as compared to CASPT2.
Techniques for Computation of Frequency Limited H∞ Norm

NASA Astrophysics Data System (ADS)

Haider, Shafiq; Ghafoor, Abdul; Imran, Muhammad; Fahad Mumtaz, Malik

2018-01-01

Traditional H ∞ norm depicts peak system gain over infinite frequency range, but many applications like filter design, model order reduction and controller design etc. require computation of peak system gain over specific frequency interval rather than infinite range. In present work, new computationally efficient techniques for computation of H ∞ norm over frequency limited interval are proposed. Proposed techniques link norm computation with maximum singular value of the system in limited frequency interval. Numerical examples are incorporated to validate the proposed concept.
a Recursive Approach to Compute Normal Forms

NASA Astrophysics Data System (ADS)

HSU, L.; MIN, L. J.; FAVRETTO, L.

2001-06-01

Normal forms are instrumental in the analysis of dynamical systems described by ordinary differential equations, particularly when singularities close to a bifurcation are to be characterized. However, the computation of a normal form up to an arbitrary order is numerically hard. This paper focuses on the computer programming of some recursive formulas developed earlier to compute higher order normal forms. A computer program to reduce the system to its normal form on a center manifold is developed using the Maple symbolic language. However, it should be stressed that the program relies essentially on recursive numerical computations, while symbolic calculations are used only for minor tasks. Some strategies are proposed to save computation time. Examples are presented to illustrate the application of the program to obtain high order normalization or to handle systems with large dimension.
An efficient implementation of 3D high-resolution imaging for large-scale seismic data with GPU/CPU heterogeneous parallel computing

NASA Astrophysics Data System (ADS)

Xu, Jincheng; Liu, Wei; Wang, Jin; Liu, Linong; Zhang, Jianfeng

2018-02-01

De-absorption pre-stack time migration (QPSTM) compensates for the absorption and dispersion of seismic waves by introducing an effective Q parameter, thereby making it an effective tool for 3D, high-resolution imaging of seismic data. Although the optimal aperture obtained via stationary-phase migration reduces the computational cost of 3D QPSTM and yields 3D stationary-phase QPSTM, the associated computational efficiency is still the main problem in the processing of 3D, high-resolution images for real large-scale seismic data. In the current paper, we proposed a division method for large-scale, 3D seismic data to optimize the performance of stationary-phase QPSTM on clusters of graphics processing units (GPU). Then, we designed an imaging point parallel strategy to achieve an optimal parallel computing performance. Afterward, we adopted an asynchronous double buffering scheme for multi-stream to perform the GPU/CPU parallel computing. Moreover, several key optimization strategies of computation and storage based on the compute unified device architecture (CUDA) were adopted to accelerate the 3D stationary-phase QPSTM algorithm. Compared with the initial GPU code, the implementation of the key optimization steps, including thread optimization, shared memory optimization, register optimization and special function units (SFU), greatly improved the efficiency. A numerical example employing real large-scale, 3D seismic data showed that our scheme is nearly 80 times faster than the CPU-QPSTM algorithm. Our GPU/CPU heterogeneous parallel computing framework significant reduces the computational cost and facilitates 3D high-resolution imaging for large-scale seismic data.
Scalable Conjunction Processing using Spatiotemporally Indexed Ephemeris Data

NASA Astrophysics Data System (ADS)

Budianto-Ho, I.; Johnson, S.; Sivilli, R.; Alberty, C.; Scarberry, R.

2014-09-01

The collision warnings produced by the Joint Space Operations Center (JSpOC) are of critical importance in protecting U.S. and allied spacecraft against destructive collisions and protecting the lives of astronauts during space flight. As the Space Surveillance Network (SSN) improves its sensor capabilities for tracking small and dim space objects, the number of tracked objects increases from thousands to hundreds of thousands of objects, while the number of potential conjunctions increases with the square of the number of tracked objects. Classical filtering techniques such as apogee and perigee filters have proven insufficient. Novel and orders of magnitude faster conjunction analysis algorithms are required to find conjunctions in a timely manner. Stellar Science has developed innovative filtering techniques for satellite conjunction processing using spatiotemporally indexed ephemeris data that efficiently and accurately reduces the number of objects requiring high-fidelity and computationally-intensive conjunction analysis. Two such algorithms, one based on the k-d Tree pioneered in robotics applications and the other based on Spatial Hash Tables used in computer gaming and animation, use, at worst, an initial O(N log N) preprocessing pass (where N is the number of tracked objects) to build large O(N) spatial data structures that substantially reduce the required number of O(N^2) computations, substituting linear memory usage for quadratic processing time. The filters have been implemented as Open Services Gateway initiative (OSGi) plug-ins for the Continuous Anomalous Orbital Situation Discriminator (CAOS-D) conjunction analysis architecture. We have demonstrated the effectiveness, efficiency, and scalability of the techniques using a catalog of 100,000 objects, an analysis window of one day, on a 64-core computer with 1TB shared memory. Each algorithm can process the full catalog in 6 minutes or less, almost a twenty-fold performance improvement over the baseline implementation running on the same machine. We will present an overview of the algorithms and results that demonstrate the scalability of our concepts.
CUDA GPU based full-Stokes finite difference modelling of glaciers

NASA Astrophysics Data System (ADS)

Brædstrup, C. F.; Egholm, D. L.

2012-04-01

Many have stressed the limitations of using the shallow shelf and shallow ice approximations when modelling ice streams or surging glaciers. Using a full-stokes approach requires either large amounts of computer power or time and is therefore seldom an option for most glaciologists. Recent advances in graphics card (GPU) technology for high performance computing have proven extremely efficient in accelerating many large scale scientific computations. The general purpose GPU (GPGPU) technology is cheap, has a low power consumption and fits into a normal desktop computer. It could therefore provide a powerful tool for many glaciologists. Our full-stokes ice sheet model implements a Red-Black Gauss-Seidel iterative linear solver to solve the full stokes equations. This technique has proven very effective when applied to the stokes equation in geodynamics problems, and should therefore also preform well in glaciological flow probems. The Gauss-Seidel iterator is known to be robust but several other linear solvers have a much faster convergence. To aid convergence, the solver uses a multigrid approach where values are interpolated and extrapolated between different grid resolutions to minimize the short wavelength errors efficiently. This reduces the iteration count by several orders of magnitude. The run-time is further reduced by using the GPGPU technology where each card has up to 448 cores. Researchers utilizing the GPGPU technique in other areas have reported between 2 - 11 times speedup compared to multicore CPU implementations on similar problems. The goal of these initial investigations into the possible usage of GPGPU technology in glacial modelling is to apply the enhanced resolution of a full-stokes solver to ice streams and surging glaciers. This is a area of growing interest because ice streams are the main drainage conjugates for large ice sheets. It is therefore crucial to understand this streaming behavior and it's impact up-ice.
Fast Reduction Method in Dominance-Based Information Systems

NASA Astrophysics Data System (ADS)

Li, Yan; Zhou, Qinghua; Wen, Yongchuan

2018-01-01

In real world applications, there are often some data with continuous values or preference-ordered values. Rough sets based on dominance relations can effectively deal with these kinds of data. Attribute reduction can be done in the framework of dominance-relation based approach to better extract decision rules. However, the computational cost of the dominance classes greatly affects the efficiency of attribute reduction and rule extraction. This paper presents an efficient method of computing dominance classes, and further compares it with traditional method with increasing attributes and samples. Experiments on UCI data sets show that the proposed algorithm obviously improves the efficiency of the traditional method, especially for large-scale data.
NASA Lewis Stirling SPRE testing and analysis with reduced number of cooler tubes

NASA Technical Reports Server (NTRS)

Wong, Wayne A.; Cairelli, James E.; Swec, Diane M.; Doeberling, Thomas J.; Lakatos, Thomas F.; Madi, Frank J.

1992-01-01

Free-piston Stirling power converters are candidates for high capacity space power applications. The Space Power Research Engine (SPRE), a free-piston Stirling engine coupled with a linear alternator, is being tested at the NASA Lewis Research Center in support of the Civil Space Technology Initiative. The SPRE is used as a test bed for evaluating converter modifications which have the potential to improve the converter performance and for validating computer code predictions. Reducing the number of cooler tubes on the SPRE has been identified as a modification with the potential to significantly improve power and efficiency. Experimental tests designed to investigate the effects of reducing the number of cooler tubes on converter power, efficiency and dynamics are described. Presented are test results from the converter operating with a reduced number of cooler tubes and comparisons between this data and both baseline test data and computer code predictions.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.