Automated Development of Accurate Algorithms and Efficient Codes for Computational Aeroacoustics
NASA Technical Reports Server (NTRS)
Goodrich, John W.; Dyson, Rodger W.
1999-01-01
The simulation of sound generation and propagation in three space dimensions with realistic aircraft components is a very large time dependent computation with fine details. Simulations in open domains with embedded objects require accurate and robust algorithms for propagation, for artificial inflow and outflow boundaries, and for the definition of geometrically complex objects. The development, implementation, and validation of methods for solving these demanding problems is being done to support the NASA pillar goals for reducing aircraft noise levels. Our goal is to provide algorithms which are sufficiently accurate and efficient to produce usable results rapidly enough to allow design engineers to study the effects on sound levels of design changes in propulsion systems, and in the integration of propulsion systems with airframes. There is a lack of design tools for these purposes at this time. Our technical approach to this problem combines the development of new, algorithms with the use of Mathematica and Unix utilities to automate the algorithm development, code implementation, and validation. We use explicit methods to ensure effective implementation by domain decomposition for SPMD parallel computing. There are several orders of magnitude difference in the computational efficiencies of the algorithms which we have considered. We currently have new artificial inflow and outflow boundary conditions that are stable, accurate, and unobtrusive, with implementations that match the accuracy and efficiency of the propagation methods. The artificial numerical boundary treatments have been proven to have solutions which converge to the full open domain problems, so that the error from the boundary treatments can be driven as low as is required. The purpose of this paper is to briefly present a method for developing highly accurate algorithms for computational aeroacoustics, the use of computer automation in this process, and a brief survey of the algorithms that
An Accurate and Computationally Efficient Model for Membrane-Type Circular-Symmetric Micro-Hotplates
Khan, Usman; Falconi, Christian
2014-01-01
Ideally, the design of high-performance micro-hotplates would require a large number of simulations because of the existence of many important design parameters as well as the possibly crucial effects of both spread and drift. However, the computational cost of FEM simulations, which are the only available tool for accurately predicting the temperature in micro-hotplates, is very high. As a result, micro-hotplate designers generally have no effective simulation-tools for the optimization. In order to circumvent these issues, here, we propose a model for practical circular-symmetric micro-hot-plates which takes advantage of modified Bessel functions, computationally efficient matrix-approach for considering the relevant boundary conditions, Taylor linearization for modeling the Joule heating and radiation losses, and external-region-segmentation strategy in order to accurately take into account radiation losses in the entire micro-hotplate. The proposed model is almost as accurate as FEM simulations and two to three orders of magnitude more computationally efficient (e.g., 45 s versus more than 8 h). The residual errors, which are mainly associated to the undesired heating in the electrical contacts, are small (e.g., few degrees Celsius for an 800 °C operating temperature) and, for important analyses, almost constant. Therefore, we also introduce a computationally-easy single-FEM-compensation strategy in order to reduce the residual errors to about 1 °C. As illustrative examples of the power of our approach, we report the systematic investigation of a spread in the membrane thermal conductivity and of combined variations of both ambient and bulk temperatures. Our model enables a much faster characterization of micro-hotplates and, thus, a much more effective optimization prior to fabrication. PMID:24763214
Kearns, F L; Hudson, P S; Boresch, S; Woodcock, H L
2016-01-01
Enzyme activity is inherently linked to free energies of transition states, ligand binding, protonation/deprotonation, etc.; these free energies, and thus enzyme function, can be affected by residue mutations, allosterically induced conformational changes, and much more. Therefore, being able to predict free energies associated with enzymatic processes is critical to understanding and predicting their function. Free energy simulation (FES) has historically been a computational challenge as it requires both the accurate description of inter- and intramolecular interactions and adequate sampling of all relevant conformational degrees of freedom. The hybrid quantum mechanical molecular mechanical (QM/MM) framework is the current tool of choice when accurate computations of macromolecular systems are essential. Unfortunately, robust and efficient approaches that employ the high levels of computational theory needed to accurately describe many reactive processes (ie, ab initio, DFT), while also including explicit solvation effects and accounting for extensive conformational sampling are essentially nonexistent. In this chapter, we will give a brief overview of two recently developed methods that mitigate several major challenges associated with QM/MM FES: the QM non-Boltzmann Bennett's acceptance ratio method and the QM nonequilibrium work method. We will also describe usage of these methods to calculate free energies associated with (1) relative properties and (2) along reaction paths, using simple test cases with relevance to enzymes examples. © 2016 Elsevier Inc. All rights reserved.
NASA Technical Reports Server (NTRS)
Chaderjian, Neal M.
1991-01-01
Computations from two Navier-Stokes codes, NSS and F3D, are presented for a tangent-ogive-cylinder body at high angle of attack. Features of this steady flow include a pair of primary vortices on the leeward side of the body as well as secondary vortices. The topological and physical plausibility of this vortical structure is discussed. The accuracy of these codes are assessed by comparison of the numerical solutions with experimental data. The effects of turbulence model, numerical dissipation, and grid refinement are presented. The overall efficiency of these codes are also assessed by examining their convergence rates, computational time per time step, and maximum allowable time step for time-accurate computations. Overall, the numerical results from both codes compared equally well with experimental data, however, the NSS code was found to be significantly more efficient than the F3D code.
NASA Astrophysics Data System (ADS)
Luo, Ye; Esler, Kenneth; Kent, Paul; Shulenburger, Luke
Quantum Monte Carlo (QMC) calculations of giant molecules, surface and defect properties of solids have been feasible recently due to drastically expanding computational resources. However, with the most computationally efficient basis set, B-splines, these calculations are severely restricted by the memory capacity of compute nodes. The B-spline coefficients are shared on a node but not distributed among nodes, to ensure fast evaluation. A hybrid representation which incorporates atomic orbitals near the ions and B-spline ones in the interstitial regions offers a more accurate and less memory demanding description of the orbitals because they are naturally more atomic like near ions and much smoother in between, thus allowing coarser B-spline grids. We will demonstrate the advantage of hybrid representation over pure B-spline and Gaussian basis sets and also show significant speed-up like computing the non-local pseudopotentials with our new scheme. Moreover, we discuss a new algorithm for atomic orbital initialization which used to require an extra workflow step taking a few days. With this work, the highly efficient hybrid representation paves the way to simulate large size even in-homogeneous systems using QMC. This work was supported by the U.S. Department of Energy, Office of Science, Basic Energy Sciences, Computational Materials Sciences Program.
An efficient and accurate 3D displacements tracking strategy for digital volume correlation
NASA Astrophysics Data System (ADS)
Pan, Bing; Wang, Bo; Wu, Dafang; Lubineau, Gilles
2014-07-01
Owing to its inherent computational complexity, practical implementation of digital volume correlation (DVC) for internal displacement and strain mapping faces important challenges in improving its computational efficiency. In this work, an efficient and accurate 3D displacement tracking strategy is proposed for fast DVC calculation. The efficiency advantage is achieved by using three improvements. First, to eliminate the need of updating Hessian matrix in each iteration, an efficient 3D inverse compositional Gauss-Newton (3D IC-GN) algorithm is introduced to replace existing forward additive algorithms for accurate sub-voxel displacement registration. Second, to ensure the 3D IC-GN algorithm that converges accurately and rapidly and avoid time-consuming integer-voxel displacement searching, a generalized reliability-guided displacement tracking strategy is designed to transfer accurate and complete initial guess of deformation for each calculation point from its computed neighbors. Third, to avoid the repeated computation of sub-voxel intensity interpolation coefficients, an interpolation coefficient lookup table is established for tricubic interpolation. The computational complexity of the proposed fast DVC and the existing typical DVC algorithms are first analyzed quantitatively according to necessary arithmetic operations. Then, numerical tests are performed to verify the performance of the fast DVC algorithm in terms of measurement accuracy and computational efficiency. The experimental results indicate that, compared with the existing DVC algorithm, the presented fast DVC algorithm produces similar precision and slightly higher accuracy at a substantially reduced computational cost.
NASA Astrophysics Data System (ADS)
Zhuang, Wei; Mountrakis, Giorgos
2014-09-01
Large footprint waveform LiDAR sensors have been widely used for numerous airborne studies. Ground peak identification in a large footprint waveform is a significant bottleneck in exploring full usage of the waveform datasets. In the current study, an accurate and computationally efficient algorithm was developed for ground peak identification, called Filtering and Clustering Algorithm (FICA). The method was evaluated on Land, Vegetation, and Ice Sensor (LVIS) waveform datasets acquired over Central NY. FICA incorporates a set of multi-scale second derivative filters and a k-means clustering algorithm in order to avoid detecting false ground peaks. FICA was tested in five different land cover types (deciduous trees, coniferous trees, shrub, grass and developed area) and showed more accurate results when compared to existing algorithms. More specifically, compared with Gaussian decomposition, the RMSE ground peak identification by FICA was 2.82 m (5.29 m for GD) in deciduous plots, 3.25 m (4.57 m for GD) in coniferous plots, 2.63 m (2.83 m for GD) in shrub plots, 0.82 m (0.93 m for GD) in grass plots, and 0.70 m (0.51 m for GD) in plots of developed areas. FICA performance was also relatively consistent under various slope and canopy coverage (CC) conditions. In addition, FICA showed better computational efficiency compared to existing methods. FICA's major computational and accuracy advantage is a result of the adopted multi-scale signal processing procedures that concentrate on local portions of the signal as opposed to the Gaussian decomposition that uses a curve-fitting strategy applied in the entire signal. The FICA algorithm is a good candidate for large-scale implementation on future space-borne waveform LiDAR sensors.
BLESS 2: accurate, memory-efficient and fast error correction method.
Heo, Yun; Ramachandran, Anand; Hwu, Wen-Mei; Ma, Jian; Chen, Deming
2016-08-01
The most important features of error correction tools for sequencing data are accuracy, memory efficiency and fast runtime. The previous version of BLESS was highly memory-efficient and accurate, but it was too slow to handle reads from large genomes. We have developed a new version of BLESS to improve runtime and accuracy while maintaining a small memory usage. The new version, called BLESS 2, has an error correction algorithm that is more accurate than BLESS, and the algorithm has been parallelized using hybrid MPI and OpenMP programming. BLESS 2 was compared with five top-performing tools, and it was found to be the fastest when it was executed on two computing nodes using MPI, with each node containing twelve cores. Also, BLESS 2 showed at least 11% higher gain while retaining the memory efficiency of the previous version for large genomes. Freely available at https://sourceforge.net/projects/bless-ec dchen@illinois.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Computationally Efficient Multiconfigurational Reactive Molecular Dynamics
Yamashita, Takefumi; Peng, Yuxing; Knight, Chris; Voth, Gregory A.
2012-01-01
It is a computationally demanding task to explicitly simulate the electronic degrees of freedom in a system to observe the chemical transformations of interest, while at the same time sampling the time and length scales required to converge statistical properties and thus reduce artifacts due to initial conditions, finite-size effects, and limited sampling. One solution that significantly reduces the computational expense consists of molecular models in which effective interactions between particles govern the dynamics of the system. If the interaction potentials in these models are developed to reproduce calculated properties from electronic structure calculations and/or ab initio molecular dynamics simulations, then one can calculate accurate properties at a fraction of the computational cost. Multiconfigurational algorithms model the system as a linear combination of several chemical bonding topologies to simulate chemical reactions, also sometimes referred to as “multistate”. These algorithms typically utilize energy and force calculations already found in popular molecular dynamics software packages, thus facilitating their implementation without significant changes to the structure of the code. However, the evaluation of energies and forces for several bonding topologies per simulation step can lead to poor computational efficiency if redundancy is not efficiently removed, particularly with respect to the calculation of long-ranged Coulombic interactions. This paper presents accurate approximations (effective long-range interaction and resulting hybrid methods) and multiple-program parallelization strategies for the efficient calculation of electrostatic interactions in reactive molecular simulations. PMID:25100924
Accurate atom-mapping computation for biochemical reactions.
Latendresse, Mario; Malerich, Jeremiah P; Travers, Mike; Karp, Peter D
2012-11-26
The complete atom mapping of a chemical reaction is a bijection of the reactant atoms to the product atoms that specifies the terminus of each reactant atom. Atom mapping of biochemical reactions is useful for many applications of systems biology, in particular for metabolic engineering where synthesizing new biochemical pathways has to take into account for the number of carbon atoms from a source compound that are conserved in the synthesis of a target compound. Rapid, accurate computation of the atom mapping(s) of a biochemical reaction remains elusive despite significant work on this topic. In particular, past researchers did not validate the accuracy of mapping algorithms. We introduce a new method for computing atom mappings called the minimum weighted edit-distance (MWED) metric. The metric is based on bond propensity to react and computes biochemically valid atom mappings for a large percentage of biochemical reactions. MWED models can be formulated efficiently as Mixed-Integer Linear Programs (MILPs). We have demonstrated this approach on 7501 reactions of the MetaCyc database for which 87% of the models could be solved in less than 10 s. For 2.1% of the reactions, we found multiple optimal atom mappings. We show that the error rate is 0.9% (22 reactions) by comparing these atom mappings to 2446 atom mappings of the manually curated Kyoto Encyclopedia of Genes and Genomes (KEGG) RPAIR database. To our knowledge, our computational atom-mapping approach is the most accurate and among the fastest published to date. The atom-mapping data will be available in the MetaCyc database later in 2012; the atom-mapping software will be available within the Pathway Tools software later in 2012.
Xie, Xiurui; Qu, Hong; Yi, Zhang; Kurths, Jurgen
2017-06-01
The spiking neural network (SNN) is the third generation of neural networks and performs remarkably well in cognitive tasks, such as pattern recognition. The temporal neural encode mechanism found in biological hippocampus enables SNN to possess more powerful computation capability than networks with other encoding schemes. However, this temporal encoding approach requires neurons to process information serially on time, which reduces learning efficiency significantly. To keep the powerful computation capability of the temporal encoding mechanism and to overcome its low efficiency in the training of SNNs, a new training algorithm, the accurate synaptic-efficiency adjustment method is proposed in this paper. Inspired by the selective attention mechanism of the primate visual system, our algorithm selects only the target spike time as attention areas, and ignores voltage states of the untarget ones, resulting in a significant reduction of training time. Besides, our algorithm employs a cost function based on the voltage difference between the potential of the output neuron and the firing threshold of the SNN, instead of the traditional precise firing time distance. A normalized spike-timing-dependent-plasticity learning window is applied to assigning this error to different synapses for instructing their training. Comprehensive simulations are conducted to investigate the learning properties of our algorithm, with input neurons emitting both single spike and multiple spikes. Simulation results indicate that our algorithm possesses higher learning performance than the existing other methods and achieves the state-of-the-art efficiency in the training of SNN.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sun, Jianwei; Remsing, Richard C.; Zhang, Yubo
2016-06-13
One atom or molecule binds to another through various types of bond, the strengths of which range from several meV to several eV. Although some computational methods can provide accurate descriptions of all bond types, those methods are not efficient enough for many studies (for example, large systems, ab initio molecular dynamics and high-throughput searches for functional materials). Here, we show that the recently developed non-empirical strongly constrained and appropriately normed (SCAN) meta-generalized gradient approximation (meta-GGA) within the density functional theory framework predicts accurate geometries and energies of diversely bonded molecules and materials (including covalent, metallic, ionic, hydrogen and vanmore » der Waals bonds). This represents a significant improvement at comparable efficiency over its predecessors, the GGAs that currently dominate materials computation. Often, SCAN matches or improves on the accuracy of a computationally expensive hybrid functional, at almost-GGA cost. SCAN is therefore expected to have a broad impact on chemistry and materials science.« less
Sun, Jianwei; Remsing, Richard C; Zhang, Yubo; Sun, Zhaoru; Ruzsinszky, Adrienn; Peng, Haowei; Yang, Zenghui; Paul, Arpita; Waghmare, Umesh; Wu, Xifan; Klein, Michael L; Perdew, John P
2016-09-01
One atom or molecule binds to another through various types of bond, the strengths of which range from several meV to several eV. Although some computational methods can provide accurate descriptions of all bond types, those methods are not efficient enough for many studies (for example, large systems, ab initio molecular dynamics and high-throughput searches for functional materials). Here, we show that the recently developed non-empirical strongly constrained and appropriately normed (SCAN) meta-generalized gradient approximation (meta-GGA) within the density functional theory framework predicts accurate geometries and energies of diversely bonded molecules and materials (including covalent, metallic, ionic, hydrogen and van der Waals bonds). This represents a significant improvement at comparable efficiency over its predecessors, the GGAs that currently dominate materials computation. Often, SCAN matches or improves on the accuracy of a computationally expensive hybrid functional, at almost-GGA cost. SCAN is therefore expected to have a broad impact on chemistry and materials science.
Computational efficiency for the surface renewal method
NASA Astrophysics Data System (ADS)
Kelley, Jason; Higgins, Chad
2018-04-01
Measuring surface fluxes using the surface renewal (SR) method requires programmatic algorithms for tabulation, algebraic calculation, and data quality control. A number of different methods have been published describing automated calibration of SR parameters. Because the SR method utilizes high-frequency (10 Hz+) measurements, some steps in the flux calculation are computationally expensive, especially when automating SR to perform many iterations of these calculations. Several new algorithms were written that perform the required calculations more efficiently and rapidly, and that tested for sensitivity to length of flux averaging period, ability to measure over a large range of lag timescales, and overall computational efficiency. These algorithms utilize signal processing techniques and algebraic simplifications that demonstrate simple modifications that dramatically improve computational efficiency. The results here complement efforts by other authors to standardize a robust and accurate computational SR method. Increased speed of computation time grants flexibility to implementing the SR method, opening new avenues for SR to be used in research, for applied monitoring, and in novel field deployments.
Efficiency and Accuracy of Time-Accurate Turbulent Navier-Stokes Computations
NASA Technical Reports Server (NTRS)
Rumsey, Christopher L.; Sanetrik, Mark D.; Biedron, Robert T.; Melson, N. Duane; Parlette, Edward B.
1995-01-01
The accuracy and efficiency of two types of subiterations in both explicit and implicit Navier-Stokes codes are explored for unsteady laminar circular-cylinder flow and unsteady turbulent flow over an 18-percent-thick circular-arc (biconvex) airfoil. Grid and time-step studies are used to assess the numerical accuracy of the methods. Nonsubiterative time-stepping schemes and schemes with physical time subiterations are subject to time-step limitations in practice that are removed by pseudo time sub-iterations. Computations for the circular-arc airfoil indicate that a one-equation turbulence model predicts the unsteady separated flow better than an algebraic turbulence model; also, the hysteresis with Mach number of the self-excited unsteadiness due to shock and boundary-layer separation is well predicted.
NASA Astrophysics Data System (ADS)
Yu, Jieqing; Wu, Lixin; Hu, Qingsong; Yan, Zhigang; Zhang, Shaoliang
2017-12-01
Visibility computation is of great interest to location optimization, environmental planning, ecology, and tourism. Many algorithms have been developed for visibility computation. In this paper, we propose a novel method of visibility computation, called synthetic visual plane (SVP), to achieve better performance with respect to efficiency, accuracy, or both. The method uses a global horizon, which is a synthesis of line-of-sight information of all nearer points, to determine the visibility of a point, which makes it an accurate visibility method. We used discretization of horizon to gain a good performance in efficiency. After discretization, the accuracy and efficiency of SVP depends on the scale of discretization (i.e., zone width). The method is more accurate at smaller zone widths, but this requires a longer operating time. Users must strike a balance between accuracy and efficiency at their discretion. According to our experiments, SVP is less accurate but more efficient than R2 if the zone width is set to one grid. However, SVP becomes more accurate than R2 when the zone width is set to 1/24 grid, while it continues to perform as fast or faster than R2. Although SVP performs worse than reference plane and depth map with respect to efficiency, it is superior in accuracy to these other two algorithms.
NASA Technical Reports Server (NTRS)
Kory, Carol L.
1999-01-01
The phenomenal growth of commercial communications has created a great demand for traveling-wave tube (TWT) amplifiers. Although the helix slow-wave circuit remains the mainstay of the TWT industry because of its exceptionally wide bandwidth, until recently it has been impossible to accurately analyze a helical TWT using its exact dimensions because of the complexity of its geometrical structure. For the first time, an accurate three-dimensional helical model was developed that allows accurate prediction of TWT cold-test characteristics including operating frequency, interaction impedance, and attenuation. This computational model, which was developed at the NASA Lewis Research Center, allows TWT designers to obtain a more accurate value of interaction impedance than is possible using experimental methods. Obtaining helical slow-wave circuit interaction impedance is an important part of the design process for a TWT because it is related to the gain and efficiency of the tube. This impedance cannot be measured directly; thus, conventional methods involve perturbing a helical circuit with a cylindrical dielectric rod placed on the central axis of the circuit and obtaining the difference in resonant frequency between the perturbed and unperturbed circuits. A mathematical relationship has been derived between this frequency difference and the interaction impedance (ref. 1). However, because of the complex configuration of the helical circuit, deriving this relationship involves several approximations. In addition, this experimental procedure is time-consuming and expensive, but until recently it was widely accepted as the most accurate means of determining interaction impedance. The advent of an accurate three-dimensional helical circuit model (ref. 2) made it possible for Lewis researchers to fully investigate standard approximations made in deriving the relationship between measured perturbation data and interaction impedance. The most prominent approximations made
Accurate and efficient seismic data interpolation in the principal frequency wavenumber domain
NASA Astrophysics Data System (ADS)
Wang, Benfeng; Lu, Wenkai
2017-12-01
Seismic data irregularity caused by economic limitations, acquisition environmental constraints or bad trace elimination, can decrease the performance of the below multi-channel algorithms, such as surface-related multiple elimination (SRME), though some can overcome the irregularity defects. Therefore, accurate interpolation to provide the necessary complete data is a pre-requisite, but its wide applications are constrained because of its large computational burden for huge data volume, especially in 3D explorations. For accurate and efficient interpolation, the curvelet transform- (CT) based projection onto convex sets (POCS) method in the principal frequency wavenumber (PFK) domain is introduced. The complex-valued PF components can characterize their original signal with a high accuracy, but are at least half the size, which can help provide a reasonable efficiency improvement. The irregularity of the observed data is transformed into incoherent noise in the PFK domain, and curvelet coefficients may be sparser when CT is performed on the PFK domain data, enhancing the interpolation accuracy. The performance of the POCS-based algorithms using complex-valued CT in the time space (TX), principal frequency space, and PFK domains are compared. Numerical examples on synthetic and field data demonstrate the validity and effectiveness of the proposed method. With less computational burden, the proposed method can achieve a better interpolation result, and it can be easily extended into higher dimensions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tong, Dudu; Yang, Sichun; Lu, Lanyuan
2016-06-20
Structure modellingviasmall-angle X-ray scattering (SAXS) data generally requires intensive computations of scattering intensity from any given biomolecular structure, where the accurate evaluation of SAXS profiles using coarse-grained (CG) methods is vital to improve computational efficiency. To date, most CG SAXS computing methods have been based on a single-bead-per-residue approximation but have neglected structural correlations between amino acids. To improve the accuracy of scattering calculations, accurate CG form factors of amino acids are now derived using a rigorous optimization strategy, termed electron-density matching (EDM), to best fit electron-density distributions of protein structures. This EDM method is compared with and tested againstmore » other CG SAXS computing methods, and the resulting CG SAXS profiles from EDM agree better with all-atom theoretical SAXS data. By including the protein hydration shell represented by explicit CG water molecules and the correction of protein excluded volume, the developed CG form factors also reproduce the selected experimental SAXS profiles with very small deviations. Taken together, these EDM-derived CG form factors present an accurate and efficient computational approach for SAXS computing, especially when higher molecular details (represented by theqrange of the SAXS data) become necessary for effective structure modelling.« less
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications
NASA Technical Reports Server (NTRS)
Sun, Xian-He
1997-01-01
Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm
Accurate, efficient, and (iso)geometrically flexible collocation methods for phase-field models
NASA Astrophysics Data System (ADS)
Gomez, Hector; Reali, Alessandro; Sangalli, Giancarlo
2014-04-01
We propose new collocation methods for phase-field models. Our algorithms are based on isogeometric analysis, a new technology that makes use of functions from computational geometry, such as, for example, Non-Uniform Rational B-Splines (NURBS). NURBS exhibit excellent approximability and controllable global smoothness, and can represent exactly most geometries encapsulated in Computer Aided Design (CAD) models. These attributes permitted us to derive accurate, efficient, and geometrically flexible collocation methods for phase-field models. The performance of our method is demonstrated by several numerical examples of phase separation modeled by the Cahn-Hilliard equation. We feel that our method successfully combines the geometrical flexibility of finite elements with the accuracy and simplicity of pseudo-spectral collocation methods, and is a viable alternative to classical collocation methods.
An accurate and efficient laser-envelope solver for the modeling of laser-plasma accelerators
Benedetti, C.; Schroeder, C. B.; Geddes, C. G. R.; ...
2017-10-17
Detailed and reliable numerical modeling of laser-plasma accelerators (LPAs), where a short and intense laser pulse interacts with an underdense plasma over distances of up to a meter, is a formidably challenging task. This is due to the great disparity among the length scales involved in the modeling, ranging from the micron scale of the laser wavelength to the meter scale of the total laser-plasma interaction length. The use of the time-averaged ponderomotive force approximation, where the laser pulse is described by means of its envelope, enables efficient modeling of LPAs by removing the need to model the details ofmore » electron motion at the laser wavelength scale. Furthermore, it allows simulations in cylindrical geometry which captures relevant 3D physics at 2D computational cost. A key element of any code based on the time-averaged ponderomotive force approximation is the laser envelope solver. In this paper we present the accurate and efficient envelope solver used in the code INF & RNO (INtegrated Fluid & paRticle simulatioN cOde). The features of the INF & RNO laser solver enable an accurate description of the laser pulse evolution deep into depletion even at a reasonably low resolution, resulting in significant computational speed-ups.« less
An accurate and efficient laser-envelope solver for the modeling of laser-plasma accelerators
DOE Office of Scientific and Technical Information (OSTI.GOV)
Benedetti, C.; Schroeder, C. B.; Geddes, C. G. R.
Detailed and reliable numerical modeling of laser-plasma accelerators (LPAs), where a short and intense laser pulse interacts with an underdense plasma over distances of up to a meter, is a formidably challenging task. This is due to the great disparity among the length scales involved in the modeling, ranging from the micron scale of the laser wavelength to the meter scale of the total laser-plasma interaction length. The use of the time-averaged ponderomotive force approximation, where the laser pulse is described by means of its envelope, enables efficient modeling of LPAs by removing the need to model the details ofmore » electron motion at the laser wavelength scale. Furthermore, it allows simulations in cylindrical geometry which captures relevant 3D physics at 2D computational cost. A key element of any code based on the time-averaged ponderomotive force approximation is the laser envelope solver. In this paper we present the accurate and efficient envelope solver used in the code INF & RNO (INtegrated Fluid & paRticle simulatioN cOde). The features of the INF & RNO laser solver enable an accurate description of the laser pulse evolution deep into depletion even at a reasonably low resolution, resulting in significant computational speed-ups.« less
An accurate and efficient laser-envelope solver for the modeling of laser-plasma accelerators
NASA Astrophysics Data System (ADS)
Benedetti, C.; Schroeder, C. B.; Geddes, C. G. R.; Esarey, E.; Leemans, W. P.
2018-01-01
Detailed and reliable numerical modeling of laser-plasma accelerators (LPAs), where a short and intense laser pulse interacts with an underdense plasma over distances of up to a meter, is a formidably challenging task. This is due to the great disparity among the length scales involved in the modeling, ranging from the micron scale of the laser wavelength to the meter scale of the total laser-plasma interaction length. The use of the time-averaged ponderomotive force approximation, where the laser pulse is described by means of its envelope, enables efficient modeling of LPAs by removing the need to model the details of electron motion at the laser wavelength scale. Furthermore, it allows simulations in cylindrical geometry which captures relevant 3D physics at 2D computational cost. A key element of any code based on the time-averaged ponderomotive force approximation is the laser envelope solver. In this paper we present the accurate and efficient envelope solver used in the code INF&RNO (INtegrated Fluid & paRticle simulatioN cOde). The features of the INF&RNO laser solver enable an accurate description of the laser pulse evolution deep into depletion even at a reasonably low resolution, resulting in significant computational speed-ups.
A computational efficient modelling of laminar separation bubbles
NASA Technical Reports Server (NTRS)
Dini, Paolo; Maughmer, Mark D.
1990-01-01
In predicting the aerodynamic characteristics of airfoils operating at low Reynolds numbers, it is often important to account for the effects of laminar (transitional) separation bubbles. Previous approaches to the modelling of this viscous phenomenon range from fast but sometimes unreliable empirical correlations for the length of the bubble and the associated increase in momentum thickness, to more accurate but significantly slower displacement-thickness iteration methods employing inverse boundary-layer formulations in the separated regions. Since the penalty in computational time associated with the more general methods is unacceptable for airfoil design applications, use of an accurate yet computationally efficient model is highly desirable. To this end, a semi-empirical bubble model was developed and incorporated into the Eppler and Somers airfoil design and analysis program. The generality and the efficiency was achieved by successfully approximating the local viscous/inviscid interaction, the transition location, and the turbulent reattachment process within the framework of an integral boundary-layer method. Comparisons of the predicted aerodynamic characteristics with experimental measurements for several airfoils show excellent and consistent agreement for Reynolds numbers from 2,000,000 down to 100,000.
NASA Astrophysics Data System (ADS)
Meng, Zeng; Yang, Dixiong; Zhou, Huanlin; Yu, Bo
2018-05-01
The first order reliability method has been extensively adopted for reliability-based design optimization (RBDO), but it shows inaccuracy in calculating the failure probability with highly nonlinear performance functions. Thus, the second order reliability method is required to evaluate the reliability accurately. However, its application for RBDO is quite challenge owing to the expensive computational cost incurred by the repeated reliability evaluation and Hessian calculation of probabilistic constraints. In this article, a new improved stability transformation method is proposed to search the most probable point efficiently, and the Hessian matrix is calculated by the symmetric rank-one update. The computational capability of the proposed method is illustrated and compared to the existing RBDO approaches through three mathematical and two engineering examples. The comparison results indicate that the proposed method is very efficient and accurate, providing an alternative tool for RBDO of engineering structures.
Computationally efficient multibody simulations
NASA Technical Reports Server (NTRS)
Ramakrishnan, Jayant; Kumar, Manoj
1994-01-01
Computationally efficient approaches to the solution of the dynamics of multibody systems are presented in this work. The computational efficiency is derived from both the algorithmic and implementational standpoint. Order(n) approaches provide a new formulation of the equations of motion eliminating the assembly and numerical inversion of a system mass matrix as required by conventional algorithms. Computational efficiency is also gained in the implementation phase by the symbolic processing and parallel implementation of these equations. Comparison of this algorithm with existing multibody simulation programs illustrates the increased computational efficiency.
Efficient statistically accurate algorithms for the Fokker-Planck equation in large dimensions
NASA Astrophysics Data System (ADS)
Chen, Nan; Majda, Andrew J.
2018-02-01
Solving the Fokker-Planck equation for high-dimensional complex turbulent dynamical systems is an important and practical issue. However, most traditional methods suffer from the curse of dimensionality and have difficulties in capturing the fat tailed highly intermittent probability density functions (PDFs) of complex systems in turbulence, neuroscience and excitable media. In this article, efficient statistically accurate algorithms are developed for solving both the transient and the equilibrium solutions of Fokker-Planck equations associated with high-dimensional nonlinear turbulent dynamical systems with conditional Gaussian structures. The algorithms involve a hybrid strategy that requires only a small number of ensembles. Here, a conditional Gaussian mixture in a high-dimensional subspace via an extremely efficient parametric method is combined with a judicious non-parametric Gaussian kernel density estimation in the remaining low-dimensional subspace. Particularly, the parametric method provides closed analytical formulae for determining the conditional Gaussian distributions in the high-dimensional subspace and is therefore computationally efficient and accurate. The full non-Gaussian PDF of the system is then given by a Gaussian mixture. Different from traditional particle methods, each conditional Gaussian distribution here covers a significant portion of the high-dimensional PDF. Therefore a small number of ensembles is sufficient to recover the full PDF, which overcomes the curse of dimensionality. Notably, the mixture distribution has significant skill in capturing the transient behavior with fat tails of the high-dimensional non-Gaussian PDFs, and this facilitates the algorithms in accurately describing the intermittency and extreme events in complex turbulent systems. It is shown in a stringent set of test problems that the method only requires an order of O (100) ensembles to successfully recover the highly non-Gaussian transient PDFs in up to 6
Efficient Statistically Accurate Algorithms for the Fokker-Planck Equation in Large Dimensions
NASA Astrophysics Data System (ADS)
Chen, N.; Majda, A.
2017-12-01
Solving the Fokker-Planck equation for high-dimensional complex turbulent dynamical systems is an important and practical issue. However, most traditional methods suffer from the curse of dimensionality and have difficulties in capturing the fat tailed highly intermittent probability density functions (PDFs) of complex systems in turbulence, neuroscience and excitable media. In this article, efficient statistically accurate algorithms are developed for solving both the transient and the equilibrium solutions of Fokker-Planck equations associated with high-dimensional nonlinear turbulent dynamical systems with conditional Gaussian structures. The algorithms involve a hybrid strategy that requires only a small number of ensembles. Here, a conditional Gaussian mixture in a high-dimensional subspace via an extremely efficient parametric method is combined with a judicious non-parametric Gaussian kernel density estimation in the remaining low-dimensional subspace. Particularly, the parametric method, which is based on an effective data assimilation framework, provides closed analytical formulae for determining the conditional Gaussian distributions in the high-dimensional subspace. Therefore, it is computationally efficient and accurate. The full non-Gaussian PDF of the system is then given by a Gaussian mixture. Different from the traditional particle methods, each conditional Gaussian distribution here covers a significant portion of the high-dimensional PDF. Therefore a small number of ensembles is sufficient to recover the full PDF, which overcomes the curse of dimensionality. Notably, the mixture distribution has a significant skill in capturing the transient behavior with fat tails of the high-dimensional non-Gaussian PDFs, and this facilitates the algorithms in accurately describing the intermittency and extreme events in complex turbulent systems. It is shown in a stringent set of test problems that the method only requires an order of O(100) ensembles to
NASA Technical Reports Server (NTRS)
Rausch, Russ D.; Batina, John T.; Yang, Henry T. Y.
1991-01-01
Spatial adaption procedures for the accurate and efficient solution of steady and unsteady inviscid flow problems are described. The adaption procedures were developed and implemented within a two-dimensional unstructured-grid upwind-type Euler code. These procedures involve mesh enrichment and mesh coarsening to either add points in a high gradient region or the flow or remove points where they are not needed, respectively, to produce solutions of high spatial accuracy at minimal computational costs. A detailed description is given of the enrichment and coarsening procedures and comparisons with alternative results and experimental data are presented to provide an assessment of the accuracy and efficiency of the capability. Steady and unsteady transonic results, obtained using spatial adaption for the NACA 0012 airfoil, are shown to be of high spatial accuracy, primarily in that the shock waves are very sharply captured. The results were obtained with a computational savings of a factor of approximately fifty-three for a steady case and as much as twenty-five for the unsteady cases.
NASA Technical Reports Server (NTRS)
Rausch, Russ D.; Yang, Henry T. Y.; Batina, John T.
1991-01-01
Spatial adaption procedures for the accurate and efficient solution of steady and unsteady inviscid flow problems are described. The adaption procedures were developed and implemented within a two-dimensional unstructured-grid upwind-type Euler code. These procedures involve mesh enrichment and mesh coarsening to either add points in high gradient regions of the flow or remove points where they are not needed, respectively, to produce solutions of high spatial accuracy at minimal computational cost. The paper gives a detailed description of the enrichment and coarsening procedures and presents comparisons with alternative results and experimental data to provide an assessment of the accuracy and efficiency of the capability. Steady and unsteady transonic results, obtained using spatial adaption for the NACA 0012 airfoil, are shown to be of high spatial accuracy, primarily in that the shock waves are very sharply captured. The results were obtained with a computational savings of a factor of approximately fifty-three for a steady case and as much as twenty-five for the unsteady cases.
NASA Astrophysics Data System (ADS)
Pineda, M.; Stamatakis, M.
2017-07-01
Modeling the kinetics of surface catalyzed reactions is essential for the design of reactors and chemical processes. The majority of microkinetic models employ mean-field approximations, which lead to an approximate description of catalytic kinetics by assuming spatially uncorrelated adsorbates. On the other hand, kinetic Monte Carlo (KMC) methods provide a discrete-space continuous-time stochastic formulation that enables an accurate treatment of spatial correlations in the adlayer, but at a significant computation cost. In this work, we use the so-called cluster mean-field approach to develop higher order approximations that systematically increase the accuracy of kinetic models by treating spatial correlations at a progressively higher level of detail. We further demonstrate our approach on a reduced model for NO oxidation incorporating first nearest-neighbor lateral interactions and construct a sequence of approximations of increasingly higher accuracy, which we compare with KMC and mean-field. The latter is found to perform rather poorly, overestimating the turnover frequency by several orders of magnitude for this system. On the other hand, our approximations, while more computationally intense than the traditional mean-field treatment, still achieve tremendous computational savings compared to KMC simulations, thereby opening the way for employing them in multiscale modeling frameworks.
Efficient computation of the joint sample frequency spectra for multiple populations.
Kamm, John A; Terhorst, Jonathan; Song, Yun S
2017-01-01
A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of large-scale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. SFS-based inference methods require accurate computation of the expected SFS under a given demographic model. Although much methodological progress has been made, existing methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable accurate, efficient computation of the expected joint SFS for thousands of individuals sampled from hundreds of populations related by a complex demographic model with arbitrary population size histories (including piecewise-exponential growth). Our results are implemented in a new software package called momi (MOran Models for Inference). Through an empirical study we demonstrate our improvements to numerical stability and computational complexity.
Efficient computation of the joint sample frequency spectra for multiple populations
Kamm, John A.; Terhorst, Jonathan; Song, Yun S.
2016-01-01
A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of large-scale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. SFS-based inference methods require accurate computation of the expected SFS under a given demographic model. Although much methodological progress has been made, existing methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable accurate, efficient computation of the expected joint SFS for thousands of individuals sampled from hundreds of populations related by a complex demographic model with arbitrary population size histories (including piecewise-exponential growth). Our results are implemented in a new software package called momi (MOran Models for Inference). Through an empirical study we demonstrate our improvements to numerical stability and computational complexity. PMID:28239248
Multiscale Methods for Accurate, Efficient, and Scale-Aware Models of the Earth System
DOE Office of Scientific and Technical Information (OSTI.GOV)
Goldhaber, Steve; Holland, Marika
The major goal of this project was to contribute improvements to the infrastructure of an Earth System Model in order to support research in the Multiscale Methods for Accurate, Efficient, and Scale-Aware models of the Earth System project. In support of this, the NCAR team accomplished two main tasks: improving input/output performance of the model and improving atmospheric model simulation quality. Improvement of the performance and scalability of data input and diagnostic output within the model required a new infrastructure which can efficiently handle the unstructured grids common in multiscale simulations. This allows for a more computationally efficient model, enablingmore » more years of Earth System simulation. The quality of the model simulations was improved by reducing grid-point noise in the spectral element version of the Community Atmosphere Model (CAM-SE). This was achieved by running the physics of the model using grid-cell data on a finite-volume grid.« less
Data Mining for Efficient and Accurate Large Scale Retrieval of Geophysical Parameters
NASA Astrophysics Data System (ADS)
Obradovic, Z.; Vucetic, S.; Peng, K.; Han, B.
2004-12-01
Our effort is devoted to developing data mining technology for improving efficiency and accuracy of the geophysical parameter retrievals by learning a mapping from observation attributes to the corresponding parameters within the framework of classification and regression. We will describe a method for efficient learning of neural network-based classification and regression models from high-volume data streams. The proposed procedure automatically learns a series of neural networks of different complexities on smaller data stream chunks and then properly combines them into an ensemble predictor through averaging. Based on the idea of progressive sampling the proposed approach starts with a very simple network trained on a very small chunk and then gradually increases the model complexity and the chunk size until the learning performance no longer improves. Our empirical study on aerosol retrievals from data obtained with the MISR instrument mounted at Terra satellite suggests that the proposed method is successful in learning complex concepts from large data streams with near-optimal computational effort. We will also report on a method that complements deterministic retrievals by constructing accurate predictive algorithms and applying them on appropriately selected subsets of observed data. The method is based on developing more accurate predictors aimed to catch global and local properties synthesized in a region. The procedure starts by learning the global properties of data sampled over the entire space, and continues by constructing specialized models on selected localized regions. The global and local models are integrated through an automated procedure that determines the optimal trade-off between the two components with the objective of minimizing the overall mean square errors over a specific region. Our experimental results on MISR data showed that the combined model can increase the retrieval accuracy significantly. The preliminary results on various
NASA Astrophysics Data System (ADS)
Bishay, Peter L.
This study presents a new family of highly accurate and efficient computational methods for modeling the multi-physics of multifunctional materials and composites in the micro-scale named "Multi-Physics Computational Grains" (MPCGs). Each "mathematical grain" has a random polygonal/polyhedral geometrical shape that resembles the natural shapes of the material grains in the micro-scale where each grain is surrounded by an arbitrary number of neighboring grains. The physics that are incorporated in this study include: Linear Elasticity, Electrostatics, Magnetostatics, Piezoelectricity, Piezomagnetism and Ferroelectricity. However, the methods proposed here can be extended to include more physics (thermo-elasticity, pyroelectricity, electric conduction, heat conduction, etc.) in their formulation, different analysis types (dynamics, fracture, fatigue, etc.), nonlinearities, different defect shapes, and some of the 2D methods can also be extended to 3D formulation. We present "Multi-Region Trefftz Collocation Grains" (MTCGs) as a simple and efficient method for direct and inverse problems, "Trefftz-Lekhnitskii Computational Gains" (TLCGs) for modeling porous and composite smart materials, "Hybrid Displacement Computational Grains" (HDCGs) as a general method for modeling multifunctional materials and composites, and finally "Radial-Basis-Functions Computational Grains" (RBFCGs) for modeling functionally-graded materials, magneto-electro-elastic (MEE) materials and the switching phenomena in ferroelectric materials. The first three proposed methods are suitable for direct numerical simulation (DNS) of the micromechanics of smart composite/porous materials with non-symmetrical arrangement of voids/inclusions, and provide minimal effort in meshing and minimal time in computations, since each grain can represent the matrix of a composite and can include a pore or an inclusion. The last three methods provide stiffness matrix in their formulation and hence can be readily
NASA Astrophysics Data System (ADS)
Zhang, Shunli; Zhang, Dinghua; Gong, Hao; Ghasemalizadeh, Omid; Wang, Ge; Cao, Guohua
2014-11-01
Iterative algorithms, such as the algebraic reconstruction technique (ART), are popular for image reconstruction. For iterative reconstruction, the area integral model (AIM) is more accurate for better reconstruction quality than the line integral model (LIM). However, the computation of the system matrix for AIM is more complex and time-consuming than that for LIM. Here, we propose a fast and accurate method to compute the system matrix for AIM. First, we calculate the intersection of each boundary line of a narrow fan-beam with pixels in a recursive and efficient manner. Then, by grouping the beam-pixel intersection area into six types according to the slopes of the two boundary lines, we analytically compute the intersection area of the narrow fan-beam with the pixels in a simple algebraic fashion. Overall, experimental results show that our method is about three times faster than the Siddon algorithm and about two times faster than the distance-driven model (DDM) in computation of the system matrix. The reconstruction speed of our AIM-based ART is also faster than the LIM-based ART that uses the Siddon algorithm and DDM-based ART, for one iteration. The fast reconstruction speed of our method was accomplished without compromising the image quality.
NASA Technical Reports Server (NTRS)
Daigle, Matthew John; Goebel, Kai Frank
2010-01-01
Model-based prognostics captures system knowledge in the form of physics-based models of components, and how they fail, in order to obtain accurate predictions of end of life (EOL). EOL is predicted based on the estimated current state distribution of a component and expected profiles of future usage. In general, this requires simulations of the component using the underlying models. In this paper, we develop a simulation-based prediction methodology that achieves computational efficiency by performing only the minimal number of simulations needed in order to accurately approximate the mean and variance of the complete EOL distribution. This is performed through the use of the unscented transform, which predicts the means and covariances of a distribution passed through a nonlinear transformation. In this case, the EOL simulation acts as that nonlinear transformation. In this paper, we review the unscented transform, and describe how this concept is applied to efficient EOL prediction. As a case study, we develop a physics-based model of a solenoid valve, and perform simulation experiments to demonstrate improved computational efficiency without sacrificing prediction accuracy.
Efficient Computation of Difference Vibrational Spectra in Isothermal-Isobaric Ensemble.
Joutsuka, Tatsuya; Morita, Akihiro
2016-11-03
Difference spectroscopy between two close systems is widely used to augment its selectivity to the different parts of the observed system, though the molecular dynamics calculation of tiny difference spectra would be computationally extraordinary demanding by subtraction of two spectra. Therefore, we have proposed an efficient computational algorithm of difference spectra without resorting to the subtraction. The present paper reports our extension of the theoretical method in the isothermal-isobaric (NPT) ensemble. The present theory expands our applications of analysis including pressure dependence of the spectra. We verified that the present theory yields accurate difference spectra in the NPT condition as well, with remarkable computational efficiency over the straightforward subtraction by several orders of magnitude. This method is further applied to vibrational spectra of liquid water with varying pressure and succeeded in reproducing tiny difference spectra by pressure change. The anomalous pressure dependence is elucidated in relation to other properties of liquid water.
A-VCI: A flexible method to efficiently compute vibrational spectra
NASA Astrophysics Data System (ADS)
Odunlami, Marc; Le Bris, Vincent; Bégué, Didier; Baraille, Isabelle; Coulaud, Olivier
2017-06-01
The adaptive vibrational configuration interaction algorithm has been introduced as a new method to efficiently reduce the dimension of the set of basis functions used in a vibrational configuration interaction process. It is based on the construction of nested bases for the discretization of the Hamiltonian operator according to a theoretical criterion that ensures the convergence of the method. In the present work, the Hamiltonian is written as a sum of products of operators. The purpose of this paper is to study the properties and outline the performance details of the main steps of the algorithm. New parameters have been incorporated to increase flexibility, and their influence has been thoroughly investigated. The robustness and reliability of the method are demonstrated for the computation of the vibrational spectrum up to 3000 cm-1 of a widely studied 6-atom molecule (acetonitrile). Our results are compared to the most accurate up to date computation; we also give a new reference calculation for future work on this system. The algorithm has also been applied to a more challenging 7-atom molecule (ethylene oxide). The computed spectrum up to 3200 cm-1 is the most accurate computation that exists today on such systems.
A-VCI: A flexible method to efficiently compute vibrational spectra.
Odunlami, Marc; Le Bris, Vincent; Bégué, Didier; Baraille, Isabelle; Coulaud, Olivier
2017-06-07
The adaptive vibrational configuration interaction algorithm has been introduced as a new method to efficiently reduce the dimension of the set of basis functions used in a vibrational configuration interaction process. It is based on the construction of nested bases for the discretization of the Hamiltonian operator according to a theoretical criterion that ensures the convergence of the method. In the present work, the Hamiltonian is written as a sum of products of operators. The purpose of this paper is to study the properties and outline the performance details of the main steps of the algorithm. New parameters have been incorporated to increase flexibility, and their influence has been thoroughly investigated. The robustness and reliability of the method are demonstrated for the computation of the vibrational spectrum up to 3000 cm -1 of a widely studied 6-atom molecule (acetonitrile). Our results are compared to the most accurate up to date computation; we also give a new reference calculation for future work on this system. The algorithm has also been applied to a more challenging 7-atom molecule (ethylene oxide). The computed spectrum up to 3200 cm -1 is the most accurate computation that exists today on such systems.
NASA Astrophysics Data System (ADS)
Sizov, Gennadi Y.
In this dissertation, a model-based multi-objective optimal design of permanent magnet ac machines, supplied by sine-wave current regulated drives, is developed and implemented. The design procedure uses an efficient electromagnetic finite element-based solver to accurately model nonlinear material properties and complex geometric shapes associated with magnetic circuit design. Application of an electromagnetic finite element-based solver allows for accurate computation of intricate performance parameters and characteristics. The first contribution of this dissertation is the development of a rapid computational method that allows accurate and efficient exploration of large multi-dimensional design spaces in search of optimum design(s). The computationally efficient finite element-based approach developed in this work provides a framework of tools that allow rapid analysis of synchronous electric machines operating under steady-state conditions. In the developed modeling approach, major steady-state performance parameters such as, winding flux linkages and voltages, average, cogging and ripple torques, stator core flux densities, core losses, efficiencies and saturated machine winding inductances, are calculated with minimum computational effort. In addition, the method includes means for rapid estimation of distributed stator forces and three-dimensional effects of stator and/or rotor skew on the performance of the machine. The second contribution of this dissertation is the development of the design synthesis and optimization method based on a differential evolution algorithm. The approach relies on the developed finite element-based modeling method for electromagnetic analysis and is able to tackle large-scale multi-objective design problems using modest computational resources. Overall, computational time savings of up to two orders of magnitude are achievable, when compared to current and prevalent state-of-the-art methods. These computational savings allow
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nakhleh, Luay
I proposed to develop computationally efficient tools for accurate detection and reconstruction of microbes' complex evolutionary mechanisms, thus enabling rapid and accurate annotation, analysis and understanding of their genomes. To achieve this goal, I proposed to address three aspects. (1) Mathematical modeling. A major challenge facing the accurate detection of HGT is that of distinguishing between these two events on the one hand and other events that have similar "effects." I proposed to develop a novel mathematical approach for distinguishing among these events. Further, I proposed to develop a set of novel optimization criteria for the evolutionary analysis of microbialmore » genomes in the presence of these complex evolutionary events. (2) Algorithm design. In this aspect of the project, I proposed to develop an array of e cient and accurate algorithms for analyzing microbial genomes based on the formulated optimization criteria. Further, I proposed to test the viability of the criteria and the accuracy of the algorithms in an experimental setting using both synthetic as well as biological data. (3) Software development. I proposed the nal outcome to be a suite of software tools which implements the mathematical models as well as the algorithms developed.« less
NASA Astrophysics Data System (ADS)
Hong, Pengyu; Sun, Hui; Sha, Long; Pu, Yi; Khatri, Kshitij; Yu, Xiang; Tang, Yang; Lin, Cheng
2017-08-01
A major challenge in glycomics is the characterization of complex glycan structures that are essential for understanding their diverse roles in many biological processes. We present a novel efficient computational approach, named GlycoDeNovo, for accurate elucidation of the glycan topologies from their tandem mass spectra. Given a spectrum, GlycoDeNovo first builds an interpretation-graph specifying how to interpret each peak using preceding interpreted peaks. It then reconstructs the topologies of peaks that contribute to interpreting the precursor ion. We theoretically prove that GlycoDeNovo is highly efficient. A major innovative feature added to GlycoDeNovo is a data-driven IonClassifier which can be used to effectively rank candidate topologies. IonClassifier is automatically learned from experimental spectra of known glycans to distinguish B- and C-type ions from all other ion types. Our results showed that GlycoDeNovo is robust and accurate for topology reconstruction of glycans from their tandem mass spectra. [Figure not available: see fulltext.
NASA Astrophysics Data System (ADS)
Ohwada, Taku; Shibata, Yuki; Kato, Takuma; Nakamura, Taichi
2018-06-01
Developed is a high-order accurate shock-capturing scheme for the compressible Euler/Navier-Stokes equations; the formal accuracy is 5th order in space and 4th order in time. The performance and efficiency of the scheme are validated in various numerical tests. The main ingredients of the scheme are nothing special; they are variants of the standard numerical flux, MUSCL, the usual Lagrange's polynomial and the conventional Runge-Kutta method. The scheme can compute a boundary layer accurately with a rational resolution and capture a stationary contact discontinuity sharply without inner points. And yet it is endowed with high resistance against shock anomalies (carbuncle phenomenon, post-shock oscillations, etc.). A good balance between high robustness and low dissipation is achieved by blending three types of numerical fluxes according to physical situation in an intuitively easy-to-understand way. The performance of the scheme is largely comparable to that of WENO5-Rusanov, while its computational cost is 30-40% less than of that of the advanced scheme.
NASA Astrophysics Data System (ADS)
Lee, Y. C.; Thompson, H. M.; Gaskell, P. H.
2009-12-01
, industrial and physical applications. However, despite recent modelling advances, the accurate numerical solution of the equations governing such problems is still at a relatively early stage. Indeed, recent studies employing a simplifying long-wave approximation have shown that highly efficient numerical methods are necessary to solve the resulting lubrication equations in order to achieve the level of grid resolution required to accurately capture the effects of micro- and nano-scale topographical features. Solution method: A portable parallel multigrid algorithm has been developed for the above purpose, for the particular case of flow over submerged topographical features. Within the multigrid framework adopted, a W-cycle is used to accelerate convergence in respect of the time dependent nature of the problem, with relaxation sweeps performed using a fixed number of pre- and post-Red-Black Gauss-Seidel Newton iterations. In addition, the algorithm incorporates automatic adaptive time-stepping to avoid the computational expense associated with repeated time-step failure. Running time: 1.31 minutes using 128 processors on BlueGene/P with a problem size of over 16.7 million mesh points.
Efficient computation of hashes
NASA Astrophysics Data System (ADS)
Lopes, Raul H. C.; Franqueira, Virginia N. L.; Hobson, Peter R.
2014-06-01
The sequential computation of hashes at the core of many distributed storage systems and found, for example, in grid services can hinder efficiency in service quality and even pose security challenges that can only be addressed by the use of parallel hash tree modes. The main contributions of this paper are, first, the identification of several efficiency and security challenges posed by the use of sequential hash computation based on the Merkle-Damgard engine. In addition, alternatives for the parallel computation of hash trees are discussed, and a prototype for a new parallel implementation of the Keccak function, the SHA-3 winner, is introduced.
Computer-based personality judgments are more accurate than those made by humans.
Youyou, Wu; Kosinski, Michal; Stillwell, David
2015-01-27
Judging others' personalities is an essential skill in successful social living, as personality is a key driver behind people's interactions, behaviors, and emotions. Although accurate personality judgments stem from social-cognitive skills, developments in machine learning show that computer models can also make valid judgments. This study compares the accuracy of human and computer-based personality judgments, using a sample of 86,220 volunteers who completed a 100-item personality questionnaire. We show that (i) computer predictions based on a generic digital footprint (Facebook Likes) are more accurate (r = 0.56) than those made by the participants' Facebook friends using a personality questionnaire (r = 0.49); (ii) computer models show higher interjudge agreement; and (iii) computer personality judgments have higher external validity when predicting life outcomes such as substance use, political attitudes, and physical health; for some outcomes, they even outperform the self-rated personality scores. Computers outpacing humans in personality judgment presents significant opportunities and challenges in the areas of psychological assessment, marketing, and privacy.
Improving the Efficiency of Abdominal Aortic Aneurysm Wall Stress Computations
Zelaya, Jaime E.; Goenezen, Sevan; Dargon, Phong T.; Azarbal, Amir-Farzin; Rugonyi, Sandra
2014-01-01
An abdominal aortic aneurysm is a pathological dilation of the abdominal aorta, which carries a high mortality rate if ruptured. The most commonly used surrogate marker of rupture risk is the maximal transverse diameter of the aneurysm. More recent studies suggest that wall stress from models of patient-specific aneurysm geometries extracted, for instance, from computed tomography images may be a more accurate predictor of rupture risk and an important factor in AAA size progression. However, quantification of wall stress is typically computationally intensive and time-consuming, mainly due to the nonlinear mechanical behavior of the abdominal aortic aneurysm walls. These difficulties have limited the potential of computational models in clinical practice. To facilitate computation of wall stresses, we propose to use a linear approach that ensures equilibrium of wall stresses in the aneurysms. This proposed linear model approach is easy to implement and eliminates the burden of nonlinear computations. To assess the accuracy of our proposed approach to compute wall stresses, results from idealized and patient-specific model simulations were compared to those obtained using conventional approaches and to those of a hypothetical, reference abdominal aortic aneurysm model. For the reference model, wall mechanical properties and the initial unloaded and unstressed configuration were assumed to be known, and the resulting wall stresses were used as reference for comparison. Our proposed linear approach accurately approximates wall stresses for varying model geometries and wall material properties. Our findings suggest that the proposed linear approach could be used as an effective, efficient, easy-to-use clinical tool to estimate patient-specific wall stresses. PMID:25007052
An accurate computational method for the diffusion regime verification
NASA Astrophysics Data System (ADS)
Zhokh, Alexey A.; Strizhak, Peter E.
2018-04-01
The diffusion regime (sub-diffusive, standard, or super-diffusive) is defined by the order of the derivative in the corresponding transport equation. We develop an accurate computational method for the direct estimation of the diffusion regime. The method is based on the derivative order estimation using the asymptotic analytic solutions of the diffusion equation with the integer order and the time-fractional derivatives. The robustness and the computational cheapness of the proposed method are verified using the experimental methane and methyl alcohol transport kinetics through the catalyst pellet.
Computer-based personality judgments are more accurate than those made by humans
Youyou, Wu; Kosinski, Michal; Stillwell, David
2015-01-01
Judging others’ personalities is an essential skill in successful social living, as personality is a key driver behind people’s interactions, behaviors, and emotions. Although accurate personality judgments stem from social-cognitive skills, developments in machine learning show that computer models can also make valid judgments. This study compares the accuracy of human and computer-based personality judgments, using a sample of 86,220 volunteers who completed a 100-item personality questionnaire. We show that (i) computer predictions based on a generic digital footprint (Facebook Likes) are more accurate (r = 0.56) than those made by the participants’ Facebook friends using a personality questionnaire (r = 0.49); (ii) computer models show higher interjudge agreement; and (iii) computer personality judgments have higher external validity when predicting life outcomes such as substance use, political attitudes, and physical health; for some outcomes, they even outperform the self-rated personality scores. Computers outpacing humans in personality judgment presents significant opportunities and challenges in the areas of psychological assessment, marketing, and privacy. PMID:25583507
Fast and accurate computation of projected two-point functions
NASA Astrophysics Data System (ADS)
Grasshorn Gebhardt, Henry S.; Jeong, Donghui
2018-01-01
We present the two-point function from the fast and accurate spherical Bessel transformation (2-FAST) algorithm
NASA Astrophysics Data System (ADS)
Cary, John R.; Abell, D.; Amundson, J.; Bruhwiler, D. L.; Busby, R.; Carlsson, J. A.; Dimitrov, D. A.; Kashdan, E.; Messmer, P.; Nieter, C.; Smithe, D. N.; Spentzouris, P.; Stoltz, P.; Trines, R. M.; Wang, H.; Werner, G. R.
2006-09-01
As the size and cost of particle accelerators escalate, high-performance computing plays an increasingly important role; optimization through accurate, detailed computermodeling increases performance and reduces costs. But consequently, computer simulations face enormous challenges. Early approximation methods, such as expansions in distance from the design orbit, were unable to supply detailed accurate results, such as in the computation of wake fields in complex cavities. Since the advent of message-passing supercomputers with thousands of processors, earlier approximations are no longer necessary, and it is now possible to compute wake fields, the effects of dampers, and self-consistent dynamics in cavities accurately. In this environment, the focus has shifted towards the development and implementation of algorithms that scale to large numbers of processors. So-called charge-conserving algorithms evolve the electromagnetic fields without the need for any global solves (which are difficult to scale up to many processors). Using cut-cell (or embedded) boundaries, these algorithms can simulate the fields in complex accelerator cavities with curved walls. New implicit algorithms, which are stable for any time-step, conserve charge as well, allowing faster simulation of structures with details small compared to the characteristic wavelength. These algorithmic and computational advances have been implemented in the VORPAL7 Framework, a flexible, object-oriented, massively parallel computational application that allows run-time assembly of algorithms and objects, thus composing an application on the fly.
Development of highly accurate approximate scheme for computing the charge transfer integral
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pershin, Anton; Szalay, Péter G.
The charge transfer integral is a key parameter required by various theoretical models to describe charge transport properties, e.g., in organic semiconductors. The accuracy of this important property depends on several factors, which include the level of electronic structure theory and internal simplifications of the applied formalism. The goal of this paper is to identify the performance of various approximate approaches of the latter category, while using the high level equation-of-motion coupled cluster theory for the electronic structure. The calculations have been performed on the ethylene dimer as one of the simplest model systems. By studying different spatial perturbations, itmore » was shown that while both energy split in dimer and fragment charge difference methods are equivalent with the exact formulation for symmetrical displacements, they are less efficient when describing transfer integral along the asymmetric alteration coordinate. Since the “exact” scheme was found computationally expensive, we examine the possibility to obtain the asymmetric fluctuation of the transfer integral by a Taylor expansion along the coordinate space. By exploring the efficiency of this novel approach, we show that the Taylor expansion scheme represents an attractive alternative to the “exact” calculations due to a substantial reduction of computational costs, when a considerably large region of the potential energy surface is of interest. Moreover, we show that the Taylor expansion scheme, irrespective of the dimer symmetry, is very accurate for the entire range of geometry fluctuations that cover the space the molecule accesses at room temperature.« less
NASA Astrophysics Data System (ADS)
Pan, Bing; Wang, Bo
2017-10-01
Digital volume correlation (DVC) is a powerful technique for quantifying interior deformation within solid opaque materials and biological tissues. In the last two decades, great efforts have been made to improve the accuracy and efficiency of the DVC algorithm. However, there is still a lack of a flexible, robust and accurate version that can be efficiently implemented in personal computers with limited RAM. This paper proposes an advanced DVC method that can realize accurate full-field internal deformation measurement applicable to high-resolution volume images with up to billions of voxels. Specifically, a novel layer-wise reliability-guided displacement tracking strategy combined with dynamic data management is presented to guide the DVC computation from slice to slice. The displacements at specified calculation points in each layer are computed using the advanced 3D inverse-compositional Gauss-Newton algorithm with the complete initial guess of the deformation vector accurately predicted from the computed calculation points. Since only limited slices of interest in the reference and deformed volume images rather than the whole volume images are required, the DVC calculation can thus be efficiently implemented on personal computers. The flexibility, accuracy and efficiency of the presented DVC approach are demonstrated by analyzing computer-simulated and experimentally obtained high-resolution volume images.
Xu, Jason; Minin, Vladimir N
2015-07-01
Branching processes are a class of continuous-time Markov chains (CTMCs) with ubiquitous applications. A general difficulty in statistical inference under partially observed CTMC models arises in computing transition probabilities when the discrete state space is large or uncountable. Classical methods such as matrix exponentiation are infeasible for large or countably infinite state spaces, and sampling-based alternatives are computationally intensive, requiring integration over all possible hidden events. Recent work has successfully applied generating function techniques to computing transition probabilities for linear multi-type branching processes. While these techniques often require significantly fewer computations than matrix exponentiation, they also become prohibitive in applications with large populations. We propose a compressed sensing framework that significantly accelerates the generating function method, decreasing computational cost up to a logarithmic factor by only assuming the probability mass of transitions is sparse. We demonstrate accurate and efficient transition probability computations in branching process models for blood cell formation and evolution of self-replicating transposable elements in bacterial genomes.
Xu, Jason; Minin, Vladimir N.
2016-01-01
Branching processes are a class of continuous-time Markov chains (CTMCs) with ubiquitous applications. A general difficulty in statistical inference under partially observed CTMC models arises in computing transition probabilities when the discrete state space is large or uncountable. Classical methods such as matrix exponentiation are infeasible for large or countably infinite state spaces, and sampling-based alternatives are computationally intensive, requiring integration over all possible hidden events. Recent work has successfully applied generating function techniques to computing transition probabilities for linear multi-type branching processes. While these techniques often require significantly fewer computations than matrix exponentiation, they also become prohibitive in applications with large populations. We propose a compressed sensing framework that significantly accelerates the generating function method, decreasing computational cost up to a logarithmic factor by only assuming the probability mass of transitions is sparse. We demonstrate accurate and efficient transition probability computations in branching process models for blood cell formation and evolution of self-replicating transposable elements in bacterial genomes. PMID:26949377
Accurate and efficient calculation of response times for groundwater flow
NASA Astrophysics Data System (ADS)
Carr, Elliot J.; Simpson, Matthew J.
2018-03-01
We study measures of the amount of time required for transient flow in heterogeneous porous media to effectively reach steady state, also known as the response time. Here, we develop a new approach that extends the concept of mean action time. Previous applications of the theory of mean action time to estimate the response time use the first two central moments of the probability density function associated with the transition from the initial condition, at t = 0, to the steady state condition that arises in the long time limit, as t → ∞ . This previous approach leads to a computationally convenient estimation of the response time, but the accuracy can be poor. Here, we outline a powerful extension using the first k raw moments, showing how to produce an extremely accurate estimate by making use of asymptotic properties of the cumulative distribution function. Results are validated using an existing laboratory-scale data set describing flow in a homogeneous porous medium. In addition, we demonstrate how the results also apply to flow in heterogeneous porous media. Overall, the new method is: (i) extremely accurate; and (ii) computationally inexpensive. In fact, the computational cost of the new method is orders of magnitude less than the computational effort required to study the response time by solving the transient flow equation. Furthermore, the approach provides a rigorous mathematical connection with the heuristic argument that the response time for flow in a homogeneous porous medium is proportional to L2 / D , where L is a relevant length scale, and D is the aquifer diffusivity. Here, we extend such heuristic arguments by providing a clear mathematical definition of the proportionality constant.
Experimental Realization of High-Efficiency Counterfactual Computation.
Kong, Fei; Ju, Chenyong; Huang, Pu; Wang, Pengfei; Kong, Xi; Shi, Fazhan; Jiang, Liang; Du, Jiangfeng
2015-08-21
Counterfactual computation (CFC) exemplifies the fascinating quantum process by which the result of a computation may be learned without actually running the computer. In previous experimental studies, the counterfactual efficiency is limited to below 50%. Here we report an experimental realization of the generalized CFC protocol, in which the counterfactual efficiency can break the 50% limit and even approach unity in principle. The experiment is performed with the spins of a negatively charged nitrogen-vacancy color center in diamond. Taking advantage of the quantum Zeno effect, the computer can remain in the not-running subspace due to the frequent projection by the environment, while the computation result can be revealed by final detection. The counterfactual efficiency up to 85% has been demonstrated in our experiment, which opens the possibility of many exciting applications of CFC, such as high-efficiency quantum integration and imaging.
Experimental Realization of High-Efficiency Counterfactual Computation
NASA Astrophysics Data System (ADS)
Kong, Fei; Ju, Chenyong; Huang, Pu; Wang, Pengfei; Kong, Xi; Shi, Fazhan; Jiang, Liang; Du, Jiangfeng
2015-08-01
Counterfactual computation (CFC) exemplifies the fascinating quantum process by which the result of a computation may be learned without actually running the computer. In previous experimental studies, the counterfactual efficiency is limited to below 50%. Here we report an experimental realization of the generalized CFC protocol, in which the counterfactual efficiency can break the 50% limit and even approach unity in principle. The experiment is performed with the spins of a negatively charged nitrogen-vacancy color center in diamond. Taking advantage of the quantum Zeno effect, the computer can remain in the not-running subspace due to the frequent projection by the environment, while the computation result can be revealed by final detection. The counterfactual efficiency up to 85% has been demonstrated in our experiment, which opens the possibility of many exciting applications of CFC, such as high-efficiency quantum integration and imaging.
A new approach to compute accurate velocity of meteors
NASA Astrophysics Data System (ADS)
Egal, Auriane; Gural, Peter; Vaubaillon, Jeremie; Colas, Francois; Thuillot, William
2016-10-01
The CABERNET project was designed to push the limits of meteoroid orbit measurements by improving the determination of the meteors' velocities. Indeed, despite of the development of the cameras networks dedicated to the observation of meteors, there is still an important discrepancy between the measured orbits of meteoroids computed and the theoretical results. The gap between the observed and theoretic semi-major axis of the orbits is especially significant; an accurate determination of the orbits of meteoroids therefore largely depends on the computation of the pre-atmospheric velocities. It is then imperative to dig out how to increase the precision of the measurements of the velocity.In this work, we perform an analysis of different methods currently used to compute the velocities and trajectories of the meteors. They are based on the intersecting planes method developed by Ceplecha (1987), the least squares method of Borovicka (1990), and the multi-parameter fitting (MPF) method published by Gural (2012).In order to objectively compare the performances of these techniques, we have simulated realistic meteors ('fakeors') reproducing the different error measurements of many cameras networks. Some fakeors are built following the propagation models studied by Gural (2012), and others created by numerical integrations using the Borovicka et al. 2007 model. Different optimization techniques have also been investigated in order to pick the most suitable one to solve the MPF, and the influence of the geometry of the trajectory on the result is also presented.We will present here the results of an improved implementation of the multi-parameter fitting that allow an accurate orbit computation of meteors with CABERNET. The comparison of different velocities computation seems to show that if the MPF is by far the best method to solve the trajectory and the velocity of a meteor, the ill-conditioning of the costs functions used can lead to large estimate errors for noisy
Creation of Anatomically Accurate Computer-Aided Design (CAD) Solid Models from Medical Images
NASA Technical Reports Server (NTRS)
Stewart, John E.; Graham, R. Scott; Samareh, Jamshid A.; Oberlander, Eric J.; Broaddus, William C.
1999-01-01
Most surgical instrumentation and implants used in the world today are designed with sophisticated Computer-Aided Design (CAD)/Computer-Aided Manufacturing (CAM) software. This software automates the mechanical development of a product from its conceptual design through manufacturing. CAD software also provides a means of manipulating solid models prior to Finite Element Modeling (FEM). Few surgical products are designed in conjunction with accurate CAD models of human anatomy because of the difficulty with which these models are created. We have developed a novel technique that creates anatomically accurate, patient specific CAD solids from medical images in a matter of minutes.
NASA Astrophysics Data System (ADS)
Joost, William J.
2012-09-01
Transportation accounts for approximately 28% of U.S. energy consumption with the majority of transportation energy derived from petroleum sources. Many technologies such as vehicle electrification, advanced combustion, and advanced fuels can reduce transportation energy consumption by improving the efficiency of cars and trucks. Lightweight materials are another important technology that can improve passenger vehicle fuel efficiency by 6-8% for each 10% reduction in weight while also making electric and alternative vehicles more competitive. Despite the opportunities for improved efficiency, widespread deployment of lightweight materials for automotive structures is hampered by technology gaps most often associated with performance, manufacturability, and cost. In this report, the impact of reduced vehicle weight on energy efficiency is discussed with a particular emphasis on quantitative relationships determined by several researchers. The most promising lightweight materials systems are described along with a brief review of the most significant technical barriers to their implementation. For each material system, the development of accurate material models is critical to support simulation-intensive processing and structural design for vehicles; improved models also contribute to an integrated computational materials engineering (ICME) approach for addressing technical barriers and accelerating deployment. The value of computational techniques is described by considering recent ICME and computational materials science success stories with an emphasis on applying problem-specific methods.
Efficient universal blind quantum computation.
Giovannetti, Vittorio; Maccone, Lorenzo; Morimae, Tomoyuki; Rudolph, Terry G
2013-12-06
We give a cheat sensitive protocol for blind universal quantum computation that is efficient in terms of computational and communication resources: it allows one party to perform an arbitrary computation on a second party's quantum computer without revealing either which computation is performed, or its input and output. The first party's computational capabilities can be extremely limited: she must only be able to create and measure single-qubit superposition states. The second party is not required to use measurement-based quantum computation. The protocol requires the (optimal) exchange of O(Jlog2(N)) single-qubit states, where J is the computational depth and N is the number of qubits needed for the computation.
Kang, Dongwan D.; Froula, Jeff; Egan, Rob; ...
2015-01-01
Grouping large genomic fragments assembled from shotgun metagenomic sequences to deconvolute complex microbial communities, or metagenome binning, enables the study of individual organisms and their interactions. Because of the complex nature of these communities, existing metagenome binning methods often miss a large number of microbial species. In addition, most of the tools are not scalable to large datasets. Here we introduce automated software called MetaBAT that integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning. MetaBAT outperforms alternative methods in accuracy and computational efficiency on both synthetic and real metagenome datasets. Lastly, it automatically formsmore » hundreds of high quality genome bins on a very large assembly consisting millions of contigs in a matter of hours on a single node. MetaBAT is open source software and available at https://bitbucket.org/berkeleylab/metabat.« less
Improved patient size estimates for accurate dose calculations in abdomen computed tomography
NASA Astrophysics Data System (ADS)
Lee, Chang-Lae
2017-07-01
The radiation dose of CT (computed tomography) is generally represented by the CTDI (CT dose index). CTDI, however, does not accurately predict the actual patient doses for different human body sizes because it relies on a cylinder-shaped head (diameter : 16 cm) and body (diameter : 32 cm) phantom. The purpose of this study was to eliminate the drawbacks of the conventional CTDI and to provide more accurate radiation dose information. Projection radiographs were obtained from water cylinder phantoms of various sizes, and the sizes of the water cylinder phantoms were calculated and verified using attenuation profiles. The effective diameter was also calculated using the attenuation of the abdominal projection radiographs of 10 patients. When the results of the attenuation-based method and the geometry-based method shown were compared with the results of the reconstructed-axial-CT-image-based method, the effective diameter of the attenuation-based method was found to be similar to the effective diameter of the reconstructed-axial-CT-image-based method, with a difference of less than 3.8%, but the geometry-based method showed a difference of less than 11.4%. This paper proposes a new method of accurately computing the radiation dose of CT based on the patient sizes. This method computes and provides the exact patient dose before the CT scan, and can therefore be effectively used for imaging and dose control.
Exact and efficient simulation of concordant computation
NASA Astrophysics Data System (ADS)
Cable, Hugo; Browne, Daniel E.
2015-11-01
Concordant computation is a circuit-based model of quantum computation for mixed states, that assumes that all correlations within the register are discord-free (i.e. the correlations are essentially classical) at every step of the computation. The question of whether concordant computation always admits efficient simulation by a classical computer was first considered by Eastin in arXiv:quant-ph/1006.4402v1, where an answer in the affirmative was given for circuits consisting only of one- and two-qubit gates. Building on this work, we develop the theory of classical simulation of concordant computation. We present a new framework for understanding such computations, argue that a larger class of concordant computations admit efficient simulation, and provide alternative proofs for the main results of arXiv:quant-ph/1006.4402v1 with an emphasis on the exactness of simulation which is crucial for this model. We include detailed analysis of the arithmetic complexity for solving equations in the simulation, as well as extensions to larger gates and qudits. We explore the limitations of our approach, and discuss the challenges faced in developing efficient classical simulation algorithms for all concordant computations.
Accurate computation of survival statistics in genome-wide studies.
Vandin, Fabio; Papoutsaki, Alexandra; Raphael, Benjamin J; Upfal, Eli
2015-05-01
A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations.
Accurate Computation of Survival Statistics in Genome-Wide Studies
Vandin, Fabio; Papoutsaki, Alexandra; Raphael, Benjamin J.; Upfal, Eli
2015-01-01
A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations. PMID:25950620
Dendritic nonlinearities are tuned for efficient spike-based computations in cortical circuits.
Ujfalussy, Balázs B; Makara, Judit K; Branco, Tiago; Lengyel, Máté
2015-12-24
Cortical neurons integrate thousands of synaptic inputs in their dendrites in highly nonlinear ways. It is unknown how these dendritic nonlinearities in individual cells contribute to computations at the level of neural circuits. Here, we show that dendritic nonlinearities are critical for the efficient integration of synaptic inputs in circuits performing analog computations with spiking neurons. We developed a theory that formalizes how a neuron's dendritic nonlinearity that is optimal for integrating synaptic inputs depends on the statistics of its presynaptic activity patterns. Based on their in vivo preynaptic population statistics (firing rates, membrane potential fluctuations, and correlations due to ensemble dynamics), our theory accurately predicted the responses of two different types of cortical pyramidal cells to patterned stimulation by two-photon glutamate uncaging. These results reveal a new computational principle underlying dendritic integration in cortical neurons by suggesting a functional link between cellular and systems--level properties of cortical circuits.
NASA Astrophysics Data System (ADS)
Plata, Jose J.; Nath, Pinku; Usanmaz, Demet; Carrete, Jesús; Toher, Cormac; de Jong, Maarten; Asta, Mark; Fornari, Marco; Nardelli, Marco Buongiorno; Curtarolo, Stefano
2017-10-01
One of the most accurate approaches for calculating lattice thermal conductivity, , is solving the Boltzmann transport equation starting from third-order anharmonic force constants. In addition to the underlying approximations of ab-initio parameterization, two main challenges are associated with this path: high computational costs and lack of automation in the frameworks using this methodology, which affect the discovery rate of novel materials with ad-hoc properties. Here, the Automatic Anharmonic Phonon Library (AAPL) is presented. It efficiently computes interatomic force constants by making effective use of crystal symmetry analysis, it solves the Boltzmann transport equation to obtain , and allows a fully integrated operation with minimum user intervention, a rational addition to the current high-throughput accelerated materials development framework AFLOW. An "experiment vs. theory" study of the approach is shown, comparing accuracy and speed with respect to other available packages, and for materials characterized by strong electron localization and correlation. Combining AAPL with the pseudo-hybrid functional ACBN0 is possible to improve accuracy without increasing computational requirements.
NASA Astrophysics Data System (ADS)
Schwörer, Magnus; Lorenzen, Konstantin; Mathias, Gerald; Tavan, Paul
2015-03-01
Recently, a novel approach to hybrid quantum mechanics/molecular mechanics (QM/MM) molecular dynamics (MD) simulations has been suggested [Schwörer et al., J. Chem. Phys. 138, 244103 (2013)]. Here, the forces acting on the atoms are calculated by grid-based density functional theory (DFT) for a solute molecule and by a polarizable molecular mechanics (PMM) force field for a large solvent environment composed of several 103-105 molecules as negative gradients of a DFT/PMM hybrid Hamiltonian. The electrostatic interactions are efficiently described by a hierarchical fast multipole method (FMM). Adopting recent progress of this FMM technique [Lorenzen et al., J. Chem. Theory Comput. 10, 3244 (2014)], which particularly entails a strictly linear scaling of the computational effort with the system size, and adapting this revised FMM approach to the computation of the interactions between the DFT and PMM fragments of a simulation system, here, we show how one can further enhance the efficiency and accuracy of such DFT/PMM-MD simulations. The resulting gain of total performance, as measured for alanine dipeptide (DFT) embedded in water (PMM) by the product of the gains in efficiency and accuracy, amounts to about one order of magnitude. We also demonstrate that the jointly parallelized implementation of the DFT and PMM-MD parts of the computation enables the efficient use of high-performance computing systems. The associated software is available online.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mehmani, Yashar; Oostrom, Martinus; Balhoff, Matthew
2014-03-20
Several approaches have been developed in the literature for solving flow and transport at the pore-scale. Some authors use a direct modeling approach where the fundamental flow and transport equations are solved on the actual pore-space geometry. Such direct modeling, while very accurate, comes at a great computational cost. Network models are computationally more efficient because the pore-space morphology is approximated. Typically, a mixed cell method (MCM) is employed for solving the flow and transport system which assumes pore-level perfect mixing. This assumption is invalid at moderate to high Peclet regimes. In this work, a novel Eulerian perspective on modelingmore » flow and transport at the pore-scale is developed. The new streamline splitting method (SSM) allows for circumventing the pore-level perfect mixing assumption, while maintaining the computational efficiency of pore-network models. SSM was verified with direct simulations and excellent matches were obtained against micromodel experiments across a wide range of pore-structure and fluid-flow parameters. The increase in the computational cost from MCM to SSM is shown to be minimal, while the accuracy of SSM is much higher than that of MCM and comparable to direct modeling approaches. Therefore, SSM can be regarded as an appropriate balance between incorporating detailed physics and controlling computational cost. The truly predictive capability of the model allows for the study of pore-level interactions of fluid flow and transport in different porous materials. In this paper, we apply SSM and MCM to study the effects of pore-level mixing on transverse dispersion in 3D disordered granular media.« less
Graf, Daniel; Beuerle, Matthias; Schurkus, Henry F; Luenser, Arne; Savasci, Gökcen; Ochsenfeld, Christian
2018-05-08
An efficient algorithm for calculating the random phase approximation (RPA) correlation energy is presented that is as accurate as the canonical molecular orbital resolution-of-the-identity RPA (RI-RPA) with the important advantage of an effective linear-scaling behavior (instead of quartic) for large systems due to a formulation in the local atomic orbital space. The high accuracy is achieved by utilizing optimized minimax integration schemes and the local Coulomb metric attenuated by the complementary error function for the RI approximation. The memory bottleneck of former atomic orbital (AO)-RI-RPA implementations ( Schurkus, H. F.; Ochsenfeld, C. J. Chem. Phys. 2016 , 144 , 031101 and Luenser, A.; Schurkus, H. F.; Ochsenfeld, C. J. Chem. Theory Comput. 2017 , 13 , 1647 - 1655 ) is addressed by precontraction of the large 3-center integral matrix with the Cholesky factors of the ground state density reducing the memory requirements of that matrix by a factor of [Formula: see text]. Furthermore, we present a parallel implementation of our method, which not only leads to faster RPA correlation energy calculations but also to a scalable decrease in memory requirements, opening the door for investigations of large molecules even on small- to medium-sized computing clusters. Although it is known that AO methods are highly efficient for extended systems, where sparsity allows for reaching the linear-scaling regime, we show that our work also extends the applicability when considering highly delocalized systems for which no linear scaling can be achieved. As an example, the interlayer distance of two covalent organic framework pore fragments (comprising 384 atoms in total) is analyzed.
A computable expression of closure to efficient causation.
Mossio, Matteo; Longo, Giuseppe; Stewart, John
2009-04-07
In this paper, we propose a mathematical expression of closure to efficient causation in terms of lambda-calculus; we argue that this opens up the perspective of developing principled computer simulations of systems closed to efficient causation in an appropriate programming language. An important implication of our formulation is that, by exhibiting an expression in lambda-calculus, which is a paradigmatic formalism for computability and programming, we show that there are no conceptual or principled problems in realizing a computer simulation or model of closure to efficient causation. We conclude with a brief discussion of the question whether closure to efficient causation captures all relevant properties of living systems. We suggest that it might not be the case, and that more complex definitions could indeed create crucial some obstacles to computability.
NASA Astrophysics Data System (ADS)
Choi, Chu Hwan
2002-09-01
Ab initio chemistry has shown great promise in reproducing experimental results and in its predictive power. The many complicated computational models and methods seem impenetrable to an inexperienced scientist, and the reliability of the results is not easily interpreted. The application of midbond orbitals is used to determine a general method for use in calculating weak intermolecular interactions, especially those involving electron-deficient systems. Using the criteria of consistency, flexibility, accuracy and efficiency we propose a supermolecular method of calculation using the full counterpoise (CP) method of Boys and Bernardi, coupled with Moller-Plesset (MP) perturbation theory as an efficient electron-correlative method. We also advocate the use of the highly efficient and reliable correlation-consistent polarized valence basis sets of Dunning. To these basis sets, we add a general set of midbond orbitals and demonstrate greatly enhanced efficiency in the calculation. The H2-H2 dimer is taken as a benchmark test case for our method, and details of the computation are elaborated. Our method reproduces with great accuracy the dissociation energies of other previous theoretical studies. The added efficiency of extending the basis sets with conventional means is compared with the performance of our midbond-extended basis sets. The improvement found with midbond functions is notably superior in every case tested. Finally, a novel application of midbond functions to the BH5 complex is presented. The system is an unusual van der Waals complex. The interaction potential curves are presented for several standard basis sets and midbond-enhanced basis sets, as well as for two popular, alternative correlation methods. We report that MP theory appears to be superior to coupled-cluster (CC) in speed, while it is more stable than B3LYP, a widely-used density functional theory (DFT). Application of our general method yields excellent results for the midbond basis sets
Accurate and efficient spin integration for particle accelerators
Abell, Dan T.; Meiser, Dominic; Ranjbar, Vahid H.; ...
2015-02-01
Accurate spin tracking is a valuable tool for understanding spin dynamics in particle accelerators and can help improve the performance of an accelerator. In this paper, we present a detailed discussion of the integrators in the spin tracking code GPUSPINTRACK. We have implemented orbital integrators based on drift-kick, bend-kick, and matrix-kick splits. On top of the orbital integrators, we have implemented various integrators for the spin motion. These integrators use quaternions and Romberg quadratures to accelerate both the computation and the convergence of spin rotations.We evaluate their performance and accuracy in quantitative detail for individual elements as well as formore » the entire RHIC lattice. We exploit the inherently data-parallel nature of spin tracking to accelerate our algorithms on graphics processing units.« less
Indexed variation graphs for efficient and accurate resistome profiling.
Rowe, Will P M; Winn, Martyn D
2018-05-14
Antimicrobial resistance remains a major threat to global health. Profiling the collective antimicrobial resistance genes within a metagenome (the "resistome") facilitates greater understanding of antimicrobial resistance gene diversity and dynamics. In turn, this can allow for gene surveillance, individualised treatment of bacterial infections and more sustainable use of antimicrobials. However, resistome profiling can be complicated by high similarity between reference genes, as well as the sheer volume of sequencing data and the complexity of analysis workflows. We have developed an efficient and accurate method for resistome profiling that addresses these complications and improves upon currently available tools. Our method combines a variation graph representation of gene sets with an LSH Forest indexing scheme to allow for fast classification of metagenomic sequence reads using similarity-search queries. Subsequent hierarchical local alignment of classified reads against graph traversals enables accurate reconstruction of full-length gene sequences using a scoring scheme. We provide our implementation, GROOT, and show it to be both faster and more accurate than a current reference-dependent tool for resistome profiling. GROOT runs on a laptop and can process a typical 2 gigabyte metagenome in 2 minutes using a single CPU. Our method is not restricted to resistome profiling and has the potential to improve current metagenomic workflows. GROOT is written in Go and is available at https://github.com/will-rowe/groot (MIT license). will.rowe@stfc.ac.uk. Supplementary data are available at Bioinformatics online.
An Accurate and Dynamic Computer Graphics Muscle Model
NASA Technical Reports Server (NTRS)
Levine, David Asher
1997-01-01
A computer based musculo-skeletal model was developed at the University in the departments of Mechanical and Biomedical Engineering. This model accurately represents human shoulder kinematics. The result of this model is the graphical display of bones moving through an appropriate range of motion based on inputs of EMGs and external forces. The need existed to incorporate a geometric muscle model in the larger musculo-skeletal model. Previous muscle models did not accurately represent muscle geometries, nor did they account for the kinematics of tendons. This thesis covers the creation of a new muscle model for use in the above musculo-skeletal model. This muscle model was based on anatomical data from the Visible Human Project (VHP) cadaver study. Two-dimensional digital images from the VHP were analyzed and reconstructed to recreate the three-dimensional muscle geometries. The recreated geometries were smoothed, reduced, and sliced to form data files defining the surfaces of each muscle. The muscle modeling function opened these files during run-time and recreated the muscle surface. The modeling function applied constant volume limitations to the muscle and constant geometry limitations to the tendons.
A Unified Methodology for Computing Accurate Quaternion Color Moments and Moment Invariants.
Karakasis, Evangelos G; Papakostas, George A; Koulouriotis, Dimitrios E; Tourassis, Vassilios D
2014-02-01
In this paper, a general framework for computing accurate quaternion color moments and their corresponding invariants is proposed. The proposed unified scheme arose by studying the characteristics of different orthogonal polynomials. These polynomials are used as kernels in order to form moments, the invariants of which can easily be derived. The resulted scheme permits the usage of any polynomial-like kernel in a unified and consistent way. The resulted moments and moment invariants demonstrate robustness to noisy conditions and high discriminative power. Additionally, in the case of continuous moments, accurate computations take place to avoid approximation errors. Based on this general methodology, the quaternion Tchebichef, Krawtchouk, Dual Hahn, Legendre, orthogonal Fourier-Mellin, pseudo Zernike and Zernike color moments, and their corresponding invariants are introduced. A selected paradigm presents the reconstruction capability of each moment family, whereas proper classification scenarios evaluate the performance of color moment invariants.
Dendritic nonlinearities are tuned for efficient spike-based computations in cortical circuits
Ujfalussy, Balázs B; Makara, Judit K; Branco, Tiago; Lengyel, Máté
2015-01-01
Cortical neurons integrate thousands of synaptic inputs in their dendrites in highly nonlinear ways. It is unknown how these dendritic nonlinearities in individual cells contribute to computations at the level of neural circuits. Here, we show that dendritic nonlinearities are critical for the efficient integration of synaptic inputs in circuits performing analog computations with spiking neurons. We developed a theory that formalizes how a neuron's dendritic nonlinearity that is optimal for integrating synaptic inputs depends on the statistics of its presynaptic activity patterns. Based on their in vivo preynaptic population statistics (firing rates, membrane potential fluctuations, and correlations due to ensemble dynamics), our theory accurately predicted the responses of two different types of cortical pyramidal cells to patterned stimulation by two-photon glutamate uncaging. These results reveal a new computational principle underlying dendritic integration in cortical neurons by suggesting a functional link between cellular and systems--level properties of cortical circuits. DOI: http://dx.doi.org/10.7554/eLife.10056.001 PMID:26705334
Computationally efficient control allocation
NASA Technical Reports Server (NTRS)
Durham, Wayne (Inventor)
2001-01-01
A computationally efficient method for calculating near-optimal solutions to the three-objective, linear control allocation problem is disclosed. The control allocation problem is that of distributing the effort of redundant control effectors to achieve some desired set of objectives. The problem is deemed linear if control effectiveness is affine with respect to the individual control effectors. The optimal solution is that which exploits the collective maximum capability of the effectors within their individual physical limits. Computational efficiency is measured by the number of floating-point operations required for solution. The method presented returned optimal solutions in more than 90% of the cases examined; non-optimal solutions returned by the method were typically much less than 1% different from optimal and the errors tended to become smaller than 0.01% as the number of controls was increased. The magnitude of the errors returned by the present method was much smaller than those that resulted from either pseudo inverse or cascaded generalized inverse solutions. The computational complexity of the method presented varied linearly with increasing numbers of controls; the number of required floating point operations increased from 5.5 i, to seven times faster than did the minimum-norm solution (the pseudoinverse), and at about the same rate as did the cascaded generalized inverse solution. The computational requirements of the method presented were much better than that of previously described facet-searching methods which increase in proportion to the square of the number of controls.
Accurate computation of gravitational field of a tesseroid
NASA Astrophysics Data System (ADS)
Fukushima, Toshio
2018-02-01
We developed an accurate method to compute the gravitational field of a tesseroid. The method numerically integrates a surface integral representation of the gravitational potential of the tesseroid by conditionally splitting its line integration intervals and by using the double exponential quadrature rule. Then, it evaluates the gravitational acceleration vector and the gravity gradient tensor by numerically differentiating the numerically integrated potential. The numerical differentiation is conducted by appropriately switching the central and the single-sided second-order difference formulas with a suitable choice of the test argument displacement. If necessary, the new method is extended to the case of a general tesseroid with the variable density profile, the variable surface height functions, and/or the variable intervals in longitude or in latitude. The new method is capable of computing the gravitational field of the tesseroid independently on the location of the evaluation point, namely whether outside, near the surface of, on the surface of, or inside the tesseroid. The achievable precision is 14-15 digits for the potential, 9-11 digits for the acceleration vector, and 6-8 digits for the gradient tensor in the double precision environment. The correct digits are roughly doubled if employing the quadruple precision computation. The new method provides a reliable procedure to compute the topographic gravitational field, especially that near, on, and below the surface. Also, it could potentially serve as a sure reference to complement and elaborate the existing approaches using the Gauss-Legendre quadrature or other standard methods of numerical integration.
Efficient calibration for imperfect computer models
Tuo, Rui; Wu, C. F. Jeff
2015-12-01
Many computer models contain unknown parameters which need to be estimated using physical observations. Furthermore, the calibration method based on Gaussian process models may lead to unreasonable estimate for imperfect computer models. In this work, we extend their study to calibration problems with stochastic physical data. We propose a novel method, called the L 2 calibration, and show its semiparametric efficiency. The conventional method of the ordinary least squares is also studied. Theoretical analysis shows that it is consistent but not efficient. Here, numerical examples show that the proposed method outperforms the existing ones.
I/O-Efficient Scientific Computation Using TPIE
NASA Technical Reports Server (NTRS)
Vengroff, Darren Erik; Vitter, Jeffrey Scott
1996-01-01
In recent years, input/output (I/O)-efficient algorithms for a wide variety of problems have appeared in the literature. However, systems specifically designed to assist programmers in implementing such algorithms have remained scarce. TPIE is a system designed to support I/O-efficient paradigms for problems from a variety of domains, including computational geometry, graph algorithms, and scientific computation. The TPIE interface frees programmers from having to deal not only with explicit read and write calls, but also the complex memory management that must be performed for I/O-efficient computation. In this paper we discuss applications of TPIE to problems in scientific computation. We discuss algorithmic issues underlying the design and implementation of the relevant components of TPIE and present performance results of programs written to solve a series of benchmark problems using our current TPIE prototype. Some of the benchmarks we present are based on the NAS parallel benchmarks while others are of our own creation. We demonstrate that the central processing unit (CPU) overhead required to manage I/O is small and that even with just a single disk, the I/O overhead of I/O-efficient computation ranges from negligible to the same order of magnitude as CPU time. We conjecture that if we use a number of disks in parallel this overhead can be all but eliminated.
Modeling weakly-ionized plasmas in magnetic field: A new computationally-efficient approach
DOE Office of Scientific and Technical Information (OSTI.GOV)
Parent, Bernard, E-mail: parent@pusan.ac.kr; Macheret, Sergey O.; Shneider, Mikhail N.
2015-11-01
Despite its success at simulating accurately both non-neutral and quasi-neutral weakly-ionized plasmas, the drift-diffusion model has been observed to be a particularly stiff set of equations. Recently, it was demonstrated that the stiffness of the system could be relieved by rewriting the equations such that the potential is obtained from Ohm's law rather than Gauss's law while adding some source terms to the ion transport equation to ensure that Gauss's law is satisfied in non-neutral regions. Although the latter was applicable to multicomponent and multidimensional plasmas, it could not be used for plasmas in which the magnetic field was significant.more » This paper hence proposes a new computationally-efficient set of electron and ion transport equations that can be used not only for a plasma with multiple types of positive and negative ions, but also for a plasma in magnetic field. Because the proposed set of equations is obtained from the same physical model as the conventional drift-diffusion equations without introducing new assumptions or simplifications, it results in the same exact solution when the grid is refined sufficiently while being more computationally efficient: not only is the proposed approach considerably less stiff and hence requires fewer iterations to reach convergence but it yields a converged solution that exhibits a significantly higher resolution. The combined faster convergence and higher resolution is shown to result in a hundredfold increase in computational efficiency for some typical steady and unsteady plasma problems including non-neutral cathode and anode sheaths as well as quasi-neutral regions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Candel, A.; Kabel, A.; Lee, L.
Over the past years, SLAC's Advanced Computations Department (ACD), under SciDAC sponsorship, has developed a suite of 3D (2D) parallel higher-order finite element (FE) codes, T3P (T2P) and Pic3P (Pic2P), aimed at accurate, large-scale simulation of wakefields and particle-field interactions in radio-frequency (RF) cavities of complex shape. The codes are built on the FE infrastructure that supports SLAC's frequency domain codes, Omega3P and S3P, to utilize conformal tetrahedral (triangular)meshes, higher-order basis functions and quadratic geometry approximation. For time integration, they adopt an unconditionally stable implicit scheme. Pic3P (Pic2P) extends T3P (T2P) to treat charged-particle dynamics self-consistently using the PIC (particle-in-cell)more » approach, the first such implementation on a conformal, unstructured grid using Whitney basis functions. Examples from applications to the International Linear Collider (ILC), Positron Electron Project-II (PEP-II), Linac Coherent Light Source (LCLS) and other accelerators will be presented to compare the accuracy and computational efficiency of these codes versus their counterparts using structured grids.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Du; Yang, Weitao
An efficient method for calculating excitation energies based on the particle-particle random phase approximation (ppRPA) is presented. Neglecting the contributions from the high-lying virtual states and the low-lying core states leads to the significantly smaller active-space ppRPA matrix while keeping the error to within 0.05 eV from the corresponding full ppRPA excitation energies. The resulting computational cost is significantly reduced and becomes less than the construction of the non-local Fock exchange potential matrix in the self-consistent-field (SCF) procedure. With only a modest number of active orbitals, the original ppRPA singlet-triplet (ST) gaps as well as the low-lying single and doublemore » excitation energies can be accurately reproduced at much reduced computational costs, up to 100 times faster than the iterative Davidson diagonalization of the original full ppRPA matrix. For high-lying Rydberg excitations where the Davidson algorithm fails, the computational savings of active-space ppRPA with respect to the direct diagonalization is even more dramatic. The virtues of the underlying full ppRPA combined with the significantly lower computational cost of the active-space approach will significantly expand the applicability of the ppRPA method to calculate excitation energies at a cost of O(K^{4}), with a prefactor much smaller than a single SCF Hartree-Fock (HF)/hybrid functional calculation, thus opening up new possibilities for the quantum mechanical study of excited state electronic structure of large systems.« less
Zhang, Du; Yang, Weitao
2016-10-13
An efficient method for calculating excitation energies based on the particle-particle random phase approximation (ppRPA) is presented. Neglecting the contributions from the high-lying virtual states and the low-lying core states leads to the significantly smaller active-space ppRPA matrix while keeping the error to within 0.05 eV from the corresponding full ppRPA excitation energies. The resulting computational cost is significantly reduced and becomes less than the construction of the non-local Fock exchange potential matrix in the self-consistent-field (SCF) procedure. With only a modest number of active orbitals, the original ppRPA singlet-triplet (ST) gaps as well as the low-lying single and doublemore » excitation energies can be accurately reproduced at much reduced computational costs, up to 100 times faster than the iterative Davidson diagonalization of the original full ppRPA matrix. For high-lying Rydberg excitations where the Davidson algorithm fails, the computational savings of active-space ppRPA with respect to the direct diagonalization is even more dramatic. The virtues of the underlying full ppRPA combined with the significantly lower computational cost of the active-space approach will significantly expand the applicability of the ppRPA method to calculate excitation energies at a cost of O(K^{4}), with a prefactor much smaller than a single SCF Hartree-Fock (HF)/hybrid functional calculation, thus opening up new possibilities for the quantum mechanical study of excited state electronic structure of large systems.« less
Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures
2017-10-04
Report: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures The views, opinions and/or findings contained in this...Chapel Hill Title: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures Report Term: 0-Other Email: dm...algorithms for scientific and geometric computing by exploiting the power and performance efficiency of heterogeneous shared memory architectures . These
NASA Astrophysics Data System (ADS)
Singh, Harendra
2018-04-01
The key purpose of this article is to introduce an efficient computational method for the approximate solution of the homogeneous as well as non-homogeneous nonlinear Lane-Emden type equations. Using proposed computational method given nonlinear equation is converted into a set of nonlinear algebraic equations whose solution gives the approximate solution to the Lane-Emden type equation. Various nonlinear cases of Lane-Emden type equations like standard Lane-Emden equation, the isothermal gas spheres equation and white-dwarf equation are discussed. Results are compared with some well-known numerical methods and it is observed that our results are more accurate.
NASA Technical Reports Server (NTRS)
Liu, Yi; Anusonti-Inthra, Phuriwat; Diskin, Boris
2011-01-01
A physics-based, systematically coupled, multidisciplinary prediction tool (MUTE) for rotorcraft noise was developed and validated with a wide range of flight configurations and conditions. MUTE is an aggregation of multidisciplinary computational tools that accurately and efficiently model the physics of the source of rotorcraft noise, and predict the noise at far-field observer locations. It uses systematic coupling approaches among multiple disciplines including Computational Fluid Dynamics (CFD), Computational Structural Dynamics (CSD), and high fidelity acoustics. Within MUTE, advanced high-order CFD tools are used around the rotor blade to predict the transonic flow (shock wave) effects, which generate the high-speed impulsive noise. Predictions of the blade-vortex interaction noise in low speed flight are also improved by using the Particle Vortex Transport Method (PVTM), which preserves the wake flow details required for blade/wake and fuselage/wake interactions. The accuracy of the source noise prediction is further improved by utilizing a coupling approach between CFD and CSD, so that the effects of key structural dynamics, elastic blade deformations, and trim solutions are correctly represented in the analysis. The blade loading information and/or the flow field parameters around the rotor blade predicted by the CFD/CSD coupling approach are used to predict the acoustic signatures at far-field observer locations with a high-fidelity noise propagation code (WOPWOP3). The predicted results from the MUTE tool for rotor blade aerodynamic loading and far-field acoustic signatures are compared and validated with a variation of experimental data sets, such as UH60-A data, DNW test data and HART II test data.
Fast and accurate mock catalogue generation for low-mass galaxies
NASA Astrophysics Data System (ADS)
Koda, Jun; Blake, Chris; Beutler, Florian; Kazin, Eyal; Marin, Felipe
2016-06-01
We present an accurate and fast framework for generating mock catalogues including low-mass haloes, based on an implementation of the COmoving Lagrangian Acceleration (COLA) technique. Multiple realisations of mock catalogues are crucial for analyses of large-scale structure, but conventional N-body simulations are too computationally expensive for the production of thousands of realizations. We show that COLA simulations can produce accurate mock catalogues with a moderate computation resource for low- to intermediate-mass galaxies in 1012 M⊙ haloes, both in real and redshift space. COLA simulations have accurate peculiar velocities, without systematic errors in the velocity power spectra for k ≤ 0.15 h Mpc-1, and with only 3-per cent error for k ≤ 0.2 h Mpc-1. We use COLA with 10 time steps and a Halo Occupation Distribution to produce 600 mock galaxy catalogues of the WiggleZ Dark Energy Survey. Our parallelized code for efficient generation of accurate halo catalogues is publicly available at github.com/junkoda/cola_halo.
Multislice Computed Tomography Accurately Detects Stenosis in Coronary Artery Bypass Conduits
Duran, Cihan; Sagbas, Ertan; Caynak, Baris; Sanisoglu, Ilhan; Akpinar, Belhhan; Gulbaran, Murat
2007-01-01
The aim of this study was to evaluate the accuracy of multislice computed tomography in detecting graft stenosis or occlusion after coronary artery bypass grafting, using coronary angiography as the standard. From January 2005 through May 2006, 25 patients (19 men and 6 women; mean age, 54 ± 11.3 years) underwent diagnostic investigation of their bypass grafts by multislice computed tomography within 1 month of coronary angiography. The mean time elapsed after coronary artery bypass grafting was 6.2 years. In these 25 patients, we examined 65 bypass conduits (24 arterial and 41 venous) and 171 graft segments (the shaft, proximal anastomosis, and distal anastomosis). Compared with coronary angiography, the segment-based sensitivity, specificity, and positive and negative predictive values of multislice computed tomography in the evaluation of stenosis were 89%, 100%, 100%, and 99%, respectively. The patency rate for multislice compu-ted tomography was 85% (55/65: 3 arterial and 7 venous grafts were occluded), with 100% sensitivity and specificity. From these data, we conclude that multislice computed tomography can accurately evaluate the patency and stenosis of bypass grafts during outpatient follow-up. PMID:17948078
Toward accurate tooth segmentation from computed tomography images using a hybrid level set model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gan, Yangzhou; Zhao, Qunfei; Xia, Zeyang, E-mail: zy.xia@siat.ac.cn, E-mail: jing.xiong@siat.ac.cn
Purpose: A three-dimensional (3D) model of the teeth provides important information for orthodontic diagnosis and treatment planning. Tooth segmentation is an essential step in generating the 3D digital model from computed tomography (CT) images. The aim of this study is to develop an accurate and efficient tooth segmentation method from CT images. Methods: The 3D dental CT volumetric images are segmented slice by slice in a two-dimensional (2D) transverse plane. The 2D segmentation is composed of a manual initialization step and an automatic slice by slice segmentation step. In the manual initialization step, the user manually picks a starting slicemore » and selects a seed point for each tooth in this slice. In the automatic slice segmentation step, a developed hybrid level set model is applied to segment tooth contours from each slice. Tooth contour propagation strategy is employed to initialize the level set function automatically. Cone beam CT (CBCT) images of two subjects were used to tune the parameters. Images of 16 additional subjects were used to validate the performance of the method. Volume overlap metrics and surface distance metrics were adopted to assess the segmentation accuracy quantitatively. The volume overlap metrics were volume difference (VD, mm{sup 3}) and Dice similarity coefficient (DSC, %). The surface distance metrics were average symmetric surface distance (ASSD, mm), RMS (root mean square) symmetric surface distance (RMSSSD, mm), and maximum symmetric surface distance (MSSD, mm). Computation time was recorded to assess the efficiency. The performance of the proposed method has been compared with two state-of-the-art methods. Results: For the tested CBCT images, the VD, DSC, ASSD, RMSSSD, and MSSD for the incisor were 38.16 ± 12.94 mm{sup 3}, 88.82 ± 2.14%, 0.29 ± 0.03 mm, 0.32 ± 0.08 mm, and 1.25 ± 0.58 mm, respectively; the VD, DSC, ASSD, RMSSSD, and MSSD for the canine were 49.12 ± 9.33 mm{sup 3}, 91.57 ± 0.82%, 0.27 ± 0
Kossert, K; Cassette, Ph; Carles, A Grau; Jörg, G; Gostomski, Christroph Lierse V; Nähle, O; Wolf, Ch
2014-05-01
The triple-to-double coincidence ratio (TDCR) method is frequently used to measure the activity of radionuclides decaying by pure β emission or electron capture (EC). Some radionuclides with more complex decays have also been studied, but accurate calculations of decay branches which are accompanied by many coincident γ transitions have not yet been investigated. This paper describes recent extensions of the model to make efficiency computations for more complex decay schemes possible. In particular, the MICELLE2 program that applies a stochastic approach of the free parameter model was extended. With an improved code, efficiencies for β(-), β(+) and EC branches with up to seven coincident γ transitions can be calculated. Moreover, a new parametrization for the computation of electron stopping powers has been implemented to compute the ionization quenching function of 10 commercial scintillation cocktails. In order to demonstrate the capabilities of the TDCR method, the following radionuclides are discussed: (166m)Ho (complex β(-)/γ), (59)Fe (complex β(-)/γ), (64)Cu (β(-), β(+), EC and EC/γ) and (229)Th in equilibrium with its progenies (decay chain with many α, β and complex β(-)/γ transitions). © 2013 Published by Elsevier Ltd.
Zhang, Jun; Dolg, Michael
2013-07-09
An efficient way to obtain accurate CCSD and CCSD(T) energies for large systems, i.e., the third-order incremental dual-basis set zero-buffer approach (inc3-db-B0), has been developed and tested. This approach combines the powerful incremental scheme with the dual-basis set method, and along with the new proposed K-means clustering (KM) method and zero-buffer (B0) approximation, can obtain very accurate absolute and relative energies efficiently. We tested the approach for 10 systems of different chemical nature, i.e., intermolecular interactions including hydrogen bonding, dispersion interaction, and halogen bonding; an intramolecular rearrangement reaction; aliphatic and conjugated hydrocarbon chains; three compact covalent molecules; and a water cluster. The results show that the errors for relative energies are <1.94 kJ/mol (or 0.46 kcal/mol), for absolute energies of <0.0026 hartree. By parallelization, our approach can be applied to molecules of more than 30 atoms and more than 100 correlated electrons with high-quality basis set such as cc-pVDZ or cc-pVTZ, saving computational cost by a factor of more than 10-20, compared to traditional implementation. The physical reasons of the success of the inc3-db-B0 approach are also analyzed.
NASA Astrophysics Data System (ADS)
Tomaro, Robert F.
1998-07-01
The present research is aimed at developing a higher-order, spatially accurate scheme for both steady and unsteady flow simulations using unstructured meshes. The resulting scheme must work on a variety of general problems to ensure the creation of a flexible, reliable and accurate aerodynamic analysis tool. To calculate the flow around complex configurations, unstructured grids and the associated flow solvers have been developed. Efficient simulations require the minimum use of computer memory and computational times. Unstructured flow solvers typically require more computer memory than a structured flow solver due to the indirect addressing of the cells. The approach taken in the present research was to modify an existing three-dimensional unstructured flow solver to first decrease the computational time required for a solution and then to increase the spatial accuracy. The terms required to simulate flow involving non-stationary grids were also implemented. First, an implicit solution algorithm was implemented to replace the existing explicit procedure. Several test cases, including internal and external, inviscid and viscous, two-dimensional, three-dimensional and axi-symmetric problems, were simulated for comparison between the explicit and implicit solution procedures. The increased efficiency and robustness of modified code due to the implicit algorithm was demonstrated. Two unsteady test cases, a plunging airfoil and a wing undergoing bending and torsion, were simulated using the implicit algorithm modified to include the terms required for a moving and/or deforming grid. Secondly, a higher than second-order spatially accurate scheme was developed and implemented into the baseline code. Third- and fourth-order spatially accurate schemes were implemented and tested. The original dissipation was modified to include higher-order terms and modified near shock waves to limit pre- and post-shock oscillations. The unsteady cases were repeated using the higher
NASA Astrophysics Data System (ADS)
Noyes, Ben F.; Mokaberi, Babak; Mandoy, Ram; Pate, Alex; Huijgen, Ralph; McBurney, Mike; Chen, Owen
2017-03-01
Reducing overlay error via an accurate APC feedback system is one of the main challenges in high volume production of the current and future nodes in the semiconductor industry. The overlay feedback system directly affects the number of dies meeting overlay specification and the number of layers requiring dedicated exposure tools through the fabrication flow. Increasing the former number and reducing the latter number is beneficial for the overall efficiency and yield of the fabrication process. An overlay feedback system requires accurate determination of the overlay error, or fingerprint, on exposed wafers in order to determine corrections to be automatically and dynamically applied to the exposure of future wafers. Since current and future nodes require correction per exposure (CPE), the resolution of the overlay fingerprint must be high enough to accommodate CPE in the overlay feedback system, or overlay control module (OCM). Determining a high resolution fingerprint from measured data requires extremely dense overlay sampling that takes a significant amount of measurement time. For static corrections this is acceptable, but in an automated dynamic correction system this method creates extreme bottlenecks for the throughput of said system as new lots have to wait until the previous lot is measured. One solution is using a less dense overlay sampling scheme and employing computationally up-sampled data to a dense fingerprint. That method uses a global fingerprint model over the entire wafer; measured localized overlay errors are therefore not always represented in its up-sampled output. This paper will discuss a hybrid system shown in Fig. 1 that combines a computationally up-sampled fingerprint with the measured data to more accurately capture the actual fingerprint, including local overlay errors. Such a hybrid system is shown to result in reduced modelled residuals while determining the fingerprint, and better on-product overlay performance.
NASA Astrophysics Data System (ADS)
Bauer, Sebastian; Mathias, Gerald; Tavan, Paul
2014-03-01
We present a reaction field (RF) method which accurately solves the Poisson equation for proteins embedded in dielectric solvent continua at a computational effort comparable to that of an electrostatics calculation with polarizable molecular mechanics (MM) force fields. The method combines an approach originally suggested by Egwolf and Tavan [J. Chem. Phys. 118, 2039 (2003)] with concepts generalizing the Born solution [Z. Phys. 1, 45 (1920)] for a solvated ion. First, we derive an exact representation according to which the sources of the RF potential and energy are inducible atomic anti-polarization densities and atomic shielding charge distributions. Modeling these atomic densities by Gaussians leads to an approximate representation. Here, the strengths of the Gaussian shielding charge distributions are directly given in terms of the static partial charges as defined, e.g., by standard MM force fields for the various atom types, whereas the strengths of the Gaussian anti-polarization densities are calculated by a self-consistency iteration. The atomic volumes are also described by Gaussians. To account for covalently overlapping atoms, their effective volumes are calculated by another self-consistency procedure, which guarantees that the dielectric function ɛ(r) is close to one everywhere inside the protein. The Gaussian widths σi of the atoms i are parameters of the RF approximation. The remarkable accuracy of the method is demonstrated by comparison with Kirkwood's analytical solution for a spherical protein [J. Chem. Phys. 2, 351 (1934)] and with computationally expensive grid-based numerical solutions for simple model systems in dielectric continua including a di-peptide (Ac-Ala-NHMe) as modeled by a standard MM force field. The latter example shows how weakly the RF conformational free energy landscape depends on the parameters σi. A summarizing discussion highlights the achievements of the new theory and of its approximate solution particularly by
Bauer, Sebastian; Mathias, Gerald; Tavan, Paul
2014-03-14
We present a reaction field (RF) method which accurately solves the Poisson equation for proteins embedded in dielectric solvent continua at a computational effort comparable to that of an electrostatics calculation with polarizable molecular mechanics (MM) force fields. The method combines an approach originally suggested by Egwolf and Tavan [J. Chem. Phys. 118, 2039 (2003)] with concepts generalizing the Born solution [Z. Phys. 1, 45 (1920)] for a solvated ion. First, we derive an exact representation according to which the sources of the RF potential and energy are inducible atomic anti-polarization densities and atomic shielding charge distributions. Modeling these atomic densities by Gaussians leads to an approximate representation. Here, the strengths of the Gaussian shielding charge distributions are directly given in terms of the static partial charges as defined, e.g., by standard MM force fields for the various atom types, whereas the strengths of the Gaussian anti-polarization densities are calculated by a self-consistency iteration. The atomic volumes are also described by Gaussians. To account for covalently overlapping atoms, their effective volumes are calculated by another self-consistency procedure, which guarantees that the dielectric function ε(r) is close to one everywhere inside the protein. The Gaussian widths σ(i) of the atoms i are parameters of the RF approximation. The remarkable accuracy of the method is demonstrated by comparison with Kirkwood's analytical solution for a spherical protein [J. Chem. Phys. 2, 351 (1934)] and with computationally expensive grid-based numerical solutions for simple model systems in dielectric continua including a di-peptide (Ac-Ala-NHMe) as modeled by a standard MM force field. The latter example shows how weakly the RF conformational free energy landscape depends on the parameters σ(i). A summarizing discussion highlights the achievements of the new theory and of its approximate solution particularly by
Time accurate application of the MacCormack 2-4 scheme on massively parallel computers
NASA Technical Reports Server (NTRS)
Hudson, Dale A.; Long, Lyle N.
1995-01-01
Many recent computational efforts in turbulence and acoustics research have used higher order numerical algorithms. One popular method has been the explicit MacCormack 2-4 scheme. The MacCormack 2-4 scheme is second order accurate in time and fourth order accurate in space, and is stable for CFL's below 2/3. Current research has shown that the method can give accurate results but does exhibit significant Gibbs phenomena at sharp discontinuities. The impact of adding Jameson type second, third, and fourth order artificial viscosity was examined here. Category 2 problems, the nonlinear traveling wave and the Riemann problem, were computed using a CFL number of 0.25. This research has found that dispersion errors can be significantly reduced or nearly eliminated by using a combination of second and third order terms in the damping. Use of second and fourth order terms reduced the magnitude of dispersion errors but not as effectively as the second and third order combination. The program was coded using Thinking Machine's CM Fortran, a variant of Fortran 90/High Performance Fortran, and was executed on a 2K CM-200. Simple extrapolation boundary conditions were used for both problems.
Efficiently modeling neural networks on massively parallel computers
NASA Technical Reports Server (NTRS)
Farber, Robert M.
1993-01-01
Neural networks are a very useful tool for analyzing and modeling complex real world systems. Applying neural network simulations to real world problems generally involves large amounts of data and massive amounts of computation. To efficiently handle the computational requirements of large problems, we have implemented at Los Alamos a highly efficient neural network compiler for serial computers, vector computers, vector parallel computers, and fine grain SIMD computers such as the CM-2 connection machine. This paper describes the mapping used by the compiler to implement feed-forward backpropagation neural networks for a SIMD (Single Instruction Multiple Data) architecture parallel computer. Thinking Machines Corporation has benchmarked our code at 1.3 billion interconnects per second (approximately 3 gigaflops) on a 64,000 processor CM-2 connection machine (Singer 1990). This mapping is applicable to other SIMD computers and can be implemented on MIMD computers such as the CM-5 connection machine. Our mapping has virtually no communications overhead with the exception of the communications required for a global summation across the processors (which has a sub-linear runtime growth on the order of O(log(number of processors)). We can efficiently model very large neural networks which have many neurons and interconnects and our mapping can extend to arbitrarily large networks (within memory limitations) by merging the memory space of separate processors with fast adjacent processor interprocessor communications. This paper will consider the simulation of only feed forward neural network although this method is extendable to recurrent networks.
IMPROVING TACONITE PROCESSING PLANT EFFICIENCY BY COMPUTER SIMULATION, Final Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
William M. Bond; Salih Ersayin
2007-03-30
This project involved industrial scale testing of a mineral processing simulator to improve the efficiency of a taconite processing plant, namely the Minorca mine. The Concentrator Modeling Center at the Coleraine Minerals Research Laboratory, University of Minnesota Duluth, enhanced the capabilities of available software, Usim Pac, by developing mathematical models needed for accurate simulation of taconite plants. This project provided funding for this technology to prove itself in the industrial environment. As the first step, data representing existing plant conditions were collected by sampling and sample analysis. Data were then balanced and provided a basis for assessing the efficiency ofmore » individual devices and the plant, and also for performing simulations aimed at improving plant efficiency. Performance evaluation served as a guide in developing alternative process strategies for more efficient production. A large number of computer simulations were then performed to quantify the benefits and effects of implementing these alternative schemes. Modification of makeup ball size was selected as the most feasible option for the target performance improvement. This was combined with replacement of existing hydrocyclones with more efficient ones. After plant implementation of these modifications, plant sampling surveys were carried out to validate findings of the simulation-based study. Plant data showed very good agreement with the simulated data, confirming results of simulation. After the implementation of modifications in the plant, several upstream bottlenecks became visible. Despite these bottlenecks limiting full capacity, concentrator energy improvement of 7% was obtained. Further improvements in energy efficiency are expected in the near future. The success of this project demonstrated the feasibility of a simulation-based approach. Currently, the Center provides simulation-based service to all the iron ore mining companies operating in northern
Sengupta, Arkajyoti; Ramabhadran, Raghunath O; Raghavachari, Krishnan
2014-08-14
In this study we have used the connectivity-based hierarchy (CBH) method to derive accurate heats of formation of a range of biomolecules, 18 amino acids and 10 barbituric acid/uracil derivatives. The hierarchy is based on the connectivity of the different atoms in a large molecule. It results in error-cancellation reaction schemes that are automated, general, and can be readily used for a broad range of organic molecules and biomolecules. Herein, we first locate stable conformational and tautomeric forms of these biomolecules using an accurate level of theory (viz. CCSD(T)/6-311++G(3df,2p)). Subsequently, the heats of formation of the amino acids are evaluated using the CBH-1 and CBH-2 schemes and routinely employed density functionals or wave function-based methods. The calculated heats of formation obtained herein using modest levels of theory and are in very good agreement with those obtained using more expensive W1-F12 and W2-F12 methods on amino acids and G3 results on barbituric acid derivatives. Overall, the present study (a) highlights the small effect of including multiple conformers in determining the heats of formation of biomolecules and (b) in concurrence with previous CBH studies, proves that use of the more effective error-cancelling isoatomic scheme (CBH-2) results in more accurate heats of formation with modestly sized basis sets along with common density functionals or wave function-based methods.
High Performance Computing Meets Energy Efficiency - Continuum Magazine |
NREL High Performance Computing Meets Energy Efficiency High Performance Computing Meets Energy turbines. Simulation by Patrick J. Moriarty and Matthew J. Churchfield, NREL The new High Performance Computing Data Center at the National Renewable Energy Laboratory (NREL) hosts high-speed, high-volume data
NASA Astrophysics Data System (ADS)
Balsara, Dinshaw S.
2017-12-01
As computational astrophysics comes under pressure to become a precision science, there is an increasing need to move to high accuracy schemes for computational astrophysics. The algorithmic needs of computational astrophysics are indeed very special. The methods need to be robust and preserve the positivity of density and pressure. Relativistic flows should remain sub-luminal. These requirements place additional pressures on a computational astrophysics code, which are usually not felt by a traditional fluid dynamics code. Hence the need for a specialized review. The focus here is on weighted essentially non-oscillatory (WENO) schemes, discontinuous Galerkin (DG) schemes and PNPM schemes. WENO schemes are higher order extensions of traditional second order finite volume schemes. At third order, they are most similar to piecewise parabolic method schemes, which are also included. DG schemes evolve all the moments of the solution, with the result that they are more accurate than WENO schemes. PNPM schemes occupy a compromise position between WENO and DG schemes. They evolve an Nth order spatial polynomial, while reconstructing higher order terms up to Mth order. As a result, the timestep can be larger. Time-dependent astrophysical codes need to be accurate in space and time with the result that the spatial and temporal accuracies must be matched. This is realized with the help of strong stability preserving Runge-Kutta schemes and ADER (Arbitrary DERivative in space and time) schemes, both of which are also described. The emphasis of this review is on computer-implementable ideas, not necessarily on the underlying theory.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hoang, Tuan L.; Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, CA 94550; Marian, Jaime, E-mail: jmarian@ucla.edu
2015-11-01
An improved version of a recently developed stochastic cluster dynamics (SCD) method (Marian and Bulatov, 2012) [6] is introduced as an alternative to rate theory (RT) methods for solving coupled ordinary differential equation (ODE) systems for irradiation damage simulations. SCD circumvents by design the curse of dimensionality of the variable space that renders traditional ODE-based RT approaches inefficient when handling complex defect population comprised of multiple (more than two) defect species. Several improvements introduced here enable efficient and accurate simulations of irradiated materials up to realistic (high) damage doses characteristic of next-generation nuclear systems. The first improvement is a proceduremore » for efficiently updating the defect reaction-network and event selection in the context of a dynamically expanding reaction-network. Next is a novel implementation of the τ-leaping method that speeds up SCD simulations by advancing the state of the reaction network in large time increments when appropriate. Lastly, a volume rescaling procedure is introduced to control the computational complexity of the expanding reaction-network through occasional reductions of the defect population while maintaining accurate statistics. The enhanced SCD method is then applied to model defect cluster accumulation in iron thin films subjected to triple ion-beam (Fe{sup 3+}, He{sup +} and H{sup +}) irradiations, for which standard RT or spatially-resolved kinetic Monte Carlo simulations are prohibitively expensive.« less
NASA Astrophysics Data System (ADS)
Hoang, Tuan L.; Marian, Jaime; Bulatov, Vasily V.; Hosemann, Peter
2015-11-01
An improved version of a recently developed stochastic cluster dynamics (SCD) method (Marian and Bulatov, 2012) [6] is introduced as an alternative to rate theory (RT) methods for solving coupled ordinary differential equation (ODE) systems for irradiation damage simulations. SCD circumvents by design the curse of dimensionality of the variable space that renders traditional ODE-based RT approaches inefficient when handling complex defect population comprised of multiple (more than two) defect species. Several improvements introduced here enable efficient and accurate simulations of irradiated materials up to realistic (high) damage doses characteristic of next-generation nuclear systems. The first improvement is a procedure for efficiently updating the defect reaction-network and event selection in the context of a dynamically expanding reaction-network. Next is a novel implementation of the τ-leaping method that speeds up SCD simulations by advancing the state of the reaction network in large time increments when appropriate. Lastly, a volume rescaling procedure is introduced to control the computational complexity of the expanding reaction-network through occasional reductions of the defect population while maintaining accurate statistics. The enhanced SCD method is then applied to model defect cluster accumulation in iron thin films subjected to triple ion-beam (Fe3+, He+ and H+) irradiations, for which standard RT or spatially-resolved kinetic Monte Carlo simulations are prohibitively expensive.
Convolutional networks for fast, energy-efficient neuromorphic computing.
Esser, Steven K; Merolla, Paul A; Arthur, John V; Cassidy, Andrew S; Appuswamy, Rathinakumar; Andreopoulos, Alexander; Berg, David J; McKinstry, Jeffrey L; Melano, Timothy; Barch, Davis R; di Nolfo, Carmelo; Datta, Pallab; Amir, Arnon; Taba, Brian; Flickner, Myron D; Modha, Dharmendra S
2016-10-11
Deep networks are now able to achieve human-level performance on a broad spectrum of recognition tasks. Independently, neuromorphic computing has now demonstrated unprecedented energy-efficiency through a new chip architecture based on spiking neurons, low precision synapses, and a scalable communication network. Here, we demonstrate that neuromorphic computing, despite its novel architectural primitives, can implement deep convolution networks that (i) approach state-of-the-art classification accuracy across eight standard datasets encompassing vision and speech, (ii) perform inference while preserving the hardware's underlying energy-efficiency and high throughput, running on the aforementioned datasets at between 1,200 and 2,600 frames/s and using between 25 and 275 mW (effectively >6,000 frames/s per Watt), and (iii) can be specified and trained using backpropagation with the same ease-of-use as contemporary deep learning. This approach allows the algorithmic power of deep learning to be merged with the efficiency of neuromorphic processors, bringing the promise of embedded, intelligent, brain-inspired computing one step closer.
Fast and Accurate Circuit Design Automation through Hierarchical Model Switching.
Huynh, Linh; Tagkopoulos, Ilias
2015-08-21
In computer-aided biological design, the trifecta of characterized part libraries, accurate models and optimal design parameters is crucial for producing reliable designs. As the number of parts and model complexity increase, however, it becomes exponentially more difficult for any optimization method to search the solution space, hence creating a trade-off that hampers efficient design. To address this issue, we present a hierarchical computer-aided design architecture that uses a two-step approach for biological design. First, a simple model of low computational complexity is used to predict circuit behavior and assess candidate circuit branches through branch-and-bound methods. Then, a complex, nonlinear circuit model is used for a fine-grained search of the reduced solution space, thus achieving more accurate results. Evaluation with a benchmark of 11 circuits and a library of 102 experimental designs with known characterization parameters demonstrates a speed-up of 3 orders of magnitude when compared to other design methods that provide optimality guarantees.
Stable and Spectrally Accurate Schemes for the Navier-Stokes Equations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jia, Jun; Liu, Jie
2011-01-01
In this paper, we present an accurate, efficient and stable numerical method for the incompressible Navier-Stokes equations (NSEs). The method is based on (1) an equivalent pressure Poisson equation formulation of the NSE with proper pressure boundary conditions, which facilitates the design of high-order and stable numerical methods, and (2) the Krylov deferred correction (KDC) accelerated method of lines transpose (mbox MoL{sup T}), which is very stable, efficient, and of arbitrary order in time. Numerical tests with known exact solutions in three dimensions show that the new method is spectrally accurate in time, and a numerical order of convergence 9more » was observed. Two-dimensional computational results of flow past a cylinder and flow in a bifurcated tube are also reported.« less
Accurate and efficient modeling of the detector response in small animal multi-head PET systems.
Cecchetti, Matteo; Moehrs, Sascha; Belcari, Nicola; Del Guerra, Alberto
2013-10-07
In fully three-dimensional PET imaging, iterative image reconstruction techniques usually outperform analytical algorithms in terms of image quality provided that an appropriate system model is used. In this study we concentrate on the calculation of an accurate system model for the YAP-(S)PET II small animal scanner, with the aim to obtain fully resolution- and contrast-recovered images at low levels of image roughness. For this purpose we calculate the system model by decomposing it into a product of five matrices: (1) a detector response component obtained via Monte Carlo simulations, (2) a geometric component which describes the scanner geometry and which is calculated via a multi-ray method, (3) a detector normalization component derived from the acquisition of a planar source, (4) a photon attenuation component calculated from x-ray computed tomography data, and finally, (5) a positron range component is formally included. This system model factorization allows the optimization of each component in terms of computation time, storage requirements and accuracy. The main contribution of this work is a new, efficient way to calculate the detector response component for rotating, planar detectors, that consists of a GEANT4 based simulation of a subset of lines of flight (LOFs) for a single detector head whereas the missing LOFs are obtained by using intrinsic detector symmetries. Additionally, we introduce and analyze a probability threshold for matrix elements of the detector component to optimize the trade-off between the matrix size in terms of non-zero elements and the resulting quality of the reconstructed images. In order to evaluate our proposed system model we reconstructed various images of objects, acquired according to the NEMA NU 4-2008 standard, and we compared them to the images reconstructed with two other system models: a model that does not include any detector response component and a model that approximates analytically the depth of interaction
Accurate and efficient modeling of the detector response in small animal multi-head PET systems
NASA Astrophysics Data System (ADS)
Cecchetti, Matteo; Moehrs, Sascha; Belcari, Nicola; Del Guerra, Alberto
2013-10-01
In fully three-dimensional PET imaging, iterative image reconstruction techniques usually outperform analytical algorithms in terms of image quality provided that an appropriate system model is used. In this study we concentrate on the calculation of an accurate system model for the YAP-(S)PET II small animal scanner, with the aim to obtain fully resolution- and contrast-recovered images at low levels of image roughness. For this purpose we calculate the system model by decomposing it into a product of five matrices: (1) a detector response component obtained via Monte Carlo simulations, (2) a geometric component which describes the scanner geometry and which is calculated via a multi-ray method, (3) a detector normalization component derived from the acquisition of a planar source, (4) a photon attenuation component calculated from x-ray computed tomography data, and finally, (5) a positron range component is formally included. This system model factorization allows the optimization of each component in terms of computation time, storage requirements and accuracy. The main contribution of this work is a new, efficient way to calculate the detector response component for rotating, planar detectors, that consists of a GEANT4 based simulation of a subset of lines of flight (LOFs) for a single detector head whereas the missing LOFs are obtained by using intrinsic detector symmetries. Additionally, we introduce and analyze a probability threshold for matrix elements of the detector component to optimize the trade-off between the matrix size in terms of non-zero elements and the resulting quality of the reconstructed images. In order to evaluate our proposed system model we reconstructed various images of objects, acquired according to the NEMA NU 4-2008 standard, and we compared them to the images reconstructed with two other system models: a model that does not include any detector response component and a model that approximates analytically the depth of interaction
NASA Astrophysics Data System (ADS)
Ghale, Purnima; Johnson, Harley T.
2018-06-01
We present an efficient sparse matrix-vector (SpMV) based method to compute the density matrix P from a given Hamiltonian in electronic structure computations. Our method is a hybrid approach based on Chebyshev-Jackson approximation theory and matrix purification methods like the second order spectral projection purification (SP2). Recent methods to compute the density matrix scale as O(N) in the number of floating point operations but are accompanied by large memory and communication overhead, and they are based on iterative use of the sparse matrix-matrix multiplication kernel (SpGEMM), which is known to be computationally irregular. In addition to irregularity in the sparse Hamiltonian H, the nonzero structure of intermediate estimates of P depends on products of H and evolves over the course of computation. On the other hand, an expansion of the density matrix P in terms of Chebyshev polynomials is straightforward and SpMV based; however, the resulting density matrix may not satisfy the required constraints exactly. In this paper, we analyze the strengths and weaknesses of the Chebyshev-Jackson polynomials and the second order spectral projection purification (SP2) method, and propose to combine them so that the accurate density matrix can be computed using the SpMV computational kernel only, and without having to store the density matrix P. Our method accomplishes these objectives by using the Chebyshev polynomial estimate as the initial guess for SP2, which is followed by using sparse matrix-vector multiplications (SpMVs) to replicate the behavior of the SP2 algorithm for purification. We demonstrate the method on a tight-binding model system of an oxide material containing more than 3 million atoms. In addition, we also present the predicted behavior of our method when applied to near-metallic Hamiltonians with a wide energy spectrum.
A simplified approach to characterizing a kilovoltage source spectrum for accurate dose computation.
Poirier, Yannick; Kouznetsov, Alexei; Tambasco, Mauro
2012-06-01
%. The HVL and kVp are sufficient for characterizing a kV x-ray source spectrum for accurate dose computation. As these parameters can be easily and accurately measured, they provide for a clinically feasible approach to characterizing a kV energy spectrum to be used for patient specific x-ray dose computations. Furthermore, these results provide experimental validation of our novel hybrid dose computation algorithm. © 2012 American Association of Physicists in Medicine.
[Economic efficiency of computer monitoring of health].
Il'icheva, N P; Stazhadze, L L
2001-01-01
Presents the method of computer monitoring of health, based on utilization of modern information technologies in public health. The method helps organize preventive activities of an outpatient clinic at a high level and essentially decrease the time and money loss. Efficiency of such preventive measures, increased number of computer and Internet users suggests that such methods are promising and further studies in this field are needed.
Convolutional networks for fast, energy-efficient neuromorphic computing
Esser, Steven K.; Merolla, Paul A.; Arthur, John V.; Cassidy, Andrew S.; Appuswamy, Rathinakumar; Andreopoulos, Alexander; Berg, David J.; McKinstry, Jeffrey L.; Melano, Timothy; Barch, Davis R.; di Nolfo, Carmelo; Datta, Pallab; Amir, Arnon; Taba, Brian; Flickner, Myron D.; Modha, Dharmendra S.
2016-01-01
Deep networks are now able to achieve human-level performance on a broad spectrum of recognition tasks. Independently, neuromorphic computing has now demonstrated unprecedented energy-efficiency through a new chip architecture based on spiking neurons, low precision synapses, and a scalable communication network. Here, we demonstrate that neuromorphic computing, despite its novel architectural primitives, can implement deep convolution networks that (i) approach state-of-the-art classification accuracy across eight standard datasets encompassing vision and speech, (ii) perform inference while preserving the hardware’s underlying energy-efficiency and high throughput, running on the aforementioned datasets at between 1,200 and 2,600 frames/s and using between 25 and 275 mW (effectively >6,000 frames/s per Watt), and (iii) can be specified and trained using backpropagation with the same ease-of-use as contemporary deep learning. This approach allows the algorithmic power of deep learning to be merged with the efficiency of neuromorphic processors, bringing the promise of embedded, intelligent, brain-inspired computing one step closer. PMID:27651489
ACCURATE CHEMICAL MASTER EQUATION SOLUTION USING MULTI-FINITE BUFFERS
Cao, Youfang; Terebus, Anna; Liang, Jie
2016-01-01
The discrete chemical master equation (dCME) provides a fundamental framework for studying stochasticity in mesoscopic networks. Because of the multi-scale nature of many networks where reaction rates have large disparity, directly solving dCMEs is intractable due to the exploding size of the state space. It is important to truncate the state space effectively with quantified errors, so accurate solutions can be computed. It is also important to know if all major probabilistic peaks have been computed. Here we introduce the Accurate CME (ACME) algorithm for obtaining direct solutions to dCMEs. With multi-finite buffers for reducing the state space by O(n!), exact steady-state and time-evolving network probability landscapes can be computed. We further describe a theoretical framework of aggregating microstates into a smaller number of macrostates by decomposing a network into independent aggregated birth and death processes, and give an a priori method for rapidly determining steady-state truncation errors. The maximal sizes of the finite buffers for a given error tolerance can also be pre-computed without costly trial solutions of dCMEs. We show exactly computed probability landscapes of three multi-scale networks, namely, a 6-node toggle switch, 11-node phage-lambda epigenetic circuit, and 16-node MAPK cascade network, the latter two with no known solutions. We also show how probabilities of rare events can be computed from first-passage times, another class of unsolved problems challenging for simulation-based techniques due to large separations in time scales. Overall, the ACME method enables accurate and efficient solutions of the dCME for a large class of networks. PMID:27761104
DOE Office of Scientific and Technical Information (OSTI.GOV)
Heydari, M.H., E-mail: heydari@stu.yazd.ac.ir; The Laboratory of Quantum Information Processing, Yazd University, Yazd; Hooshmandasl, M.R., E-mail: hooshmandasl@yazd.ac.ir
Because of the nonlinearity, closed-form solutions of many important stochastic functional equations are virtually impossible to obtain. Thus, numerical solutions are a viable alternative. In this paper, a new computational method based on the generalized hat basis functions together with their stochastic operational matrix of Itô-integration is proposed for solving nonlinear stochastic Itô integral equations in large intervals. In the proposed method, a new technique for computing nonlinear terms in such problems is presented. The main advantage of the proposed method is that it transforms problems under consideration into nonlinear systems of algebraic equations which can be simply solved. Errormore » analysis of the proposed method is investigated and also the efficiency of this method is shown on some concrete examples. The obtained results reveal that the proposed method is very accurate and efficient. As two useful applications, the proposed method is applied to obtain approximate solutions of the stochastic population growth models and stochastic pendulum problem.« less
Yu, Peng; Shaw, Chad A
2014-06-01
The Dirichlet-multinomial (DMN) distribution is a fundamental model for multicategory count data with overdispersion. This distribution has many uses in bioinformatics including applications to metagenomics data, transctriptomics and alternative splicing. The DMN distribution reduces to the multinomial distribution when the overdispersion parameter ψ is 0. Unfortunately, numerical computation of the DMN log-likelihood function by conventional methods results in instability in the neighborhood of [Formula: see text]. An alternative formulation circumvents this instability, but it leads to long runtimes that make it impractical for large count data common in bioinformatics. We have developed a new method for computation of the DMN log-likelihood to solve the instability problem without incurring long runtimes. The new approach is composed of a novel formula and an algorithm to extend its applicability. Our numerical experiments show that this new method both improves the accuracy of log-likelihood evaluation and the runtime by several orders of magnitude, especially in high-count data situations that are common in deep sequencing data. Using real metagenomic data, our method achieves manyfold runtime improvement. Our method increases the feasibility of using the DMN distribution to model many high-throughput problems in bioinformatics. We have included in our work an R package giving access to this method and a vingette applying this approach to metagenomic data. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
High accurate interpolation of NURBS tool path for CNC machine tools
NASA Astrophysics Data System (ADS)
Liu, Qiang; Liu, Huan; Yuan, Songmei
2016-09-01
Feedrate fluctuation caused by approximation errors of interpolation methods has great effects on machining quality in NURBS interpolation, but few methods can efficiently eliminate or reduce it to a satisfying level without sacrificing the computing efficiency at present. In order to solve this problem, a high accurate interpolation method for NURBS tool path is proposed. The proposed method can efficiently reduce the feedrate fluctuation by forming a quartic equation with respect to the curve parameter increment, which can be efficiently solved by analytic methods in real-time. Theoretically, the proposed method can totally eliminate the feedrate fluctuation for any 2nd degree NURBS curves and can interpolate 3rd degree NURBS curves with minimal feedrate fluctuation. Moreover, a smooth feedrate planning algorithm is also proposed to generate smooth tool motion with considering multiple constraints and scheduling errors by an efficient planning strategy. Experiments are conducted to verify the feasibility and applicability of the proposed method. This research presents a novel NURBS interpolation method with not only high accuracy but also satisfying computing efficiency.
Efficient Computational Prototyping of Mixed Technology Microfluidic Components and Systems
2002-08-01
AFRL-IF-RS-TR-2002-190 Final Technical Report August 2002 EFFICIENT COMPUTATIONAL PROTOTYPING OF MIXED TECHNOLOGY MICROFLUIDIC...SUBTITLE EFFICIENT COMPUTATIONAL PROTOTYPING OF MIXED TECHNOLOGY MICROFLUIDIC COMPONENTS AND SYSTEMS 6. AUTHOR(S) Narayan R. Aluru, Jacob White...Aided Design (CAD) tools for microfluidic components and systems were developed in this effort. Innovative numerical methods and algorithms for mixed
Texture functions in image analysis: A computationally efficient solution
NASA Technical Reports Server (NTRS)
Cox, S. C.; Rose, J. F.
1983-01-01
A computationally efficient means for calculating texture measurements from digital images by use of the co-occurrence technique is presented. The calculation of the statistical descriptors of image texture and a solution that circumvents the need for calculating and storing a co-occurrence matrix are discussed. The results show that existing efficient algorithms for calculating sums, sums of squares, and cross products can be used to compute complex co-occurrence relationships directly from the digital image input.
Drawert, Brian; Lawson, Michael J; Petzold, Linda; Khammash, Mustafa
2010-02-21
We have developed a computational framework for accurate and efficient simulation of stochastic spatially inhomogeneous biochemical systems. The new computational method employs a fractional step hybrid strategy. A novel formulation of the finite state projection (FSP) method, called the diffusive FSP method, is introduced for the efficient and accurate simulation of diffusive transport. Reactions are handled by the stochastic simulation algorithm.
Brandenburg, Jan Gerit; Caldeweyher, Eike; Grimme, Stefan
2016-06-21
We extend the recently introduced PBEh-3c global hybrid density functional [S. Grimme et al., J. Chem. Phys., 2015, 143, 054107] by a screened Fock exchange variant based on the Henderson-Janesko-Scuseria exchange hole model. While the excellent performance of the global hybrid is maintained for small covalently bound molecules, its performance for computed condensed phase mass densities is further improved. Most importantly, a speed up of 30 to 50% can be achieved and especially for small orbital energy gap cases, the method is numerically much more robust. The latter point is important for many applications, e.g., for metal-organic frameworks, organic semiconductors, or protein structures. This enables an accurate density functional based electronic structure calculation of a full DNA helix structure on a single core desktop computer which is presented as an example in addition to comprehensive benchmark results.
Li, Pei-Nan; Li, Hong; Wu, Mo-Li; Wang, Shou-Yu; Kong, Qing-You; Zhang, Zhen; Sun, Yuan; Liu, Jia; Lv, De-Cheng
2012-01-01
Wound measurement is an objective and direct way to trace the course of wound healing and to evaluate therapeutic efficacy. Nevertheless, the accuracy and efficiency of the current measurement methods need to be improved. Taking the advantages of reliability of transparency tracing and the accuracy of computer-aided digital imaging, a transparency-based digital imaging approach is established, by which data from 340 wound tracing were collected from 6 experimental groups (8 rats/group) at 8 experimental time points (Day 1, 3, 5, 7, 10, 12, 14 and 16) and orderly archived onto a transparency model sheet. This sheet was scanned and its image was saved in JPG form. Since a set of standard area units from 1 mm2 to 1 cm2 was integrated into the sheet, the tracing areas in JPG image were measured directly, using the “Magnetic lasso tool” in Adobe Photoshop program. The pixel values/PVs of individual outlined regions were obtained and recorded in an average speed of 27 second/region. All PV data were saved in an excel form and their corresponding areas were calculated simultaneously by the formula of Y (PV of the outlined region)/X (PV of standard area unit) × Z (area of standard unit). It took a researcher less than 3 hours to finish area calculation of 340 regions. In contrast, over 3 hours were expended by three skillful researchers to accomplish the above work with traditional transparency-based method. Moreover, unlike the results obtained traditionally, little variation was found among the data calculated by different persons and the standard area units in different sizes and shapes. Given its accurate, reproductive and efficient properties, this transparency-based digital imaging approach would be of significant values in basic wound healing research and clinical practice. PMID:22666449
Efficient computation of photonic crystal waveguide modes with dispersive material.
Schmidt, Kersten; Kappeler, Roman
2010-03-29
The optimization of PhC waveguides is a key issue for successfully designing PhC devices. Since this design task is computationally expensive, efficient methods are demanded. The available codes for computing photonic bands are also applied to PhC waveguides. They are reliable but not very efficient, which is even more pronounced for dispersive material. We present a method based on higher order finite elements with curved cells, which allows to solve for the band structure taking directly into account the dispersiveness of the materials. This is accomplished by reformulating the wave equations as a linear eigenproblem in the complex wave-vectors k. For this method, we demonstrate the high efficiency for the computation of guided PhC waveguide modes by a convergence analysis.
DeGregorio, Nicole; Iyengar, Srinivasan S
2018-01-09
We present two sampling measures to gauge critical regions of potential energy surfaces. These sampling measures employ (a) the instantaneous quantum wavepacket density, an approximation to the (b) potential surface, its (c) gradients, and (d) a Shannon information theory based expression that estimates the local entropy associated with the quantum wavepacket. These four criteria together enable a directed sampling of potential surfaces that appears to correctly describe the local oscillation frequencies, or the local Nyquist frequency, of a potential surface. The sampling functions are then utilized to derive a tessellation scheme that discretizes the multidimensional space to enable efficient sampling of potential surfaces. The sampled potential surface is then combined with four different interpolation procedures, namely, (a) local Hermite curve interpolation, (b) low-pass filtered Lagrange interpolation, (c) the monomial symmetrization approximation (MSA) developed by Bowman and co-workers, and (d) a modified Shepard algorithm. The sampling procedure and the fitting schemes are used to compute (a) potential surfaces in highly anharmonic hydrogen-bonded systems and (b) study hydrogen-transfer reactions in biogenic volatile organic compounds (isoprene) where the transferring hydrogen atom is found to demonstrate critical quantum nuclear effects. In the case of isoprene, the algorithm discussed here is used to derive multidimensional potential surfaces along a hydrogen-transfer reaction path to gauge the effect of quantum-nuclear degrees of freedom on the hydrogen-transfer process. Based on the decreased computational effort, facilitated by the optimal sampling of the potential surfaces through the use of sampling functions discussed here, and the accuracy of the associated potential surfaces, we believe the method will find great utility in the study of quantum nuclear dynamics problems, of which application to hydrogen-transfer reactions and hydrogen
CCOMP: An efficient algorithm for complex roots computation of determinantal equations
NASA Astrophysics Data System (ADS)
Zouros, Grigorios P.
2018-01-01
In this paper a free Python algorithm, entitled CCOMP (Complex roots COMPutation), is developed for the efficient computation of complex roots of determinantal equations inside a prescribed complex domain. The key to the method presented is the efficient determination of the candidate points inside the domain which, in their close neighborhood, a complex root may lie. Once these points are detected, the algorithm proceeds to a two-dimensional minimization problem with respect to the minimum modulus eigenvalue of the system matrix. In the core of CCOMP exist three sub-algorithms whose tasks are the efficient estimation of the minimum modulus eigenvalues of the system matrix inside the prescribed domain, the efficient computation of candidate points which guarantee the existence of minima, and finally, the computation of minima via bound constrained minimization algorithms. Theoretical results and heuristics support the development and the performance of the algorithm, which is discussed in detail. CCOMP supports general complex matrices, and its efficiency, applicability and validity is demonstrated to a variety of microwave applications.
The Effect of Computer Automation on Institutional Review Board (IRB) Office Efficiency
ERIC Educational Resources Information Center
Oder, Karl; Pittman, Stephanie
2015-01-01
Companies purchase computer systems to make their processes more efficient through automation. Some academic medical centers (AMC) have purchased computer systems for their institutional review boards (IRB) to increase efficiency and compliance with regulations. IRB computer systems are expensive to purchase, deploy, and maintain. An AMC should…
Changing computing paradigms towards power efficiency
Klavík, Pavel; Malossi, A. Cristiano I.; Bekas, Costas; Curioni, Alessandro
2014-01-01
Power awareness is fast becoming immensely important in computing, ranging from the traditional high-performance computing applications to the new generation of data centric workloads. In this work, we describe our efforts towards a power-efficient computing paradigm that combines low- and high-precision arithmetic. We showcase our ideas for the widely used kernel of solving systems of linear equations that finds numerous applications in scientific and engineering disciplines as well as in large-scale data analytics, statistics and machine learning. Towards this goal, we developed tools for the seamless power profiling of applications at a fine-grain level. In addition, we verify here previous work on post-FLOPS/W metrics and show that these can shed much more light in the power/energy profile of important applications. PMID:24842033
NASA Technical Reports Server (NTRS)
Kiris, Cetin; Kwak, Dochan
2001-01-01
Two numerical procedures, one based on artificial compressibility method and the other pressure projection method, are outlined for obtaining time-accurate solutions of the incompressible Navier-Stokes equations. The performance of the two method are compared by obtaining unsteady solutions for the evolution of twin vortices behind a at plate. Calculated results are compared with experimental and other numerical results. For an un- steady ow which requires small physical time step, pressure projection method was found to be computationally efficient since it does not require any subiterations procedure. It was observed that the artificial compressibility method requires a fast convergence scheme at each physical time step in order to satisfy incompressibility condition. This was obtained by using a GMRES-ILU(0) solver in our computations. When a line-relaxation scheme was used, the time accuracy was degraded and time-accurate computations became very expensive.
Accurate chemical master equation solution using multi-finite buffers
Cao, Youfang; Terebus, Anna; Liang, Jie
2016-06-29
Here, the discrete chemical master equation (dCME) provides a fundamental framework for studying stochasticity in mesoscopic networks. Because of the multiscale nature of many networks where reaction rates have a large disparity, directly solving dCMEs is intractable due to the exploding size of the state space. It is important to truncate the state space effectively with quantified errors, so accurate solutions can be computed. It is also important to know if all major probabilistic peaks have been computed. Here we introduce the accurate CME (ACME) algorithm for obtaining direct solutions to dCMEs. With multifinite buffers for reducing the state spacemore » by $O(n!)$, exact steady-state and time-evolving network probability landscapes can be computed. We further describe a theoretical framework of aggregating microstates into a smaller number of macrostates by decomposing a network into independent aggregated birth and death processes and give an a priori method for rapidly determining steady-state truncation errors. The maximal sizes of the finite buffers for a given error tolerance can also be precomputed without costly trial solutions of dCMEs. We show exactly computed probability landscapes of three multiscale networks, namely, a 6-node toggle switch, 11-node phage-lambda epigenetic circuit, and 16-node MAPK cascade network, the latter two with no known solutions. We also show how probabilities of rare events can be computed from first-passage times, another class of unsolved problems challenging for simulation-based techniques due to large separations in time scales. Overall, the ACME method enables accurate and efficient solutions of the dCME for a large class of networks.« less
Accurate chemical master equation solution using multi-finite buffers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cao, Youfang; Terebus, Anna; Liang, Jie
Here, the discrete chemical master equation (dCME) provides a fundamental framework for studying stochasticity in mesoscopic networks. Because of the multiscale nature of many networks where reaction rates have a large disparity, directly solving dCMEs is intractable due to the exploding size of the state space. It is important to truncate the state space effectively with quantified errors, so accurate solutions can be computed. It is also important to know if all major probabilistic peaks have been computed. Here we introduce the accurate CME (ACME) algorithm for obtaining direct solutions to dCMEs. With multifinite buffers for reducing the state spacemore » by $O(n!)$, exact steady-state and time-evolving network probability landscapes can be computed. We further describe a theoretical framework of aggregating microstates into a smaller number of macrostates by decomposing a network into independent aggregated birth and death processes and give an a priori method for rapidly determining steady-state truncation errors. The maximal sizes of the finite buffers for a given error tolerance can also be precomputed without costly trial solutions of dCMEs. We show exactly computed probability landscapes of three multiscale networks, namely, a 6-node toggle switch, 11-node phage-lambda epigenetic circuit, and 16-node MAPK cascade network, the latter two with no known solutions. We also show how probabilities of rare events can be computed from first-passage times, another class of unsolved problems challenging for simulation-based techniques due to large separations in time scales. Overall, the ACME method enables accurate and efficient solutions of the dCME for a large class of networks.« less
Funnel metadynamics as accurate binding free-energy method
Limongelli, Vittorio; Bonomi, Massimiliano; Parrinello, Michele
2013-01-01
A detailed description of the events ruling ligand/protein interaction and an accurate estimation of the drug affinity to its target is of great help in speeding drug discovery strategies. We have developed a metadynamics-based approach, named funnel metadynamics, that allows the ligand to enhance the sampling of the target binding sites and its solvated states. This method leads to an efficient characterization of the binding free-energy surface and an accurate calculation of the absolute protein–ligand binding free energy. We illustrate our protocol in two systems, benzamidine/trypsin and SC-558/cyclooxygenase 2. In both cases, the X-ray conformation has been found as the lowest free-energy pose, and the computed protein–ligand binding free energy in good agreement with experiments. Furthermore, funnel metadynamics unveils important information about the binding process, such as the presence of alternative binding modes and the role of waters. The results achieved at an affordable computational cost make funnel metadynamics a valuable method for drug discovery and for dealing with a variety of problems in chemistry, physics, and material science. PMID:23553839
Computationally Efficient Clustering of Audio-Visual Meeting Data
NASA Astrophysics Data System (ADS)
Hung, Hayley; Friedland, Gerald; Yeo, Chuohao
This chapter presents novel computationally efficient algorithms to extract semantically meaningful acoustic and visual events related to each of the participants in a group discussion using the example of business meeting recordings. The recording setup involves relatively few audio-visual sensors, comprising a limited number of cameras and microphones. We first demonstrate computationally efficient algorithms that can identify who spoke and when, a problem in speech processing known as speaker diarization. We also extract visual activity features efficiently from MPEG4 video by taking advantage of the processing that was already done for video compression. Then, we present a method of associating the audio-visual data together so that the content of each participant can be managed individually. The methods presented in this article can be used as a principal component that enables many higher-level semantic analysis tasks needed in search, retrieval, and navigation.
NASA Astrophysics Data System (ADS)
Valasek, Lukas; Glasa, Jan
2017-12-01
Current fire simulation systems are capable to utilize advantages of high-performance computer (HPC) platforms available and to model fires efficiently in parallel. In this paper, efficiency of a corridor fire simulation on a HPC computer cluster is discussed. The parallel MPI version of Fire Dynamics Simulator is used for testing efficiency of selected strategies of allocation of computational resources of the cluster using a greater number of computational cores. Simulation results indicate that if the number of cores used is not equal to a multiple of the total number of cluster node cores there are allocation strategies which provide more efficient calculations.
NASA Astrophysics Data System (ADS)
Xu, M.; van Overloop, P. J.; van de Giesen, N. C.
2011-02-01
Model predictive control (MPC) of open channel flow is becoming an important tool in water management. The complexity of the prediction model has a large influence on the MPC application in terms of control effectiveness and computational efficiency. The Saint-Venant equations, called SV model in this paper, and the Integrator Delay (ID) model are either accurate but computationally costly, or simple but restricted to allowed flow changes. In this paper, a reduced Saint-Venant (RSV) model is developed through a model reduction technique, Proper Orthogonal Decomposition (POD), on the SV equations. The RSV model keeps the main flow dynamics and functions over a large flow range but is easier to implement in MPC. In the test case of a modeled canal reach, the number of states and disturbances in the RSV model is about 45 and 16 times less than the SV model, respectively. The computational time of MPC with the RSV model is significantly reduced, while the controller remains effective. Thus, the RSV model is a promising means to balance the control effectiveness and computational efficiency.
A computationally efficient modelling of laminar separation bubbles
NASA Technical Reports Server (NTRS)
Dini, Paolo; Maughmer, Mark D.
1989-01-01
The goal is to accurately predict the characteristics of the laminar separation bubble and its effects on airfoil performance. Toward this end, a computational model of the separation bubble was developed and incorporated into the Eppler and Somers airfoil design and analysis program. Thus far, the focus of the research was limited to the development of a model which can accurately predict situations in which the interaction between the bubble and the inviscid velocity distribution is weak, the so-called short bubble. A summary of the research performed in the past nine months is presented. The bubble model in its present form is then described. Lastly, the performance of this model in predicting bubble characteristics is shown for a few cases.
Changing computing paradigms towards power efficiency.
Klavík, Pavel; Malossi, A Cristiano I; Bekas, Costas; Curioni, Alessandro
2014-06-28
Power awareness is fast becoming immensely important in computing, ranging from the traditional high-performance computing applications to the new generation of data centric workloads. In this work, we describe our efforts towards a power-efficient computing paradigm that combines low- and high-precision arithmetic. We showcase our ideas for the widely used kernel of solving systems of linear equations that finds numerous applications in scientific and engineering disciplines as well as in large-scale data analytics, statistics and machine learning. Towards this goal, we developed tools for the seamless power profiling of applications at a fine-grain level. In addition, we verify here previous work on post-FLOPS/W metrics and show that these can shed much more light in the power/energy profile of important applications. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
Methods for Computationally Efficient Structured CFD Simulations of Complex Turbomachinery Flows
NASA Technical Reports Server (NTRS)
Herrick, Gregory P.; Chen, Jen-Ping
2012-01-01
This research presents more efficient computational methods by which to perform multi-block structured Computational Fluid Dynamics (CFD) simulations of turbomachinery, thus facilitating higher-fidelity solutions of complicated geometries and their associated flows. This computational framework offers flexibility in allocating resources to balance process count and wall-clock computation time, while facilitating research interests of simulating axial compressor stall inception with more complete gridding of the flow passages and rotor tip clearance regions than is typically practiced with structured codes. The paradigm presented herein facilitates CFD simulation of previously impractical geometries and flows. These methods are validated and demonstrate improved computational efficiency when applied to complicated geometries and flows.
NASA Astrophysics Data System (ADS)
Hou, Peng-Fei; Zhang, Yang
2017-09-01
Because most piezoelectric functional devices, including sensors, actuators and energy harvesters, are in the form of a piezoelectric coated structure, it is valuable to present an accurate and efficient method for obtaining the electro-mechanical coupling fields of this coated structure under mechanical and electrical loads. With this aim, the two-dimensional Green’s function for a normal line force and line charge on the surface of coated structure, which is a combination of an orthotropic piezoelectric coating and orthotropic elastic substrate, is presented in the form of elementary functions based on the general solution method. The corresponding electro-mechanical coupling fields of this coated structure under arbitrary mechanical and electrical loads can then be obtained by the superposition principle and Gauss integration. Numerical results show that the presented method has high computational precision, efficiency and stability. It can be used to design the best coating thickness in functional devices, improve the sensitivity of sensors, and improve the efficiency of actuators and energy harvesters. This method could be an efficient tool for engineers in engineering applications.
An efficient method for computation of the manipulator inertia matrix
NASA Technical Reports Server (NTRS)
Fijany, Amir; Bejczy, Antal K.
1989-01-01
An efficient method of computation of the manipulator inertia matrix is presented. Using spatial notations, the method leads to the definition of the composite rigid-body spatial inertia, which is a spatial representation of the notion of augmented body. The previously proposed methods, the physical interpretations leading to their derivation, and their redundancies are analyzed. The proposed method achieves a greater efficiency by eliminating the redundancy in the intrinsic equations as well as by a better choice of coordinate frame for their projection. In this case, removing the redundancy leads to greater efficiency of the computation in both serial and parallel senses.
LCC-Demons: a robust and accurate symmetric diffeomorphic registration algorithm.
Lorenzi, M; Ayache, N; Frisoni, G B; Pennec, X
2013-11-01
Non-linear registration is a key instrument for computational anatomy to study the morphology of organs and tissues. However, in order to be an effective instrument for the clinical practice, registration algorithms must be computationally efficient, accurate and most importantly robust to the multiple biases affecting medical images. In this work we propose a fast and robust registration framework based on the log-Demons diffeomorphic registration algorithm. The transformation is parameterized by stationary velocity fields (SVFs), and the similarity metric implements a symmetric local correlation coefficient (LCC). Moreover, we show how the SVF setting provides a stable and consistent numerical scheme for the computation of the Jacobian determinant and the flux of the deformation across the boundaries of a given region. Thus, it provides a robust evaluation of spatial changes. We tested the LCC-Demons in the inter-subject registration setting, by comparing with state-of-the-art registration algorithms on public available datasets, and in the intra-subject longitudinal registration problem, for the statistically powered measurements of the longitudinal atrophy in Alzheimer's disease. Experimental results show that LCC-Demons is a generic, flexible, efficient and robust algorithm for the accurate non-linear registration of images, which can find several applications in the field of medical imaging. Without any additional optimization, it solves equally well intra & inter-subject registration problems, and compares favorably to state-of-the-art methods. Copyright © 2013 Elsevier Inc. All rights reserved.
Mezei, Pál D; Csonka, Gábor I; Ruzsinszky, Adrienn; Sun, Jianwei
2015-01-13
A correct description of the anion-π interaction is essential for the design of selective anion receptors and channels and important for advances in the field of supramolecular chemistry. However, it is challenging to do accurate, precise, and efficient calculations of this interaction, which are lacking in the literature. In this article, by testing sets of 20 binary anion-π complexes of fluoride, chloride, bromide, nitrate, or carbonate ions with hexafluorobenzene, 1,3,5-trifluorobenzene, 2,4,6-trifluoro-1,3,5-triazine, or 1,3,5-triazine and 30 ternary π-anion-π' sandwich complexes composed from the same monomers, we suggest domain-based local-pair natural orbital coupled cluster energies extrapolated to the complete basis-set limit as reference values. We give a detailed explanation of the origin of anion-π interactions, using the permanent quadrupole moments, static dipole polarizabilities, and electrostatic potential maps. We use symmetry-adapted perturbation theory (SAPT) to calculate the components of the anion-π interaction energies. We examine the performance of the direct random phase approximation (dRPA), the second-order screened exchange (SOSEX), local-pair natural-orbital (LPNO) coupled electron pair approximation (CEPA), and several dispersion-corrected density functionals (including generalized gradient approximation (GGA), meta-GGA, and double hybrid density functional). The LPNO-CEPA/1 results show the best agreement with the reference results. The dRPA method is only slightly less accurate and precise than the LPNO-CEPA/1, but it is considerably more efficient (6-17 times faster) for the binary complexes studied in this paper. For 30 ternary π-anion-π' sandwich complexes, we give dRPA interaction energies as reference values. The double hybrid functionals are much more efficient but less accurate and precise than dRPA. The dispersion-corrected double hybrid PWPB95-D3(BJ) and B2PLYP-D3(BJ) functionals perform better than the GGA and meta
Efficient computation of kinship and identity coefficients on large pedigrees.
Cheng, En; Elliott, Brendan; Ozsoyoglu, Z Meral
2009-06-01
With the rapidly expanding field of medical genetics and genetic counseling, genealogy information is becoming increasingly abundant. An important computation on pedigree data is the calculation of identity coefficients, which provide a complete description of the degree of relatedness of a pair of individuals. The areas of application of identity coefficients are numerous and diverse, from genetic counseling to disease tracking, and thus, the computation of identity coefficients merits special attention. However, the computation of identity coefficients is not done directly, but rather as the final step after computing a set of generalized kinship coefficients. In this paper, we first propose a novel Path-Counting Formula for calculating generalized kinship coefficients, which is motivated by Wright's path-counting method for computing inbreeding coefficient. We then present an efficient and scalable scheme for calculating generalized kinship coefficients on large pedigrees using NodeCodes, a special encoding scheme for expediting the evaluation of queries on pedigree graph structures. Furthermore, we propose an improved scheme using Family NodeCodes for the computation of generalized kinship coefficients, which is motivated by the significant improvement of using Family NodeCodes for inbreeding coefficient over the use of NodeCodes. We also perform experiments for evaluating the efficiency of our method, and compare it with the performance of the traditional recursive algorithm for three individuals. Experimental results demonstrate that the resulting scheme is more scalable and efficient than the traditional recursive methods for computing generalized kinship coefficients.
A Computationally Efficient Method for Polyphonic Pitch Estimation
NASA Astrophysics Data System (ADS)
Zhou, Ruohua; Reiss, Joshua D.; Mattavelli, Marco; Zoia, Giorgio
2009-12-01
This paper presents a computationally efficient method for polyphonic pitch estimation. The method employs the Fast Resonator Time-Frequency Image (RTFI) as the basic time-frequency analysis tool. The approach is composed of two main stages. First, a preliminary pitch estimation is obtained by means of a simple peak-picking procedure in the pitch energy spectrum. Such spectrum is calculated from the original RTFI energy spectrum according to harmonic grouping principles. Then the incorrect estimations are removed according to spectral irregularity and knowledge of the harmonic structures of the music notes played on commonly used music instruments. The new approach is compared with a variety of other frame-based polyphonic pitch estimation methods, and results demonstrate the high performance and computational efficiency of the approach.
Efficient and Flexible Computation of Many-Electron Wave Function Overlaps.
Plasser, Felix; Ruckenbauer, Matthias; Mai, Sebastian; Oppel, Markus; Marquetand, Philipp; González, Leticia
2016-03-08
A new algorithm for the computation of the overlap between many-electron wave functions is described. This algorithm allows for the extensive use of recurring intermediates and thus provides high computational efficiency. Because of the general formalism employed, overlaps can be computed for varying wave function types, molecular orbitals, basis sets, and molecular geometries. This paves the way for efficiently computing nonadiabatic interaction terms for dynamics simulations. In addition, other application areas can be envisaged, such as the comparison of wave functions constructed at different levels of theory. Aside from explaining the algorithm and evaluating the performance, a detailed analysis of the numerical stability of wave function overlaps is carried out, and strategies for overcoming potential severe pitfalls due to displaced atoms and truncated wave functions are presented.
Accurate Determination of Coulombic Efficiency for Lithium Metal Anodes and Lithium Metal Batteries
DOE Office of Scientific and Technical Information (OSTI.GOV)
Adams, Brian D.; Zheng, Jianming; Ren, Xiaodi
Lithium (Li) metal is an ideal anode material for high energy density batteries. However, its low Coulombic efficiency (CE) and formation of dendrites during the plating and stripping processes has hindered its applications in rechargeable Li metal batteries. The accurate measurement of Li CE is a critical factor to predict the cycle life of Li metal batteries, but the measurement of Li CE is affected by various factors that often leads to conflicting values reported in the literature. Here, we investigate various factors that affect the measurement of Li CE and propose a more accurate method of determining Li CE.more » It was also found that the capacity used for cycling greatly affects the stabilization cycles and the average CE. A higher cycling capacity leads to a shorter number of stabilization cycles and higher average CE. With a proper high-concentration ether-based electrolyte, Li metal can be cycled with a high average CE of 99.5 % for over 100 cycles at a high capacity of 6 mAh cm-2 suitable for practical applications.« less
Demystifying the GMAT: Computer-Based Testing Terms
ERIC Educational Resources Information Center
Rudner, Lawrence M.
2012-01-01
Computer-based testing can be a powerful means to make all aspects of test administration not only faster and more efficient, but also more accurate and more secure. While the Graduate Management Admission Test (GMAT) exam is a computer adaptive test, there are other approaches. This installment presents a primer of computer-based testing terms.
Neic, Aurel; Campos, Fernando O; Prassl, Anton J; Niederer, Steven A; Bishop, Martin J; Vigmond, Edward J; Plank, Gernot
2017-10-01
Anatomically accurate and biophysically detailed bidomain models of the human heart have proven a powerful tool for gaining quantitative insight into the links between electrical sources in the myocardium and the concomitant current flow in the surrounding medium as they represent their relationship mechanistically based on first principles. Such models are increasingly considered as a clinical research tool with the perspective of being used, ultimately, as a complementary diagnostic modality. An important prerequisite in many clinical modeling applications is the ability of models to faithfully replicate potential maps and electrograms recorded from a given patient. However, while the personalization of electrophysiology models based on the gold standard bidomain formulation is in principle feasible, the associated computational expenses are significant, rendering their use incompatible with clinical time frames. In this study we report on the development of a novel computationally efficient reaction-eikonal (R-E) model for modeling extracellular potential maps and electrograms. Using a biventricular human electrophysiology model, which incorporates a topologically realistic His-Purkinje system (HPS), we demonstrate by comparing against a high-resolution reaction-diffusion (R-D) bidomain model that the R-E model predicts extracellular potential fields, electrograms as well as ECGs at the body surface with high fidelity and offers vast computational savings greater than three orders of magnitude. Due to their efficiency R-E models are ideally suitable for forward simulations in clinical modeling studies which attempt to personalize electrophysiological model features.
NASA Astrophysics Data System (ADS)
Neic, Aurel; Campos, Fernando O.; Prassl, Anton J.; Niederer, Steven A.; Bishop, Martin J.; Vigmond, Edward J.; Plank, Gernot
2017-10-01
Anatomically accurate and biophysically detailed bidomain models of the human heart have proven a powerful tool for gaining quantitative insight into the links between electrical sources in the myocardium and the concomitant current flow in the surrounding medium as they represent their relationship mechanistically based on first principles. Such models are increasingly considered as a clinical research tool with the perspective of being used, ultimately, as a complementary diagnostic modality. An important prerequisite in many clinical modeling applications is the ability of models to faithfully replicate potential maps and electrograms recorded from a given patient. However, while the personalization of electrophysiology models based on the gold standard bidomain formulation is in principle feasible, the associated computational expenses are significant, rendering their use incompatible with clinical time frames. In this study we report on the development of a novel computationally efficient reaction-eikonal (R-E) model for modeling extracellular potential maps and electrograms. Using a biventricular human electrophysiology model, which incorporates a topologically realistic His-Purkinje system (HPS), we demonstrate by comparing against a high-resolution reaction-diffusion (R-D) bidomain model that the R-E model predicts extracellular potential fields, electrograms as well as ECGs at the body surface with high fidelity and offers vast computational savings greater than three orders of magnitude. Due to their efficiency R-E models are ideally suitable for forward simulations in clinical modeling studies which attempt to personalize electrophysiological model features.
Positive Wigner functions render classical simulation of quantum computation efficient.
Mari, A; Eisert, J
2012-12-07
We show that quantum circuits where the initial state and all the following quantum operations can be represented by positive Wigner functions can be classically efficiently simulated. This is true both for continuous-variable as well as discrete variable systems in odd prime dimensions, two cases which will be treated on entirely the same footing. Noting the fact that Clifford and Gaussian operations preserve the positivity of the Wigner function, our result generalizes the Gottesman-Knill theorem. Our algorithm provides a way of sampling from the output distribution of a computation or a simulation, including the efficient sampling from an approximate output distribution in the case of sampling imperfections for initial states, gates, or measurements. In this sense, this work highlights the role of the positive Wigner function as separating classically efficiently simulable systems from those that are potentially universal for quantum computing and simulation, and it emphasizes the role of negativity of the Wigner function as a computational resource.
Spin-neurons: A possible path to energy-efficient neuromorphic computers
NASA Astrophysics Data System (ADS)
Sharad, Mrigank; Fan, Deliang; Roy, Kaushik
2013-12-01
Recent years have witnessed growing interest in the field of brain-inspired computing based on neural-network architectures. In order to translate the related algorithmic models into powerful, yet energy-efficient cognitive-computing hardware, computing-devices beyond CMOS may need to be explored. The suitability of such devices to this field of computing would strongly depend upon how closely their physical characteristics match with the essential computing primitives employed in such models. In this work, we discuss the rationale of applying emerging spin-torque devices for bio-inspired computing. Recent spin-torque experiments have shown the path to low-current, low-voltage, and high-speed magnetization switching in nano-scale magnetic devices. Such magneto-metallic, current-mode spin-torque switches can mimic the analog summing and "thresholding" operation of an artificial neuron with high energy-efficiency. Comparison with CMOS-based analog circuit-model of a neuron shows that "spin-neurons" (spin based circuit model of neurons) can achieve more than two orders of magnitude lower energy and beyond three orders of magnitude reduction in energy-delay product. The application of spin-neurons can therefore be an attractive option for neuromorphic computers of future.
Accurate and Efficient Approximation to the Optimized Effective Potential for Exchange
NASA Astrophysics Data System (ADS)
Ryabinkin, Ilya G.; Kananenka, Alexei A.; Staroverov, Viktor N.
2013-07-01
We devise an efficient practical method for computing the Kohn-Sham exchange-correlation potential corresponding to a Hartree-Fock electron density. This potential is almost indistinguishable from the exact-exchange optimized effective potential (OEP) and, when used as an approximation to the OEP, is vastly better than all existing models. Using our method one can obtain unambiguous, nearly exact OEPs for any reasonable finite one-electron basis set at the same low cost as the Krieger-Li-Iafrate and Becke-Johnson potentials. For all practical purposes, this solves the long-standing problem of black-box construction of OEPs in exact-exchange calculations.
The thermodynamic efficiency of computations made in cells across the range of life
NASA Astrophysics Data System (ADS)
Kempes, Christopher P.; Wolpert, David; Cohen, Zachary; Pérez-Mercader, Juan
2017-11-01
Biological organisms must perform computation as they grow, reproduce and evolve. Moreover, ever since Landauer's bound was proposed, it has been known that all computation has some thermodynamic cost-and that the same computation can be achieved with greater or smaller thermodynamic cost depending on how it is implemented. Accordingly an important issue concerning the evolution of life is assessing the thermodynamic efficiency of the computations performed by organisms. This issue is interesting both from the perspective of how close life has come to maximally efficient computation (presumably under the pressure of natural selection), and from the practical perspective of what efficiencies we might hope that engineered biological computers might achieve, especially in comparison with current computational systems. Here we show that the computational efficiency of translation, defined as free energy expended per amino acid operation, outperforms the best supercomputers by several orders of magnitude, and is only about an order of magnitude worse than the Landauer bound. However, this efficiency depends strongly on the size and architecture of the cell in question. In particular, we show that the useful efficiency of an amino acid operation, defined as the bulk energy per amino acid polymerization, decreases for increasing bacterial size and converges to the polymerization cost of the ribosome. This cost of the largest bacteria does not change in cells as we progress through the major evolutionary shifts to both single- and multicellular eukaryotes. However, the rates of total computation per unit mass are non-monotonic in bacteria with increasing cell size, and also change across different biological architectures, including the shift from unicellular to multicellular eukaryotes. This article is part of the themed issue 'Reconceptualizing the origins of life'.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chiang, Patrick
2014-01-31
The research goal of this CAREER proposal is to develop energy-efficient, VLSI interconnect circuits and systems that will facilitate future massively-parallel, high-performance computing. Extreme-scale computing will exhibit massive parallelism on multiple vertical levels, from thou sands of computational units on a single processor to thousands of processors in a single data center. Unfortunately, the energy required to communicate between these units at every level (on chip, off-chip, off-rack) will be the critical limitation to energy efficiency. Therefore, the PI's career goal is to become a leading researcher in the design of energy-efficient VLSI interconnect for future computing systems.
Computer Architecture for Energy Efficient SFQ
2014-08-27
IBM Corporation (T.J. Watson Research Laboratory) 1101 Kitchawan Road Yorktown Heights, NY 10598 -0000 2 ABSTRACT Number of Papers published in peer...accomplished during this ARO-sponsored project at IBM Research to identify and model an energy efficient SFQ-based computer architecture. The... IBM Windsor Blue (WB), illustrated schematically in Figure 2. The basic building block of WB is a "tile" comprised of a 64-bit arithmetic logic unit
Efficient conjugate gradient algorithms for computation of the manipulator forward dynamics
NASA Technical Reports Server (NTRS)
Fijany, Amir; Scheid, Robert E.
1989-01-01
The applicability of conjugate gradient algorithms for computation of the manipulator forward dynamics is investigated. The redundancies in the previously proposed conjugate gradient algorithm are analyzed. A new version is developed which, by avoiding these redundancies, achieves a significantly greater efficiency. A preconditioned conjugate gradient algorithm is also presented. A diagonal matrix whose elements are the diagonal elements of the inertia matrix is proposed as the preconditioner. In order to increase the computational efficiency, an algorithm is developed which exploits the synergism between the computation of the diagonal elements of the inertia matrix and that required by the conjugate gradient algorithm.
NASA Technical Reports Server (NTRS)
Marconi, F.; Salas, M.; Yaeger, L.
1976-01-01
A numerical procedure has been developed to compute the inviscid super/hypersonic flow field about complex vehicle geometries accurately and efficiently. A second order accurate finite difference scheme is used to integrate the three dimensional Euler equations in regions of continuous flow, while all shock waves are computed as discontinuities via the Rankine Hugoniot jump conditions. Conformal mappings are used to develop a computational grid. The effects of blunt nose entropy layers are computed in detail. Real gas effects for equilibrium air are included using curve fits of Mollier charts. Typical calculated results for shuttle orbiter, hypersonic transport, and supersonic aircraft configurations are included to demonstrate the usefulness of this tool.
Accurate complex scaling of three dimensional numerical potentials
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cerioni, Alessandro; Genovese, Luigi; Duchemin, Ivan
2013-05-28
The complex scaling method, which consists in continuing spatial coordinates into the complex plane, is a well-established method that allows to compute resonant eigenfunctions of the time-independent Schroedinger operator. Whenever it is desirable to apply the complex scaling to investigate resonances in physical systems defined on numerical discrete grids, the most direct approach relies on the application of a similarity transformation to the original, unscaled Hamiltonian. We show that such an approach can be conveniently implemented in the Daubechies wavelet basis set, featuring a very promising level of generality, high accuracy, and no need for artificial convergence parameters. Complex scalingmore » of three dimensional numerical potentials can be efficiently and accurately performed. By carrying out an illustrative resonant state computation in the case of a one-dimensional model potential, we then show that our wavelet-based approach may disclose new exciting opportunities in the field of computational non-Hermitian quantum mechanics.« less
NASA Astrophysics Data System (ADS)
Berger, Lukas; Kleinheinz, Konstantin; Attili, Antonio; Bisetti, Fabrizio; Pitsch, Heinz; Mueller, Michael E.
2018-05-01
Modelling unclosed terms in partial differential equations typically involves two steps: First, a set of known quantities needs to be specified as input parameters for a model, and second, a specific functional form needs to be defined to model the unclosed terms by the input parameters. Both steps involve a certain modelling error, with the former known as the irreducible error and the latter referred to as the functional error. Typically, only the total modelling error, which is the sum of functional and irreducible error, is assessed, but the concept of the optimal estimator enables the separate analysis of the total and the irreducible errors, yielding a systematic modelling error decomposition. In this work, attention is paid to the techniques themselves required for the practical computation of irreducible errors. Typically, histograms are used for optimal estimator analyses, but this technique is found to add a non-negligible spurious contribution to the irreducible error if models with multiple input parameters are assessed. Thus, the error decomposition of an optimal estimator analysis becomes inaccurate, and misleading conclusions concerning modelling errors may be drawn. In this work, numerically accurate techniques for optimal estimator analyses are identified and a suitable evaluation of irreducible errors is presented. Four different computational techniques are considered: a histogram technique, artificial neural networks, multivariate adaptive regression splines, and an additive model based on a kernel method. For multiple input parameter models, only artificial neural networks and multivariate adaptive regression splines are found to yield satisfactorily accurate results. Beyond a certain number of input parameters, the assessment of models in an optimal estimator analysis even becomes practically infeasible if histograms are used. The optimal estimator analysis in this paper is applied to modelling the filtered soot intermittency in large eddy
Spin-neurons: A possible path to energy-efficient neuromorphic computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sharad, Mrigank; Fan, Deliang; Roy, Kaushik
Recent years have witnessed growing interest in the field of brain-inspired computing based on neural-network architectures. In order to translate the related algorithmic models into powerful, yet energy-efficient cognitive-computing hardware, computing-devices beyond CMOS may need to be explored. The suitability of such devices to this field of computing would strongly depend upon how closely their physical characteristics match with the essential computing primitives employed in such models. In this work, we discuss the rationale of applying emerging spin-torque devices for bio-inspired computing. Recent spin-torque experiments have shown the path to low-current, low-voltage, and high-speed magnetization switching in nano-scale magnetic devices.more » Such magneto-metallic, current-mode spin-torque switches can mimic the analog summing and “thresholding” operation of an artificial neuron with high energy-efficiency. Comparison with CMOS-based analog circuit-model of a neuron shows that “spin-neurons” (spin based circuit model of neurons) can achieve more than two orders of magnitude lower energy and beyond three orders of magnitude reduction in energy-delay product. The application of spin-neurons can therefore be an attractive option for neuromorphic computers of future.« less
Efficient quantum circuits for one-way quantum computing.
Tanamoto, Tetsufumi; Liu, Yu-Xi; Hu, Xuedong; Nori, Franco
2009-03-13
While Ising-type interactions are ideal for implementing controlled phase flip gates in one-way quantum computing, natural interactions between solid-state qubits are most often described by either the XY or the Heisenberg models. We show an efficient way of generating cluster states directly using either the imaginary SWAP (iSWAP) gate for the XY model, or the sqrt[SWAP] gate for the Heisenberg model. Our approach thus makes one-way quantum computing more feasible for solid-state devices.
Computational efficiency improvements for image colorization
NASA Astrophysics Data System (ADS)
Yu, Chao; Sharma, Gaurav; Aly, Hussein
2013-03-01
We propose an efficient algorithm for colorization of greyscale images. As in prior work, colorization is posed as an optimization problem: a user specifies the color for a few scribbles drawn on the greyscale image and the color image is obtained by propagating color information from the scribbles to surrounding regions, while maximizing the local smoothness of colors. In this formulation, colorization is obtained by solving a large sparse linear system, which normally requires substantial computation and memory resources. Our algorithm improves the computational performance through three innovations over prior colorization implementations. First, the linear system is solved iteratively without explicitly constructing the sparse matrix, which significantly reduces the required memory. Second, we formulate each iteration in terms of integral images obtained by dynamic programming, reducing repetitive computation. Third, we use a coarseto- fine framework, where a lower resolution subsampled image is first colorized and this low resolution color image is upsampled to initialize the colorization process for the fine level. The improvements we develop provide significant speedup and memory savings compared to the conventional approach of solving the linear system directly using off-the-shelf sparse solvers, and allow us to colorize images with typical sizes encountered in realistic applications on typical commodity computing platforms.
Efficient path-based computations on pedigree graphs with compact encodings
2012-01-01
A pedigree is a diagram of family relationships, and it is often used to determine the mode of inheritance (dominant, recessive, etc.) of genetic diseases. Along with rapidly growing knowledge of genetics and accumulation of genealogy information, pedigree data is becoming increasingly important. In large pedigree graphs, path-based methods for efficiently computing genealogical measurements, such as inbreeding and kinship coefficients of individuals, depend on efficient identification and processing of paths. In this paper, we propose a new compact path encoding scheme on large pedigrees, accompanied by an efficient algorithm for identifying paths. We demonstrate the utilization of our proposed method by applying it to the inbreeding coefficient computation. We present time and space complexity analysis, and also manifest the efficiency of our method for evaluating inbreeding coefficients as compared to previous methods by experimental results using pedigree graphs with real and synthetic data. Both theoretical and experimental results demonstrate that our method is more scalable and efficient than previous methods in terms of time and space requirements. PMID:22536898
Matrix-vector multiplication using digital partitioning for more accurate optical computing
NASA Technical Reports Server (NTRS)
Gary, C. K.
1992-01-01
Digital partitioning offers a flexible means of increasing the accuracy of an optical matrix-vector processor. This algorithm can be implemented with the same architecture required for a purely analog processor, which gives optical matrix-vector processors the ability to perform high-accuracy calculations at speeds comparable with or greater than electronic computers as well as the ability to perform analog operations at a much greater speed. Digital partitioning is compared with digital multiplication by analog convolution, residue number systems, and redundant number representation in terms of the size and the speed required for an equivalent throughput as well as in terms of the hardware requirements. Digital partitioning and digital multiplication by analog convolution are found to be the most efficient alogrithms if coding time and hardware are considered, and the architecture for digital partitioning permits the use of analog computations to provide the greatest throughput for a single processor.
Perspective: Memcomputing: Leveraging memory and physics to compute efficiently
NASA Astrophysics Data System (ADS)
Di Ventra, Massimiliano; Traversa, Fabio L.
2018-05-01
It is well known that physical phenomena may be of great help in computing some difficult problems efficiently. A typical example is prime factorization that may be solved in polynomial time by exploiting quantum entanglement on a quantum computer. There are, however, other types of (non-quantum) physical properties that one may leverage to compute efficiently a wide range of hard problems. In this perspective, we discuss how to employ one such property, memory (time non-locality), in a novel physics-based approach to computation: Memcomputing. In particular, we focus on digital memcomputing machines (DMMs) that are scalable. DMMs can be realized with non-linear dynamical systems with memory. The latter property allows the realization of a new type of Boolean logic, one that is self-organizing. Self-organizing logic gates are "terminal-agnostic," namely, they do not distinguish between the input and output terminals. When appropriately assembled to represent a given combinatorial/optimization problem, the corresponding self-organizing circuit converges to the equilibrium points that express the solutions of the problem at hand. In doing so, DMMs take advantage of the long-range order that develops during the transient dynamics. This collective dynamical behavior, reminiscent of a phase transition, or even the "edge of chaos," is mediated by families of classical trajectories (instantons) that connect critical points of increasing stability in the system's phase space. The topological character of the solution search renders DMMs robust against noise and structural disorder. Since DMMs are non-quantum systems described by ordinary differential equations, not only can they be built in hardware with the available technology, they can also be simulated efficiently on modern classical computers. As an example, we will show the polynomial-time solution of the subset-sum problem for the worst cases, and point to other types of hard problems where simulations of DMMs
Computing Interactions Of Free-Space Radiation With Matter
NASA Technical Reports Server (NTRS)
Wilson, J. W.; Cucinotta, F. A.; Shinn, J. L.; Townsend, L. W.; Badavi, F. F.; Tripathi, R. K.; Silberberg, R.; Tsao, C. H.; Badwar, G. D.
1995-01-01
High Charge and Energy Transport (HZETRN) computer program computationally efficient, user-friendly package of software adressing problem of transport of, and shielding against, radiation in free space. Designed as "black box" for design engineers not concerned with physics of underlying atomic and nuclear radiation processes in free-space environment, but rather primarily interested in obtaining fast and accurate dosimetric information for design and construction of modules and devices for use in free space. Computational efficiency achieved by unique algorithm based on deterministic approach to solution of Boltzmann equation rather than computationally intensive statistical Monte Carlo method. Written in FORTRAN.
Newman, D M; Hawley, R W; Goeckel, D L; Crawford, R D; Abraham, S; Gallagher, N C
1993-05-10
An efficient storage format was developed for computer-generated holograms for use in electron-beam lithography. This method employs run-length encoding and Lempel-Ziv-Welch compression and succeeds in exposing holograms that were previously infeasible owing to the hologram's tremendous pattern-data file size. These holograms also require significant computation; thus the algorithm was implemented on a parallel computer, which improved performance by 2 orders of magnitude. The decompression algorithm was integrated into the Cambridge electron-beam machine's front-end processor.Although this provides much-needed ability, some hardware enhancements will be required in the future to overcome inadequacies in the current front-end processor that result in a lengthy exposure time.
NASA Astrophysics Data System (ADS)
Gorthi, Sai Siva; Rajshekhar, Gannavarpu; Rastogi, Pramod
2010-06-01
Recently, a high-order instantaneous moments (HIM)-operator-based method was proposed for accurate phase estimation in digital holographic interferometry. The method relies on piece-wise polynomial approximation of phase and subsequent evaluation of the polynomial coefficients from the HIM operator using single-tone frequency estimation. The work presents a comparative analysis of the performance of different single-tone frequency estimation techniques, like Fourier transform followed by optimization, estimation of signal parameters by rotational invariance technique (ESPRIT), multiple signal classification (MUSIC), and iterative frequency estimation by interpolation on Fourier coefficients (IFEIF) in HIM-operator-based methods for phase estimation. Simulation and experimental results demonstrate the potential of the IFEIF technique with respect to computational efficiency and estimation accuracy.
Accurate Phylogenetic Tree Reconstruction from Quartets: A Heuristic Approach
Reaz, Rezwana; Bayzid, Md. Shamsuzzoha; Rahman, M. Sohel
2014-01-01
Supertree methods construct trees on a set of taxa (species) combining many smaller trees on the overlapping subsets of the entire set of taxa. A ‘quartet’ is an unrooted tree over taxa, hence the quartet-based supertree methods combine many -taxon unrooted trees into a single and coherent tree over the complete set of taxa. Quartet-based phylogeny reconstruction methods have been receiving considerable attentions in the recent years. An accurate and efficient quartet-based method might be competitive with the current best phylogenetic tree reconstruction methods (such as maximum likelihood or Bayesian MCMC analyses), without being as computationally intensive. In this paper, we present a novel and highly accurate quartet-based phylogenetic tree reconstruction method. We performed an extensive experimental study to evaluate the accuracy and scalability of our approach on both simulated and biological datasets. PMID:25117474
An efficient formulation of robot arm dynamics for control and computer simulation
NASA Astrophysics Data System (ADS)
Lee, C. S. G.; Nigam, R.
This paper describes an efficient formulation of the dynamic equations of motion of industrial robots based on the Lagrange formulation of d'Alembert's principle. This formulation, as applied to a PUMA robot arm, results in a set of closed form second order differential equations with cross product terms. They are not as efficient in computation as those formulated by the Newton-Euler method, but provide a better analytical model for control analysis and computer simulation. Computational complexities of this dynamic model together with other models are tabulated for discussion.
He, Jieyue; Li, Chaojun; Ye, Baoliu; Zhong, Wei
2012-06-25
Most computational algorithms mainly focus on detecting highly connected subgraphs in PPI networks as protein complexes but ignore their inherent organization. Furthermore, many of these algorithms are computationally expensive. However, recent analysis indicates that experimentally detected protein complexes generally contain Core/attachment structures. In this paper, a Greedy Search Method based on Core-Attachment structure (GSM-CA) is proposed. The GSM-CA method detects densely connected regions in large protein-protein interaction networks based on the edge weight and two criteria for determining core nodes and attachment nodes. The GSM-CA method improves the prediction accuracy compared to other similar module detection approaches, however it is computationally expensive. Many module detection approaches are based on the traditional hierarchical methods, which is also computationally inefficient because the hierarchical tree structure produced by these approaches cannot provide adequate information to identify whether a network belongs to a module structure or not. In order to speed up the computational process, the Greedy Search Method based on Fast Clustering (GSM-FC) is proposed in this work. The edge weight based GSM-FC method uses a greedy procedure to traverse all edges just once to separate the network into the suitable set of modules. The proposed methods are applied to the protein interaction network of S. cerevisiae. Experimental results indicate that many significant functional modules are detected, most of which match the known complexes. Results also demonstrate that the GSM-FC algorithm is faster and more accurate as compared to other competing algorithms. Based on the new edge weight definition, the proposed algorithm takes advantages of the greedy search procedure to separate the network into the suitable set of modules. Experimental analysis shows that the identified modules are statistically significant. The algorithm can reduce the
Improving robustness and computational efficiency using modern C++
DOE Office of Scientific and Technical Information (OSTI.GOV)
Paterno, M.; Kowalkowski, J.; Green, C.
2014-01-01
For nearly two decades, the C++ programming language has been the dominant programming language for experimental HEP. The publication of ISO/IEC 14882:2011, the current version of the international standard for the C++ programming language, makes available a variety of language and library facilities for improving the robustness, expressiveness, and computational efficiency of C++ code. However, much of the C++ written by the experimental HEP community does not take advantage of the features of the language to obtain these benefits, either due to lack of familiarity with these features or concern that these features must somehow be computationally inefficient. In thismore » paper, we address some of the features of modern C+-+, and show how they can be used to make programs that are both robust and computationally efficient. We compare and contrast simple yet realistic examples of some common implementation patterns in C, currently-typical C++, and modern C++, and show (when necessary, down to the level of generated assembly language code) the quality of the executable code produced by recent C++ compilers, with the aim of allowing the HEP community to make informed decisions on the costs and benefits of the use of modern C++.« less
An efficient sparse matrix multiplication scheme for the CYBER 205 computer
NASA Technical Reports Server (NTRS)
Lambiotte, Jules J., Jr.
1988-01-01
This paper describes the development of an efficient algorithm for computing the product of a matrix and vector on a CYBER 205 vector computer. The desire to provide software which allows the user to choose between the often conflicting goals of minimizing central processing unit (CPU) time or storage requirements has led to a diagonal-based algorithm in which one of four types of storage is selected for each diagonal. The candidate storage types employed were chosen to be efficient on the CYBER 205 for diagonals which have nonzero structure which is dense, moderately sparse, very sparse and short, or very sparse and long; however, for many densities, no diagonal type is most efficient with respect to both resource requirements, and a trade-off must be made. For each diagonal, an initialization subroutine estimates the CPU time and storage required for each storage type based on results from previously performed numerical experimentation. These requirements are adjusted by weights provided by the user which reflect the relative importance the user places on the two resources. The adjusted resource requirements are then compared to select the most efficient storage and computational scheme.
Adly, Amr A.; Abd-El-Hafiz, Salwa K.
2012-01-01
Incorporation of hysteresis models in electromagnetic analysis approaches is indispensable to accurate field computation in complex magnetic media. Throughout those computations, vector nature and computational efficiency of such models become especially crucial when sophisticated geometries requiring massive sub-region discretization are involved. Recently, an efficient vector Preisach-type hysteresis model constructed from only two scalar models having orthogonally coupled elementary operators has been proposed. This paper presents a novel Hopfield neural network approach for the implementation of Stoner–Wohlfarth-like operators that could lead to a significant enhancement in the computational efficiency of the aforementioned model. Advantages of this approach stem from the non-rectangular nature of these operators that substantially minimizes the number of operators needed to achieve an accurate vector hysteresis model. Details of the proposed approach, its identification and experimental testing are presented in the paper. PMID:25685446
NASA Astrophysics Data System (ADS)
Zheng, Chang-Jun; Gao, Hai-Feng; Du, Lei; Chen, Hai-Bo; Zhang, Chuanzeng
2016-01-01
An accurate numerical solver is developed in this paper for eigenproblems governed by the Helmholtz equation and formulated through the boundary element method. A contour integral method is used to convert the nonlinear eigenproblem into an ordinary eigenproblem, so that eigenvalues can be extracted accurately by solving a set of standard boundary element systems of equations. In order to accelerate the solution procedure, the parameters affecting the accuracy and efficiency of the method are studied and two contour paths are compared. Moreover, a wideband fast multipole method is implemented with a block IDR (s) solver to reduce the overall solution cost of the boundary element systems of equations with multiple right-hand sides. The Burton-Miller formulation is employed to identify the fictitious eigenfrequencies of the interior acoustic problems with multiply connected domains. The actual effect of the Burton-Miller formulation on tackling the fictitious eigenfrequency problem is investigated and the optimal choice of the coupling parameter as α = i / k is confirmed through exterior sphere examples. Furthermore, the numerical eigenvalues obtained by the developed method are compared with the results obtained by the finite element method to show the accuracy and efficiency of the developed method.
NASA Astrophysics Data System (ADS)
Montoliu, C.; Ferrando, N.; Gosálvez, M. A.; Cerdá, J.; Colom, R. J.
2013-10-01
The use of atomistic methods, such as the Continuous Cellular Automaton (CCA), is currently regarded as a computationally efficient and experimentally accurate approach for the simulation of anisotropic etching of various substrates in the manufacture of Micro-electro-mechanical Systems (MEMS). However, when the features of the chemical process are modified, a time-consuming calibration process needs to be used to transform the new macroscopic etch rates into a corresponding set of atomistic rates. Furthermore, changing the substrate requires a labor-intensive effort to reclassify most atomistic neighborhoods. In this context, the Level Set (LS) method provides an alternative approach where the macroscopic forces affecting the front evolution are directly applied at the discrete level, thus avoiding the need for reclassification and/or calibration. Correspondingly, we present a fully-operational Sparse Field Method (SFM) implementation of the LS approach, discussing in detail the algorithm and providing a thorough characterization of the computational cost and simulation accuracy, including a comparison to the performance by the most recent CCA model. We conclude that the SFM implementation achieves similar accuracy as the CCA method with less fluctuations in the etch front and requiring roughly 4 times less memory. Although SFM can be up to 2 times slower than CCA for the simulation of anisotropic etchants, it can also be up to 10 times faster than CCA for isotropic etchants. In addition, we present a parallel, GPU-based implementation (gSFM) and compare it to an optimized, multicore CPU version (cSFM), demonstrating that the SFM algorithm can be successfully parallelized and the simulation times consequently reduced, while keeping the accuracy of the simulations. Although modern multicore CPUs provide an acceptable option, the massively parallel architecture of modern GPUs is more suitable, as reflected by computational times for gSFM up to 7.4 times faster than
Progress in fast, accurate multi-scale climate simulations
Collins, W. D.; Johansen, H.; Evans, K. J.; ...
2015-06-01
We present a survey of physical and computational techniques that have the potential to contribute to the next generation of high-fidelity, multi-scale climate simulations. Examples of the climate science problems that can be investigated with more depth with these computational improvements include the capture of remote forcings of localized hydrological extreme events, an accurate representation of cloud features over a range of spatial and temporal scales, and parallel, large ensembles of simulations to more effectively explore model sensitivities and uncertainties. Numerical techniques, such as adaptive mesh refinement, implicit time integration, and separate treatment of fast physical time scales are enablingmore » improved accuracy and fidelity in simulation of dynamics and allowing more complete representations of climate features at the global scale. At the same time, partnerships with computer science teams have focused on taking advantage of evolving computer architectures such as many-core processors and GPUs. As a result, approaches which were previously considered prohibitively costly have become both more efficient and scalable. In combination, progress in these three critical areas is poised to transform climate modeling in the coming decades.« less
Efficient computation of optimal actions.
Todorov, Emanuel
2009-07-14
Optimal choice of actions is a fundamental problem relevant to fields as diverse as neuroscience, psychology, economics, computer science, and control engineering. Despite this broad relevance the abstract setting is similar: we have an agent choosing actions over time, an uncertain dynamical system whose state is affected by those actions, and a performance criterion that the agent seeks to optimize. Solving problems of this kind remains hard, in part, because of overly generic formulations. Here, we propose a more structured formulation that greatly simplifies the construction of optimal control laws in both discrete and continuous domains. An exhaustive search over actions is avoided and the problem becomes linear. This yields algorithms that outperform Dynamic Programming and Reinforcement Learning, and thereby solve traditional problems more efficiently. Our framework also enables computations that were not possible before: composing optimal control laws by mixing primitives, applying deterministic methods to stochastic systems, quantifying the benefits of error tolerance, and inferring goals from behavioral data via convex optimization. Development of a general class of easily solvable problems tends to accelerate progress--as linear systems theory has done, for example. Our framework may have similar impact in fields where optimal choice of actions is relevant.
Approximate, computationally efficient online learning in Bayesian spiking neurons.
Kuhlmann, Levin; Hauser-Raspe, Michael; Manton, Jonathan H; Grayden, David B; Tapson, Jonathan; van Schaik, André
2014-03-01
Bayesian spiking neurons (BSNs) provide a probabilistic interpretation of how neurons perform inference and learning. Online learning in BSNs typically involves parameter estimation based on maximum-likelihood expectation-maximization (ML-EM) which is computationally slow and limits the potential of studying networks of BSNs. An online learning algorithm, fast learning (FL), is presented that is more computationally efficient than the benchmark ML-EM for a fixed number of time steps as the number of inputs to a BSN increases (e.g., 16.5 times faster run times for 20 inputs). Although ML-EM appears to converge 2.0 to 3.6 times faster than FL, the computational cost of ML-EM means that ML-EM takes longer to simulate to convergence than FL. FL also provides reasonable convergence performance that is robust to initialization of parameter estimates that are far from the true parameter values. However, parameter estimation depends on the range of true parameter values. Nevertheless, for a physiologically meaningful range of parameter values, FL gives very good average estimation accuracy, despite its approximate nature. The FL algorithm therefore provides an efficient tool, complementary to ML-EM, for exploring BSN networks in more detail in order to better understand their biological relevance. Moreover, the simplicity of the FL algorithm means it can be easily implemented in neuromorphic VLSI such that one can take advantage of the energy-efficient spike coding of BSNs.
An accurate and efficient experimental approach for characterization of the complex oral microbiota.
Zheng, Wei; Tsompana, Maria; Ruscitto, Angela; Sharma, Ashu; Genco, Robert; Sun, Yijun; Buck, Michael J
2015-10-05
Currently, taxonomic interrogation of microbiota is based on amplification of 16S rRNA gene sequences in clinical and scientific settings. Accurate evaluation of the microbiota depends heavily on the primers used, and genus/species resolution bias can arise with amplification of non-representative genomic regions. The latest Illumina MiSeq sequencing chemistry has extended the read length to 300 bp, enabling deep profiling of large number of samples in a single paired-end reaction at a fraction of the cost. An increasingly large number of researchers have adopted this technology for various microbiome studies targeting the 16S rRNA V3-V4 hypervariable region. To expand the applicability of this powerful platform for further descriptive and functional microbiome studies, we standardized and tested an efficient, reliable, and straightforward workflow for the amplification, library construction, and sequencing of the 16S V1-V3 hypervariable region using the new 2 × 300 MiSeq platform. Our analysis involved 11 subgingival plaque samples from diabetic and non-diabetic human subjects suffering from periodontitis. The efficiency and reliability of our experimental protocol was compared to 16S V3-V4 sequencing data from the same samples. Comparisons were based on measures of observed taxonomic richness and species evenness, along with Procrustes analyses using beta(β)-diversity distance metrics. As an experimental control, we also analyzed a total of eight technical replicates for the V1-V3 and V3-V4 regions from a synthetic community with known bacterial species operon counts. We show that our experimental protocol accurately measures true bacterial community composition. Procrustes analyses based on unweighted UniFrac β-diversity metrics depicted significant correlation between oral bacterial composition for the V1-V3 and V3-V4 regions. However, measures of phylotype richness were higher for the V1-V3 region, suggesting that V1-V3 offers a deeper assessment of
Lee, Hui Sun; Im, Wonpil
2016-04-01
Molecular recognition by protein mostly occurs in a local region on the protein surface. Thus, an efficient computational method for accurate characterization of protein local structural conservation is necessary to better understand biology and drug design. We present a novel local structure alignment tool, G-LoSA. G-LoSA aligns protein local structures in a sequence order independent way and provides a GA-score, a chemical feature-based and size-independent structure similarity score. Our benchmark validation shows the robust performance of G-LoSA to the local structures of diverse sizes and characteristics, demonstrating its universal applicability to local structure-centric comparative biology studies. In particular, G-LoSA is highly effective in detecting conserved local regions on the entire surface of a given protein. In addition, the applications of G-LoSA to identifying template ligands and predicting ligand and protein binding sites illustrate its strong potential for computer-aided drug design. We hope that G-LoSA can be a useful computational method for exploring interesting biological problems through large-scale comparison of protein local structures and facilitating drug discovery research and development. G-LoSA is freely available to academic users at http://im.compbio.ku.edu/GLoSA/. © 2016 The Protein Society.
NASA Astrophysics Data System (ADS)
Moghani, Mahdy Malekzadeh; Khomami, Bamin
2017-02-01
The computational efficiency of Brownian dynamics (BD) simulation of the constrained model of a polymeric chain (bead-rod) with n beads and in the presence of hydrodynamic interaction (HI) is reduced to the order of n2 via an efficient algorithm which utilizes the conjugate-gradient (CG) method within a Picard iteration scheme. Moreover, the utility of the Barnes and Hut (BH) multipole method in BD simulation of polymeric solutions in the presence of HI, with regard to computational cost, scaling, and accuracy, is discussed. Overall, it is determined that this approach leads to a scaling of O (n1.2) . Furthermore, a stress algorithm is developed which accurately captures the transient stress growth in the startup of flow for the bead-rod model with HI and excluded volume (EV) interaction. Rheological properties of the chains up to n =350 in the presence of EV and HI are computed via the former algorithm. The result depicts qualitative differences in shear thinning behavior of the polymeric solutions in the intermediate values of the Weissenburg number (10
Fast and Accurate Approximation to Significance Tests in Genome-Wide Association Studies
Zhang, Yu; Liu, Jun S.
2011-01-01
Genome-wide association studies commonly involve simultaneous tests of millions of single nucleotide polymorphisms (SNP) for disease association. The SNPs in nearby genomic regions, however, are often highly correlated due to linkage disequilibrium (LD, a genetic term for correlation). Simple Bonferonni correction for multiple comparisons is therefore too conservative. Permutation tests, which are often employed in practice, are both computationally expensive for genome-wide studies and limited in their scopes. We present an accurate and computationally efficient method, based on Poisson de-clumping heuristics, for approximating genome-wide significance of SNP associations. Compared with permutation tests and other multiple comparison adjustment approaches, our method computes the most accurate and robust p-value adjustments for millions of correlated comparisons within seconds. We demonstrate analytically that the accuracy and the efficiency of our method are nearly independent of the sample size, the number of SNPs, and the scale of p-values to be adjusted. In addition, our method can be easily adopted to estimate false discovery rate. When applied to genome-wide SNP datasets, we observed highly variable p-value adjustment results evaluated from different genomic regions. The variation in adjustments along the genome, however, are well conserved between the European and the African populations. The p-value adjustments are significantly correlated with LD among SNPs, recombination rates, and SNP densities. Given the large variability of sequence features in the genome, we further discuss a novel approach of using SNP-specific (local) thresholds to detect genome-wide significant associations. This article has supplementary material online. PMID:22140288
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ibrahim, Khaled Z.; Epifanovsky, Evgeny; Williams, Samuel
Coupled-cluster methods provide highly accurate models of molecular structure through explicit numerical calculation of tensors representing the correlation between electrons. These calculations are dominated by a sequence of tensor contractions, motivating the development of numerical libraries for such operations. While based on matrix–matrix multiplication, these libraries are specialized to exploit symmetries in the molecular structure and in electronic interactions, and thus reduce the size of the tensor representation and the complexity of contractions. The resulting algorithms are irregular and their parallelization has been previously achieved via the use of dynamic scheduling or specialized data decompositions. We introduce our efforts tomore » extend the Libtensor framework to work in the distributed memory environment in a scalable and energy-efficient manner. We achieve up to 240× speedup compared with the optimized shared memory implementation of Libtensor. We attain scalability to hundreds of thousands of compute cores on three distributed-memory architectures (Cray XC30 and XC40, and IBM Blue Gene/Q), and on a heterogeneous GPU-CPU system (Cray XK7). As the bottlenecks shift from being compute-bound DGEMM's to communication-bound collectives as the size of the molecular system scales, we adopt two radically different parallelization approaches for handling load-imbalance, tasking and bulk synchronous models. Nevertheless, we preserve a unified interface to both programming models to maintain the productivity of computational quantum chemists.« less
Ibrahim, Khaled Z.; Epifanovsky, Evgeny; Williams, Samuel; ...
2017-03-08
Coupled-cluster methods provide highly accurate models of molecular structure through explicit numerical calculation of tensors representing the correlation between electrons. These calculations are dominated by a sequence of tensor contractions, motivating the development of numerical libraries for such operations. While based on matrix–matrix multiplication, these libraries are specialized to exploit symmetries in the molecular structure and in electronic interactions, and thus reduce the size of the tensor representation and the complexity of contractions. The resulting algorithms are irregular and their parallelization has been previously achieved via the use of dynamic scheduling or specialized data decompositions. We introduce our efforts tomore » extend the Libtensor framework to work in the distributed memory environment in a scalable and energy-efficient manner. We achieve up to 240× speedup compared with the optimized shared memory implementation of Libtensor. We attain scalability to hundreds of thousands of compute cores on three distributed-memory architectures (Cray XC30 and XC40, and IBM Blue Gene/Q), and on a heterogeneous GPU-CPU system (Cray XK7). As the bottlenecks shift from being compute-bound DGEMM's to communication-bound collectives as the size of the molecular system scales, we adopt two radically different parallelization approaches for handling load-imbalance, tasking and bulk synchronous models. Nevertheless, we preserve a unified interface to both programming models to maintain the productivity of computational quantum chemists.« less
Accurate Arabic Script Language/Dialect Classification
2014-01-01
Army Research Laboratory Accurate Arabic Script Language/Dialect Classification by Stephen C. Tratz ARL-TR-6761 January 2014 Approved for public...1197 ARL-TR-6761 January 2014 Accurate Arabic Script Language/Dialect Classification Stephen C. Tratz Computational and Information Sciences...Include area code) Standard Form 298 (Rev. 8/98) Prescribed by ANSI Std. Z39.18 January 2014 Final Accurate Arabic Script Language/Dialect Classification
Computationally efficient optimization of radiation drives
NASA Astrophysics Data System (ADS)
Zimmerman, George; Swift, Damian
2017-06-01
For many applications of pulsed radiation, the temporal pulse shape is designed to induce a desired time-history of conditions. This optimization is normally performed using multi-physics simulations of the system, adjusting the shape until the desired response is induced. These simulations may be computationally intensive, and iterative forward optimization is then expensive and slow. In principle, a simulation program could be modified to adjust the radiation drive automatically until the desired instantaneous response is achieved, but this may be impracticable in a complicated multi-physics program. However, the computational time increment is typically much shorter than the time scale of changes in the desired response, so the radiation intensity can be adjusted so that the response tends toward the desired value. This relaxed in-situ optimization method can give an adequate design for a pulse shape in a single forward simulation, giving a typical gain in computational efficiency of tens to thousands. This approach was demonstrated for the design of laser pulse shapes to induce ramp loading to high pressure in target assemblies where different components had significantly different mechanical impedance, requiring careful pulse shaping. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
NASA Astrophysics Data System (ADS)
Wolfs, Vincent; Willems, Patrick
2013-10-01
Many applications in support of water management decisions require hydrodynamic models with limited calculation time, including real time control of river flooding, uncertainty and sensitivity analyses by Monte-Carlo simulations, and long term simulations in support of the statistical analysis of the model simulation results (e.g. flood frequency analysis). Several computationally efficient hydrodynamic models exist, but little attention is given to the modelling of floodplains. This paper presents a methodology that can emulate output from a full hydrodynamic model by predicting one or several levels in a floodplain, together with the flow rate between river and floodplain. The overtopping of the embankment is modelled as an overflow at a weir. Adaptive neuro fuzzy inference systems (ANFIS) are exploited to cope with the varying factors affecting the flow. Different input sets and identification methods are considered in model construction. Because of the dual use of simplified physically based equations and data-driven techniques, the ANFIS consist of very few rules with a low number of input variables. A second calculation scheme can be followed for exceptionally large floods. The obtained nominal emulation model was tested for four floodplains along the river Dender in Belgium. Results show that the obtained models are accurate with low computational cost.
Enabling fast, stable and accurate peridynamic computations using multi-time-step integration
Lindsay, P.; Parks, M. L.; Prakash, A.
2016-04-13
Peridynamics is a nonlocal extension of classical continuum mechanics that is well-suited for solving problems with discontinuities such as cracks. This paper extends the peridynamic formulation to decompose a problem domain into a number of smaller overlapping subdomains and to enable the use of different time steps in different subdomains. This approach allows regions of interest to be isolated and solved at a small time step for increased accuracy while the rest of the problem domain can be solved at a larger time step for greater computational efficiency. Lastly, performance of the proposed method in terms of stability, accuracy, andmore » computational cost is examined and several numerical examples are presented to corroborate the findings.« less
NASA Astrophysics Data System (ADS)
Sagui, Celeste; Pedersen, Lee G.; Darden, Thomas A.
2004-01-01
The accurate simulation of biologically active macromolecules faces serious limitations that originate in the treatment of electrostatics in the empirical force fields. The current use of "partial charges" is a significant source of errors, since these vary widely with different conformations. By contrast, the molecular electrostatic potential (MEP) obtained through the use of a distributed multipole moment description, has been shown to converge to the quantum MEP outside the van der Waals surface, when higher order multipoles are used. However, in spite of the considerable improvement to the representation of the electronic cloud, higher order multipoles are not part of current classical biomolecular force fields due to the excessive computational cost. In this paper we present an efficient formalism for the treatment of higher order multipoles in Cartesian tensor formalism. The Ewald "direct sum" is evaluated through a McMurchie-Davidson formalism [L. McMurchie and E. Davidson, J. Comput. Phys. 26, 218 (1978)]. The "reciprocal sum" has been implemented in three different ways: using an Ewald scheme, a particle mesh Ewald (PME) method, and a multigrid-based approach. We find that even though the use of the McMurchie-Davidson formalism considerably reduces the cost of the calculation with respect to the standard matrix implementation of multipole interactions, the calculation in direct space remains expensive. When most of the calculation is moved to reciprocal space via the PME method, the cost of a calculation where all multipolar interactions (up to hexadecapole-hexadecapole) are included is only about 8.5 times more expensive than a regular AMBER 7 [D. A. Pearlman et al., Comput. Phys. Commun. 91, 1 (1995)] implementation with only charge-charge interactions. The multigrid implementation is slower but shows very promising results for parallelization. It provides a natural way to interface with continuous, Gaussian-based electrostatics in the future. It is
LI, ZHILIN; JI, HAIFENG; CHEN, XIAOHONG
2016-01-01
A new augmented method is proposed for elliptic interface problems with a piecewise variable coefficient that has a finite jump across a smooth interface. The main motivation is not only to get a second order accurate solution but also a second order accurate gradient from each side of the interface. The key of the new method is to introduce the jump in the normal derivative of the solution as an augmented variable and re-write the interface problem as a new PDE that consists of a leading Laplacian operator plus lower order derivative terms near the interface. In this way, the leading second order derivatives jump relations are independent of the jump in the coefficient that appears only in the lower order terms after the scaling. An upwind type discretization is used for the finite difference discretization at the irregular grid points near or on the interface so that the resulting coefficient matrix is an M-matrix. A multi-grid solver is used to solve the linear system of equations and the GMRES iterative method is used to solve the augmented variable. Second order convergence for the solution and the gradient from each side of the interface has also been proved in this paper. Numerical examples for general elliptic interface problems have confirmed the theoretical analysis and efficiency of the new method. PMID:28983130
On the accurate estimation of gap fraction during daytime with digital cover photography
NASA Astrophysics Data System (ADS)
Hwang, Y. R.; Ryu, Y.; Kimm, H.; Macfarlane, C.; Lang, M.; Sonnentag, O.
2015-12-01
Digital cover photography (DCP) has emerged as an indirect method to obtain gap fraction accurately. Thus far, however, the intervention of subjectivity, such as determining the camera relative exposure value (REV) and threshold in the histogram, hindered computing accurate gap fraction. Here we propose a novel method that enables us to measure gap fraction accurately during daytime under various sky conditions by DCP. The novel method computes gap fraction using a single DCP unsaturated raw image which is corrected for scattering effects by canopies and a reconstructed sky image from the raw format image. To test the sensitivity of the novel method derived gap fraction to diverse REVs, solar zenith angles and canopy structures, we took photos in one hour interval between sunrise to midday under dense and sparse canopies with REV 0 to -5. The novel method showed little variation of gap fraction across different REVs in both dense and spares canopies across diverse range of solar zenith angles. The perforated panel experiment, which was used to test the accuracy of the estimated gap fraction, confirmed that the novel method resulted in the accurate and consistent gap fractions across different hole sizes, gap fractions and solar zenith angles. These findings highlight that the novel method opens new opportunities to estimate gap fraction accurately during daytime from sparse to dense canopies, which will be useful in monitoring LAI precisely and validating satellite remote sensing LAI products efficiently.
Li, Xiangrui; Lu, Zhong-Lin
2012-02-29
Display systems based on conventional computer graphics cards are capable of generating images with 8-bit gray level resolution. However, most experiments in vision research require displays with more than 12 bits of luminance resolution. Several solutions are available. Bit++ (1) and DataPixx (2) use the Digital Visual Interface (DVI) output from graphics cards and high resolution (14 or 16-bit) digital-to-analog converters to drive analog display devices. The VideoSwitcher (3) described here combines analog video signals from the red and blue channels of graphics cards with different weights using a passive resister network (4) and an active circuit to deliver identical video signals to the three channels of color monitors. The method provides an inexpensive way to enable high-resolution monochromatic displays using conventional graphics cards and analog monitors. It can also provide trigger signals that can be used to mark stimulus onsets, making it easy to synchronize visual displays with physiological recordings or response time measurements. Although computer keyboards and mice are frequently used in measuring response times (RT), the accuracy of these measurements is quite low. The RTbox is a specialized hardware and software solution for accurate RT measurements. Connected to the host computer through a USB connection, the driver of the RTbox is compatible with all conventional operating systems. It uses a microprocessor and high-resolution clock to record the identities and timing of button events, which are buffered until the host computer retrieves them. The recorded button events are not affected by potential timing uncertainties or biases associated with data transmission and processing in the host computer. The asynchronous storage greatly simplifies the design of user programs. Several methods are available to synchronize the clocks of the RTbox and the host computer. The RTbox can also receive external triggers and be used to measure RT with respect
Unified commutation-pruning technique for efficient computation of composite DFTs
NASA Astrophysics Data System (ADS)
Castro-Palazuelos, David E.; Medina-Melendrez, Modesto Gpe.; Torres-Roman, Deni L.; Shkvarko, Yuriy V.
2015-12-01
An efficient computation of a composite length discrete Fourier transform (DFT), as well as a fast Fourier transform (FFT) of both time and space data sequences in uncertain (non-sparse or sparse) computational scenarios, requires specific processing algorithms. Traditional algorithms typically employ some pruning methods without any commutations, which prevents them from attaining the potential computational efficiency. In this paper, we propose an alternative unified approach with automatic commutations between three computational modalities aimed at efficient computations of the pruned DFTs adapted for variable composite lengths of the non-sparse input-output data. The first modality is an implementation of the direct computation of a composite length DFT, the second one employs the second-order recursive filtering method, and the third one performs the new pruned decomposed transform. The pruned decomposed transform algorithm performs the decimation in time or space (DIT) data acquisition domain and, then, decimation in frequency (DIF). The unified combination of these three algorithms is addressed as the DFTCOMM technique. Based on the treatment of the combinational-type hypotheses testing optimization problem of preferable allocations between all feasible commuting-pruning modalities, we have found the global optimal solution to the pruning problem that always requires a fewer or, at most, the same number of arithmetic operations than other feasible modalities. The DFTCOMM method outperforms the existing competing pruning techniques in the sense of attainable savings in the number of required arithmetic operations. It requires fewer or at most the same number of arithmetic operations for its execution than any other of the competing pruning methods reported in the literature. Finally, we provide the comparison of the DFTCOMM with the recently developed sparse fast Fourier transform (SFFT) algorithmic family. We feature that, in the sensing scenarios with
Accurate upwind methods for the Euler equations
NASA Technical Reports Server (NTRS)
Huynh, Hung T.
1993-01-01
A new class of piecewise linear methods for the numerical solution of the one-dimensional Euler equations of gas dynamics is presented. These methods are uniformly second-order accurate, and can be considered as extensions of Godunov's scheme. With an appropriate definition of monotonicity preservation for the case of linear convection, it can be shown that they preserve monotonicity. Similar to Van Leer's MUSCL scheme, they consist of two key steps: a reconstruction step followed by an upwind step. For the reconstruction step, a monotonicity constraint that preserves uniform second-order accuracy is introduced. Computational efficiency is enhanced by devising a criterion that detects the 'smooth' part of the data where the constraint is redundant. The concept and coding of the constraint are simplified by the use of the median function. A slope steepening technique, which has no effect at smooth regions and can resolve a contact discontinuity in four cells, is described. As for the upwind step, existing and new methods are applied in a manner slightly different from those in the literature. These methods are derived by approximating the Euler equations via linearization and diagonalization. At a 'smooth' interface, Harten, Lax, and Van Leer's one intermediate state model is employed. A modification for this model that can resolve contact discontinuities is presented. Near a discontinuity, either this modified model or a more accurate one, namely, Roe's flux-difference splitting. is used. The current presentation of Roe's method, via the conceptually simple flux-vector splitting, not only establishes a connection between the two splittings, but also leads to an admissibility correction with no conditional statement, and an efficient approximation to Osher's approximate Riemann solver. These reconstruction and upwind steps result in schemes that are uniformly second-order accurate and economical at smooth regions, and yield high resolution at discontinuities.
Serang, Oliver; MacCoss, Michael J.; Noble, William Stafford
2010-01-01
The problem of identifying proteins from a shotgun proteomics experiment has not been definitively solved. Identifying the proteins in a sample requires ranking them, ideally with interpretable scores. In particular, “degenerate” peptides, which map to multiple proteins, have made such a ranking difficult to compute. The problem of computing posterior probabilities for the proteins, which can be interpreted as confidence in a protein’s presence, has been especially daunting. Previous approaches have either ignored the peptide degeneracy problem completely, addressed it by computing a heuristic set of proteins or heuristic posterior probabilities, or by estimating the posterior probabilities with sampling methods. We present a probabilistic model for protein identification in tandem mass spectrometry that recognizes peptide degeneracy. We then introduce graph-transforming algorithms that facilitate efficient computation of protein probabilities, even for large data sets. We evaluate our identification procedure on five different well-characterized data sets and demonstrate our ability to efficiently compute high-quality protein posteriors. PMID:20712337
DOE Office of Scientific and Technical Information (OSTI.GOV)
Miliordos, Evangelos; Xantheas, Sotiris S.
2015-06-21
We report MP2 and CCSD(T) binding energies with basis sets up to pentuple zeta quality for the m = 2-6, 8 clusters. Or best CCSD(T)/CBS estimates are -4.99 kcal/mol (dimer), -15.77 kcal/mol (trimer), -27.39 kcal/mol (tetramer), -35.9 ± 0.3 kcal/mol (pentamer), -46.2 ± 0.3 kcal/mol (prism hexamer), -45.9 ± 0.3 kcal/mol (cage hexamer), -45.4 ± 0.3 kcal/mol (book hexamer), -44.3 ± 0.3 kcal/mol (ring hexamer), -73.0 ± 0.5 kcal/mol (D 2d octamer) and -72.9 ± 0.5 kcal/mol (S4 octamer). We have found that the percentage of both the uncorrected (dimer) and BSSE-corrected (dimer CP e) binding energies recovered with respectmore » to the CBS limit falls into a narrow range for each basis set for all clusters and in addition this range was found to decrease upon increasing the basis set. Relatively accurate estimates (within < 0.5%) of the CBS limits can be obtained when using the “ 2/3, 1/3” (for the AVDZ set) or the “½ , ½” (for the AVTZ, AVQZ and AV5Z sets) mixing ratio between dimer e and dimer CPe. Based on those findings we propose an accurate and efficient computational protocol that can be used to estimate accurate binding energies of clusters at the MP2 (for up to 100 molecules) and CCSD(T) (for up to 30 molecules) levels of theory. This work was supported by the US Department of Energy, Office of Science, Office of Basic Energy Sciences, Division of Chemical Sciences, Geosciences and Biosciences. Pacific Northwest National Laboratory (PNNL) is a multi program national laboratory operated for DOE by Battelle. This research also used resources of the National Energy Research Scientific Computing Center, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. AC02-05CH11231.« less
NASA Astrophysics Data System (ADS)
Meyer, Toni; Körner, Christian; Vandewal, Koen; Leo, Karl
2018-04-01
In two terminal tandem solar cells, the current density - voltage (jV) characteristic of the individual subcells is typically not directly measurable, but often required for a rigorous device characterization. In this work, we reconstruct the jV-characteristic of organic solar cells from measurements of the external quantum efficiency under applied bias voltages and illumination. We show that it is necessary to perform a bias irradiance variation at each voltage and subsequently conduct a mathematical correction of the differential to the absolute external quantum efficiency to obtain an accurate jV-characteristic. Furthermore, we show that measuring the external quantum efficiency as a function of voltage for a single bias irradiance of 0.36 AM1.5g equivalent sun provides a good approximation of the photocurrent density over voltage curve. The method is tested on a selection of efficient, common single-junctions. The obtained conclusions can easily be transferred to multi-junction devices with serially connected subcells.
NASA Astrophysics Data System (ADS)
Pan, Liang; Xu, Kun; Li, Qibing; Li, Jiequan
2016-12-01
For computational fluid dynamics (CFD), the generalized Riemann problem (GRP) solver and the second-order gas-kinetic scheme (GKS) provide a time-accurate flux function starting from a discontinuous piecewise linear flow distributions around a cell interface. With the adoption of time derivative of the flux function, a two-stage Lax-Wendroff-type (L-W for short) time stepping method has been recently proposed in the design of a fourth-order time accurate method for inviscid flow [21]. In this paper, based on the same time-stepping method and the second-order GKS flux function [42], a fourth-order gas-kinetic scheme is constructed for the Euler and Navier-Stokes (NS) equations. In comparison with the formal one-stage time-stepping third-order gas-kinetic solver [24], the current fourth-order method not only reduces the complexity of the flux function, but also improves the accuracy of the scheme. In terms of the computational cost, a two-dimensional third-order GKS flux function takes about six times of the computational time of a second-order GKS flux function. However, a fifth-order WENO reconstruction may take more than ten times of the computational cost of a second-order GKS flux function. Therefore, it is fully legitimate to develop a two-stage fourth order time accurate method (two reconstruction) instead of standard four stage fourth-order Runge-Kutta method (four reconstruction). Most importantly, the robustness of the fourth-order GKS is as good as the second-order one. In the current computational fluid dynamics (CFD) research, it is still a difficult problem to extend the higher-order Euler solver to the NS one due to the change of governing equations from hyperbolic to parabolic type and the initial interface discontinuity. This problem remains distinctively for the hypersonic viscous and heat conducting flow. The GKS is based on the kinetic equation with the hyperbolic transport and the relaxation source term. The time-dependent GKS flux function
NASA Technical Reports Server (NTRS)
Alter, Stephen J.; Brauckmann, Gregory J.; Kleb, Bil; Streett, Craig L; Glass, Christopher E.; Schuster, David M.
2015-01-01
Using the Fully Unstructured Three-Dimensional (FUN3D) computational fluid dynamics code, an unsteady, time-accurate flow field about a Space Launch System configuration was simulated at a transonic wind tunnel condition (Mach = 0.9). Delayed detached eddy simulation combined with Reynolds Averaged Naiver-Stokes and a Spallart-Almaras turbulence model were employed for the simulation. Second order accurate time evolution scheme was used to simulate the flow field, with a minimum of 0.2 seconds of simulated time to as much as 1.4 seconds. Data was collected at 480 pressure taps at locations, 139 of which matched a 3% wind tunnel model, tested in the Transonic Dynamic Tunnel (TDT) facility at NASA Langley Research Center. Comparisons between computation and experiment showed agreement within 5% in terms of location for peak RMS levels, and 20% for frequency and magnitude of power spectral densities. Grid resolution and time step sensitivity studies were performed to identify methods for improved accuracy comparisons to wind tunnel data. With limited computational resources, accurate trends for reduced vibratory loads on the vehicle were observed. Exploratory methods such as determining minimized computed errors based on CFL number and sub-iterations, as well as evaluating frequency content of the unsteady pressures and evaluation of oscillatory shock structures were used in this study to enhance computational efficiency and solution accuracy. These techniques enabled development of a set of best practices, for the evaluation of future flight vehicle designs in terms of vibratory loads.
Computer Series, 101: Accurate Equations of State in Computational Chemistry Projects.
ERIC Educational Resources Information Center
Albee, David; Jones, Edward
1989-01-01
Discusses the use of computers in chemistry courses at the United States Military Academy. Provides two examples of computer projects: (1) equations of state, and (2) solving for molar volume. Presents BASIC and PASCAL listings for the second project. Lists 10 applications for physical chemistry. (MVL)
Fast and accurate reference-free alignment of subtomograms.
Chen, Yuxiang; Pfeffer, Stefan; Hrabe, Thomas; Schuller, Jan Michael; Förster, Friedrich
2013-06-01
In cryoelectron tomography alignment and averaging of subtomograms, each dnepicting the same macromolecule, improves the resolution compared to the individual subtomogram. Major challenges of subtomogram alignment are noise enhancement due to overfitting, the bias of an initial reference in the iterative alignment process, and the computational cost of processing increasingly large amounts of data. Here, we propose an efficient and accurate alignment algorithm via a generalized convolution theorem, which allows computation of a constrained correlation function using spherical harmonics. This formulation increases computational speed of rotational matching dramatically compared to rotation search in Cartesian space without sacrificing accuracy in contrast to other spherical harmonic based approaches. Using this sampling method, a reference-free alignment procedure is proposed to tackle reference bias and overfitting, which also includes contrast transfer function correction by Wiener filtering. Application of the method to simulated data allowed us to obtain resolutions near the ground truth. For two experimental datasets, ribosomes from yeast lysate and purified 20S proteasomes, we achieved reconstructions of approximately 20Å and 16Å, respectively. The software is ready-to-use and made public to the community. Copyright © 2013 Elsevier Inc. All rights reserved.
Accurate Time-Dependent Traveling-Wave Tube Model Developed for Computational Bit-Error-Rate Testing
NASA Technical Reports Server (NTRS)
Kory, Carol L.
2001-01-01
The phenomenal growth of the satellite communications industry has created a large demand for traveling-wave tubes (TWT's) operating with unprecedented specifications requiring the design and production of many novel devices in record time. To achieve this, the TWT industry heavily relies on computational modeling. However, the TWT industry's computational modeling capabilities need to be improved because there are often discrepancies between measured TWT data and that predicted by conventional two-dimensional helical TWT interaction codes. This limits the analysis and design of novel devices or TWT's with parameters differing from what is conventionally manufactured. In addition, the inaccuracy of current computational tools limits achievable TWT performance because optimized designs require highly accurate models. To address these concerns, a fully three-dimensional, time-dependent, helical TWT interaction model was developed using the electromagnetic particle-in-cell code MAFIA (Solution of MAxwell's equations by the Finite-Integration-Algorithm). The model includes a short section of helical slow-wave circuit with excitation fed by radiofrequency input/output couplers, and an electron beam contained by periodic permanent magnet focusing. A cutaway view of several turns of the three-dimensional helical slow-wave circuit with input/output couplers is shown. This has been shown to be more accurate than conventionally used two-dimensional models. The growth of the communications industry has also imposed a demand for increased data rates for the transmission of large volumes of data. To achieve increased data rates, complex modulation and multiple access techniques are employed requiring minimum distortion of the signal as it is passed through the TWT. Thus, intersymbol interference (ISI) becomes a major consideration, as well as suspected causes such as reflections within the TWT. To experimentally investigate effects of the physical TWT on ISI would be
Muver, a computational framework for accurately calling accumulated mutations.
Burkholder, Adam B; Lujan, Scott A; Lavender, Christopher A; Grimm, Sara A; Kunkel, Thomas A; Fargo, David C
2018-05-09
Identification of mutations from next-generation sequencing data typically requires a balance between sensitivity and accuracy. This is particularly true of DNA insertions and deletions (indels), that can impart significant phenotypic consequences on cells but are harder to call than substitution mutations from whole genome mutation accumulation experiments. To overcome these difficulties, we present muver, a computational framework that integrates established bioinformatics tools with novel analytical methods to generate mutation calls with the extremely low false positive rates and high sensitivity required for accurate mutation rate determination and comparison. Muver uses statistical comparison of ancestral and descendant allelic frequencies to identify variant loci and assigns genotypes with models that include per-sample assessments of sequencing errors by mutation type and repeat context. Muver identifies maximally parsimonious mutation pathways that connect these genotypes, differentiating potential allelic conversion events and delineating ambiguities in mutation location, type, and size. Benchmarking with a human gold standard father-son pair demonstrates muver's sensitivity and low false positive rates. In DNA mismatch repair (MMR) deficient Saccharomyces cerevisiae, muver detects multi-base deletions in homopolymers longer than the replicative polymerase footprint at rates greater than predicted for sequential single-base deletions, implying a novel multi-repeat-unit slippage mechanism. Benchmarking results demonstrate the high accuracy and sensitivity achieved with muver, particularly for indels, relative to available tools. Applied to an MMR-deficient Saccharomyces cerevisiae system, muver mutation calls facilitate mechanistic insights into DNA replication fidelity.
NASA Astrophysics Data System (ADS)
Lin, Y.; O'Malley, D.; Vesselinov, V. V.
2015-12-01
Inverse modeling seeks model parameters given a set of observed state variables. However, for many practical problems due to the facts that the observed data sets are often large and model parameters are often numerous, conventional methods for solving the inverse modeling can be computationally expensive. We have developed a new, computationally-efficient Levenberg-Marquardt method for solving large-scale inverse modeling. Levenberg-Marquardt methods require the solution of a dense linear system of equations which can be prohibitively expensive to compute for large-scale inverse problems. Our novel method projects the original large-scale linear problem down to a Krylov subspace, such that the dimensionality of the measurements can be significantly reduced. Furthermore, instead of solving the linear system for every Levenberg-Marquardt damping parameter, we store the Krylov subspace computed when solving the first damping parameter and recycle it for all the following damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved by using these computational techniques. We apply this new inverse modeling method to invert for a random transitivity field. Our algorithm is fast enough to solve for the distributed model parameters (transitivity) at each computational node in the model domain. The inversion is also aided by the use regularization techniques. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). Julia is an advanced high-level scientific programing language that allows for efficient memory management and utilization of high-performance computational resources. By comparing with a Levenberg-Marquardt method using standard linear inversion techniques, our Levenberg-Marquardt method yields speed-up ratio of 15 in a multi-core computational environment and a speed-up ratio of 45 in a single-core computational environment. Therefore, our new inverse modeling method is a
CFD Analysis and Design Optimization Using Parallel Computers
NASA Technical Reports Server (NTRS)
Martinelli, Luigi; Alonso, Juan Jose; Jameson, Antony; Reuther, James
1997-01-01
A versatile and efficient multi-block method is presented for the simulation of both steady and unsteady flow, as well as aerodynamic design optimization of complete aircraft configurations. The compressible Euler and Reynolds Averaged Navier-Stokes (RANS) equations are discretized using a high resolution scheme on body-fitted structured meshes. An efficient multigrid implicit scheme is implemented for time-accurate flow calculations. Optimum aerodynamic shape design is achieved at very low cost using an adjoint formulation. The method is implemented on parallel computing systems using the MPI message passing interface standard to ensure portability. The results demonstrate that, by combining highly efficient algorithms with parallel computing, it is possible to perform detailed steady and unsteady analysis as well as automatic design for complex configurations using the present generation of parallel computers.
Computational methods for efficient structural reliability and reliability sensitivity analysis
NASA Technical Reports Server (NTRS)
Wu, Y.-T.
1993-01-01
This paper presents recent developments in efficient structural reliability analysis methods. The paper proposes an efficient, adaptive importance sampling (AIS) method that can be used to compute reliability and reliability sensitivities. The AIS approach uses a sampling density that is proportional to the joint PDF of the random variables. Starting from an initial approximate failure domain, sampling proceeds adaptively and incrementally with the goal of reaching a sampling domain that is slightly greater than the failure domain to minimize over-sampling in the safe region. Several reliability sensitivity coefficients are proposed that can be computed directly and easily from the above AIS-based failure points. These probability sensitivities can be used for identifying key random variables and for adjusting design to achieve reliability-based objectives. The proposed AIS methodology is demonstrated using a turbine blade reliability analysis problem.
NASA Technical Reports Server (NTRS)
Desmarais, R. N.
1982-01-01
This paper describes an accurate economical method for generating approximations to the kernel of the integral equation relating unsteady pressure to normalwash in nonplanar flow. The method is capable of generating approximations of arbitrary accuracy. It is based on approximating the algebraic part of the non elementary integrals in the kernel by exponential approximations and then integrating termwise. The exponent spacing in the approximation is a geometric sequence. The coefficients and exponent multiplier of the exponential approximation are computed by least squares so the method is completely automated. Exponential approximates generated in this manner are two orders of magnitude more accurate than the exponential approximation that is currently most often used for this purpose. Coefficients for 8, 12, 24, and 72 term approximations are tabulated in the report. Also, since the method is automated, it can be used to generate approximations to attain any desired trade-off between accuracy and computing cost.
NASA Technical Reports Server (NTRS)
White, C. W.
1981-01-01
The computational efficiency of the impedance type loads prediction method was studied. Three goals were addressed: devise a method to make the impedance method operate more efficiently in the computer; assess the accuracy and convenience of the method for determining the effect of design changes; and investigate the use of the method to identify design changes for reduction of payload loads. The method is suitable for calculation of dynamic response in either the frequency or time domain. It is concluded that: the choice of an orthogonal coordinate system will allow the impedance method to operate more efficiently in the computer; the approximate mode impedance technique is adequate for determining the effect of design changes, and is applicable for both statically determinate and statically indeterminate payload attachments; and beneficial design changes to reduce payload loads can be identified by the combined application of impedance techniques and energy distribution review techniques.
Deng, Nanjie; Zhang, Bin W.; Levy, Ronald M.
2015-01-01
The ability to accurately model solvent effects on free energy surfaces is important for understanding many biophysical processes including protein folding and misfolding, allosteric transitions and protein-ligand binding. Although all-atom simulations in explicit solvent can provide an accurate model for biomolecules in solution, explicit solvent simulations are hampered by the slow equilibration on rugged landscapes containing multiple basins separated by barriers. In many cases, implicit solvent models can be used to significantly speed up the conformational sampling; however, implicit solvent simulations do not fully capture the effects of a molecular solvent, and this can lead to loss of accuracy in the estimated free energies. Here we introduce a new approach to compute free energy changes in which the molecular details of explicit solvent simulations are retained while also taking advantage of the speed of the implicit solvent simulations. In this approach, the slow equilibration in explicit solvent, due to the long waiting times before barrier crossing, is avoided by using a thermodynamic cycle which connects the free energy basins in implicit solvent and explicit solvent using a localized decoupling scheme. We test this method by computing conformational free energy differences and solvation free energies of the model system alanine dipeptide in water. The free energy changes between basins in explicit solvent calculated using fully explicit solvent paths agree with the corresponding free energy differences obtained using the implicit/explicit thermodynamic cycle to within 0.3 kcal/mol out of ~3 kcal/mol at only ~8 % of the computational cost. We note that WHAM methods can be used to further improve the efficiency and accuracy of the explicit/implicit thermodynamic cycle. PMID:26236174
Deng, Nanjie; Zhang, Bin W; Levy, Ronald M
2015-06-09
The ability to accurately model solvent effects on free energy surfaces is important for understanding many biophysical processes including protein folding and misfolding, allosteric transitions, and protein–ligand binding. Although all-atom simulations in explicit solvent can provide an accurate model for biomolecules in solution, explicit solvent simulations are hampered by the slow equilibration on rugged landscapes containing multiple basins separated by barriers. In many cases, implicit solvent models can be used to significantly speed up the conformational sampling; however, implicit solvent simulations do not fully capture the effects of a molecular solvent, and this can lead to loss of accuracy in the estimated free energies. Here we introduce a new approach to compute free energy changes in which the molecular details of explicit solvent simulations are retained while also taking advantage of the speed of the implicit solvent simulations. In this approach, the slow equilibration in explicit solvent, due to the long waiting times before barrier crossing, is avoided by using a thermodynamic cycle which connects the free energy basins in implicit solvent and explicit solvent using a localized decoupling scheme. We test this method by computing conformational free energy differences and solvation free energies of the model system alanine dipeptide in water. The free energy changes between basins in explicit solvent calculated using fully explicit solvent paths agree with the corresponding free energy differences obtained using the implicit/explicit thermodynamic cycle to within 0.3 kcal/mol out of ∼3 kcal/mol at only ∼8% of the computational cost. We note that WHAM methods can be used to further improve the efficiency and accuracy of the implicit/explicit thermodynamic cycle.
NASA Technical Reports Server (NTRS)
Goodwin, Sabine A.; Raj, P.
1999-01-01
Progress to date towards the development and validation of a fast, accurate and cost-effective aeroelastic method for advanced parallel computing platforms such as the IBM SP2 and the SGI Origin 2000 is presented in this paper. The ENSAERO code, developed at the NASA-Ames Research Center has been selected for this effort. The code allows for the computation of aeroelastic responses by simultaneously integrating the Euler or Navier-Stokes equations and the modal structural equations of motion. To assess the computational performance and accuracy of the ENSAERO code, this paper reports the results of the Navier-Stokes simulations of the transonic flow over a flexible aeroelastic wing body configuration. In addition, a forced harmonic oscillation analysis in the frequency domain and an analysis in the time domain are done on a wing undergoing a rigid pitch and plunge motion. Finally, to demonstrate the ENSAERO flutter-analysis capability, aeroelastic Euler and Navier-Stokes computations on an L-1011 wind tunnel model including pylon, nacelle and empennage are underway. All computational solutions are compared with experimental data to assess the level of accuracy of ENSAERO. As the computations described above are performed, a meticulous log of computational performance in terms of wall clock time, execution speed, memory and disk storage is kept. Code scalability is also demonstrated by studying the impact of varying the number of processors on computational performance on the IBM SP2 and the Origin 2000 systems.
A generic efficient adaptive grid scheme for rocket propulsion modeling
NASA Technical Reports Server (NTRS)
Mo, J. D.; Chow, Alan S.
1993-01-01
The objective of this research is to develop an efficient, time-accurate numerical algorithm to discretize the Navier-Stokes equations for the predictions of internal one-, two-dimensional and axisymmetric flows. A generic, efficient, elliptic adaptive grid generator is implicitly coupled with the Lower-Upper factorization scheme in the development of ALUNS computer code. The calculations of one-dimensional shock tube wave propagation and two-dimensional shock wave capture, wave-wave interactions, shock wave-boundary interactions show that the developed scheme is stable, accurate and extremely robust. The adaptive grid generator produced a very favorable grid network by a grid speed technique. This generic adaptive grid generator is also applied in the PARC and FDNS codes and the computational results for solid rocket nozzle flowfield and crystal growth modeling by those codes will be presented in the conference, too. This research work is being supported by NASA/MSFC.
NASA Technical Reports Server (NTRS)
Guruswamy, G. P.; Goorjian, P. M.
1984-01-01
An efficient coordinate transformation technique is presented for constructing grids for unsteady, transonic aerodynamic computations for delta-type wings. The original shearing transformation yielded computations that were numerically unstable and this paper discusses the sources of those instabilities. The new shearing transformation yields computations that are stable, fast, and accurate. Comparisons of those two methods are shown for the flow over the F5 wing that demonstrate the new stability. Also, comparisons are made with experimental data that demonstrate the accuracy of the new method. The computations were made by using a time-accurate, finite-difference, alternating-direction-implicit (ADI) algorithm for the transonic small-disturbance potential equation.
Efficient Privacy-Aware Record Integration.
Kuzu, Mehmet; Kantarcioglu, Murat; Inan, Ali; Bertino, Elisa; Durham, Elizabeth; Malin, Bradley
2013-01-01
The integration of information dispersed among multiple repositories is a crucial step for accurate data analysis in various domains. In support of this goal, it is critical to devise procedures for identifying similar records across distinct data sources. At the same time, to adhere to privacy regulations and policies, such procedures should protect the confidentiality of the individuals to whom the information corresponds. Various private record linkage (PRL) protocols have been proposed to achieve this goal, involving secure multi-party computation (SMC) and similarity preserving data transformation techniques. SMC methods provide secure and accurate solutions to the PRL problem, but are prohibitively expensive in practice, mainly due to excessive computational requirements. Data transformation techniques offer more practical solutions, but incur the cost of information leakage and false matches. In this paper, we introduce a novel model for practical PRL, which 1) affords controlled and limited information leakage, 2) avoids false matches resulting from data transformation. Initially, we partition the data sources into blocks to eliminate comparisons for records that are unlikely to match. Then, to identify matches, we apply an efficient SMC technique between the candidate record pairs. To enable efficiency and privacy, our model leaks a controlled amount of obfuscated data prior to the secure computations. Applied obfuscation relies on differential privacy which provides strong privacy guarantees against adversaries with arbitrary background knowledge. In addition, we illustrate the practical nature of our approach through an empirical analysis with data derived from public voter records.
Optimized efficiency in InP nanowire solar cells with accurate 1D analysis
NASA Astrophysics Data System (ADS)
Chen, Yang; Kivisaari, Pyry; Pistol, Mats-Erik; Anttu, Nicklas
2018-01-01
Semiconductor nanowire arrays are a promising candidate for next generation solar cells due to enhanced absorption and reduced material consumption. However, to optimize their performance, time consuming three-dimensional (3D) opto-electronics modeling is usually performed. Here, we develop an accurate one-dimensional (1D) modeling method for the analysis. The 1D modeling is about 400 times faster than 3D modeling and allows direct application of concepts from planar pn-junctions on the analysis of nanowire solar cells. We show that the superposition principle can break down in InP nanowires due to strong surface recombination in the depletion region, giving rise to an IV-behavior similar to that with low shunt resistance. Importantly, we find that the open-circuit voltage of nanowire solar cells is typically limited by contact leakage. Therefore, to increase the efficiency, we have investigated the effect of high-bandgap GaP carrier-selective contact segments at the top and bottom of the InP nanowire and we find that GaP contact segments improve the solar cell efficiency. Next, we discuss the merit of p-i-n and p-n junction concepts in nanowire solar cells. With GaP carrier selective top and bottom contact segments in the InP nanowire array, we find that a p-n junction design is superior to a p-i-n junction design. We predict a best efficiency of 25% for a surface recombination velocity of 4500 cm s-1, corresponding to a non-radiative lifetime of 1 ns in p-n junction cells. The developed 1D model can be used for general modeling of axial p-n and p-i-n junctions in semiconductor nanowires. This includes also LED applications and we expect faster progress in device modeling using our method.
Optimized efficiency in InP nanowire solar cells with accurate 1D analysis.
Chen, Yang; Kivisaari, Pyry; Pistol, Mats-Erik; Anttu, Nicklas
2018-01-26
Semiconductor nanowire arrays are a promising candidate for next generation solar cells due to enhanced absorption and reduced material consumption. However, to optimize their performance, time consuming three-dimensional (3D) opto-electronics modeling is usually performed. Here, we develop an accurate one-dimensional (1D) modeling method for the analysis. The 1D modeling is about 400 times faster than 3D modeling and allows direct application of concepts from planar pn-junctions on the analysis of nanowire solar cells. We show that the superposition principle can break down in InP nanowires due to strong surface recombination in the depletion region, giving rise to an IV-behavior similar to that with low shunt resistance. Importantly, we find that the open-circuit voltage of nanowire solar cells is typically limited by contact leakage. Therefore, to increase the efficiency, we have investigated the effect of high-bandgap GaP carrier-selective contact segments at the top and bottom of the InP nanowire and we find that GaP contact segments improve the solar cell efficiency. Next, we discuss the merit of p-i-n and p-n junction concepts in nanowire solar cells. With GaP carrier selective top and bottom contact segments in the InP nanowire array, we find that a p-n junction design is superior to a p-i-n junction design. We predict a best efficiency of 25% for a surface recombination velocity of 4500 cm s -1 , corresponding to a non-radiative lifetime of 1 ns in p-n junction cells. The developed 1D model can be used for general modeling of axial p-n and p-i-n junctions in semiconductor nanowires. This includes also LED applications and we expect faster progress in device modeling using our method.
Computationally efficient algorithm for Gaussian Process regression in case of structured samples
NASA Astrophysics Data System (ADS)
Belyaev, M.; Burnaev, E.; Kapushev, Y.
2016-04-01
Surrogate modeling is widely used in many engineering problems. Data sets often have Cartesian product structure (for instance factorial design of experiments with missing points). In such case the size of the data set can be very large. Therefore, one of the most popular algorithms for approximation-Gaussian Process regression-can be hardly applied due to its computational complexity. In this paper a computationally efficient approach for constructing Gaussian Process regression in case of data sets with Cartesian product structure is presented. Efficiency is achieved by using a special structure of the data set and operations with tensors. Proposed algorithm has low computational as well as memory complexity compared to existing algorithms. In this work we also introduce a regularization procedure allowing to take into account anisotropy of the data set and avoid degeneracy of regression model.
Computationally-Efficient Minimum-Time Aircraft Routes in the Presence of Winds
NASA Technical Reports Server (NTRS)
Jardin, Matthew R.
2004-01-01
A computationally efficient algorithm for minimizing the flight time of an aircraft in a variable wind field has been invented. The algorithm, referred to as Neighboring Optimal Wind Routing (NOWR), is based upon neighboring-optimal-control (NOC) concepts and achieves minimum-time paths by adjusting aircraft heading according to wind conditions at an arbitrary number of wind measurement points along the flight route. The NOWR algorithm may either be used in a fast-time mode to compute minimum- time routes prior to flight, or may be used in a feedback mode to adjust aircraft heading in real-time. By traveling minimum-time routes instead of direct great-circle (direct) routes, flights across the United States can save an average of about 7 minutes, and as much as one hour of flight time during periods of strong jet-stream winds. The neighboring optimal routes computed via the NOWR technique have been shown to be within 1.5 percent of the absolute minimum-time routes for flights across the continental United States. On a typical 450-MHz Sun Ultra workstation, the NOWR algorithm produces complete minimum-time routes in less than 40 milliseconds. This corresponds to a rate of 25 optimal routes per second. The closest comparable optimization technique runs approximately 10 times slower. Airlines currently use various trial-and-error search techniques to determine which of a set of commonly traveled routes will minimize flight time. These algorithms are too computationally expensive for use in real-time systems, or in systems where many optimal routes need to be computed in a short amount of time. Instead of operating in real-time, airlines will typically plan a trajectory several hours in advance using wind forecasts. If winds change significantly from forecasts, the resulting flights will no longer be minimum-time. The need for a computationally efficient wind-optimal routing algorithm is even greater in the case of new air-traffic-control automation concepts. For air
An energy-efficient failure detector for vehicular cloud computing.
Liu, Jiaxi; Wu, Zhibo; Dong, Jian; Wu, Jin; Wen, Dongxin
2018-01-01
Failure detectors are one of the fundamental components for maintaining the high availability of vehicular cloud computing. In vehicular cloud computing, lots of RSUs are deployed along the road to improve the connectivity. Many of them are equipped with solar battery due to the unavailability or excess expense of wired electrical power. So it is important to reduce the battery consumption of RSU. However, the existing failure detection algorithms are not designed to save battery consumption RSU. To solve this problem, a new energy-efficient failure detector 2E-FD has been proposed specifically for vehicular cloud computing. 2E-FD does not only provide acceptable failure detection service, but also saves the battery consumption of RSU. Through the comparative experiments, the results show that our failure detector has better performance in terms of speed, accuracy and battery consumption.
An energy-efficient failure detector for vehicular cloud computing
Liu, Jiaxi; Wu, Zhibo; Wu, Jin; Wen, Dongxin
2018-01-01
Failure detectors are one of the fundamental components for maintaining the high availability of vehicular cloud computing. In vehicular cloud computing, lots of RSUs are deployed along the road to improve the connectivity. Many of them are equipped with solar battery due to the unavailability or excess expense of wired electrical power. So it is important to reduce the battery consumption of RSU. However, the existing failure detection algorithms are not designed to save battery consumption RSU. To solve this problem, a new energy-efficient failure detector 2E-FD has been proposed specifically for vehicular cloud computing. 2E-FD does not only provide acceptable failure detection service, but also saves the battery consumption of RSU. Through the comparative experiments, the results show that our failure detector has better performance in terms of speed, accuracy and battery consumption. PMID:29352282
Efficient alignment-free DNA barcode analytics.
Kuksa, Pavel; Pavlovic, Vladimir
2009-11-10
In this work we consider barcode DNA analysis problems and address them using alternative, alignment-free methods and representations which model sequences as collections of short sequence fragments (features). The methods use fixed-length representations (spectrum) for barcode sequences to measure similarities or dissimilarities between sequences coming from the same or different species. The spectrum-based representation not only allows for accurate and computationally efficient species classification, but also opens possibility for accurate clustering analysis of putative species barcodes and identification of critical within-barcode loci distinguishing barcodes of different sample groups. New alignment-free methods provide highly accurate and fast DNA barcode-based identification and classification of species with substantial improvements in accuracy and speed over state-of-the-art barcode analysis methods. We evaluate our methods on problems of species classification and identification using barcodes, important and relevant analytical tasks in many practical applications (adverse species movement monitoring, sampling surveys for unknown or pathogenic species identification, biodiversity assessment, etc.) On several benchmark barcode datasets, including ACG, Astraptes, Hesperiidae, Fish larvae, and Birds of North America, proposed alignment-free methods considerably improve prediction accuracy compared to prior results. We also observe significant running time improvements over the state-of-the-art methods. Our results show that newly developed alignment-free methods for DNA barcoding can efficiently and with high accuracy identify specimens by examining only few barcode features, resulting in increased scalability and interpretability of current computational approaches to barcoding.
Efficient alignment-free DNA barcode analytics
Kuksa, Pavel; Pavlovic, Vladimir
2009-01-01
Background In this work we consider barcode DNA analysis problems and address them using alternative, alignment-free methods and representations which model sequences as collections of short sequence fragments (features). The methods use fixed-length representations (spectrum) for barcode sequences to measure similarities or dissimilarities between sequences coming from the same or different species. The spectrum-based representation not only allows for accurate and computationally efficient species classification, but also opens possibility for accurate clustering analysis of putative species barcodes and identification of critical within-barcode loci distinguishing barcodes of different sample groups. Results New alignment-free methods provide highly accurate and fast DNA barcode-based identification and classification of species with substantial improvements in accuracy and speed over state-of-the-art barcode analysis methods. We evaluate our methods on problems of species classification and identification using barcodes, important and relevant analytical tasks in many practical applications (adverse species movement monitoring, sampling surveys for unknown or pathogenic species identification, biodiversity assessment, etc.) On several benchmark barcode datasets, including ACG, Astraptes, Hesperiidae, Fish larvae, and Birds of North America, proposed alignment-free methods considerably improve prediction accuracy compared to prior results. We also observe significant running time improvements over the state-of-the-art methods. Conclusion Our results show that newly developed alignment-free methods for DNA barcoding can efficiently and with high accuracy identify specimens by examining only few barcode features, resulting in increased scalability and interpretability of current computational approaches to barcoding. PMID:19900305
A fast and accurate frequency estimation algorithm for sinusoidal signal with harmonic components
NASA Astrophysics Data System (ADS)
Hu, Jinghua; Pan, Mengchun; Zeng, Zhidun; Hu, Jiafei; Chen, Dixiang; Tian, Wugang; Zhao, Jianqiang; Du, Qingfa
2016-10-01
Frequency estimation is a fundamental problem in many applications, such as traditional vibration measurement, power system supervision, and microelectromechanical system sensors control. In this paper, a fast and accurate frequency estimation algorithm is proposed to deal with low efficiency problem in traditional methods. The proposed algorithm consists of coarse and fine frequency estimation steps, and we demonstrate that it is more efficient than conventional searching methods to achieve coarse frequency estimation (location peak of FFT amplitude) by applying modified zero-crossing technique. Thus, the proposed estimation algorithm requires less hardware and software sources and can achieve even higher efficiency when the experimental data increase. Experimental results with modulated magnetic signal show that the root mean square error of frequency estimation is below 0.032 Hz with the proposed algorithm, which has lower computational complexity and better global performance than conventional frequency estimation methods.
Han, Buhm; Kang, Hyun Min; Eskin, Eleazar
2009-01-01
With the development of high-throughput sequencing and genotyping technologies, the number of markers collected in genetic association studies is growing rapidly, increasing the importance of methods for correcting for multiple hypothesis testing. The permutation test is widely considered the gold standard for accurate multiple testing correction, but it is often computationally impractical for these large datasets. Recently, several studies proposed efficient alternative approaches to the permutation test based on the multivariate normal distribution (MVN). However, they cannot accurately correct for multiple testing in genome-wide association studies for two reasons. First, these methods require partitioning of the genome into many disjoint blocks and ignore all correlations between markers from different blocks. Second, the true null distribution of the test statistic often fails to follow the asymptotic distribution at the tails of the distribution. We propose an accurate and efficient method for multiple testing correction in genome-wide association studies—SLIDE. Our method accounts for all correlation within a sliding window and corrects for the departure of the true null distribution of the statistic from the asymptotic distribution. In simulations using the Wellcome Trust Case Control Consortium data, the error rate of SLIDE's corrected p-values is more than 20 times smaller than the error rate of the previous MVN-based methods' corrected p-values, while SLIDE is orders of magnitude faster than the permutation test and other competing methods. We also extend the MVN framework to the problem of estimating the statistical power of an association study with correlated markers and propose an efficient and accurate power estimation method SLIP. SLIP and SLIDE are available at http://slide.cs.ucla.edu. PMID:19381255
Efficient computation of parameter sensitivities of discrete stochastic chemical reaction networks.
Rathinam, Muruhan; Sheppard, Patrick W; Khammash, Mustafa
2010-01-21
Parametric sensitivity of biochemical networks is an indispensable tool for studying system robustness properties, estimating network parameters, and identifying targets for drug therapy. For discrete stochastic representations of biochemical networks where Monte Carlo methods are commonly used, sensitivity analysis can be particularly challenging, as accurate finite difference computations of sensitivity require a large number of simulations for both nominal and perturbed values of the parameters. In this paper we introduce the common random number (CRN) method in conjunction with Gillespie's stochastic simulation algorithm, which exploits positive correlations obtained by using CRNs for nominal and perturbed parameters. We also propose a new method called the common reaction path (CRP) method, which uses CRNs together with the random time change representation of discrete state Markov processes due to Kurtz to estimate the sensitivity via a finite difference approximation applied to coupled reaction paths that emerge naturally in this representation. While both methods reduce the variance of the estimator significantly compared to independent random number finite difference implementations, numerical evidence suggests that the CRP method achieves a greater variance reduction. We also provide some theoretical basis for the superior performance of CRP. The improved accuracy of these methods allows for much more efficient sensitivity estimation. In two example systems reported in this work, speedup factors greater than 300 and 10,000 are demonstrated.
Accurate determination of the charge transfer efficiency of photoanodes for solar water splitting.
Klotz, Dino; Grave, Daniel A; Rothschild, Avner
2017-08-09
The oxygen evolution reaction (OER) at the surface of semiconductor photoanodes is critical for photoelectrochemical water splitting. This reaction involves photo-generated holes that oxidize water via charge transfer at the photoanode/electrolyte interface. However, a certain fraction of the holes that reach the surface recombine with electrons from the conduction band, giving rise to the surface recombination loss. The charge transfer efficiency, η t , defined as the ratio between the flux of holes that contribute to the water oxidation reaction and the total flux of holes that reach the surface, is an important parameter that helps to distinguish between bulk and surface recombination losses. However, accurate determination of η t by conventional voltammetry measurements is complicated because only the total current is measured and it is difficult to discern between different contributions to the current. Chopped light measurement (CLM) and hole scavenger measurement (HSM) techniques are widely employed to determine η t , but they often lead to errors resulting from instrumental as well as fundamental limitations. Intensity modulated photocurrent spectroscopy (IMPS) is better suited for accurate determination of η t because it provides direct information on both the total photocurrent and the surface recombination current. However, careful analysis of IMPS measurements at different light intensities is required to account for nonlinear effects. This work compares the η t values obtained by these methods using heteroepitaxial thin-film hematite photoanodes as a case study. We show that a wide spread of η t values is obtained by different analysis methods, and even within the same method different values may be obtained depending on instrumental and experimental conditions such as the light source and light intensity. Statistical analysis of the results obtained for our model hematite photoanode show good correlation between different methods for
2016-01-01
Abstract Molecular recognition by protein mostly occurs in a local region on the protein surface. Thus, an efficient computational method for accurate characterization of protein local structural conservation is necessary to better understand biology and drug design. We present a novel local structure alignment tool, G‐LoSA. G‐LoSA aligns protein local structures in a sequence order independent way and provides a GA‐score, a chemical feature‐based and size‐independent structure similarity score. Our benchmark validation shows the robust performance of G‐LoSA to the local structures of diverse sizes and characteristics, demonstrating its universal applicability to local structure‐centric comparative biology studies. In particular, G‐LoSA is highly effective in detecting conserved local regions on the entire surface of a given protein. In addition, the applications of G‐LoSA to identifying template ligands and predicting ligand and protein binding sites illustrate its strong potential for computer‐aided drug design. We hope that G‐LoSA can be a useful computational method for exploring interesting biological problems through large‐scale comparison of protein local structures and facilitating drug discovery research and development. G‐LoSA is freely available to academic users at http://im.compbio.ku.edu/GLoSA/. PMID:26813336
Multigrid Computations of 3-D Incompressible Internal and External Viscous Rotating Flows
NASA Technical Reports Server (NTRS)
Sheng, Chunhua; Taylor, Lafayette K.; Chen, Jen-Ping; Jiang, Min-Yee; Whitfield, David L.
1996-01-01
This report presents multigrid methods for solving the 3-D incompressible viscous rotating flows in a NASA low-speed centrifugal compressor and a marine propeller 4119. Numerical formulations are given in both the rotating reference frame and the absolute frame. Comparisons are made for the accuracy, efficiency, and robustness between the steady-state scheme and the time-accurate scheme for simulating viscous rotating flows for complex internal and external flow applications. Prospects for further increase in efficiency and accuracy of unsteady time-accurate computations are discussed.
A computationally efficient scheme for the non-linear diffusion equation
NASA Astrophysics Data System (ADS)
Termonia, P.; Van de Vyver, H.
2009-04-01
This Letter proposes a new numerical scheme for integrating the non-linear diffusion equation. It is shown that it is linearly stable. Some tests are presented comparing this scheme to a popular decentered version of the linearized Crank-Nicholson scheme, showing that, although this scheme is slightly less accurate in treating the highly resolved waves, (i) the new scheme better treats highly non-linear systems, (ii) better handles the short waves, (iii) for a given test bed turns out to be three to four times more computationally cheap, and (iv) is easier in implementation.
Oltean, Gabriel; Ivanciu, Laura-Nicoleta
2016-01-01
The design and verification of complex electronic systems, especially the analog and mixed-signal ones, prove to be extremely time consuming tasks, if only circuit-level simulations are involved. A significant amount of time can be saved if a cost effective solution is used for the extensive analysis of the system, under all conceivable conditions. This paper proposes a data-driven method to build fast to evaluate, but also accurate metamodels capable of generating not-yet simulated waveforms as a function of different combinations of the parameters of the system. The necessary data are obtained by early-stage simulation of an electronic control system from the automotive industry. The metamodel development is based on three key elements: a wavelet transform for waveform characterization, a genetic algorithm optimization to detect the optimal wavelet transform and to identify the most relevant decomposition coefficients, and an artificial neuronal network to derive the relevant coefficients of the wavelet transform for any new parameters combination. The resulted metamodels for three different waveform families are fully reliable. They satisfy the required key points: high accuracy (a maximum mean squared error of 7.1x10-5 for the unity-based normalized waveforms), efficiency (fully affordable computational effort for metamodel build-up: maximum 18 minutes on a general purpose computer), and simplicity (less than 1 second for running the metamodel, the user only provides the parameters combination). The metamodels can be used for very efficient generation of new waveforms, for any possible combination of dependent parameters, offering the possibility to explore the entire design space. A wide range of possibilities becomes achievable for the user, such as: all design corners can be analyzed, possible worst-case situations can be investigated, extreme values of waveforms can be discovered, sensitivity analyses can be performed (the influence of each parameter on the
Oltean, Gabriel; Ivanciu, Laura-Nicoleta
2016-01-01
The design and verification of complex electronic systems, especially the analog and mixed-signal ones, prove to be extremely time consuming tasks, if only circuit-level simulations are involved. A significant amount of time can be saved if a cost effective solution is used for the extensive analysis of the system, under all conceivable conditions. This paper proposes a data-driven method to build fast to evaluate, but also accurate metamodels capable of generating not-yet simulated waveforms as a function of different combinations of the parameters of the system. The necessary data are obtained by early-stage simulation of an electronic control system from the automotive industry. The metamodel development is based on three key elements: a wavelet transform for waveform characterization, a genetic algorithm optimization to detect the optimal wavelet transform and to identify the most relevant decomposition coefficients, and an artificial neuronal network to derive the relevant coefficients of the wavelet transform for any new parameters combination. The resulted metamodels for three different waveform families are fully reliable. They satisfy the required key points: high accuracy (a maximum mean squared error of 7.1x10-5 for the unity-based normalized waveforms), efficiency (fully affordable computational effort for metamodel build-up: maximum 18 minutes on a general purpose computer), and simplicity (less than 1 second for running the metamodel, the user only provides the parameters combination). The metamodels can be used for very efficient generation of new waveforms, for any possible combination of dependent parameters, offering the possibility to explore the entire design space. A wide range of possibilities becomes achievable for the user, such as: all design corners can be analyzed, possible worst-case situations can be investigated, extreme values of waveforms can be discovered, sensitivity analyses can be performed (the influence of each parameter on the
Zhang, Xiaopu; Lin, Jun; Chen, Zubin; Sun, Feng; Zhu, Xi; Fang, Gengfa
2018-06-05
Microseismic monitoring is one of the most critical technologies for hydraulic fracturing in oil and gas production. To detect events in an accurate and efficient way, there are two major challenges. One challenge is how to achieve high accuracy due to a poor signal-to-noise ratio (SNR). The other one is concerned with real-time data transmission. Taking these challenges into consideration, an edge-computing-based platform, namely Edge-to-Center LearnReduce, is presented in this work. The platform consists of a data center with many edge components. At the data center, a neural network model combined with convolutional neural network (CNN) and long short-term memory (LSTM) is designed and this model is trained by using previously obtained data. Once the model is fully trained, it is sent to edge components for events detection and data reduction. At each edge component, a probabilistic inference is added to the neural network model to improve its accuracy. Finally, the reduced data is delivered to the data center. Based on experiment results, a high detection accuracy (over 96%) with less transmitted data (about 90%) was achieved by using the proposed approach on a microseismic monitoring system. These results show that the platform can simultaneously improve the accuracy and efficiency of microseismic monitoring.
NASA Technical Reports Server (NTRS)
Loh, Ching Y.; Jorgenson, Philip C. E.
2007-01-01
A time-accurate, upwind, finite volume method for computing compressible flows on unstructured grids is presented. The method is second order accurate in space and time and yields high resolution in the presence of discontinuities. For efficiency, the Roe approximate Riemann solver with an entropy correction is employed. In the basic Euler/Navier-Stokes scheme, many concepts of high order upwind schemes are adopted: the surface flux integrals are carefully treated, a Cauchy-Kowalewski time-stepping scheme is used in the time-marching stage, and a multidimensional limiter is applied in the reconstruction stage. However even with these up-to-date improvements, the basic upwind scheme is still plagued by the so-called "pathological behaviors," e.g., the carbuncle phenomenon, the expansion shock, etc. A solution to these limitations is presented which uses a very simple dissipation model while still preserving second order accuracy. This scheme is referred to as the enhanced time-accurate upwind (ETAU) scheme in this paper. The unstructured grid capability renders flexibility for use in complex geometry; and the present ETAU Euler/Navier-Stokes scheme is capable of handling a broad spectrum of flow regimes from high supersonic to subsonic at very low Mach number, appropriate for both CFD (computational fluid dynamics) and CAA (computational aeroacoustics). Numerous examples are included to demonstrate the robustness of the methods.
NASA Astrophysics Data System (ADS)
MacDonald, Christopher L.; Bhattacharya, Nirupama; Sprouse, Brian P.; Silva, Gabriel A.
2015-09-01
Computing numerical solutions to fractional differential equations can be computationally intensive due to the effect of non-local derivatives in which all previous time points contribute to the current iteration. In general, numerical approaches that depend on truncating part of the system history while efficient, can suffer from high degrees of error and inaccuracy. Here we present an adaptive time step memory method for smooth functions applied to the Grünwald-Letnikov fractional diffusion derivative. This method is computationally efficient and results in smaller errors during numerical simulations. Sampled points along the system's history at progressively longer intervals are assumed to reflect the values of neighboring time points. By including progressively fewer points backward in time, a temporally 'weighted' history is computed that includes contributions from the entire past of the system, maintaining accuracy, but with fewer points actually calculated, greatly improving computational efficiency.
Efficient scatter model for simulation of ultrasound images from computed tomography data
NASA Astrophysics Data System (ADS)
D'Amato, J. P.; Lo Vercio, L.; Rubi, P.; Fernandez Vera, E.; Barbuzza, R.; Del Fresno, M.; Larrabide, I.
2015-12-01
Background and motivation: Real-time ultrasound simulation refers to the process of computationally creating fully synthetic ultrasound images instantly. Due to the high value of specialized low cost training for healthcare professionals, there is a growing interest in the use of this technology and the development of high fidelity systems that simulate the acquisitions of echographic images. The objective is to create an efficient and reproducible simulator that can run either on notebooks or desktops using low cost devices. Materials and methods: We present an interactive ultrasound simulator based on CT data. This simulator is based on ray-casting and provides real-time interaction capabilities. The simulation of scattering that is coherent with the transducer position in real time is also introduced. Such noise is produced using a simplified model of multiplicative noise and convolution with point spread functions (PSF) tailored for this purpose. Results: The computational efficiency of scattering maps generation was revised with an improved performance. This allowed a more efficient simulation of coherent scattering in the synthetic echographic images while providing highly realistic result. We describe some quality and performance metrics to validate these results, where a performance of up to 55fps was achieved. Conclusion: The proposed technique for real-time scattering modeling provides realistic yet computationally efficient scatter distributions. The error between the original image and the simulated scattering image was compared for the proposed method and the state-of-the-art, showing negligible differences in its distribution.
Energy efficient hybrid computing systems using spin devices
NASA Astrophysics Data System (ADS)
Sharad, Mrigank
Emerging spin-devices like magnetic tunnel junctions (MTJ's), spin-valves and domain wall magnets (DWM) have opened new avenues for spin-based logic design. This work explored potential computing applications which can exploit such devices for higher energy-efficiency and performance. The proposed applications involve hybrid design schemes, where charge-based devices supplement the spin-devices, to gain large benefits at the system level. As an example, lateral spin valves (LSV) involve switching of nanomagnets using spin-polarized current injection through a metallic channel such as Cu. Such spin-torque based devices possess several interesting properties that can be exploited for ultra-low power computation. Analog characteristic of spin current facilitate non-Boolean computation like majority evaluation that can be used to model a neuron. The magneto-metallic neurons can operate at ultra-low terminal voltage of ˜20mV, thereby resulting in small computation power. Moreover, since nano-magnets inherently act as memory elements, these devices can facilitate integration of logic and memory in interesting ways. The spin based neurons can be integrated with CMOS and other emerging devices leading to different classes of neuromorphic/non-Von-Neumann architectures. The spin-based designs involve `mixed-mode' processing and hence can provide very compact and ultra-low energy solutions for complex computation blocks, both digital as well as analog. Such low-power, hybrid designs can be suitable for various data processing applications like cognitive computing, associative memory, and currentmode on-chip global interconnects. Simulation results for these applications based on device-circuit co-simulation framework predict more than ˜100x improvement in computation energy as compared to state of the art CMOS design, for optimal spin-device parameters.
Adding computationally efficient realism to Monte Carlo turbulence simulation
NASA Technical Reports Server (NTRS)
Campbell, C. W.
1985-01-01
Frequently in aerospace vehicle flight simulation, random turbulence is generated using the assumption that the craft is small compared to the length scales of turbulence. The turbulence is presumed to vary only along the flight path of the vehicle but not across the vehicle span. The addition of the realism of three-dimensionality is a worthy goal, but any such attempt will not gain acceptance in the simulator community unless it is computationally efficient. A concept for adding three-dimensional realism with a minimum of computational complexity is presented. The concept involves the use of close rational approximations to irrational spectra and cross-spectra so that systems of stable, explicit difference equations can be used to generate the turbulence.
Efficient High-Pressure State Equations
NASA Technical Reports Server (NTRS)
Harstad, Kenneth G.; Miller, Richard S.; Bellan, Josette
1997-01-01
A method is presented for a relatively accurate, noniterative, computationally efficient calculation of high-pressure fluid-mixture equations of state, especially targeted to gas turbines and rocket engines. Pressures above I bar and temperatures above 100 K are addressed The method is based on curve fitting an effective reference state relative to departure functions formed using the Peng-Robinson cubic state equation Fit parameters for H2, O2, N2, propane, methane, n-heptane, and methanol are given.
Computational Methods for Configurational Entropy Using Internal and Cartesian Coordinates.
Hikiri, Simon; Yoshidome, Takashi; Ikeguchi, Mitsunori
2016-12-13
The configurational entropy of solute molecules is a crucially important quantity to study various biophysical processes. Consequently, it is necessary to establish an efficient quantitative computational method to calculate configurational entropy as accurately as possible. In the present paper, we investigate the quantitative performance of the quasi-harmonic and related computational methods, including widely used methods implemented in popular molecular dynamics (MD) software packages, compared with the Clausius method, which is capable of accurately computing the change of the configurational entropy upon temperature change. Notably, we focused on the choice of the coordinate systems (i.e., internal or Cartesian coordinates). The Boltzmann-quasi-harmonic (BQH) method using internal coordinates outperformed all the six methods examined here. The introduction of improper torsions in the BQH method improves its performance, and anharmonicity of proper torsions in proteins is identified to be the origin of the superior performance of the BQH method. In contrast, widely used methods implemented in MD packages show rather poor performance. In addition, the enhanced sampling of replica-exchange MD simulations was found to be efficient for the convergent behavior of entropy calculations. Also in folding/unfolding transitions of a small protein, Chignolin, the BQH method was reasonably accurate. However, the independent term without the correlation term in the BQH method was most accurate for the folding entropy among the methods considered in this study, because the QH approximation of the correlation term in the BQH method was no longer valid for the divergent unfolded structures.
A Computational Framework for Efficient Low Temperature Plasma Simulations
NASA Astrophysics Data System (ADS)
Verma, Abhishek Kumar; Venkattraman, Ayyaswamy
2016-10-01
Over the past years, scientific computing has emerged as an essential tool for the investigation and prediction of low temperature plasmas (LTP) applications which includes electronics, nanomaterial synthesis, metamaterials etc. To further explore the LTP behavior with greater fidelity, we present a computational toolbox developed to perform LTP simulations. This framework will allow us to enhance our understanding of multiscale plasma phenomenon using high performance computing tools mainly based on OpenFOAM FVM distribution. Although aimed at microplasma simulations, the modular framework is able to perform multiscale, multiphysics simulations of physical systems comprises of LTP. Some salient introductory features are capability to perform parallel, 3D simulations of LTP applications on unstructured meshes. Performance of the solver is tested based on numerical results assessing accuracy and efficiency of benchmarks for problems in microdischarge devices. Numerical simulation of microplasma reactor at atmospheric pressure with hemispherical dielectric coated electrodes will be discussed and hence, provide an overview of applicability and future scope of this framework.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Giner, Emmanuel, E-mail: gnrmnl@unife.it; Angeli, Celestino, E-mail: anc@unife.it
2016-03-14
The present work describes a new method to compute accurate spin densities for open shell systems. The proposed approach follows two steps: first, it provides molecular orbitals which correctly take into account the spin delocalization; second, a proper CI treatment allows to account for the spin polarization effect while keeping a restricted formalism and avoiding spin contamination. The main idea of the optimization procedure is based on the orbital relaxation of the various charge transfer determinants responsible for the spin delocalization. The algorithm is tested and compared to other existing methods on a series of organic and inorganic open shellmore » systems. The results reported here show that the new approach (almost black-box) provides accurate spin densities at a reasonable computational cost making it suitable for a systematic study of open shell systems.« less
Efficient computation of PDF-based characteristics from diffusion MR signal.
Assemlal, Haz-Edine; Tschumperlé, David; Brun, Luc
2008-01-01
We present a general method for the computation of PDF-based characteristics of the tissue micro-architecture in MR imaging. The approach relies on the approximation of the MR signal by a series expansion based on Spherical Harmonics and Laguerre-Gaussian functions, followed by a simple projection step that is efficiently done in a finite dimensional space. The resulting algorithm is generic, flexible and is able to compute a large set of useful characteristics of the local tissues structure. We illustrate the effectiveness of this approach by showing results on synthetic and real MR datasets acquired in a clinical time-frame.
Stone, John E; Hallock, Michael J; Phillips, James C; Peterson, Joseph R; Luthey-Schulten, Zaida; Schulten, Klaus
2016-05-01
Many of the continuing scientific advances achieved through computational biology are predicated on the availability of ongoing increases in computational power required for detailed simulation and analysis of cellular processes on biologically-relevant timescales. A critical challenge facing the development of future exascale supercomputer systems is the development of new computing hardware and associated scientific applications that dramatically improve upon the energy efficiency of existing solutions, while providing increased simulation, analysis, and visualization performance. Mobile computing platforms have recently become powerful enough to support interactive molecular visualization tasks that were previously only possible on laptops and workstations, creating future opportunities for their convenient use for meetings, remote collaboration, and as head mounted displays for immersive stereoscopic viewing. We describe early experiences adapting several biomolecular simulation and analysis applications for emerging heterogeneous computing platforms that combine power-efficient system-on-chip multi-core CPUs with high-performance massively parallel GPUs. We present low-cost power monitoring instrumentation that provides sufficient temporal resolution to evaluate the power consumption of individual CPU algorithms and GPU kernels. We compare the performance and energy efficiency of scientific applications running on emerging platforms with results obtained on traditional platforms, identify hardware and algorithmic performance bottlenecks that affect the usability of these platforms, and describe avenues for improving both the hardware and applications in pursuit of the needs of molecular modeling tasks on mobile devices and future exascale computers.
Stone, John E.; Hallock, Michael J.; Phillips, James C.; Peterson, Joseph R.; Luthey-Schulten, Zaida; Schulten, Klaus
2016-01-01
Many of the continuing scientific advances achieved through computational biology are predicated on the availability of ongoing increases in computational power required for detailed simulation and analysis of cellular processes on biologically-relevant timescales. A critical challenge facing the development of future exascale supercomputer systems is the development of new computing hardware and associated scientific applications that dramatically improve upon the energy efficiency of existing solutions, while providing increased simulation, analysis, and visualization performance. Mobile computing platforms have recently become powerful enough to support interactive molecular visualization tasks that were previously only possible on laptops and workstations, creating future opportunities for their convenient use for meetings, remote collaboration, and as head mounted displays for immersive stereoscopic viewing. We describe early experiences adapting several biomolecular simulation and analysis applications for emerging heterogeneous computing platforms that combine power-efficient system-on-chip multi-core CPUs with high-performance massively parallel GPUs. We present low-cost power monitoring instrumentation that provides sufficient temporal resolution to evaluate the power consumption of individual CPU algorithms and GPU kernels. We compare the performance and energy efficiency of scientific applications running on emerging platforms with results obtained on traditional platforms, identify hardware and algorithmic performance bottlenecks that affect the usability of these platforms, and describe avenues for improving both the hardware and applications in pursuit of the needs of molecular modeling tasks on mobile devices and future exascale computers. PMID:27516922
NREL's Building-Integrated Supercomputer Provides Heating and Efficient Computing (Fact Sheet)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
2014-09-01
NREL's Energy Systems Integration Facility (ESIF) is meant to investigate new ways to integrate energy sources so they work together efficiently, and one of the key tools to that investigation, a new supercomputer, is itself a prime example of energy systems integration. NREL teamed with Hewlett-Packard (HP) and Intel to develop the innovative warm-water, liquid-cooled Peregrine supercomputer, which not only operates efficiently but also serves as the primary source of building heat for ESIF offices and laboratories. This innovative high-performance computer (HPC) can perform more than a quadrillion calculations per second as part of the world's most energy-efficient HPC datamore » center.« less
Highly efficient computer algorithm for identifying layer thickness of atomically thin 2D materials
NASA Astrophysics Data System (ADS)
Lee, Jekwan; Cho, Seungwan; Park, Soohyun; Bae, Hyemin; Noh, Minji; Kim, Beom; In, Chihun; Yang, Seunghoon; Lee, Sooun; Seo, Seung Young; Kim, Jehyun; Lee, Chul-Ho; Shim, Woo-Young; Jo, Moon-Ho; Kim, Dohun; Choi, Hyunyong
2018-03-01
The fields of layered material research, such as transition-metal dichalcogenides (TMDs), have demonstrated that the optical, electrical and mechanical properties strongly depend on the layer number N. Thus, efficient and accurate determination of N is the most crucial step before the associated device fabrication. An existing experimental technique using an optical microscope is the most widely used one to identify N. However, a critical drawback of this approach is that it relies on extensive laboratory experiences to estimate N; it requires a very time-consuming image-searching task assisted by human eyes and secondary measurements such as atomic force microscopy and Raman spectroscopy, which are necessary to ensure N. In this work, we introduce a computer algorithm based on the image analysis of a quantized optical contrast. We show that our algorithm can apply to a wide variety of layered materials, including graphene, MoS2, and WS2 regardless of substrates. The algorithm largely consists of two parts. First, it sets up an appropriate boundary between target flakes and substrate. Second, to compute N, it automatically calculates the optical contrast using an adaptive RGB estimation process between each target, which results in a matrix with different integer Ns and returns a matrix map of Ns onto the target flake position. Using a conventional desktop computational power, the time taken to display the final N matrix was 1.8 s on average for the image size of 1280 pixels by 960 pixels and obtained a high accuracy of 90% (six estimation errors among 62 samples) when compared to the other methods. To show the effectiveness of our algorithm, we also apply it to TMD flakes transferred on optically transparent c-axis sapphire substrates and obtain a similar result of the accuracy of 94% (two estimation errors among 34 samples).
Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation.
Kamnitsas, Konstantinos; Ledig, Christian; Newcombe, Virginia F J; Simpson, Joanna P; Kane, Andrew D; Menon, David K; Rueckert, Daniel; Glocker, Ben
2017-02-01
We propose a dual pathway, 11-layers deep, three-dimensional Convolutional Neural Network for the challenging task of brain lesion segmentation. The devised architecture is the result of an in-depth analysis of the limitations of current networks proposed for similar applications. To overcome the computational burden of processing 3D medical scans, we have devised an efficient and effective dense training scheme which joins the processing of adjacent image patches into one pass through the network while automatically adapting to the inherent class imbalance present in the data. Further, we analyze the development of deeper, thus more discriminative 3D CNNs. In order to incorporate both local and larger contextual information, we employ a dual pathway architecture that processes the input images at multiple scales simultaneously. For post-processing of the network's soft segmentation, we use a 3D fully connected Conditional Random Field which effectively removes false positives. Our pipeline is extensively evaluated on three challenging tasks of lesion segmentation in multi-channel MRI patient data with traumatic brain injuries, brain tumours, and ischemic stroke. We improve on the state-of-the-art for all three applications, with top ranking performance on the public benchmarks BRATS 2015 and ISLES 2015. Our method is computationally efficient, which allows its adoption in a variety of research and clinical settings. The source code of our implementation is made publicly available. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
2009-01-01
Background Marginal posterior genotype probabilities need to be computed for genetic analyses such as geneticcounseling in humans and selective breeding in animal and plant species. Methods In this paper, we describe a peeling based, deterministic, exact algorithm to compute efficiently genotype probabilities for every member of a pedigree with loops without recourse to junction-tree methods from graph theory. The efficiency in computing the likelihood by peeling comes from storing intermediate results in multidimensional tables called cutsets. Computing marginal genotype probabilities for individual i requires recomputing the likelihood for each of the possible genotypes of individual i. This can be done efficiently by storing intermediate results in two types of cutsets called anterior and posterior cutsets and reusing these intermediate results to compute the likelihood. Examples A small example is used to illustrate the theoretical concepts discussed in this paper, and marginal genotype probabilities are computed at a monogenic disease locus for every member in a real cattle pedigree. PMID:19958551
Efficient shortest-path-tree computation in network routing based on pulse-coupled neural networks.
Qu, Hong; Yi, Zhang; Yang, Simon X
2013-06-01
Shortest path tree (SPT) computation is a critical issue for routers using link-state routing protocols, such as the most commonly used open shortest path first and intermediate system to intermediate system. Each router needs to recompute a new SPT rooted from itself whenever a change happens in the link state. Most commercial routers do this computation by deleting the current SPT and building a new one using static algorithms such as the Dijkstra algorithm at the beginning. Such recomputation of an entire SPT is inefficient, which may consume a considerable amount of CPU time and result in a time delay in the network. Some dynamic updating methods using the information in the updated SPT have been proposed in recent years. However, there are still many limitations in those dynamic algorithms. In this paper, a new modified model of pulse-coupled neural networks (M-PCNNs) is proposed for the SPT computation. It is rigorously proved that the proposed model is capable of solving some optimization problems, such as the SPT. A static algorithm is proposed based on the M-PCNNs to compute the SPT efficiently for large-scale problems. In addition, a dynamic algorithm that makes use of the structure of the previously computed SPT is proposed, which significantly improves the efficiency of the algorithm. Simulation results demonstrate the effective and efficient performance of the proposed approach.
Framework for computationally efficient optimal irrigation scheduling using ant colony optimization
USDA-ARS?s Scientific Manuscript database
A general optimization framework is introduced with the overall goal of reducing search space size and increasing the computational efficiency of evolutionary algorithm application for optimal irrigation scheduling. The framework achieves this goal by representing the problem in the form of a decisi...
On the Use of Electrooculogram for Efficient Human Computer Interfaces
Usakli, A. B.; Gurkan, S.; Aloise, F.; Vecchiato, G.; Babiloni, F.
2010-01-01
The aim of this study is to present electrooculogram signals that can be used for human computer interface efficiently. Establishing an efficient alternative channel for communication without overt speech and hand movements is important to increase the quality of life for patients suffering from Amyotrophic Lateral Sclerosis or other illnesses that prevent correct limb and facial muscular responses. We have made several experiments to compare the P300-based BCI speller and EOG-based new system. A five-letter word can be written on average in 25 seconds and in 105 seconds with the EEG-based device. Giving message such as “clean-up” could be performed in 3 seconds with the new system. The new system is more efficient than P300-based BCI system in terms of accuracy, speed, applicability, and cost efficiency. Using EOG signals, it is possible to improve the communication abilities of those patients who can move their eyes. PMID:19841687
Multiresolution molecular mechanics: Implementation and efficiency
NASA Astrophysics Data System (ADS)
Biyikli, Emre; To, Albert C.
2017-01-01
Atomistic/continuum coupling methods combine accurate atomistic methods and efficient continuum methods to simulate the behavior of highly ordered crystalline systems. Coupled methods utilize the advantages of both approaches to simulate systems at a lower computational cost, while retaining the accuracy associated with atomistic methods. Many concurrent atomistic/continuum coupling methods have been proposed in the past; however, their true computational efficiency has not been demonstrated. The present work presents an efficient implementation of a concurrent coupling method called the Multiresolution Molecular Mechanics (MMM) for serial, parallel, and adaptive analysis. First, we present the features of the software implemented along with the associated technologies. The scalability of the software implementation is demonstrated, and the competing effects of multiscale modeling and parallelization are discussed. Then, the algorithms contributing to the efficiency of the software are presented. These include algorithms for eliminating latent ghost atoms from calculations and measurement-based dynamic balancing of parallel workload. The efficiency improvements made by these algorithms are demonstrated by benchmark tests. The efficiency of the software is found to be on par with LAMMPS, a state-of-the-art Molecular Dynamics (MD) simulation code, when performing full atomistic simulations. Speed-up of the MMM method is shown to be directly proportional to the reduction of the number of the atoms visited in force computation. Finally, an adaptive MMM analysis on a nanoindentation problem, containing over a million atoms, is performed, yielding an improvement of 6.3-8.5 times in efficiency, over the full atomistic MD method. For the first time, the efficiency of a concurrent atomistic/continuum coupling method is comprehensively investigated and demonstrated.
A Simple and Resource-efficient Setup for the Computer-aided Drug Design Laboratory.
Moretti, Loris; Sartori, Luca
2016-10-01
Undertaking modelling investigations for Computer-Aided Drug Design (CADD) requires a proper environment. In principle, this could be done on a single computer, but the reality of a drug discovery program requires robustness and high-throughput computing (HTC) to efficiently support the research. Therefore, a more capable alternative is needed but its implementation has no widespread solution. Here, the realization of such a computing facility is discussed, from general layout to technical details all aspects are covered. © 2016 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Computing NLTE Opacities -- Node Level Parallel Calculation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holladay, Daniel
Presentation. The goal: to produce a robust library capable of computing reasonably accurate opacities inline with the assumption of LTE relaxed (non-LTE). Near term: demonstrate acceleration of non-LTE opacity computation. Far term (if funded): connect to application codes with in-line capability and compute opacities. Study science problems. Use efficient algorithms that expose many levels of parallelism and utilize good memory access patterns for use on advanced architectures. Portability to multiple types of hardware including multicore processors, manycore processors such as KNL, GPUs, etc. Easily coupled to radiation hydrodynamics and thermal radiative transfer codes.
NASA Technical Reports Server (NTRS)
Tranter, W. H.; Ziemer, R. E.; Fashano, M. J.
1975-01-01
This paper reviews the SYSTID technique for performance evaluation of communication systems using time-domain computer simulation. An example program illustrates the language. The inclusion of both Gaussian and impulse noise models make accurate simulation possible in a wide variety of environments. A very flexible postprocessor makes possible accurate and efficient performance evaluation.
Efficient calculation of general Voigt profiles
NASA Astrophysics Data System (ADS)
Cope, D.; Khoury, R.; Lovett, R. J.
1988-02-01
An accurate and efficient program is presented for the computation of OIL profiles, generalizations of the Voigt profile resulting from the one-interacting-level model of Ward et al. (1974). These profiles have speed dependent shift and width functions and have asymmetric shapes. The program contains an adjustable error control parameter and includes the Voigt profile as a special case, although the general nature of this program renders it slower than a specialized Voigt profile method. Results on accuracy and computation time are presented for a broad set of test parameters, and a comparison is made with previous work on the asymptotic behavior of general Voigt profiles.
Efficient integration method for fictitious domain approaches
NASA Astrophysics Data System (ADS)
Duczek, Sascha; Gabbert, Ulrich
2015-10-01
In the current article, we present an efficient and accurate numerical method for the integration of the system matrices in fictitious domain approaches such as the finite cell method (FCM). In the framework of the FCM, the physical domain is embedded in a geometrically larger domain of simple shape which is discretized using a regular Cartesian grid of cells. Therefore, a spacetree-based adaptive quadrature technique is normally deployed to resolve the geometry of the structure. Depending on the complexity of the structure under investigation this method accounts for most of the computational effort. To reduce the computational costs for computing the system matrices an efficient quadrature scheme based on the divergence theorem (Gauß-Ostrogradsky theorem) is proposed. Using this theorem the dimension of the integral is reduced by one, i.e. instead of solving the integral for the whole domain only its contour needs to be considered. In the current paper, we present the general principles of the integration method and its implementation. The results to several two-dimensional benchmark problems highlight its properties. The efficiency of the proposed method is compared to conventional spacetree-based integration techniques.
Quantum Monte Carlo: Faster, More Reliable, And More Accurate
NASA Astrophysics Data System (ADS)
Anderson, Amos Gerald
2010-06-01
The Schrodinger Equation has been available for about 83 years, but today, we still strain to apply it accurately to molecules of interest. The difficulty is not theoretical in nature, but practical, since we're held back by lack of sufficient computing power. Consequently, effort is applied to find acceptable approximations to facilitate real time solutions. In the meantime, computer technology has begun rapidly advancing and changing the way we think about efficient algorithms. For those who can reorganize their formulas to take advantage of these changes and thereby lift some approximations, incredible new opportunities await. Over the last decade, we've seen the emergence of a new kind of computer processor, the graphics card. Designed to accelerate computer games by optimizing quantity instead of quality in processor, they have become of sufficient quality to be useful to some scientists. In this thesis, we explore the first known use of a graphics card to computational chemistry by rewriting our Quantum Monte Carlo software into the requisite "data parallel" formalism. We find that notwithstanding precision considerations, we are able to speed up our software by about a factor of 6. The success of a Quantum Monte Carlo calculation depends on more than just processing power. It also requires the scientist to carefully design the trial wavefunction used to guide simulated electrons. We have studied the use of Generalized Valence Bond wavefunctions to simply, and yet effectively, captured the essential static correlation in atoms and molecules. Furthermore, we have developed significantly improved two particle correlation functions, designed with both flexibility and simplicity considerations, representing an effective and reliable way to add the necessary dynamic correlation. Lastly, we present our method for stabilizing the statistical nature of the calculation, by manipulating configuration weights, thus facilitating efficient and robust calculations. Our
The Improvement of Efficiency in the Numerical Computation of Orbit Trajectories
NASA Technical Reports Server (NTRS)
Dyer, J.; Danchick, R.; Pierce, S.; Haney, R.
1972-01-01
An analysis, system design, programming, and evaluation of results are described for numerical computation of orbit trajectories. Evaluation of generalized methods, interaction of different formulations for satellite motion, transformation of equations of motion and integrator loads, and development of efficient integrators are also considered.
BINGO: a code for the efficient computation of the scalar bi-spectrum
NASA Astrophysics Data System (ADS)
Hazra, Dhiraj Kumar; Sriramkumar, L.; Martin, Jérôme
2013-05-01
We present a new and accurate Fortran code, the BI-spectra and Non-Gaussianity Operator (BINGO), for the efficient numerical computation of the scalar bi-spectrum and the non-Gaussianity parameter fNL in single field inflationary models involving the canonical scalar field. The code can calculate all the different contributions to the bi-spectrum and the parameter fNL for an arbitrary triangular configuration of the wavevectors. Focusing firstly on the equilateral limit, we illustrate the accuracy of BINGO by comparing the results from the code with the spectral dependence of the bi-spectrum expected in power law inflation. Then, considering an arbitrary triangular configuration, we contrast the numerical results with the analytical expression available in the slow roll limit, for, say, the case of the conventional quadratic potential. Considering a non-trivial scenario involving deviations from slow roll, we compare the results from the code with the analytical results that have recently been obtained in the case of the Starobinsky model in the equilateral limit. As an immediate application, we utilize BINGO to examine of the power of the non-Gaussianity parameter fNL to discriminate between various inflationary models that admit departures from slow roll and lead to similar features in the scalar power spectrum. We close with a summary and discussion on the implications of the results we obtain.
Yao, Y. X.; Liu, J.; Liu, C.; ...
2015-08-28
We present an efficient method for calculating the electronic structure and total energy of strongly correlated electron systems. The method extends the traditional Gutzwiller approximation for one-particle operators to the evaluation of the expectation values of two particle operators in the many-electron Hamiltonian. The method is free of adjustable Coulomb parameters, and has no double counting issues in the calculation of total energy, and has the correct atomic limit. We demonstrate that the method describes well the bonding and dissociation behaviors of the hydrogen and nitrogen clusters, as well as the ammonia composed of hydrogen and nitrogen atoms. We alsomore » show that the method can satisfactorily tackle great challenging problems faced by the density functional theory recently discussed in the literature. The computational workload of our method is similar to the Hartree-Fock approach while the results are comparable to high-level quantum chemistry calculations.« less
Computationally efficient stochastic optimization using multiple realizations
NASA Astrophysics Data System (ADS)
Bayer, P.; Bürger, C. M.; Finkel, M.
2008-02-01
The presented study is concerned with computationally efficient methods for solving stochastic optimization problems involving multiple equally probable realizations of uncertain parameters. A new and straightforward technique is introduced that is based on dynamically ordering the stack of realizations during the search procedure. The rationale is that a small number of critical realizations govern the output of a reliability-based objective function. By utilizing a problem, which is typical to designing a water supply well field, several variants of this "stack ordering" approach are tested. The results are statistically assessed, in terms of optimality and nominal reliability. This study demonstrates that the simple ordering of a given number of 500 realizations while applying an evolutionary search algorithm can save about half of the model runs without compromising the optimization procedure. More advanced variants of stack ordering can, if properly configured, save up to more than 97% of the computational effort that would be required if the entire number of realizations were considered. The findings herein are promising for similar problems of water management and reliability-based design in general, and particularly for non-convex problems that require heuristic search techniques.
Accurate traveltime computation in complex anisotropic media with discontinuous Galerkin method
NASA Astrophysics Data System (ADS)
Le Bouteiller, P.; Benjemaa, M.; Métivier, L.; Virieux, J.
2017-12-01
Travel time computation is of major interest for a large range of geophysical applications, among which source localization and characterization, phase identification, data windowing and tomography, from decametric scale up to global Earth scale.Ray-tracing tools, being essentially 1D Lagrangian integration along a path, have been used for their efficiency but present some drawbacks, such as a rather difficult control of the medium sampling. Moreover, they do not provide answers in shadow zones. Eikonal solvers, based on an Eulerian approach, have attracted attention in seismology with the pioneering work of Vidale (1988), while such approach has been proposed earlier by Riznichenko (1946). They have been used now for first-arrival travel-time tomography at various scales (Podvin & Lecomte (1991). The framework for solving this non-linear partial differential equation is now well understood and various finite-difference approaches have been proposed, essentially for smooth media. We propose a novel finite element approach which builds a precise solution for strongly heterogeneous anisotropic medium (still in the limit of Eikonal validity). The discontinuous Galerkin method we have developed allows local refinement of the mesh and local high orders of interpolation inside elements. High precision of the travel times and its spatial derivatives is obtained through this formulation. This finite element method also honors boundary conditions, such as complex topographies and absorbing boundaries for mimicking an infinite medium. Applications from travel-time tomography, slope tomography are expected, but also for migration and take-off angles estimation, thanks to the accuracy obtained when computing first-arrival times.References:Podvin, P. and Lecomte, I., 1991. Finite difference computation of traveltimes in very contrasted velocity model: a massively parallel approach and its associated tools, Geophys. J. Int., 105, 271-284.Riznichenko, Y., 1946. Geometrical
Reverse radiance: a fast accurate method for determining luminance
NASA Astrophysics Data System (ADS)
Moore, Kenneth E.; Rykowski, Ronald F.; Gangadhara, Sanjay
2012-10-01
Reverse ray tracing from a region of interest backward to the source has long been proposed as an efficient method of determining luminous flux. The idea is to trace rays only from where the final flux needs to be known back to the source, rather than tracing in the forward direction from the source outward to see where the light goes. Once the reverse ray reaches the source, the radiance the equivalent forward ray would have represented is determined and the resulting flux computed. Although reverse ray tracing is conceptually simple, the method critically depends upon an accurate source model in both the near and far field. An overly simplified source model, such as an ideal Lambertian surface substantially detracts from the accuracy and thus benefit of the method. This paper will introduce an improved method of reverse ray tracing that we call Reverse Radiance that avoids assumptions about the source properties. The new method uses measured data from a Source Imaging Goniometer (SIG) that simultaneously measures near and far field luminous data. Incorporating this data into a fast reverse ray tracing integration method yields fast, accurate data for a wide variety of illumination problems.
Wan, Shixiang; Zou, Quan
2017-01-01
Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.
NASA Astrophysics Data System (ADS)
Kochmann, Julian; Wulfinghoff, Stephan; Ehle, Lisa; Mayer, Joachim; Svendsen, Bob; Reese, Stefanie
2018-06-01
Recently, two-scale FE-FFT-based methods (e.g., Spahn et al. in Comput Methods Appl Mech Eng 268:871-883, 2014; Kochmann et al. in Comput Methods Appl Mech Eng 305:89-110, 2016) have been proposed to predict the microscopic and overall mechanical behavior of heterogeneous materials. The purpose of this work is the extension to elasto-viscoplastic polycrystals, efficient and robust Fourier solvers and the prediction of micromechanical fields during macroscopic deformation processes. Assuming scale separation, the macroscopic problem is solved using the finite element method. The solution of the microscopic problem, which is embedded as a periodic unit cell (UC) in each macroscopic integration point, is found by employing fast Fourier transforms, fixed-point and Newton-Krylov methods. The overall material behavior is defined by the mean UC response. In order to ensure spatially converged micromechanical fields as well as feasible overall CPU times, an efficient but simple solution strategy for two-scale simulations is proposed. As an example, the constitutive behavior of 42CrMo4 steel is predicted during macroscopic three-point bending tests.
NASA Astrophysics Data System (ADS)
Kochmann, Julian; Wulfinghoff, Stephan; Ehle, Lisa; Mayer, Joachim; Svendsen, Bob; Reese, Stefanie
2017-09-01
Recently, two-scale FE-FFT-based methods (e.g., Spahn et al. in Comput Methods Appl Mech Eng 268:871-883, 2014; Kochmann et al. in Comput Methods Appl Mech Eng 305:89-110, 2016) have been proposed to predict the microscopic and overall mechanical behavior of heterogeneous materials. The purpose of this work is the extension to elasto-viscoplastic polycrystals, efficient and robust Fourier solvers and the prediction of micromechanical fields during macroscopic deformation processes. Assuming scale separation, the macroscopic problem is solved using the finite element method. The solution of the microscopic problem, which is embedded as a periodic unit cell (UC) in each macroscopic integration point, is found by employing fast Fourier transforms, fixed-point and Newton-Krylov methods. The overall material behavior is defined by the mean UC response. In order to ensure spatially converged micromechanical fields as well as feasible overall CPU times, an efficient but simple solution strategy for two-scale simulations is proposed. As an example, the constitutive behavior of 42CrMo4 steel is predicted during macroscopic three-point bending tests.
Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads.
Song, Li; Florea, Liliana
2015-01-01
Next-generation sequencing of cellular RNA (RNA-seq) is rapidly becoming the cornerstone of transcriptomic analysis. However, sequencing errors in the already short RNA-seq reads complicate bioinformatics analyses, in particular alignment and assembly. Error correction methods have been highly effective for whole-genome sequencing (WGS) reads, but are unsuitable for RNA-seq reads, owing to the variation in gene expression levels and alternative splicing. We developed a k-mer based method, Rcorrector, to correct random sequencing errors in Illumina RNA-seq reads. Rcorrector uses a De Bruijn graph to compactly represent all trusted k-mers in the input reads. Unlike WGS read correctors, which use a global threshold to determine trusted k-mers, Rcorrector computes a local threshold at every position in a read. Rcorrector has an accuracy higher than or comparable to existing methods, including the only other method (SEECER) designed for RNA-seq reads, and is more time and memory efficient. With a 5 GB memory footprint for 100 million reads, it can be run on virtually any desktop or server. The software is available free of charge under the GNU General Public License from https://github.com/mourisl/Rcorrector/.
Solving the Coupled System Improves Computational Efficiency of the Bidomain Equations
Southern, James A.; Plank, Gernot; Vigmond, Edward J.; Whiteley, Jonathan P.
2017-01-01
The bidomain equations are frequently used to model the propagation of cardiac action potentials across cardiac tissue. At the whole organ level the size of the computational mesh required makes their solution a significant computational challenge. As the accuracy of the numerical solution cannot be compromised, efficiency of the solution technique is important to ensure that the results of the simulation can be obtained in a reasonable time whilst still encapsulating the complexities of the system. In an attempt to increase efficiency of the solver, the bidomain equations are often decoupled into one parabolic equation that is computationally very cheap to solve and an elliptic equation that is much more expensive to solve. In this study the performance of this uncoupled solution method is compared with an alternative strategy in which the bidomain equations are solved as a coupled system. This seems counter-intuitive as the alternative method requires the solution of a much larger linear system at each time step. However, in tests on two 3-D rabbit ventricle benchmarks it is shown that the coupled method is up to 80% faster than the conventional uncoupled method — and that parallel performance is better for the larger coupled problem. PMID:19457741
A strategy for improved computational efficiency of the method of anchored distributions
NASA Astrophysics Data System (ADS)
Over, Matthew William; Yang, Yarong; Chen, Xingyuan; Rubin, Yoram
2013-06-01
This paper proposes a strategy for improving the computational efficiency of model inversion using the method of anchored distributions (MAD) by "bundling" similar model parametrizations in the likelihood function. Inferring the likelihood function typically requires a large number of forward model (FM) simulations for each possible model parametrization; as a result, the process is quite expensive. To ease this prohibitive cost, we present an approximation for the likelihood function called bundling that relaxes the requirement for high quantities of FM simulations. This approximation redefines the conditional statement of the likelihood function as the probability of a set of similar model parametrizations "bundle" replicating field measurements, which we show is neither a model reduction nor a sampling approach to improving the computational efficiency of model inversion. To evaluate the effectiveness of these modifications, we compare the quality of predictions and computational cost of bundling relative to a baseline MAD inversion of 3-D flow and transport model parameters. Additionally, to aid understanding of the implementation we provide a tutorial for bundling in the form of a sample data set and script for the R statistical computing language. For our synthetic experiment, bundling achieved a 35% reduction in overall computational cost and had a limited negative impact on predicted probability distributions of the model parameters. Strategies for minimizing error in the bundling approximation, for enforcing similarity among the sets of model parametrizations, and for identifying convergence of the likelihood function are also presented.
Minimum Requirements for Accurate and Efficient Real-Time On-Chip Spike Sorting
Navajas, Joaquin; Barsakcioglu, Deren Y.; Eftekhar, Amir; Jackson, Andrew; Constandinou, Timothy G.; Quiroga, Rodrigo Quian
2014-01-01
Background Extracellular recordings are performed by inserting electrodes in the brain, relaying the signals to external power-demanding devices, where spikes are detected and sorted in order to identify the firing activity of different putative neurons. A main caveat of these recordings is the necessity of wires passing through the scalp and skin in order to connect intracortical electrodes to external amplifiers. The aim of this paper is to evaluate the feasibility of an implantable platform (i.e. a chip) with the capability to wirelessly transmit the neural signals and perform real-time on-site spike sorting. New Method We computationally modelled a two-stage implementation for online, robust, and efficient spike sorting. In the first stage, spikes are detected on-chip and streamed to an external computer where mean templates are created and sent back to the chip. In the second stage, spikes are sorted in real-time through template matching. Results We evaluated this procedure using realistic simulations of extracellular recordings and describe a set of specifications that optimise performance while keeping to a minimum the signal requirements and the complexity of the calculations. Comparison with Existing Methods A key bottleneck for the development of long-term BMIs is to find an inexpensive method for real-time spike sorting. Here, we simulated a solution to this problem that uses both offline and online processing of the data. Conclusions Hardware implementations of this method therefore enable low-power long-term wireless transmission of multiple site extracellular recordings, with application to wireless BMIs or closed-loop stimulation designs. PMID:24769170
Computational Modeling in Concert with Laboratory Studies: Application to B Cell Differentiation
Remediation is expensive, so accurate prediction of dose-response is important to help control costs. Dose response is a function of biological mechanisms. Computational models of these mechanisms improve the efficiency of research and provide the capability for prediction.
Redundancy management for efficient fault recovery in NASA's distributed computing system
NASA Technical Reports Server (NTRS)
Malek, Miroslaw; Pandya, Mihir; Yau, Kitty
1991-01-01
The management of redundancy in computer systems was studied and guidelines were provided for the development of NASA's fault-tolerant distributed systems. Fault recovery and reconfiguration mechanisms were examined. A theoretical foundation was laid for redundancy management by efficient reconfiguration methods and algorithmic diversity. Algorithms were developed to optimize the resources for embedding of computational graphs of tasks in the system architecture and reconfiguration of these tasks after a failure has occurred. The computational structure represented by a path and the complete binary tree was considered and the mesh and hypercube architectures were targeted for their embeddings. The innovative concept of Hybrid Algorithm Technique was introduced. This new technique provides a mechanism for obtaining fault tolerance while exhibiting improved performance.
The accurate particle tracer code
NASA Astrophysics Data System (ADS)
Wang, Yulei; Liu, Jian; Qin, Hong; Yu, Zhi; Yao, Yicun
2017-11-01
The Accurate Particle Tracer (APT) code is designed for systematic large-scale applications of geometric algorithms for particle dynamical simulations. Based on a large variety of advanced geometric algorithms, APT possesses long-term numerical accuracy and stability, which are critical for solving multi-scale and nonlinear problems. To provide a flexible and convenient I/O interface, the libraries of Lua and Hdf5 are used. Following a three-step procedure, users can efficiently extend the libraries of electromagnetic configurations, external non-electromagnetic forces, particle pushers, and initialization approaches by use of the extendible module. APT has been used in simulations of key physical problems, such as runaway electrons in tokamaks and energetic particles in Van Allen belt. As an important realization, the APT-SW version has been successfully distributed on the world's fastest computer, the Sunway TaihuLight supercomputer, by supporting master-slave architecture of Sunway many-core processors. Based on large-scale simulations of a runaway beam under parameters of the ITER tokamak, it is revealed that the magnetic ripple field can disperse the pitch-angle distribution significantly and improve the confinement of energetic runaway beam on the same time.
The accurate particle tracer code
Wang, Yulei; Liu, Jian; Qin, Hong; ...
2017-07-20
The Accurate Particle Tracer (APT) code is designed for systematic large-scale applications of geometric algorithms for particle dynamical simulations. Based on a large variety of advanced geometric algorithms, APT possesses long-term numerical accuracy and stability, which are critical for solving multi-scale and nonlinear problems. To provide a flexible and convenient I/O interface, the libraries of Lua and Hdf5 are used. Following a three-step procedure, users can efficiently extend the libraries of electromagnetic configurations, external non-electromagnetic forces, particle pushers, and initialization approaches by use of the extendible module. APT has been used in simulations of key physical problems, such as runawaymore » electrons in tokamaks and energetic particles in Van Allen belt. As an important realization, the APT-SW version has been successfully distributed on the world’s fastest computer, the Sunway TaihuLight supercomputer, by supporting master–slave architecture of Sunway many-core processors. Here, based on large-scale simulations of a runaway beam under parameters of the ITER tokamak, it is revealed that the magnetic ripple field can disperse the pitch-angle distribution significantly and improve the confinement of energetic runaway beam on the same time.« less
The accurate particle tracer code
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Yulei; Liu, Jian; Qin, Hong
The Accurate Particle Tracer (APT) code is designed for systematic large-scale applications of geometric algorithms for particle dynamical simulations. Based on a large variety of advanced geometric algorithms, APT possesses long-term numerical accuracy and stability, which are critical for solving multi-scale and nonlinear problems. To provide a flexible and convenient I/O interface, the libraries of Lua and Hdf5 are used. Following a three-step procedure, users can efficiently extend the libraries of electromagnetic configurations, external non-electromagnetic forces, particle pushers, and initialization approaches by use of the extendible module. APT has been used in simulations of key physical problems, such as runawaymore » electrons in tokamaks and energetic particles in Van Allen belt. As an important realization, the APT-SW version has been successfully distributed on the world’s fastest computer, the Sunway TaihuLight supercomputer, by supporting master–slave architecture of Sunway many-core processors. Here, based on large-scale simulations of a runaway beam under parameters of the ITER tokamak, it is revealed that the magnetic ripple field can disperse the pitch-angle distribution significantly and improve the confinement of energetic runaway beam on the same time.« less
Bhanot, Gyan V [Princeton, NJ; Chen, Dong [Croton-On-Hudson, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY
2012-01-10
The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.
Aguilar, I; Misztal, I; Legarra, A; Tsuruta, S
2011-12-01
Genomic evaluations can be calculated using a unified procedure that combines phenotypic, pedigree and genomic information. Implementation of such a procedure requires the inverse of the relationship matrix based on pedigree and genomic relationships. The objective of this study was to investigate efficient computing options to create relationship matrices based on genomic markers and pedigree information as well as their inverses. SNP maker information was simulated for a panel of 40 K SNPs, with the number of genotyped animals up to 30 000. Matrix multiplication in the computation of the genomic relationship was by a simple 'do' loop, by two optimized versions of the loop, and by a specific matrix multiplication subroutine. Inversion was by a generalized inverse algorithm and by a LAPACK subroutine. With the most efficient choices and parallel processing, creation of matrices for 30 000 animals would take a few hours. Matrices required to implement a unified approach can be computed efficiently. Optimizations can be either by modifications of existing code or by the use of efficient automatic optimizations provided by open source or third-party libraries. © 2011 Blackwell Verlag GmbH.
An Efficient Statistical Method to Compute Molecular Collisional Rate Coefficients
NASA Astrophysics Data System (ADS)
Loreau, Jérôme; Lique, François; Faure, Alexandre
2018-01-01
Our knowledge about the “cold” universe often relies on molecular spectra. A general property of such spectra is that the energy level populations are rarely at local thermodynamic equilibrium. Solving the radiative transfer thus requires the availability of collisional rate coefficients with the main colliding partners over the temperature range ∼10–1000 K. These rate coefficients are notoriously difficult to measure and expensive to compute. In particular, very few reliable collisional data exist for inelastic collisions involving reactive radicals or ions. In this Letter, we explore the use of a fast quantum statistical method to determine molecular collisional excitation rate coefficients. The method is benchmarked against accurate (but costly) rigid-rotor close-coupling calculations. For collisions proceeding through the formation of a strongly bound complex, the method is found to be highly satisfactory up to room temperature. Its accuracy decreases with decreasing potential well depth and with increasing temperature, as expected. This new method opens the way to the determination of accurate inelastic collisional data involving key reactive species such as {{{H}}}3+, H2O+, and H3O+ for which exact quantum calculations are currently not feasible.
Computationally Efficient Adaptive Beamformer for Ultrasound Imaging Based on QR Decomposition.
Park, Jongin; Wi, Seok-Min; Lee, Jin S
2016-02-01
Adaptive beamforming methods for ultrasound imaging have been studied to improve image resolution and contrast. The most common approach is the minimum variance (MV) beamformer which minimizes the power of the beamformed output while maintaining the response from the direction of interest constant. The method achieves higher resolution and better contrast than the delay-and-sum (DAS) beamformer, but it suffers from high computational cost. This cost is mainly due to the computation of the spatial covariance matrix and its inverse, which requires O(L(3)) computations, where L denotes the subarray size. In this study, we propose a computationally efficient MV beamformer based on QR decomposition. The idea behind our approach is to transform the spatial covariance matrix to be a scalar matrix σI and we subsequently obtain the apodization weights and the beamformed output without computing the matrix inverse. To do that, QR decomposition algorithm is used and also can be executed at low cost, and therefore, the computational complexity is reduced to O(L(2)). In addition, our approach is mathematically equivalent to the conventional MV beamformer, thereby showing the equivalent performances. The simulation and experimental results support the validity of our approach.
NASA Technical Reports Server (NTRS)
Kaiser, Mary K.; Proffitt, Dennis R.
1992-01-01
Recent developments in microelectronics have encouraged the use of 3D data bases to create compelling volumetric renderings of graphical objects. However, even with the computational capabilities of current-generation graphical systems, real-time displays of such objects are difficult, particularly when dynamic spatial transformations are involved. In this paper we discuss a type of visual stimulus (the stereokinetic effect display) that is computationally far less complex than a true three-dimensional transformation but yields an equally compelling depth impression, often perceptually indiscriminable from the true spatial transformation. Several possible applications for this technique are discussed (e.g., animating contour maps and air traffic control displays so as to evoke accurate depth percepts).
Multiscale Methods, Parallel Computation, and Neural Networks for Real-Time Computer Vision.
NASA Astrophysics Data System (ADS)
Battiti, Roberto
1990-01-01
This thesis presents new algorithms for low and intermediate level computer vision. The guiding ideas in the presented approach are those of hierarchical and adaptive processing, concurrent computation, and supervised learning. Processing of the visual data at different resolutions is used not only to reduce the amount of computation necessary to reach the fixed point, but also to produce a more accurate estimation of the desired parameters. The presented adaptive multiple scale technique is applied to the problem of motion field estimation. Different parts of the image are analyzed at a resolution that is chosen in order to minimize the error in the coefficients of the differential equations to be solved. Tests with video-acquired images show that velocity estimation is more accurate over a wide range of motion with respect to the homogeneous scheme. In some cases introduction of explicit discontinuities coupled to the continuous variables can be used to avoid propagation of visual information from areas corresponding to objects with different physical and/or kinematic properties. The human visual system uses concurrent computation in order to process the vast amount of visual data in "real -time." Although with different technological constraints, parallel computation can be used efficiently for computer vision. All the presented algorithms have been implemented on medium grain distributed memory multicomputers with a speed-up approximately proportional to the number of processors used. A simple two-dimensional domain decomposition assigns regions of the multiresolution pyramid to the different processors. The inter-processor communication needed during the solution process is proportional to the linear dimension of the assigned domain, so that efficiency is close to 100% if a large region is assigned to each processor. Finally, learning algorithms are shown to be a viable technique to engineer computer vision systems for different applications starting from
Workflow computing. Improving management and efficiency of pathology diagnostic services.
Buffone, G J; Moreau, D; Beck, J R
1996-04-01
Traditionally, information technology in health care has helped practitioners to collect, store, and present information and also to add a degree of automation to simple tasks (instrument interfaces supporting result entry, for example). Thus commercially available information systems do little to support the need to model, execute, monitor, coordinate, and revise the various complex clinical processes required to support health-care delivery. Workflow computing, which is already implemented and improving the efficiency of operations in several nonmedical industries, can address the need to manage complex clinical processes. Workflow computing not only provides a means to define and manage the events, roles, and information integral to health-care delivery but also supports the explicit implementation of policy or rules appropriate to the process. This article explains how workflow computing may be applied to health-care and the inherent advantages of the technology, and it defines workflow system requirements for use in health-care delivery with special reference to diagnostic pathology.
An efficient hybrid technique in RCS predictions of complex targets at high frequencies
NASA Astrophysics Data System (ADS)
Algar, María-Jesús; Lozano, Lorena; Moreno, Javier; González, Iván; Cátedra, Felipe
2017-09-01
Most computer codes in Radar Cross Section (RCS) prediction use Physical Optics (PO) and Physical theory of Diffraction (PTD) combined with Geometrical Optics (GO) and Geometrical Theory of Diffraction (GTD). The latter approaches are computationally cheaper and much more accurate for curved surfaces, but not applicable for the computation of the RCS of all surfaces of a complex object due to the presence of caustic problems in the analysis of concave surfaces or flat surfaces in the far field. The main contribution of this paper is the development of a hybrid method based on a new combination of two asymptotic techniques: GTD and PO, considering the advantages and avoiding the disadvantages of each of them. A very efficient and accurate method to analyze the RCS of complex structures at high frequencies is obtained with the new combination. The proposed new method has been validated comparing RCS results obtained for some simple cases using the proposed approach and RCS using the rigorous technique of Method of Moments (MoM). Some complex cases have been examined at high frequencies contrasting the results with PO. This study shows the accuracy and the efficiency of the hybrid method and its suitability for the computation of the RCS at really large and complex targets at high frequencies.
Enhancing battery efficiency for pervasive health-monitoring systems based on electronic textiles.
Zheng, Nenggan; Wu, Zhaohui; Lin, Man; Yang, Laurence Tianruo
2010-03-01
Electronic textiles are regarded as one of the most important computation platforms for future computer-assisted health-monitoring applications. In these novel systems, multiple batteries are used in order to prolong their operational lifetime, which is a significant metric for system usability. However, due to the nonlinear features of batteries, computing systems with multiple batteries cannot achieve the same battery efficiency as those powered by a monolithic battery of equal capacity. In this paper, we propose an algorithm aiming to maximize battery efficiency globally for the computer-assisted health-care systems with multiple batteries. Based on an accurate analytical battery model, the concept of weighted battery fatigue degree is introduced and the novel battery-scheduling algorithm called predicted weighted fatigue degree least first (PWFDLF) is developed. Besides, we also discuss our attempts during search PWFDLF: a weighted round-robin (WRR) and a greedy algorithm achieving highest local battery efficiency, which reduces to the sequential discharging policy. Evaluation results show that a considerable improvement in battery efficiency can be obtained by PWFDLF under various battery configurations and current profiles compared to conventional sequential and WRR discharging policies.
Williams, Ian; Constandinou, Timothy G.
2014-01-01
Accurate models of proprioceptive neural patterns could 1 day play an important role in the creation of an intuitive proprioceptive neural prosthesis for amputees. This paper looks at combining efficient implementations of biomechanical and proprioceptor models in order to generate signals that mimic human muscular proprioceptive patterns for future experimental work in prosthesis feedback. A neuro-musculoskeletal model of the upper limb with 7 degrees of freedom and 17 muscles is presented and generates real time estimates of muscle spindle and Golgi Tendon Organ neural firing patterns. Unlike previous neuro-musculoskeletal models, muscle activation and excitation levels are unknowns in this application and an inverse dynamics tool (static optimization) is integrated to estimate these variables. A proprioceptive prosthesis will need to be portable and this is incompatible with the computationally demanding nature of standard biomechanical and proprioceptor modeling. This paper uses and proposes a number of approximations and optimizations to make real time operation on portable hardware feasible. Finally technical obstacles to mimicking natural feedback for an intuitive proprioceptive prosthesis, as well as issues and limitations with existing models, are identified and discussed. PMID:25009463
Convergence Acceleration of a Navier-Stokes Solver for Efficient Static Aeroelastic Computations
NASA Technical Reports Server (NTRS)
Obayashi, Shigeru; Guruswamy, Guru P.
1995-01-01
New capabilities have been developed for a Navier-Stokes solver to perform steady-state simulations more efficiently. The flow solver for solving the Navier-Stokes equations is based on a combination of the lower-upper factored symmetric Gauss-Seidel implicit method and the modified Harten-Lax-van Leer-Einfeldt upwind scheme. A numerically stable and efficient pseudo-time-marching method is also developed for computing steady flows over flexible wings. Results are demonstrated for transonic flows over rigid and flexible wings.
An efficient and general numerical method to compute steady uniform vortices
NASA Astrophysics Data System (ADS)
Luzzatto-Fegiz, Paolo; Williamson, Charles H. K.
2011-07-01
Steady uniform vortices are widely used to represent high Reynolds number flows, yet their efficient computation still presents some challenges. Existing Newton iteration methods become inefficient as the vortices develop fine-scale features; in addition, these methods cannot, in general, find solutions with specified Casimir invariants. On the other hand, available relaxation approaches are computationally inexpensive, but can fail to converge to a solution. In this paper, we overcome these limitations by introducing a new discretization, based on an inverse-velocity map, which radically increases the efficiency of Newton iteration methods. In addition, we introduce a procedure to prescribe Casimirs and remove the degeneracies in the steady vorticity equation, thus ensuring convergence for general vortex configurations. We illustrate our methodology by considering several unbounded flows involving one or two vortices. Our method enables the computation, for the first time, of steady vortices that do not exhibit any geometric symmetry. In addition, we discover that, as the limiting vortex state for each flow is approached, each family of solutions traces a clockwise spiral in a bifurcation plot consisting of a velocity-impulse diagram. By the recently introduced "IVI diagram" stability approach [Phys. Rev. Lett. 104 (2010) 044504], each turn of this spiral is associated with a loss of stability for the steady flows. Such spiral structure is suggested to be a universal feature of steady, uniform-vorticity flows.
An efficient and accurate molecular alignment and docking technique using ab initio quality scoring
Füsti-Molnár, László; Merz, Kenneth M.
2008-01-01
An accurate and efficient molecular alignment technique is presented based on first principle electronic structure calculations. This new scheme maximizes quantum similarity matrices in the relative orientation of the molecules and uses Fourier transform techniques for two purposes. First, building up the numerical representation of true ab initio electronic densities and their Coulomb potentials is accelerated by the previously described Fourier transform Coulomb method. Second, the Fourier convolution technique is applied for accelerating optimizations in the translational coordinates. In order to avoid any interpolation error, the necessary analytical formulas are derived for the transformation of the ab initio wavefunctions in rotational coordinates. The results of our first implementation for a small test set are analyzed in detail and compared with published results of the literature. A new way of refinement of existing shape based alignments is also proposed by using Fourier convolutions of ab initio or other approximate electron densities. This new alignment technique is generally applicable for overlap, Coulomb, kinetic energy, etc., quantum similarity measures and can be extended to a genuine docking solution with ab initio scoring. PMID:18624561
Efficient quantum algorithm for computing n-time correlation functions.
Pedernales, J S; Di Candia, R; Egusquiza, I L; Casanova, J; Solano, E
2014-07-11
We propose a method for computing n-time correlation functions of arbitrary spinorial, fermionic, and bosonic operators, consisting of an efficient quantum algorithm that encodes these correlations in an initially added ancillary qubit for probe and control tasks. For spinorial and fermionic systems, the reconstruction of arbitrary n-time correlation functions requires the measurement of two ancilla observables, while for bosonic variables time derivatives of the same observables are needed. Finally, we provide examples applicable to different quantum platforms in the frame of the linear response theory.
A computationally efficient modelling of laminar separation bubbles
NASA Technical Reports Server (NTRS)
Maughmer, Mark D.
1988-01-01
The goal of this research is to accurately predict the characteristics of the laminar separation bubble and its effects on airfoil performance. To this end, a model of the bubble is under development and will be incorporated in the analysis section of the Eppler and Somers program. As a first step in this direction, an existing bubble model was inserted into the program. It was decided to address the problem of the short bubble before attempting the prediction of the long bubble. In the second place, an integral boundary-layer method is believed more desirable than a finite difference approach. While these two methods achieve similar prediction accuracy, finite-difference methods tend to involve significantly longer computer run times than the integral methods. Finally, as the boundary-layer analysis in the Eppler and Somers program employs the momentum and kinetic energy integral equations, a short-bubble model compatible with these equations is most preferable.
Procedure and computer program to calculate machine contribution to sawmill recovery
Philip H. Steele; Hiram Hallock; Stanford Lunstrum
1981-01-01
The importance of considering individual machine contribution to total mill efficiency is discussed. A method for accurately calculating machine contribution is introduced, and an example is given using this method. A FORTRAN computer program to make the necessary complex calculations automatically is also presented with user instructions.
A Computationally-Efficient Inverse Approach to Probabilistic Strain-Based Damage Diagnosis
NASA Technical Reports Server (NTRS)
Warner, James E.; Hochhalter, Jacob D.; Leser, William P.; Leser, Patrick E.; Newman, John A
2016-01-01
This work presents a computationally-efficient inverse approach to probabilistic damage diagnosis. Given strain data at a limited number of measurement locations, Bayesian inference and Markov Chain Monte Carlo (MCMC) sampling are used to estimate probability distributions of the unknown location, size, and orientation of damage. Substantial computational speedup is obtained by replacing a three-dimensional finite element (FE) model with an efficient surrogate model. The approach is experimentally validated on cracked test specimens where full field strains are determined using digital image correlation (DIC). Access to full field DIC data allows for testing of different hypothetical sensor arrangements, facilitating the study of strain-based diagnosis effectiveness as the distance between damage and measurement locations increases. The ability of the framework to effectively perform both probabilistic damage localization and characterization in cracked plates is demonstrated and the impact of measurement location on uncertainty in the predictions is shown. Furthermore, the analysis time to produce these predictions is orders of magnitude less than a baseline Bayesian approach with the FE method by utilizing surrogate modeling and effective numerical sampling approaches.
Automatic temperature computation for realistic IR simulation
NASA Astrophysics Data System (ADS)
Le Goff, Alain; Kersaudy, Philippe; Latger, Jean; Cathala, Thierry; Stolte, Nilo; Barillot, Philippe
2000-07-01
Polygon temperature computation in 3D virtual scenes is fundamental for IR image simulation. This article describes in detail the temperature calculation software and its current extensions, briefly presented in [1]. This software, called MURET, is used by the simulation workshop CHORALE of the French DGA. MURET is a one-dimensional thermal software, which accurately takes into account the material thermal attributes of three-dimensional scene and the variation of the environment characteristics (atmosphere) as a function of the time. Concerning the environment, absorbed incident fluxes are computed wavelength by wavelength, for each half an hour, druing 24 hours before the time of the simulation. For each polygon, incident fluxes are compsed of: direct solar fluxes, sky illumination (including diffuse solar fluxes). Concerning the materials, classical thermal attributes are associated to several layers, such as conductivity, absorption, spectral emissivity, density, specific heat, thickness and convection coefficients are taken into account. In the future, MURET will be able to simulate permeable natural materials (water influence) and vegetation natural materials (woods). This model of thermal attributes induces a very accurate polygon temperature computation for the complex 3D databases often found in CHORALE simulations. The kernel of MUET consists of an efficient ray tracer allowing to compute the history (over 24 hours) of the shadowed parts of the 3D scene and a library, responsible for the thermal computations. The great originality concerns the way the heating fluxes are computed. Using ray tracing, the flux received in each 3D point of the scene accurately takes into account the masking (hidden surfaces) between objects. By the way, this library supplies other thermal modules such as a thermal shows computation tool.
NASA Astrophysics Data System (ADS)
Gleason, M. J.; Pitlick, J.; Buttenfield, B. P.
2011-12-01
Terrestrial laser scanning (TLS) represents a new and particularly effective remote sensing technique for investigating geomorphologic processes. Unfortunately, TLS data are commonly characterized by extremely large volume, heterogeneous point distribution, and erroneous measurements, raising challenges for applied researchers. To facilitate efficient and accurate use of TLS in geomorphology, and to improve accessibility for TLS processing in commercial software environments, we are developing a filtering method for raw TLS data to: eliminate data redundancy; produce a more uniformly spaced dataset; remove erroneous measurements; and maintain the ability of the TLS dataset to accurately model terrain. Our method conducts local aggregation of raw TLS data using a 3-D search algorithm based on the geometrical expression of expected random errors in the data. This approach accounts for the estimated accuracy and precision limitations of the instruments and procedures used in data collection, thereby allowing for identification and removal of potential erroneous measurements prior to data aggregation. Initial tests of the proposed technique on a sample TLS point cloud required a modest processing time of approximately 100 minutes to reduce dataset volume over 90 percent (from 12,380,074 to 1,145,705 points). Preliminary analysis of the filtered point cloud revealed substantial improvement in homogeneity of point distribution and minimal degradation of derived terrain models. We will test the method on two independent TLS datasets collected in consecutive years along a non-vegetated reach of the North Fork Toutle River in Washington. We will evaluate the tool using various quantitative, qualitative, and statistical methods. The crux of this evaluation will include a bootstrapping analysis to test the ability of the filtered datasets to model the terrain at roughly the same accuracy as the raw datasets.
Comparison of a degree-day computer and a recording thermograph in a forest environment.
Boyd E. Wickman
1985-01-01
A field test showed that degree-days accumulated by a miniature computer and a recording thermograph in early spring and summer were comparable. For phenological studies the biophenometer was as accurate and more efficient than the hygrothermograph.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.
Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally-efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace, such that the dimensionality of themore » problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2D and a random hydraulic conductivity field in 3D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ~10 1 to ~10 2 in a multi-core computational environment. Furthermore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate- to large-scale problems.« less
NASA Astrophysics Data System (ADS)
Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.
2016-09-01
Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace such that the dimensionality of the problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2-D and a random hydraulic conductivity field in 3-D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ˜101 to ˜102 in a multicore computational environment. Therefore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate to large-scale problems.
Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.
2016-09-01
Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally-efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace, such that the dimensionality of themore » problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2D and a random hydraulic conductivity field in 3D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ~10 1 to ~10 2 in a multi-core computational environment. Furthermore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate- to large-scale problems.« less
Textbook Multigrid Efficiency for Computational Fluid Dynamics Simulations
NASA Technical Reports Server (NTRS)
Brandt, Achi; Thomas, James L.; Diskin, Boris
2001-01-01
Considerable progress over the past thirty years has been made in the development of large-scale computational fluid dynamics (CFD) solvers for the Euler and Navier-Stokes equations. Computations are used routinely to design the cruise shapes of transport aircraft through complex-geometry simulations involving the solution of 25-100 million equations; in this arena the number of wind-tunnel tests for a new design has been substantially reduced. However, simulations of the entire flight envelope of the vehicle, including maximum lift, buffet onset, flutter, and control effectiveness have not been as successful in eliminating the reliance on wind-tunnel testing. These simulations involve unsteady flows with more separation and stronger shock waves than at cruise. The main reasons limiting further inroads of CFD into the design process are: (1) the reliability of turbulence models; and (2) the time and expense of the numerical simulation. Because of the prohibitive resolution requirements of direct simulations at high Reynolds numbers, transition and turbulence modeling is expected to remain an issue for the near term. The focus of this paper addresses the latter problem by attempting to attain optimal efficiencies in solving the governing equations. Typically current CFD codes based on the use of multigrid acceleration techniques and multistage Runge-Kutta time-stepping schemes are able to converge lift and drag values for cruise configurations within approximately 1000 residual evaluations. An optimally convergent method is defined as having textbook multigrid efficiency (TME), meaning the solutions to the governing system of equations are attained in a computational work which is a small (less than 10) multiple of the operation count in the discretized system of equations (residual equations). In this paper, a distributed relaxation approach to achieving TME for Reynolds-averaged Navier-Stokes (RNAS) equations are discussed along with the foundations that form the
Higher-order methods for simulations on quantum computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sornborger, A.T.; Stewart, E.D.
1999-09-01
To implement many-qubit gates for use in quantum simulations on quantum computers efficiently, we develop and present methods reexpressing exp[[minus]i(H[sub 1]+H[sub 2]+[center dot][center dot][center dot])[Delta]t] as a product of factors exp[[minus]iH[sub 1][Delta]t], exp[[minus]iH[sub 2][Delta]t],[hor ellipsis], which is accurate to third or fourth order in [Delta]t. The methods we derive are an extended form of the symplectic method, and can also be used for an integration of classical Hamiltonians on classical computers. We derive both integral and irrational methods, and find the most efficient methods in both cases. [copyright] [ital 1999] [ital The American Physical Society
Node fingerprinting: an efficient heuristic for aligning biological networks.
Radu, Alex; Charleston, Michael
2014-10-01
With the continuing increase in availability of biological data and improvements to biological models, biological network analysis has become a promising area of research. An emerging technique for the analysis of biological networks is through network alignment. Network alignment has been used to calculate genetic distance, similarities between regulatory structures, and the effect of external forces on gene expression, and to depict conditional activity of expression modules in cancer. Network alignment is algorithmically complex, and therefore we must rely on heuristics, ideally as efficient and accurate as possible. The majority of current techniques for network alignment rely on precomputed information, such as with protein sequence alignment, or on tunable network alignment parameters, which may introduce an increased computational overhead. Our presented algorithm, which we call Node Fingerprinting (NF), is appropriate for performing global pairwise network alignment without precomputation or tuning, can be fully parallelized, and is able to quickly compute an accurate alignment between two biological networks. It has performed as well as or better than existing algorithms on biological and simulated data, and with fewer computational resources. The algorithmic validation performed demonstrates the low computational resource requirements of NF.
NASA Astrophysics Data System (ADS)
Wang, Yuan; Chen, Zhidong; Sang, Xinzhu; Li, Hui; Zhao, Linmin
2018-03-01
Holographic displays can provide the complete optical wave field of a three-dimensional (3D) scene, including the depth perception. However, it often takes a long computation time to produce traditional computer-generated holograms (CGHs) without more complex and photorealistic rendering. The backward ray-tracing technique is able to render photorealistic high-quality images, which noticeably reduce the computation time achieved from the high-degree parallelism. Here, a high-efficiency photorealistic computer-generated hologram method is presented based on the ray-tracing technique. Rays are parallelly launched and traced under different illuminations and circumstances. Experimental results demonstrate the effectiveness of the proposed method. Compared with the traditional point cloud CGH, the computation time is decreased to 24 s to reconstruct a 3D object of 100 ×100 rays with continuous depth change.
Step-by-step magic state encoding for efficient fault-tolerant quantum computation.
Goto, Hayato
2014-12-16
Quantum error correction allows one to make quantum computers fault-tolerant against unavoidable errors due to decoherence and imperfect physical gate operations. However, the fault-tolerant quantum computation requires impractically large computational resources for useful applications. This is a current major obstacle to the realization of a quantum computer. In particular, magic state distillation, which is a standard approach to universality, consumes the most resources in fault-tolerant quantum computation. For the resource problem, here we propose step-by-step magic state encoding for concatenated quantum codes, where magic states are encoded step by step from the physical level to the logical one. To manage errors during the encoding, we carefully use error detection. Since the sizes of intermediate codes are small, it is expected that the resource overheads will become lower than previous approaches based on the distillation at the logical level. Our simulation results suggest that the resource requirements for a logical magic state will become comparable to those for a single logical controlled-NOT gate. Thus, the present method opens a new possibility for efficient fault-tolerant quantum computation.
An accurate method for computer-generating tungsten anode x-ray spectra from 30 to 140 kV.
Boone, J M; Seibert, J A
1997-11-01
A tungsten anode spectral model using interpolating polynomials (TASMIP) was used to compute x-ray spectra at 1 keV intervals over the range from 30 kV to 140 kV. The TASMIP is not semi-empirical and uses no physical assumptions regarding x-ray production, but rather interpolates measured constant potential x-ray spectra published by Fewell et al. [Handbook of Computed Tomography X-ray Spectra (U.S. Government Printing Office, Washington, D.C., 1981)]. X-ray output measurements (mR/mAs measured at 1 m) were made on a calibrated constant potential generator in our laboratory from 50 kV to 124 kV, and with 0-5 mm added aluminum filtration. The Fewell spectra were slightly modified (numerically hardened) and normalized based on the attenuation and output characteristics of a constant potential generator and metal-insert x-ray tube in our laboratory. Then, using the modified Fewell spectra of different kVs, the photon fluence phi at each 1 keV energy bin (E) over energies from 10 keV to 140 keV was characterized using polynomial functions of the form phi (E) = a0[E] + a1[E] kV + a2[E] kV2 + ... + a(n)[E] kVn. A total of 131 polynomial functions were used to calculate accurate x-ray spectra, each function requiring between two and four terms. The resulting TASMIP algorithm produced x-ray spectra that match both the quality and quantity characteristics of the x-ray system in our laboratory. For photon fluences above 10% of the peak fluence in the spectrum, the average percent difference (and standard deviation) between the modified Fewell spectra and the TASMIP photon fluence was -1.43% (3.8%) for the 50 kV spectrum, -0.89% (1.37%) for the 70 kV spectrum, and for the 80, 90, 100, 110, 120, 130 and 140 kV spectra, the mean differences between spectra were all less than 0.20% and the standard deviations were less than approximately 1.1%. The model was also extended to include the effects of generator-induced kV ripple. Finally, the x-ray photon fluence in the units of
PVT: An Efficient Computational Procedure to Speed up Next-generation Sequence Analysis
2014-01-01
Background High-throughput Next-Generation Sequencing (NGS) techniques are advancing genomics and molecular biology research. This technology generates substantially large data which puts up a major challenge to the scientists for an efficient, cost and time effective solution to analyse such data. Further, for the different types of NGS data, there are certain common challenging steps involved in analysing those data. Spliced alignment is one such fundamental step in NGS data analysis which is extremely computational intensive as well as time consuming. There exists serious problem even with the most widely used spliced alignment tools. TopHat is one such widely used spliced alignment tools which although supports multithreading, does not efficiently utilize computational resources in terms of CPU utilization and memory. Here we have introduced PVT (Pipelined Version of TopHat) where we take up a modular approach by breaking TopHat’s serial execution into a pipeline of multiple stages, thereby increasing the degree of parallelization and computational resource utilization. Thus we address the discrepancies in TopHat so as to analyze large NGS data efficiently. Results We analysed the SRA dataset (SRX026839 and SRX026838) consisting of single end reads and SRA data SRR1027730 consisting of paired-end reads. We used TopHat v2.0.8 to analyse these datasets and noted the CPU usage, memory footprint and execution time during spliced alignment. With this basic information, we designed PVT, a pipelined version of TopHat that removes the redundant computational steps during ‘spliced alignment’ and breaks the job into a pipeline of multiple stages (each comprising of different step(s)) to improve its resource utilization, thus reducing the execution time. Conclusions PVT provides an improvement over TopHat for spliced alignment of NGS data analysis. PVT thus resulted in the reduction of the execution time to ~23% for the single end read dataset. Further, PVT designed
PVT: an efficient computational procedure to speed up next-generation sequence analysis.
Maji, Ranjan Kumar; Sarkar, Arijita; Khatua, Sunirmal; Dasgupta, Subhasis; Ghosh, Zhumur
2014-06-04
High-throughput Next-Generation Sequencing (NGS) techniques are advancing genomics and molecular biology research. This technology generates substantially large data which puts up a major challenge to the scientists for an efficient, cost and time effective solution to analyse such data. Further, for the different types of NGS data, there are certain common challenging steps involved in analysing those data. Spliced alignment is one such fundamental step in NGS data analysis which is extremely computational intensive as well as time consuming. There exists serious problem even with the most widely used spliced alignment tools. TopHat is one such widely used spliced alignment tools which although supports multithreading, does not efficiently utilize computational resources in terms of CPU utilization and memory. Here we have introduced PVT (Pipelined Version of TopHat) where we take up a modular approach by breaking TopHat's serial execution into a pipeline of multiple stages, thereby increasing the degree of parallelization and computational resource utilization. Thus we address the discrepancies in TopHat so as to analyze large NGS data efficiently. We analysed the SRA dataset (SRX026839 and SRX026838) consisting of single end reads and SRA data SRR1027730 consisting of paired-end reads. We used TopHat v2.0.8 to analyse these datasets and noted the CPU usage, memory footprint and execution time during spliced alignment. With this basic information, we designed PVT, a pipelined version of TopHat that removes the redundant computational steps during 'spliced alignment' and breaks the job into a pipeline of multiple stages (each comprising of different step(s)) to improve its resource utilization, thus reducing the execution time. PVT provides an improvement over TopHat for spliced alignment of NGS data analysis. PVT thus resulted in the reduction of the execution time to ~23% for the single end read dataset. Further, PVT designed for paired end reads showed an
Chalmers, Eric; Luczak, Artur; Gruber, Aaron J.
2016-01-01
The mammalian brain is thought to use a version of Model-based Reinforcement Learning (MBRL) to guide “goal-directed” behavior, wherein animals consider goals and make plans to acquire desired outcomes. However, conventional MBRL algorithms do not fully explain animals' ability to rapidly adapt to environmental changes, or learn multiple complex tasks. They also require extensive computation, suggesting that goal-directed behavior is cognitively expensive. We propose here that key features of processing in the hippocampus support a flexible MBRL mechanism for spatial navigation that is computationally efficient and can adapt quickly to change. We investigate this idea by implementing a computational MBRL framework that incorporates features inspired by computational properties of the hippocampus: a hierarchical representation of space, “forward sweeps” through future spatial trajectories, and context-driven remapping of place cells. We find that a hierarchical abstraction of space greatly reduces the computational load (mental effort) required for adaptation to changing environmental conditions, and allows efficient scaling to large problems. It also allows abstract knowledge gained at high levels to guide adaptation to new obstacles. Moreover, a context-driven remapping mechanism allows learning and memory of multiple tasks. Simulating dorsal or ventral hippocampal lesions in our computational framework qualitatively reproduces behavioral deficits observed in rodents with analogous lesions. The framework may thus embody key features of how the brain organizes model-based RL to efficiently solve navigation and other difficult tasks. PMID:28018203
Computational Flow Modeling of Human Upper Airway Breathing
NASA Astrophysics Data System (ADS)
Mylavarapu, Goutham
Computational modeling of biological systems have gained a lot of interest in biomedical research, in the recent past. This thesis focuses on the application of computational simulations to study airflow dynamics in human upper respiratory tract. With advancements in medical imaging, patient specific geometries of anatomically accurate respiratory tracts can now be reconstructed from Magnetic Resonance Images (MRI) or Computed Tomography (CT) scans, with better and accurate details than traditional cadaver cast models. Computational studies using these individualized geometrical models have advantages of non-invasiveness, ease, minimum patient interaction, improved accuracy over experimental and clinical studies. Numerical simulations can provide detailed flow fields including velocities, flow rates, airway wall pressure, shear stresses, turbulence in an airway. Interpretation of these physical quantities will enable to develop efficient treatment procedures, medical devices, targeted drug delivery etc. The hypothesis for this research is that computational modeling can predict the outcomes of a surgical intervention or a treatment plan prior to its application and will guide the physician in providing better treatment to the patients. In the current work, three different computational approaches Computational Fluid Dynamics (CFD), Flow-Structure Interaction (FSI) and Particle Flow simulations were used to investigate flow in airway geometries. CFD approach assumes airway wall as rigid, and relatively easy to simulate, compared to the more challenging FSI approach, where interactions of airway wall deformations with flow are also accounted. The CFD methodology using different turbulence models is validated against experimental measurements in an airway phantom. Two case-studies using CFD, to quantify a pre and post-operative airway and another, to perform virtual surgery to determine the best possible surgery in a constricted airway is demonstrated. The unsteady
NASA Astrophysics Data System (ADS)
Zhao, Xiao-mei; Xie, Dong-fan; Li, Qi
2015-02-01
With the development of intelligent transport system, advanced information feedback strategies have been developed to reduce traffic congestion and enhance the capacity. However, previous strategies provide accurate information to travelers and our simulation results show that accurate information brings negative effects, especially in delay case. Because travelers prefer to the best condition route with accurate information, and delayed information cannot reflect current traffic condition but past. Then travelers make wrong routing decisions, causing the decrease of the capacity and the increase of oscillations and the system deviating from the equilibrium. To avoid the negative effect, bounded rationality is taken into account by introducing a boundedly rational threshold BR. When difference between two routes is less than the BR, routes have equal probability to be chosen. The bounded rationality is helpful to improve the efficiency in terms of capacity, oscillation and the gap deviating from the system equilibrium.
On numerically accurate finite element
NASA Technical Reports Server (NTRS)
Nagtegaal, J. C.; Parks, D. M.; Rice, J. R.
1974-01-01
A general criterion for testing a mesh with topologically similar repeat units is given, and the analysis shows that only a few conventional element types and arrangements are, or can be made suitable for computations in the fully plastic range. Further, a new variational principle, which can easily and simply be incorporated into an existing finite element program, is presented. This allows accurate computations to be made even for element designs that would not normally be suitable. Numerical results are given for three plane strain problems, namely pure bending of a beam, a thick-walled tube under pressure, and a deep double edge cracked tensile specimen. The effects of various element designs and of the new variational procedure are illustrated. Elastic-plastic computation at finite strain are discussed.
Shaw, Calvin B; Prakash, Jaya; Pramanik, Manojit; Yalavarthy, Phaneendra K
2013-08-01
A computationally efficient approach that computes the optimal regularization parameter for the Tikhonov-minimization scheme is developed for photoacoustic imaging. This approach is based on the least squares-QR decomposition which is a well-known dimensionality reduction technique for a large system of equations. It is shown that the proposed framework is effective in terms of quantitative and qualitative reconstructions of initial pressure distribution enabled via finding an optimal regularization parameter. The computational efficiency and performance of the proposed method are shown using a test case of numerical blood vessel phantom, where the initial pressure is exactly known for quantitative comparison.
Bhanot, Gyan V [Princeton, NJ; Chen, Dong [Croton-On-Hudson, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY
2008-01-01
The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.
Acceleration of FDTD mode solver by high-performance computing techniques.
Han, Lin; Xi, Yanping; Huang, Wei-Ping
2010-06-21
A two-dimensional (2D) compact finite-difference time-domain (FDTD) mode solver is developed based on wave equation formalism in combination with the matrix pencil method (MPM). The method is validated for calculation of both real guided and complex leaky modes of typical optical waveguides against the bench-mark finite-difference (FD) eigen mode solver. By taking advantage of the inherent parallel nature of the FDTD algorithm, the mode solver is implemented on graphics processing units (GPUs) using the compute unified device architecture (CUDA). It is demonstrated that the high-performance computing technique leads to significant acceleration of the FDTD mode solver with more than 30 times improvement in computational efficiency in comparison with the conventional FDTD mode solver running on CPU of a standard desktop computer. The computational efficiency of the accelerated FDTD method is in the same order of magnitude of the standard finite-difference eigen mode solver and yet require much less memory (e.g., less than 10%). Therefore, the new method may serve as an efficient, accurate and robust tool for mode calculation of optical waveguides even when the conventional eigen value mode solvers are no longer applicable due to memory limitation.
Duan, Lili; Liu, Xiao; Zhang, John Z H
2016-05-04
Efficient and reliable calculation of protein-ligand binding free energy is a grand challenge in computational biology and is of critical importance in drug design and many other molecular recognition problems. The main challenge lies in the calculation of entropic contribution to protein-ligand binding or interaction systems. In this report, we present a new interaction entropy method which is theoretically rigorous, computationally efficient, and numerically reliable for calculating entropic contribution to free energy in protein-ligand binding and other interaction processes. Drastically different from the widely employed but extremely expensive normal mode method for calculating entropy change in protein-ligand binding, the new method calculates the entropic component (interaction entropy or -TΔS) of the binding free energy directly from molecular dynamics simulation without any extra computational cost. Extensive study of over a dozen randomly selected protein-ligand binding systems demonstrated that this interaction entropy method is both computationally efficient and numerically reliable and is vastly superior to the standard normal mode approach. This interaction entropy paradigm introduces a novel and intuitive conceptual understanding of the entropic effect in protein-ligand binding and other general interaction systems as well as a practical method for highly efficient calculation of this effect.
An accurate model for the computation of the dose of protons in water.
Embriaco, A; Bellinzona, V E; Fontana, A; Rotondi, A
2017-06-01
The accurate and fast calculation of the dose in proton radiation therapy is an essential ingredient for successful treatments. We propose a novel approach with a minimal number of parameters. The approach is based on the exact calculation of the electromagnetic part of the interaction, namely the Molière theory of the multiple Coulomb scattering for the transversal 1D projection and the Bethe-Bloch formula for the longitudinal stopping power profile, including a gaussian energy straggling. To this e.m. contribution the nuclear proton-nucleus interaction is added with a simple two-parameter model. Then, the non gaussian lateral profile is used to calculate the radial dose distribution with a method that assumes the cylindrical symmetry of the distribution. The results, obtained with a fast C++ based computational code called MONET (MOdel of ioN dosE for Therapy), are in very good agreement with the FLUKA MC code, within a few percent in the worst case. This study provides a new tool for fast dose calculation or verification, possibly for clinical use. Copyright © 2017 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
Efficient volumetric estimation from plenoptic data
NASA Astrophysics Data System (ADS)
Anglin, Paul; Reeves, Stanley J.; Thurow, Brian S.
2013-03-01
The commercial release of the Lytro camera, and greater availability of plenoptic imaging systems in general, have given the image processing community cost-effective tools for light-field imaging. While this data is most commonly used to generate planar images at arbitrary focal depths, reconstruction of volumetric fields is also possible. Similarly, deconvolution is a technique that is conventionally used in planar image reconstruction, or deblurring, algorithms. However, when leveraged with the ability of a light-field camera to quickly reproduce multiple focal planes within an imaged volume, deconvolution offers a computationally efficient method of volumetric reconstruction. Related research has shown than light-field imaging systems in conjunction with tomographic reconstruction techniques are also capable of estimating the imaged volume and have been successfully applied to particle image velocimetry (PIV). However, while tomographic volumetric estimation through algorithms such as multiplicative algebraic reconstruction techniques (MART) have proven to be highly accurate, they are computationally intensive. In this paper, the reconstruction problem is shown to be solvable by deconvolution. Deconvolution offers significant improvement in computational efficiency through the use of fast Fourier transforms (FFTs) when compared to other tomographic methods. This work describes a deconvolution algorithm designed to reconstruct a 3-D particle field from simulated plenoptic data. A 3-D extension of existing 2-D FFT-based refocusing techniques is presented to further improve efficiency when computing object focal stacks and system point spread functions (PSF). Reconstruction artifacts are identified; their underlying source and methods of mitigation are explored where possible, and reconstructions of simulated particle fields are provided.
A Power Efficient Exaflop Computer Design for Global Cloud System Resolving Climate Models.
NASA Astrophysics Data System (ADS)
Wehner, M. F.; Oliker, L.; Shalf, J.
2008-12-01
Exascale computers would allow routine ensemble modeling of the global climate system at the cloud system resolving scale. Power and cost requirements of traditional architecture systems are likely to delay such capability for many years. We present an alternative route to the exascale using embedded processor technology to design a system optimized for ultra high resolution climate modeling. These power efficient processors, used in consumer electronic devices such as mobile phones, portable music players, cameras, etc., can be tailored to the specific needs of scientific computing. We project that a system capable of integrating a kilometer scale climate model a thousand times faster than real time could be designed and built in a five year time scale for US$75M with a power consumption of 3MW. This is cheaper, more power efficient and sooner than any other existing technology.
Yalavarthy, Phaneendra K; Lynch, Daniel R; Pogue, Brian W; Dehghani, Hamid; Paulsen, Keith D
2008-05-01
Three-dimensional (3D) diffuse optical tomography is known to be a nonlinear, ill-posed and sometimes under-determined problem, where regularization is added to the minimization to allow convergence to a unique solution. In this work, a generalized least-squares (GLS) minimization method was implemented, which employs weight matrices for both data-model misfit and optical properties to include their variances and covariances, using a computationally efficient scheme. This allows inversion of a matrix that is of a dimension dictated by the number of measurements, instead of by the number of imaging parameters. This increases the computation speed up to four times per iteration in most of the under-determined 3D imaging problems. An analytic derivation, using the Sherman-Morrison-Woodbury identity, is shown for this efficient alternative form and it is proven to be equivalent, not only analytically, but also numerically. Equivalent alternative forms for other minimization methods, like Levenberg-Marquardt (LM) and Tikhonov, are also derived. Three-dimensional reconstruction results indicate that the poor recovery of quantitatively accurate values in 3D optical images can also be a characteristic of the reconstruction algorithm, along with the target size. Interestingly, usage of GLS reconstruction methods reduces error in the periphery of the image, as expected, and improves by 20% the ability to quantify local interior regions in terms of the recovered optical contrast, as compared to LM methods. Characterization of detector photo-multiplier tubes noise has enabled the use of the GLS method for reconstructing experimental data and showed a promise for better quantification of target in 3D optical imaging. Use of these new alternative forms becomes effective when the ratio of the number of imaging property parameters exceeds the number of measurements by a factor greater than 2.
NASA Technical Reports Server (NTRS)
Marconi, F.; Yaeger, L.
1976-01-01
A numerical procedure was developed to compute the inviscid super/hypersonic flow field about complex vehicle geometries accurately and efficiently. A second-order accurate finite difference scheme is used to integrate the three-dimensional Euler equations in regions of continuous flow, while all shock waves are computed as discontinuities via the Rankine-Hugoniot jump conditions. Conformal mappings are used to develop a computational grid. The effects of blunt nose entropy layers are computed in detail. Real gas effects for equilibrium air are included using curve fits of Mollier charts. Typical calculated results for shuttle orbiter, hypersonic transport, and supersonic aircraft configurations are included to demonstrate the usefulness of this tool.
VirusDetect: An automated pipeline for efficient virus discovery using deep sequencing of small RNAs
USDA-ARS?s Scientific Manuscript database
Accurate detection of viruses in plants and animals is critical for agriculture production and human health. Deep sequencing and assembly of virus-derived siRNAs has proven to be a highly efficient approach for virus discovery. However, to date no computational tools specifically designed for both k...
Memory conformity affects inaccurate memories more than accurate memories.
Wright, Daniel B; Villalba, Daniella K
2012-01-01
After controlling for initial confidence, inaccurate memories were shown to be more easily distorted than accurate memories. In two experiments groups of participants viewed 50 stimuli and were then presented with these stimuli plus 50 fillers. During this test phase participants reported their confidence that each stimulus was originally shown. This was followed by computer-generated responses from a bogus participant. After being exposed to this response participants again rated the confidence of their memory. The computer-generated responses systematically distorted participants' responses. Memory distortion depended on initial memory confidence, with uncertain memories being more malleable than confident memories. This effect was moderated by whether the participant's memory was initially accurate or inaccurate. Inaccurate memories were more malleable than accurate memories. The data were consistent with a model describing two types of memory (i.e., recollective and non-recollective memories), which differ in how susceptible these memories are to memory distortion.
Security Applications Of Computer Motion Detection
NASA Astrophysics Data System (ADS)
Bernat, Andrew P.; Nelan, Joseph; Riter, Stephen; Frankel, Harry
1987-05-01
An important area of application of computer vision is the detection of human motion in security systems. This paper describes the development of a computer vision system which can detect and track human movement across the international border between the United States and Mexico. Because of the wide range of environmental conditions, this application represents a stringent test of computer vision algorithms for motion detection and object identification. The desired output of this vision system is accurate, real-time locations for individual aliens and accurate statistical data as to the frequency of illegal border crossings. Because most detection and tracking routines assume rigid body motion, which is not characteristic of humans, new algorithms capable of reliable operation in our application are required. Furthermore, most current detection and tracking algorithms assume a uniform background against which motion is viewed - the urban environment along the US-Mexican border is anything but uniform. The system works in three stages: motion detection, object tracking and object identi-fication. We have implemented motion detection using simple frame differencing, maximum likelihood estimation, mean and median tests and are evaluating them for accuracy and computational efficiency. Due to the complex nature of the urban environment (background and foreground objects consisting of buildings, vegetation, vehicles, wind-blown debris, animals, etc.), motion detection alone is not sufficiently accurate. Object tracking and identification are handled by an expert system which takes shape, location and trajectory information as input and determines if the moving object is indeed representative of an illegal border crossing.
Guo, Zhi-Jun; Lin, Qiang; Liu, Hai-Tao; Lu, Jun-Ying; Zeng, Yan-Hong; Meng, Fan-Jie; Cao, Bin; Zi, Xue-Rong; Han, Shu-Ming; Zhang, Yu-Huan
2013-09-01
Using computed tomography (CT) to rapidly and accurately quantify pleural effusion volume benefits medical and scientific research. However, the precise volume of pleural effusions still involves many challenges and currently does not have a recognized accurate measuring. To explore the feasibility of using 64-slice CT volume-rendering technology to accurately measure pleural fluid volume and to then analyze the correlation between the volume of the free pleural effusion and the different diameters of the pleural effusion. The 64-slice CT volume-rendering technique was used to measure and analyze three parts. First, the fluid volume of a self-made thoracic model was measured and compared with the actual injected volume. Second, the pleural effusion volume was measured before and after pleural fluid drainage in 25 patients, and the volume reduction was compared with the actual volume of the liquid extract. Finally, the free pleural effusion volume was measured in 26 patients to analyze the correlation between it and the diameter of the effusion, which was then used to calculate the regression equation. After using the 64-slice CT volume-rendering technique to measure the fluid volume of the self-made thoracic model, the results were compared with the actual injection volume. No significant differences were found, P = 0.836. For the 25 patients with drained pleural effusions, the comparison of the reduction volume with the actual volume of the liquid extract revealed no significant differences, P = 0.989. The following linear regression equation was used to compare the pleural effusion volume (V) (measured by the CT volume-rendering technique) with the pleural effusion greatest depth (d): V = 158.16 × d - 116.01 (r = 0.91, P = 0.000). The following linear regression was used to compare the volume with the product of the pleural effusion diameters (l × h × d): V = 0.56 × (l × h × d) + 39.44 (r = 0.92, P = 0.000). The 64-slice CT volume-rendering technique can
NASA Technical Reports Server (NTRS)
Janetzke, David C.; Murthy, Durbha V.
1991-01-01
Aeroelastic analysis is multi-disciplinary and computationally expensive. Hence, it can greatly benefit from parallel processing. As part of an effort to develop an aeroelastic capability on a distributed memory transputer network, a parallel algorithm for the computation of aerodynamic influence coefficients is implemented on a network of 32 transputers. The aerodynamic influence coefficients are calculated using a 3-D unsteady aerodynamic model and a parallel discretization. Efficiencies up to 85 percent were demonstrated using 32 processors. The effect of subtask ordering, problem size, and network topology are presented. A comparison to results on a shared memory computer indicates that higher speedup is achieved on the distributed memory system.
TU-AB-303-08: GPU-Based Software Platform for Efficient Image-Guided Adaptive Radiation Therapy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Park, S; Robinson, A; McNutt, T
2015-06-15
Purpose: In this study, we develop an integrated software platform for adaptive radiation therapy (ART) that combines fast and accurate image registration, segmentation, and dose computation/accumulation methods. Methods: The proposed system consists of three key components; 1) deformable image registration (DIR), 2) automatic segmentation, and 3) dose computation/accumulation. The computationally intensive modules including DIR and dose computation have been implemented on a graphics processing unit (GPU). All required patient-specific data including the planning CT (pCT) with contours, daily cone-beam CTs, and treatment plan are automatically queried and retrieved from their own databases. To improve the accuracy of DIR between pCTmore » and CBCTs, we use the double force demons DIR algorithm in combination with iterative CBCT intensity correction by local intensity histogram matching. Segmentation of daily CBCT is then obtained by propagating contours from the pCT. Daily dose delivered to the patient is computed on the registered pCT by a GPU-accelerated superposition/convolution algorithm. Finally, computed daily doses are accumulated to show the total delivered dose to date. Results: Since the accuracy of DIR critically affects the quality of the other processes, we first evaluated our DIR method on eight head-and-neck cancer cases and compared its performance. Normalized mutual-information (NMI) and normalized cross-correlation (NCC) computed as similarity measures, and our method produced overall NMI of 0.663 and NCC of 0.987, outperforming conventional methods by 3.8% and 1.9%, respectively. Experimental results show that our registration method is more consistent and roust than existing algorithms, and also computationally efficient. Computation time at each fraction took around one minute (30–50 seconds for registration and 15–25 seconds for dose computation). Conclusion: We developed an integrated GPU-accelerated software platform that enables
Efficient simulation of incompressible viscous flow over multi-element airfoils
NASA Technical Reports Server (NTRS)
Rogers, Stuart E.; Wiltberger, N. Lyn; Kwak, Dochan
1992-01-01
The incompressible, viscous, turbulent flow over single and multi-element airfoils is numerically simulated in an efficient manner by solving the incompressible Navier-Stokes equations. The computer code uses the method of pseudo-compressibility with an upwind-differencing scheme for the convective fluxes and an implicit line-relaxation solution algorithm. The motivation for this work includes interest in studying the high-lift take-off and landing configurations of various aircraft. In particular, accurate computation of lift and drag at various angles of attack, up to stall, is desired. Two different turbulence models are tested in computing the flow over an NACA 4412 airfoil; an accurate prediction of stall is obtained. The approach used for multi-element airfoils involves the use of multiple zones of structured grids fitted to each element. Two different approaches are compared: a patched system of grids, and an overlaid Chimera system of grids. Computational results are presented for two-element, three-element, and four-element airfoil configurations. Excellent agreement with experimental surface pressure coefficients is seen. The code converges in less than 200 iterations, requiring on the order of one minute of CPU time (on a CRAY YMP) per element in the airfoil configuration.
Step-by-step magic state encoding for efficient fault-tolerant quantum computation
Goto, Hayato
2014-01-01
Quantum error correction allows one to make quantum computers fault-tolerant against unavoidable errors due to decoherence and imperfect physical gate operations. However, the fault-tolerant quantum computation requires impractically large computational resources for useful applications. This is a current major obstacle to the realization of a quantum computer. In particular, magic state distillation, which is a standard approach to universality, consumes the most resources in fault-tolerant quantum computation. For the resource problem, here we propose step-by-step magic state encoding for concatenated quantum codes, where magic states are encoded step by step from the physical level to the logical one. To manage errors during the encoding, we carefully use error detection. Since the sizes of intermediate codes are small, it is expected that the resource overheads will become lower than previous approaches based on the distillation at the logical level. Our simulation results suggest that the resource requirements for a logical magic state will become comparable to those for a single logical controlled-NOT gate. Thus, the present method opens a new possibility for efficient fault-tolerant quantum computation. PMID:25511387
Autonomic Closure for Turbulent Flows Using Approximate Bayesian Computation
NASA Astrophysics Data System (ADS)
Doronina, Olga; Christopher, Jason; Hamlington, Peter; Dahm, Werner
2017-11-01
Autonomic closure is a new technique for achieving fully adaptive and physically accurate closure of coarse-grained turbulent flow governing equations, such as those solved in large eddy simulations (LES). Although autonomic closure has been shown in recent a priori tests to more accurately represent unclosed terms than do dynamic versions of traditional LES models, the computational cost of the approach makes it challenging to implement for simulations of practical turbulent flows at realistically high Reynolds numbers. The optimization step used in the approach introduces large matrices that must be inverted and is highly memory intensive. In order to reduce memory requirements, here we propose to use approximate Bayesian computation (ABC) in place of the optimization step, thereby yielding a computationally-efficient implementation of autonomic closure that trades memory-intensive for processor-intensive computations. The latter challenge can be overcome as co-processors such as general purpose graphical processing units become increasingly available on current generation petascale and exascale supercomputers. In this work, we outline the formulation of ABC-enabled autonomic closure and present initial results demonstrating the accuracy and computational cost of the approach.
Computational methods for structural load and resistance modeling
NASA Technical Reports Server (NTRS)
Thacker, B. H.; Millwater, H. R.; Harren, S. V.
1991-01-01
An automated capability for computing structural reliability considering uncertainties in both load and resistance variables is presented. The computations are carried out using an automated Advanced Mean Value iteration algorithm (AMV +) with performance functions involving load and resistance variables obtained by both explicit and implicit methods. A complete description of the procedures used is given as well as several illustrative examples, verified by Monte Carlo Analysis. In particular, the computational methods described in the paper are shown to be quite accurate and efficient for a material nonlinear structure considering material damage as a function of several primitive random variables. The results show clearly the effectiveness of the algorithms for computing the reliability of large-scale structural systems with a maximum number of resolutions.
Supernovae Discovery Efficiency
NASA Astrophysics Data System (ADS)
John, Colin
2018-01-01
Abstract:We present supernovae (SN) search efficiency measurements for recent Hubble Space Telescope (HST) surveys. Efficiency is a key component to any search, and is important parameter as a correction factor for SN rates. To achieve an accurate value for efficiency, many supernovae need to be discoverable in surveys. This cannot be achieved from real SN only, due to their scarcity, so fake SN are planted. These fake supernovae—with a goal of realism in mind—yield an understanding of efficiency based on position related to other celestial objects, and brightness. To improve realism, we built a more accurate model of supernovae using a point-spread function. The next improvement to realism is planting these objects close to galaxies and of various parameters of brightness, magnitude, local galactic brightness and redshift. Once these are planted, a very accurate SN is visible and discoverable by the searcher. It is very important to find factors that affect this discovery efficiency. Exploring the factors that effect detection yields a more accurate correction factor. Further inquires into efficiency give us a better understanding of image processing, searching techniques and survey strategies, and result in an overall higher likelihood to find these events in future surveys with Hubble, James Webb, and WFIRST telescopes. After efficiency is discovered and refined with many unique surveys, it factors into measurements of SN rates versus redshift. By comparing SN rates vs redshift against the star formation rate we can test models to determine how long star systems take from the point of inception to explosion (delay time distribution). This delay time distribution is compared to SN progenitors models to get an accurate idea of what these stars were like before their deaths.
Accurate integration over atomic regions bounded by zero-flux surfaces.
Polestshuk, Pavel M
2013-01-30
The approach for the integration over a region covered by zero-flux surface is described. This approach based on the surface triangulation technique is efficiently realized in a newly developed program TWOE. The elaborated method is tested on several atomic properties including the source function. TWOE results are compared with those produced by using well-known existing programs. Absolute errors in computed atomic properties are shown to range usually from 10(-6) to 10(-5) au. The demonstrative examples prove that present realization has perfect convergence of atomic properties with increasing size of angular grid and allows to obtain highly accurate data even in the most difficult cases. It is believed that the developed program can be bridgehead that allows to implement atomic partitioning of any desired molecular property with high accuracy. Copyright © 2012 Wiley Periodicals, Inc.
The feasibility of an efficient drug design method with high-performance computers.
Yamashita, Takefumi; Ueda, Akihiko; Mitsui, Takashi; Tomonaga, Atsushi; Matsumoto, Shunji; Kodama, Tatsuhiko; Fujitani, Hideaki
2015-01-01
In this study, we propose a supercomputer-assisted drug design approach involving all-atom molecular dynamics (MD)-based binding free energy prediction after the traditional design/selection step. Because this prediction is more accurate than the empirical binding affinity scoring of the traditional approach, the compounds selected by the MD-based prediction should be better drug candidates. In this study, we discuss the applicability of the new approach using two examples. Although the MD-based binding free energy prediction has a huge computational cost, it is feasible with the latest 10 petaflop-scale computer. The supercomputer-assisted drug design approach also involves two important feedback procedures: The first feedback is generated from the MD-based binding free energy prediction step to the drug design step. While the experimental feedback usually provides binding affinities of tens of compounds at one time, the supercomputer allows us to simultaneously obtain the binding free energies of hundreds of compounds. Because the number of calculated binding free energies is sufficiently large, the compounds can be classified into different categories whose properties will aid in the design of the next generation of drug candidates. The second feedback, which occurs from the experiments to the MD simulations, is important to validate the simulation parameters. To demonstrate this, we compare the binding free energies calculated with various force fields to the experimental ones. The results indicate that the prediction will not be very successful, if we use an inaccurate force field. By improving/validating such simulation parameters, the next prediction can be made more accurate.
Calculations of 3D compressible flows using an efficient low diffusion upwind scheme
NASA Astrophysics Data System (ADS)
Hu, Zongjun; Zha, Gecheng
2005-01-01
A newly suggested E-CUSP upwind scheme is employed for the first time to calculate 3D flows of propulsion systems. The E-CUSP scheme contains the total energy in the convective vector and is fully consistent with the characteristic directions. The scheme is proved to have low diffusion and high CPU efficiency. The computed cases in this paper include a transonic nozzle with circular-to-rectangular cross-section, a transonic duct with shock wave/turbulent boundary layer interaction, and a subsonic 3D compressor cascade. The computed results agree well with the experiments. The new scheme is proved to be accurate, efficient and robust for the 3D calculations of the flows in this paper.
Accurate Finite Difference Algorithms
NASA Technical Reports Server (NTRS)
Goodrich, John W.
1996-01-01
Two families of finite difference algorithms for computational aeroacoustics are presented and compared. All of the algorithms are single step explicit methods, they have the same order of accuracy in both space and time, with examples up to eleventh order, and they have multidimensional extensions. One of the algorithm families has spectral like high resolution. Propagation with high order and high resolution algorithms can produce accurate results after O(10(exp 6)) periods of propagation with eight grid points per wavelength.
Efficient Parallel Algorithm For Direct Numerical Simulation of Turbulent Flows
NASA Technical Reports Server (NTRS)
Moitra, Stuti; Gatski, Thomas B.
1997-01-01
A distributed algorithm for a high-order-accurate finite-difference approach to the direct numerical simulation (DNS) of transition and turbulence in compressible flows is described. This work has two major objectives. The first objective is to demonstrate that parallel and distributed-memory machines can be successfully and efficiently used to solve computationally intensive and input/output intensive algorithms of the DNS class. The second objective is to show that the computational complexity involved in solving the tridiagonal systems inherent in the DNS algorithm can be reduced by algorithm innovations that obviate the need to use a parallelized tridiagonal solver.
Merced-Grafals, Emmanuelle J; Dávila, Noraica; Ge, Ning; Williams, R Stanley; Strachan, John Paul
2016-09-09
Beyond use as high density non-volatile memories, memristors have potential as synaptic components of neuromorphic systems. We investigated the suitability of tantalum oxide (TaOx) transistor-memristor (1T1R) arrays for such applications, particularly the ability to accurately, repeatedly, and rapidly reach arbitrary conductance states. Programming is performed by applying an adaptive pulsed algorithm that utilizes the transistor gate voltage to control the SET switching operation and increase programming speed of the 1T1R cells. We show the capability of programming 64 conductance levels with <0.5% average accuracy using 100 ns pulses and studied the trade-offs between programming speed and programming error. The algorithm is also utilized to program 16 conductance levels on a population of cells in the 1T1R array showing robustness to cell-to-cell variability. In general, the proposed algorithm results in approximately 10× improvement in programming speed over standard algorithms that do not use the transistor gate to control memristor switching. In addition, after only two programming pulses (an initialization pulse followed by a programming pulse), the resulting conductance values are within 12% of the target values in all cases. Finally, endurance of more than 10(6) cycles is shown through open-loop (single pulses) programming across multiple conductance levels using the optimized gate voltage of the transistor. These results are relevant for applications that require high speed, accurate, and repeatable programming of the cells such as in neural networks and analog data processing.
Accurate estimation of human body orientation from RGB-D sensors.
Liu, Wu; Zhang, Yongdong; Tang, Sheng; Tang, Jinhui; Hong, Richang; Li, Jintao
2013-10-01
Accurate estimation of human body orientation can significantly enhance the analysis of human behavior, which is a fundamental task in the field of computer vision. However, existing orientation estimation methods cannot handle the various body poses and appearances. In this paper, we propose an innovative RGB-D-based orientation estimation method to address these challenges. By utilizing the RGB-D information, which can be real time acquired by RGB-D sensors, our method is robust to cluttered environment, illumination change and partial occlusions. Specifically, efficient static and motion cue extraction methods are proposed based on the RGB-D superpixels to reduce the noise of depth data. Since it is hard to discriminate all the 360 (°) orientation using static cues or motion cues independently, we propose to utilize a dynamic Bayesian network system (DBNS) to effectively employ the complementary nature of both static and motion cues. In order to verify our proposed method, we build a RGB-D-based human body orientation dataset that covers a wide diversity of poses and appearances. Our intensive experimental evaluations on this dataset demonstrate the effectiveness and efficiency of the proposed method.
Computational Pollutant Environment Assessment from Propulsion-System Testing
NASA Technical Reports Server (NTRS)
Wang, Ten-See; McConnaughey, Paul; Chen, Yen-Sen; Warsi, Saif
1996-01-01
An asymptotic plume growth method based on a time-accurate three-dimensional computational fluid dynamics formulation has been developed to assess the exhaust-plume pollutant environment from a simulated RD-170 engine hot-fire test on the F1 Test Stand at Marshall Space Flight Center. Researchers have long known that rocket-engine hot firing has the potential for forming thermal nitric oxides, as well as producing carbon monoxide when hydrocarbon fuels are used. Because of the complex physics involved, most attempts to predict the pollutant emissions from ground-based engine testing have used simplified methods, which may grossly underpredict and/or overpredict the pollutant formations in a test environment. The objective of this work has been to develop a computational fluid dynamics-based methodology that replicates the underlying test-stand flow physics to accurately and efficiently assess pollutant emissions from ground-based rocket-engine testing. A nominal RD-170 engine hot-fire test was computed, and pertinent test-stand flow physics was captured. The predicted total emission rates compared reasonably well with those of the existing hydrocarbon engine hot-firing test data.
Efficient forced vibration reanalysis method for rotating electric machines
NASA Astrophysics Data System (ADS)
Saito, Akira; Suzuki, Hiromitsu; Kuroishi, Masakatsu; Nakai, Hideo
2015-01-01
Rotating electric machines are subject to forced vibration by magnetic force excitation with wide-band frequency spectrum that are dependent on the operating conditions. Therefore, when designing the electric machines, it is inevitable to compute the vibration response of the machines at various operating conditions efficiently and accurately. This paper presents an efficient frequency-domain vibration analysis method for the electric machines. The method enables the efficient re-analysis of the vibration response of electric machines at various operating conditions without the necessity to re-compute the harmonic response by finite element analyses. Theoretical background of the proposed method is provided, which is based on the modal reduction of the magnetic force excitation by a set of amplitude-modulated standing-waves. The method is applied to the forced response vibration of the interior permanent magnet motor at a fixed operating condition. The results computed by the proposed method agree very well with those computed by the conventional harmonic response analysis by the FEA. The proposed method is then applied to the spin-up test condition to demonstrate its applicability to various operating conditions. It is observed that the proposed method can successfully be applied to the spin-up test conditions, and the measured dominant frequency peaks in the frequency response can be well captured by the proposed approach.
Seismic imaging using finite-differences and parallel computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ober, C.C.
1997-12-31
A key to reducing the risks and costs of associated with oil and gas exploration is the fast, accurate imaging of complex geologies, such as salt domes in the Gulf of Mexico and overthrust regions in US onshore regions. Prestack depth migration generally yields the most accurate images, and one approach to this is to solve the scalar wave equation using finite differences. As part of an ongoing ACTI project funded by the US Department of Energy, a finite difference, 3-D prestack, depth migration code has been developed. The goal of this work is to demonstrate that massively parallel computersmore » can be used efficiently for seismic imaging, and that sufficient computing power exists (or soon will exist) to make finite difference, prestack, depth migration practical for oil and gas exploration. Several problems had to be addressed to get an efficient code for the Intel Paragon. These include efficient I/O, efficient parallel tridiagonal solves, and high single-node performance. Furthermore, to provide portable code the author has been restricted to the use of high-level programming languages (C and Fortran) and interprocessor communications using MPI. He has been using the SUNMOS operating system, which has affected many of his programming decisions. He will present images created from two verification datasets (the Marmousi Model and the SEG/EAEG 3D Salt Model). Also, he will show recent images from real datasets, and point out locations of improved imaging. Finally, he will discuss areas of current research which will hopefully improve the image quality and reduce computational costs.« less
Gray, Alan; Harlen, Oliver G; Harris, Sarah A; Khalid, Syma; Leung, Yuk Ming; Lonsdale, Richard; Mulholland, Adrian J; Pearson, Arwen R; Read, Daniel J; Richardson, Robin A
2015-01-01
Despite huge advances in the computational techniques available for simulating biomolecules at the quantum-mechanical, atomistic and coarse-grained levels, there is still a widespread perception amongst the experimental community that these calculations are highly specialist and are not generally applicable by researchers outside the theoretical community. In this article, the successes and limitations of biomolecular simulation and the further developments that are likely in the near future are discussed. A brief overview is also provided of the experimental biophysical methods that are commonly used to probe biomolecular structure and dynamics, and the accuracy of the information that can be obtained from each is compared with that from modelling. It is concluded that progress towards an accurate spatial and temporal model of biomacromolecules requires a combination of all of these biophysical techniques, both experimental and computational.
Multiphysics Computational Analysis of a Solid-Core Nuclear Thermal Engine Thrust Chamber
NASA Technical Reports Server (NTRS)
Wang, Ten-See; Canabal, Francisco; Cheng, Gary; Chen, Yen-Sen
2007-01-01
The objective of this effort is to develop an efficient and accurate computational heat transfer methodology to predict thermal, fluid, and hydrogen environments for a hypothetical solid-core, nuclear thermal engine - the Small Engine. In addition, the effects of power profile and hydrogen conversion on heat transfer efficiency and thrust performance were also investigated. The computational methodology is based on an unstructured-grid, pressure-based, all speeds, chemically reacting, computational fluid dynamics platform, while formulations of conjugate heat transfer were implemented to describe the heat transfer from solid to hydrogen inside the solid-core reactor. The computational domain covers the entire thrust chamber so that the afore-mentioned heat transfer effects impact the thrust performance directly. The result shows that the computed core-exit gas temperature, specific impulse, and core pressure drop agree well with those of design data for the Small Engine. Finite-rate chemistry is very important in predicting the proper energy balance as naturally occurring hydrogen decomposition is endothermic. Locally strong hydrogen conversion associated with centralized power profile gives poor heat transfer efficiency and lower thrust performance. On the other hand, uniform hydrogen conversion associated with a more uniform radial power profile achieves higher heat transfer efficiency, and higher thrust performance.
Toward an efficient Photometric Supernova Classifier
NASA Astrophysics Data System (ADS)
McClain, Bradley
2018-01-01
The Sloan Digital Sky Survey Supernova Survey (SDSS) discovered more than 1,000 Type Ia Supernovae, yet less than half of these have spectroscopic measurements. As wide-field imaging telescopes such as The Dark Energy Survey (DES) and the Panoramic Survey Telescope and Rapid Response System (Pan-STARRS) discover more supernovae, the need for accurate and computationally cheap photometric classifiers increases. My goal is to use a photometric classification algorithm based on Sncosmo, a python library for supernova cosmology analysis, to reclassify previously identified Hubble SN and other non-spectroscopically confirmed surveys. My results will be compared to other photometric classifiers such as PSNID and STARDUST. In the near future, I expect to have the algorithm validated with simulated data, optimized for efficiency, and applied with high performance computing to real data.
Efficient and accurate time-stepping schemes for integrate-and-fire neuronal networks.
Shelley, M J; Tao, L
2001-01-01
To avoid the numerical errors associated with resetting the potential following a spike in simulations of integrate-and-fire neuronal networks, Hansel et al. and Shelley independently developed a modified time-stepping method. Their particular scheme consists of second-order Runge-Kutta time-stepping, a linear interpolant to find spike times, and a recalibration of postspike potential using the spike times. Here we show analytically that such a scheme is second order, discuss the conditions under which efficient, higher-order algorithms can be constructed to treat resets, and develop a modified fourth-order scheme. To support our analysis, we simulate a system of integrate-and-fire conductance-based point neurons with all-to-all coupling. For six-digit accuracy, our modified Runge-Kutta fourth-order scheme needs a time-step of Delta(t) = 0.5 x 10(-3) seconds, whereas to achieve comparable accuracy using a recalibrated second-order or a first-order algorithm requires time-steps of 10(-5) seconds or 10(-9) seconds, respectively. Furthermore, since the cortico-cortical conductances in standard integrate-and-fire neuronal networks do not depend on the value of the membrane potential, we can attain fourth-order accuracy with computational costs normally associated with second-order schemes.
Chaudhry, Jehanzeb Hameed; Estep, Don; Tavener, Simon; Carey, Varis; Sandelin, Jeff
2016-01-01
We consider numerical methods for initial value problems that employ a two stage approach consisting of solution on a relatively coarse discretization followed by solution on a relatively fine discretization. Examples include adaptive error control, parallel-in-time solution schemes, and efficient solution of adjoint problems for computing a posteriori error estimates. We describe a general formulation of two stage computations then perform a general a posteriori error analysis based on computable residuals and solution of an adjoint problem. The analysis accommodates various variations in the two stage computation and in formulation of the adjoint problems. We apply the analysis to compute "dual-weighted" a posteriori error estimates, to develop novel algorithms for efficient solution that take into account cancellation of error, and to the Parareal Algorithm. We test the various results using several numerical examples.
Crowd Sensing-Enabling Security Service Recommendation for Social Fog Computing Systems.
Wu, Jun; Su, Zhou; Wang, Shen; Li, Jianhua
2017-07-30
Fog computing, shifting intelligence and resources from the remote cloud to edge networks, has the potential of providing low-latency for the communication from sensing data sources to users. For the objects from the Internet of Things (IoT) to the cloud, it is a new trend that the objects establish social-like relationships with each other, which efficiently brings the benefits of developed sociality to a complex environment. As fog service become more sophisticated, it will become more convenient for fog users to share their own services, resources, and data via social networks. Meanwhile, the efficient social organization can enable more flexible, secure, and collaborative networking. Aforementioned advantages make the social network a potential architecture for fog computing systems. In this paper, we design an architecture for social fog computing, in which the services of fog are provisioned based on "friend" relationships. To the best of our knowledge, this is the first attempt at an organized fog computing system-based social model. Meanwhile, social networking enhances the complexity and security risks of fog computing services, creating difficulties of security service recommendations in social fog computing. To address this, we propose a novel crowd sensing-enabling security service provisioning method to recommend security services accurately in social fog computing systems. Simulation results show the feasibilities and efficiency of the crowd sensing-enabling security service recommendation method for social fog computing systems.
NASA Astrophysics Data System (ADS)
Lashkin, S. V.; Kozelkov, A. S.; Yalozo, A. V.; Gerasimov, V. Yu.; Zelensky, D. K.
2017-12-01
This paper describes the details of the parallel implementation of the SIMPLE algorithm for numerical solution of the Navier-Stokes system of equations on arbitrary unstructured grids. The iteration schemes for the serial and parallel versions of the SIMPLE algorithm are implemented. In the description of the parallel implementation, special attention is paid to computational data exchange among processors under the condition of the grid model decomposition using fictitious cells. We discuss the specific features for the storage of distributed matrices and implementation of vector-matrix operations in parallel mode. It is shown that the proposed way of matrix storage reduces the number of interprocessor exchanges. A series of numerical experiments illustrates the effect of the multigrid SLAE solver tuning on the general efficiency of the algorithm; the tuning involves the types of the cycles used (V, W, and F), the number of iterations of a smoothing operator, and the number of cells for coarsening. Two ways (direct and indirect) of efficiency evaluation for parallelization of the numerical algorithm are demonstrated. The paper presents the results of solving some internal and external flow problems with the evaluation of parallelization efficiency by two algorithms. It is shown that the proposed parallel implementation enables efficient computations for the problems on a thousand processors. Based on the results obtained, some general recommendations are made for the optimal tuning of the multigrid solver, as well as for selecting the optimal number of cells per processor.
Computational Challenges of Viscous Incompressible Flows
NASA Technical Reports Server (NTRS)
Kwak, Dochan; Kiris, Cetin; Kim, Chang Sung
2004-01-01
Over the past thirty years, numerical methods and simulation tools for incompressible flows have been advanced as a subset of the computational fluid dynamics (CFD) discipline. Although incompressible flows are encountered in many areas of engineering, simulation of compressible flow has been the major driver for developing computational algorithms and tools. This is probably due to the rather stringent requirements for predicting aerodynamic performance characteristics of flight vehicles, while flow devices involving low-speed or incompressible flow could be reasonably well designed without resorting to accurate numerical simulations. As flow devices are required to be more sophisticated and highly efficient CFD took become increasingly important in fluid engineering for incompressible and low-speed flow. This paper reviews some of the successes made possible by advances in computational technologies during the same period, and discusses some of the current challenges faced in computing incompressible flows.
Gaussian Radial Basis Function for Efficient Computation of Forest Indirect Illumination
NASA Astrophysics Data System (ADS)
Abbas, Fayçal; Babahenini, Mohamed Chaouki
2018-06-01
Global illumination of natural scenes in real time like forests is one of the most complex problems to solve, because the multiple inter-reflections between the light and material of the objects composing the scene. The major problem that arises is the problem of visibility computation. In fact, the computing of visibility is carried out for all the set of leaves visible from the center of a given leaf, given the enormous number of leaves present in a tree, this computation performed for each leaf of the tree which also reduces performance. We describe a new approach that approximates visibility queries, which precede in two steps. The first step is to generate point cloud representing the foliage. We assume that the point cloud is composed of two classes (visible, not-visible) non-linearly separable. The second step is to perform a point cloud classification by applying the Gaussian radial basis function, which measures the similarity in term of distance between each leaf and a landmark leaf. It allows approximating the visibility requests to extract the leaves that will be used to calculate the amount of indirect illumination exchanged between neighbor leaves. Our approach allows efficiently treat the light exchanges in the scene of a forest, it allows a fast computation and produces images of good visual quality, all this takes advantage of the immense power of computation of the GPU.
NASA Astrophysics Data System (ADS)
Natraj, Vijay; Li, King-Fai; Yung, Yuk L.
2009-02-01
Tables that have been used as a reference for nearly 50 years for the intensity and polarization of reflected and transmitted light in Rayleigh scattering atmospheres have been found to be inaccurate, even to four decimal places. We convert the integral equations describing the X and Y functions into a pair of coupled integro-differential equations that can be efficiently solved numerically. Special care has been taken in evaluating Cauchy principal value integrals and their derivatives that appear in the solution of the Rayleigh scattering problem. The new approach gives results accurate to eight decimal places for the entire range of tabulation (optical thicknesses 0.02-1.0, surface reflectances 0-0.8, solar and viewing zenith angles 0°-88.85°, and relative azimuth angles 0°-180°), including the most difficult case of direct transmission in the direction of the sun. Revised tables have been created and stored electronically for easy reference by the planetary science and astrophysics community.
The importance of accurate interaction potentials in the melting of argon nanoclusters
NASA Astrophysics Data System (ADS)
Pahl, E.; Calvo, F.; Schwerdtfeger, P.
The melting temperatures of argon clusters ArN (N = 13, 55, 147, 309, 561, and 923) and of bulk argon have been obtained from exchange Monte Carlo simulations and are compared using different two-body interaction potentials, namely the standard Lennard-Jones (LJ), Aziz and extended Lennard-Jones (ELJ) potentials. The latter potential has many advantages: while maintaining the computational efficiency of the commonly used LJ potential, it is as accurate as the Aziz potential but the computer time scales more favorably with increasing cluster size. By applying the ELJ form and extrapolating the cluster data to the infinite system, we are able to extract the melting point of argon already in good agreement with experimental measurements. By considering the additional Axilrod-Teller three-body contribution as well, we calculate a melting temperature of T meltELJ = 84.7 K compared to the experimental value of T meltexp = 83.85 K, whereas the LJ potential underestimates the melting point by more than 7 K. Thus melting temperatures within 1 K accuracy are now feasible.
T.Z. Ye; K.J.S. Jayawickrama; G.R. Johnson
2006-01-01
Using computer simulation, we evaluated the impact of using first-generation information to increase selection efficiency in a second-generation breeding program. Selection efficiency was compared in terms of increase in rank correlation between estimated and true breeding values (i.e., ranking accuracy), reduction in coefficient of variation of correlation...
Stamatakis, Alexandros; Ott, Michael
2008-12-27
The continuous accumulation of sequence data, for example, due to novel wet-laboratory techniques such as pyrosequencing, coupled with the increasing popularity of multi-gene phylogenies and emerging multi-core processor architectures that face problems of cache congestion, poses new challenges with respect to the efficient computation of the phylogenetic maximum-likelihood (ML) function. Here, we propose two approaches that can significantly speed up likelihood computations that typically represent over 95 per cent of the computational effort conducted by current ML or Bayesian inference programs. Initially, we present a method and an appropriate data structure to efficiently compute the likelihood score on 'gappy' multi-gene alignments. By 'gappy' we denote sampling-induced gaps owing to missing sequences in individual genes (partitions), i.e. not real alignment gaps. A first proof-of-concept implementation in RAXML indicates that this approach can accelerate inferences on large and gappy alignments by approximately one order of magnitude. Moreover, we present insights and initial performance results on multi-core architectures obtained during the transition from an OpenMP-based to a Pthreads-based fine-grained parallelization of the ML function.
GPU and APU computations of Finite Time Lyapunov Exponent fields
NASA Astrophysics Data System (ADS)
Conti, Christian; Rossinelli, Diego; Koumoutsakos, Petros
2012-03-01
We present GPU and APU accelerated computations of Finite-Time Lyapunov Exponent (FTLE) fields. The calculation of FTLEs is a computationally intensive process, as in order to obtain the sharp ridges associated with the Lagrangian Coherent Structures an extensive resampling of the flow field is required. The computational performance of this resampling is limited by the memory bandwidth of the underlying computer architecture. The present technique harnesses data-parallel execution of many-core architectures and relies on fast and accurate evaluations of moment conserving functions for the mesh to particle interpolations. We demonstrate how the computation of FTLEs can be efficiently performed on a GPU and on an APU through OpenCL and we report over one order of magnitude improvements over multi-threaded executions in FTLE computations of bluff body flows.
Computationally efficient simulation of unsteady aerodynamics using POD on the fly
NASA Astrophysics Data System (ADS)
Moreno-Ramos, Ruben; Vega, José M.; Varas, Fernando
2016-12-01
Modern industrial aircraft design requires a large amount of sufficiently accurate aerodynamic and aeroelastic simulations. Current computational fluid dynamics (CFD) solvers with aeroelastic capabilities, such as the NASA URANS unstructured solver FUN3D, require very large computational resources. Since a very large amount of simulation is necessary, the CFD cost is just unaffordable in an industrial production environment and must be significantly reduced. Thus, a more inexpensive, yet sufficiently precise solver is strongly needed. An opportunity to approach this goal could follow some recent results (Terragni and Vega 2014 SIAM J. Appl. Dyn. Syst. 13 330-65 Rapun et al 2015 Int. J. Numer. Meth. Eng. 104 844-68) on an adaptive reduced order model that combines ‘on the fly’ a standard numerical solver (to compute some representative snapshots), proper orthogonal decomposition (POD) (to extract modes from the snapshots), Galerkin projection (onto the set of POD modes), and several additional ingredients such as projecting the equations using a limited amount of points and fairly generic mode libraries. When applied to the complex Ginzburg-Landau equation, the method produces acceleration factors (comparing with standard numerical solvers) of the order of 20 and 300 in one and two space dimensions, respectively. Unfortunately, the extension of the method to unsteady, compressible flows around deformable geometries requires new approaches to deal with deformable meshes, high-Reynolds numbers, and compressibility. A first step in this direction is presented considering the unsteady compressible, two-dimensional flow around an oscillating airfoil using a CFD solver in a rigidly moving mesh. POD on the Fly gives results whose accuracy is comparable to that of the CFD solver used to compute the snapshots.
Efficient Computation of Closed-loop Frequency Response for Large Order Flexible Systems
NASA Technical Reports Server (NTRS)
Maghami, Peiman G.; Giesy, Daniel P.
1997-01-01
An efficient and robust computational scheme is given for the calculation of the frequency response function of a large order, flexible system implemented with a linear, time invariant control system. Advantage is taken of the highly structured sparsity of the system matrix of the plant based on a model of the structure using normal mode coordinates. The computational time per frequency point of the new computational scheme is a linear function of system size, a significant improvement over traditional, full-matrix techniques whose computational times per frequency point range from quadratic to cubic functions of system size. This permits the practical frequency domain analysis of systems of much larger order than by traditional, full-matrix techniques. Formulations are given for both open and closed loop loop systems. Numerical examples are presented showing the advantages of the present formulation over traditional approaches, both in speed and in accuracy. Using a model with 703 structural modes, a speed-up of almost two orders of magnitude was observed while accuracy improved by up to 5 decimal places.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, S.; Park, S.; Makowski, L.
Small angle X-ray scattering (SAXS) is an increasingly powerful technique to characterize the structure of biomolecules in solution. We present a computational method for accurately and efficiently computing the solution scattering curve from a protein with dynamical fluctuations. The method is built upon a coarse-grained (CG) representation of the protein. This CG approach takes advantage of the low-resolution character of solution scattering. It allows rapid determination of the scattering pattern from conformations extracted from CG simulations to obtain scattering characterization of the protein conformational landscapes. Important elements incorporated in the method include an effective residue-based structure factor for each aminomore » acid, an explicit treatment of the hydration layer at the surface of the protein, and an ensemble average of scattering from all accessible conformations to account for macromolecular flexibility. The CG model is calibrated and illustrated to accurately reproduce the experimental scattering curve of Hen egg white lysozyme. We then illustrate the computational method by calculating the solution scattering pattern of several representative protein folds and multiple conformational states. The results suggest that solution scattering data, when combined with a reliable computational method, have great potential for a better structural description of multi-domain complexes in different functional states, and for recognizing structural folds when sequence similarity to a protein of known structure is low. Possible applications of the method are discussed.« less
LeDell, Erin; Petersen, Maya; van der Laan, Mark
In binary classification problems, the area under the ROC curve (AUC) is commonly used to evaluate the performance of a prediction model. Often, it is combined with cross-validation in order to assess how the results will generalize to an independent data set. In order to evaluate the quality of an estimate for cross-validated AUC, we obtain an estimate of its variance. For massive data sets, the process of generating a single performance estimate can be computationally expensive. Additionally, when using a complex prediction method, the process of cross-validating a predictive model on even a relatively small data set can still require a large amount of computation time. Thus, in many practical settings, the bootstrap is a computationally intractable approach to variance estimation. As an alternative to the bootstrap, we demonstrate a computationally efficient influence curve based approach to obtaining a variance estimate for cross-validated AUC.
Petersen, Maya; van der Laan, Mark
2015-01-01
In binary classification problems, the area under the ROC curve (AUC) is commonly used to evaluate the performance of a prediction model. Often, it is combined with cross-validation in order to assess how the results will generalize to an independent data set. In order to evaluate the quality of an estimate for cross-validated AUC, we obtain an estimate of its variance. For massive data sets, the process of generating a single performance estimate can be computationally expensive. Additionally, when using a complex prediction method, the process of cross-validating a predictive model on even a relatively small data set can still require a large amount of computation time. Thus, in many practical settings, the bootstrap is a computationally intractable approach to variance estimation. As an alternative to the bootstrap, we demonstrate a computationally efficient influence curve based approach to obtaining a variance estimate for cross-validated AUC. PMID:26279737
An adaptive grid to improve the efficiency and accuracy of modelling underwater noise from shipping
NASA Astrophysics Data System (ADS)
Trigg, Leah; Chen, Feng; Shapiro, Georgy; Ingram, Simon; Embling, Clare
2017-04-01
Underwater noise from shipping is becoming a significant concern and has been listed as a pollutant under Descriptor 11 of the Marine Strategy Framework Directive. Underwater noise models are an essential tool to assess and predict noise levels for regulatory procedures such as environmental impact assessments and ship noise monitoring. There are generally two approaches to noise modelling. The first is based on simplified energy flux models, assuming either spherical or cylindrical propagation of sound energy. These models are very quick but they ignore important water column and seabed properties, and produce significant errors in the areas subject to temperature stratification (Shapiro et al., 2014). The second type of model (e.g. ray-tracing and parabolic equation) is based on an advanced physical representation of sound propagation. However, these acoustic propagation models are computationally expensive to execute. Shipping noise modelling requires spatial discretization in order to group noise sources together using a grid. A uniform grid size is often selected to achieve either the greatest efficiency (i.e. speed of computations) or the greatest accuracy. In contrast, this work aims to produce efficient and accurate noise level predictions by presenting an adaptive grid where cell size varies with distance from the receiver. The spatial range over which a certain cell size is suitable was determined by calculating the distance from the receiver at which propagation loss becomes uniform across a grid cell. The computational efficiency and accuracy of the resulting adaptive grid was tested by comparing it to uniform 1 km and 5 km grids. These represent an accurate and computationally efficient grid respectively. For a case study of the Celtic Sea, an application of the adaptive grid over an area of 160×160 km reduced the number of model executions required from 25600 for a 1 km grid to 5356 in December and to between 5056 and 13132 in August, which
Superpixel-based graph cuts for accurate stereo matching
NASA Astrophysics Data System (ADS)
Feng, Liting; Qin, Kaihuai
2017-06-01
Estimating the surface normal vector and disparity of a pixel simultaneously, also known as three-dimensional label method, has been widely used in recent continuous stereo matching problem to achieve sub-pixel accuracy. However, due to the infinite label space, it’s extremely hard to assign each pixel an appropriate label. In this paper, we present an accurate and efficient algorithm, integrating patchmatch with graph cuts, to approach this critical computational problem. Besides, to get robust and precise matching cost, we use a convolutional neural network to learn a similarity measure on small image patches. Compared with other MRF related methods, our method has several advantages: its sub-modular property ensures a sub-problem optimality which is easy to perform in parallel; graph cuts can simultaneously update multiple pixels, avoiding local minima caused by sequential optimizers like belief propagation; it uses segmentation results for better local expansion move; local propagation and randomization can easily generate the initial solution without using external methods. Middlebury experiments show that our method can get higher accuracy than other MRF-based algorithms.
An efficient method to compute spurious end point contributions in PO solutions. [Physical Optics
NASA Technical Reports Server (NTRS)
Gupta, Inder J.; Burnside, Walter D.; Pistorius, Carl W. I.
1987-01-01
A method is given to compute the spurious endpoint contributions in the physical optics solution for electromagnetic scattering from conducting bodies. The method is applicable to general three-dimensional structures. The only information required to use the method is the radius of curvature of the body at the shadow boundary. Thus, the method is very efficient for numerical computations. As an illustration, the method is applied to several bodies of revolution to compute the endpoint contributions for backscattering in the case of axial incidence. It is shown that in high-frequency situations, the endpoint contributions obtained using the method are equal to the true endpoint contributions.
Huang, Qinlong; Yang, Yixian; Shi, Yuxiang
2018-02-24
With the growing number of vehicles and popularity of various services in vehicular cloud computing (VCC), message exchanging among vehicles under traffic conditions and in emergency situations is one of the most pressing demands, and has attracted significant attention. However, it is an important challenge to authenticate the legitimate sources of broadcast messages and achieve fine-grained message access control. In this work, we propose SmartVeh, a secure and efficient message access control and authentication scheme in VCC. A hierarchical, attribute-based encryption technique is utilized to achieve fine-grained and flexible message sharing, which ensures that vehicles whose persistent or dynamic attributes satisfy the access policies can access the broadcast message with equipped on-board units (OBUs). Message authentication is enforced by integrating an attribute-based signature, which achieves message authentication and maintains the anonymity of the vehicles. In order to reduce the computations of the OBUs in the vehicles, we outsource the heavy computations of encryption, decryption and signing to a cloud server and road-side units. The theoretical analysis and simulation results reveal that our secure and efficient scheme is suitable for VCC.
Yang, Yixian; Shi, Yuxiang
2018-01-01
With the growing number of vehicles and popularity of various services in vehicular cloud computing (VCC), message exchanging among vehicles under traffic conditions and in emergency situations is one of the most pressing demands, and has attracted significant attention. However, it is an important challenge to authenticate the legitimate sources of broadcast messages and achieve fine-grained message access control. In this work, we propose SmartVeh, a secure and efficient message access control and authentication scheme in VCC. A hierarchical, attribute-based encryption technique is utilized to achieve fine-grained and flexible message sharing, which ensures that vehicles whose persistent or dynamic attributes satisfy the access policies can access the broadcast message with equipped on-board units (OBUs). Message authentication is enforced by integrating an attribute-based signature, which achieves message authentication and maintains the anonymity of the vehicles. In order to reduce the computations of the OBUs in the vehicles, we outsource the heavy computations of encryption, decryption and signing to a cloud server and road-side units. The theoretical analysis and simulation results reveal that our secure and efficient scheme is suitable for VCC. PMID:29495269
Fast and Accurate Prediction of Stratified Steel Temperature During Holding Period of Ladle
NASA Astrophysics Data System (ADS)
Deodhar, Anirudh; Singh, Umesh; Shukla, Rishabh; Gautham, B. P.; Singh, Amarendra K.
2017-04-01
Thermal stratification of liquid steel in a ladle during the holding period and the teeming operation has a direct bearing on the superheat available at the caster and hence on the caster set points such as casting speed and cooling rates. The changes in the caster set points are typically carried out based on temperature measurements at the end of tundish outlet. Thermal prediction models provide advance knowledge of the influence of process and design parameters on the steel temperature at various stages. Therefore, they can be used in making accurate decisions about the caster set points in real time. However, this requires both fast and accurate thermal prediction models. In this work, we develop a surrogate model for the prediction of thermal stratification using data extracted from a set of computational fluid dynamics (CFD) simulations, pre-determined using design of experiments technique. Regression method is used for training the predictor. The model predicts the stratified temperature profile instantaneously, for a given set of process parameters such as initial steel temperature, refractory heat content, slag thickness, and holding time. More than 96 pct of the predicted values are within an error range of ±5 K (±5 °C), when compared against corresponding CFD results. Considering its accuracy and computational efficiency, the model can be extended for thermal control of casting operations. This work also sets a benchmark for developing similar thermal models for downstream processes such as tundish and caster.
Low-cost, high-performance and efficiency computational photometer design
NASA Astrophysics Data System (ADS)
Siewert, Sam B.; Shihadeh, Jeries; Myers, Randall; Khandhar, Jay; Ivanov, Vitaly
2014-05-01
Researchers at the University of Alaska Anchorage and University of Colorado Boulder have built a low cost high performance and efficiency drop-in-place Computational Photometer (CP) to test in field applications ranging from port security and safety monitoring to environmental compliance monitoring and surveying. The CP integrates off-the-shelf visible spectrum cameras with near to long wavelength infrared detectors and high resolution digital snapshots in a single device. The proof of concept combines three or more detectors into a single multichannel imaging system that can time correlate read-out, capture, and image process all of the channels concurrently with high performance and energy efficiency. The dual-channel continuous read-out is combined with a third high definition digital snapshot capability and has been designed using an FPGA (Field Programmable Gate Array) to capture, decimate, down-convert, re-encode, and transform images from two standard definition CCD (Charge Coupled Device) cameras at 30Hz. The continuous stereo vision can be time correlated to megapixel high definition snapshots. This proof of concept has been fabricated as a fourlayer PCB (Printed Circuit Board) suitable for use in education and research for low cost high efficiency field monitoring applications that need multispectral and three dimensional imaging capabilities. Initial testing is in progress and includes field testing in ports, potential test flights in un-manned aerial systems, and future planned missions to image harsh environments in the arctic including volcanic plumes, ice formation, and arctic marine life.
NASA Technical Reports Server (NTRS)
Muellerschoen, R. J.
1988-01-01
A unified method to permute vector stored Upper triangular Diagonal factorized covariance and vector stored upper triangular Square Root Information arrays is presented. The method involves cyclic permutation of the rows and columns of the arrays and retriangularization with fast (slow) Givens rotations (reflections). Minimal computation is performed, and a one dimensional scratch array is required. To make the method efficient for large arrays on a virtual memory machine, computations are arranged so as to avoid expensive paging faults. This method is potentially important for processing large volumes of radio metric data in the Deep Space Network.
Computationally efficient real-time interpolation algorithm for non-uniform sampled biosignals
Eftekhar, Amir; Kindt, Wilko; Constandinou, Timothy G.
2016-01-01
This Letter presents a novel, computationally efficient interpolation method that has been optimised for use in electrocardiogram baseline drift removal. In the authors’ previous Letter three isoelectric baseline points per heartbeat are detected, and here utilised as interpolation points. As an extension from linear interpolation, their algorithm segments the interpolation interval and utilises different piecewise linear equations. Thus, the algorithm produces a linear curvature that is computationally efficient while interpolating non-uniform samples. The proposed algorithm is tested using sinusoids with different fundamental frequencies from 0.05 to 0.7 Hz and also validated with real baseline wander data acquired from the Massachusetts Institute of Technology University and Boston's Beth Israel Hospital (MIT-BIH) Noise Stress Database. The synthetic data results show an root mean square (RMS) error of 0.9 μV (mean), 0.63 μV (median) and 0.6 μV (standard deviation) per heartbeat on a 1 mVp–p 0.1 Hz sinusoid. On real data, they obtain an RMS error of 10.9 μV (mean), 8.5 μV (median) and 9.0 μV (standard deviation) per heartbeat. Cubic spline interpolation and linear interpolation on the other hand shows 10.7 μV, 11.6 μV (mean), 7.8 μV, 8.9 μV (median) and 9.8 μV, 9.3 μV (standard deviation) per heartbeat. PMID:27382478
Computationally efficient real-time interpolation algorithm for non-uniform sampled biosignals.
Guven, Onur; Eftekhar, Amir; Kindt, Wilko; Constandinou, Timothy G
2016-06-01
This Letter presents a novel, computationally efficient interpolation method that has been optimised for use in electrocardiogram baseline drift removal. In the authors' previous Letter three isoelectric baseline points per heartbeat are detected, and here utilised as interpolation points. As an extension from linear interpolation, their algorithm segments the interpolation interval and utilises different piecewise linear equations. Thus, the algorithm produces a linear curvature that is computationally efficient while interpolating non-uniform samples. The proposed algorithm is tested using sinusoids with different fundamental frequencies from 0.05 to 0.7 Hz and also validated with real baseline wander data acquired from the Massachusetts Institute of Technology University and Boston's Beth Israel Hospital (MIT-BIH) Noise Stress Database. The synthetic data results show an root mean square (RMS) error of 0.9 μV (mean), 0.63 μV (median) and 0.6 μV (standard deviation) per heartbeat on a 1 mVp-p 0.1 Hz sinusoid. On real data, they obtain an RMS error of 10.9 μV (mean), 8.5 μV (median) and 9.0 μV (standard deviation) per heartbeat. Cubic spline interpolation and linear interpolation on the other hand shows 10.7 μV, 11.6 μV (mean), 7.8 μV, 8.9 μV (median) and 9.8 μV, 9.3 μV (standard deviation) per heartbeat.
Efficient Exact Inference With Loss Augmented Objective in Structured Learning.
Bauer, Alexander; Nakajima, Shinichi; Muller, Klaus-Robert
2016-08-19
Structural support vector machine (SVM) is an elegant approach for building complex and accurate models with structured outputs. However, its applicability relies on the availability of efficient inference algorithms--the state-of-the-art training algorithms repeatedly perform inference to compute a subgradient or to find the most violating configuration. In this paper, we propose an exact inference algorithm for maximizing nondecomposable objectives due to special type of a high-order potential having a decomposable internal structure. As an important application, our method covers the loss augmented inference, which enables the slack and margin scaling formulations of structural SVM with a variety of dissimilarity measures, e.g., Hamming loss, precision and recall, Fβ-loss, intersection over union, and many other functions that can be efficiently computed from the contingency table. We demonstrate the advantages of our approach in natural language parsing and sequence segmentation applications.
Efficient simulation of incompressible viscous flow over multi-element airfoils
NASA Technical Reports Server (NTRS)
Rogers, Stuart E.; Wiltberger, N. Lyn; Kwak, Dochan
1993-01-01
The incompressible, viscous, turbulent flow over single and multi-element airfoils is numerically simulated in an efficient manner by solving the incompressible Navier-Stokes equations. The solution algorithm employs the method of pseudo compressibility and utilizes an upwind differencing scheme for the convective fluxes, and an implicit line-relaxation scheme. The motivation for this work includes interest in studying high-lift take-off and landing configurations of various aircraft. In particular, accurate computation of lift and drag at various angles of attack up to stall is desired. Two different turbulence models are tested in computing the flow over an NACA 4412 airfoil; an accurate prediction of stall is obtained. The approach used for multi-element airfoils involves the use of multiple zones of structured grids fitted to each element. Two different approaches are compared; a patched system of grids, and an overlaid Chimera system of grids. Computational results are presented for two-element, three-element, and four-element airfoil configurations. Excellent agreement with experimental surface pressure coefficients is seen. The code converges in less than 200 iterations, requiring on the order of one minute of CPU time on a CRAY YMP per element in the airfoil configuration.
ERIC Educational Resources Information Center
Lee, Young-Jin
2012-01-01
This paper presents a computational method that can efficiently estimate the ability of students from the log files of a Web-based learning environment capturing their problem solving processes. The computational method developed in this study approximates the posterior distribution of the student's ability obtained from the conventional Bayes…
Power Efficient Hardware Architecture of SHA-1 Algorithm for Trusted Mobile Computing
NASA Astrophysics Data System (ADS)
Kim, Mooseop; Ryou, Jaecheol
The Trusted Mobile Platform (TMP) is developed and promoted by the Trusted Computing Group (TCG), which is an industry standard body to enhance the security of the mobile computing environment. The built-in SHA-1 engine in TMP is one of the most important circuit blocks and contributes the performance of the whole platform because it is used as key primitives supporting platform integrity and command authentication. Mobile platforms have very stringent limitations with respect to available power, physical circuit area, and cost. Therefore special architecture and design methods for low power SHA-1 circuit are required. In this paper, we present a novel and efficient hardware architecture of low power SHA-1 design for TMP. Our low power SHA-1 hardware can compute 512-bit data block using less than 7,000 gates and has a power consumption about 1.1 mA on a 0.25μm CMOS process.
Accurate quantum chemical calculations
NASA Technical Reports Server (NTRS)
Bauschlicher, Charles W., Jr.; Langhoff, Stephen R.; Taylor, Peter R.
1989-01-01
An important goal of quantum chemical calculations is to provide an understanding of chemical bonding and molecular electronic structure. A second goal, the prediction of energy differences to chemical accuracy, has been much harder to attain. First, the computational resources required to achieve such accuracy are very large, and second, it is not straightforward to demonstrate that an apparently accurate result, in terms of agreement with experiment, does not result from a cancellation of errors. Recent advances in electronic structure methodology, coupled with the power of vector supercomputers, have made it possible to solve a number of electronic structure problems exactly using the full configuration interaction (FCI) method within a subspace of the complete Hilbert space. These exact results can be used to benchmark approximate techniques that are applicable to a wider range of chemical and physical problems. The methodology of many-electron quantum chemistry is reviewed. Methods are considered in detail for performing FCI calculations. The application of FCI methods to several three-electron problems in molecular physics are discussed. A number of benchmark applications of FCI wave functions are described. Atomic basis sets and the development of improved methods for handling very large basis sets are discussed: these are then applied to a number of chemical and spectroscopic problems; to transition metals; and to problems involving potential energy surfaces. Although the experiences described give considerable grounds for optimism about the general ability to perform accurate calculations, there are several problems that have proved less tractable, at least with current computer resources, and these and possible solutions are discussed.
Crowd Sensing-Enabling Security Service Recommendation for Social Fog Computing Systems
Wu, Jun; Su, Zhou; Li, Jianhua
2017-01-01
Fog computing, shifting intelligence and resources from the remote cloud to edge networks, has the potential of providing low-latency for the communication from sensing data sources to users. For the objects from the Internet of Things (IoT) to the cloud, it is a new trend that the objects establish social-like relationships with each other, which efficiently brings the benefits of developed sociality to a complex environment. As fog service become more sophisticated, it will become more convenient for fog users to share their own services, resources, and data via social networks. Meanwhile, the efficient social organization can enable more flexible, secure, and collaborative networking. Aforementioned advantages make the social network a potential architecture for fog computing systems. In this paper, we design an architecture for social fog computing, in which the services of fog are provisioned based on “friend” relationships. To the best of our knowledge, this is the first attempt at an organized fog computing system-based social model. Meanwhile, social networking enhances the complexity and security risks of fog computing services, creating difficulties of security service recommendations in social fog computing. To address this, we propose a novel crowd sensing-enabling security service provisioning method to recommend security services accurately in social fog computing systems. Simulation results show the feasibilities and efficiency of the crowd sensing-enabling security service recommendation method for social fog computing systems. PMID:28758943
Efficient Smoothed Concomitant Lasso Estimation for High Dimensional Regression
NASA Astrophysics Data System (ADS)
Ndiaye, Eugene; Fercoq, Olivier; Gramfort, Alexandre; Leclère, Vincent; Salmon, Joseph
2017-10-01
In high dimensional settings, sparse structures are crucial for efficiency, both in term of memory, computation and performance. It is customary to consider ℓ 1 penalty to enforce sparsity in such scenarios. Sparsity enforcing methods, the Lasso being a canonical example, are popular candidates to address high dimension. For efficiency, they rely on tuning a parameter trading data fitting versus sparsity. For the Lasso theory to hold this tuning parameter should be proportional to the noise level, yet the latter is often unknown in practice. A possible remedy is to jointly optimize over the regression parameter as well as over the noise level. This has been considered under several names in the literature: Scaled-Lasso, Square-root Lasso, Concomitant Lasso estimation for instance, and could be of interest for uncertainty quantification. In this work, after illustrating numerical difficulties for the Concomitant Lasso formulation, we propose a modification we coined Smoothed Concomitant Lasso, aimed at increasing numerical stability. We propose an efficient and accurate solver leading to a computational cost no more expensive than the one for the Lasso. We leverage on standard ingredients behind the success of fast Lasso solvers: a coordinate descent algorithm, combined with safe screening rules to achieve speed efficiency, by eliminating early irrelevant features.
An efficient numerical model for multicomponent compressible flow in fractured porous media
NASA Astrophysics Data System (ADS)
Zidane, Ali; Firoozabadi, Abbas
2014-12-01
An efficient and accurate numerical model for multicomponent compressible single-phase flow in fractured media is presented. The discrete-fracture approach is used to model the fractures where the fracture entities are described explicitly in the computational domain. We use the concept of cross flow equilibrium in the fractures. This will allow large matrix elements in the neighborhood of the fractures and considerable speed up of the algorithm. We use an implicit finite volume (FV) scheme to solve the species mass balance equation in the fractures. This step avoids the use of Courant-Freidricks-Levy (CFL) condition and contributes to significant speed up of the code. The hybrid mixed finite element method (MFE) is used to solve for the velocity in both the matrix and the fractures coupled with the discontinuous Galerkin (DG) method to solve the species transport equations in the matrix. Four numerical examples are presented to demonstrate the robustness and efficiency of the proposed model. We show that the combination of the fracture cross-flow equilibrium and the implicit composition calculation in the fractures increase the computational speed 20-130 times in 2D. In 3D, one may expect even a higher computational efficiency.
Mostafavi, Kamal; Tutunea-Fatan, O Remus; Bordatchev, Evgueni V; Johnson, James A
2014-12-01
The strong advent of computer-assisted technologies experienced by the modern orthopedic surgery prompts for the expansion of computationally efficient techniques to be built on the broad base of computer-aided engineering tools that are readily available. However, one of the common challenges faced during the current developmental phase continues to remain the lack of reliable frameworks to allow a fast and precise conversion of the anatomical information acquired through computer tomography to a format that is acceptable to computer-aided engineering software. To address this, this study proposes an integrated and automatic framework capable to extract and then postprocess the original imaging data to a common planar and closed B-Spline representation. The core of the developed platform relies on the approximation of the discrete computer tomography data by means of an original two-step B-Spline fitting technique based on successive deformations of the control polygon. In addition to its rapidity and robustness, the developed fitting technique was validated to produce accurate representations that do not deviate by more than 0.2 mm with respect to alternate representations of the bone geometry that were obtained through different-contact-based-data acquisition or data processing methods. © IMechE 2014.
NASA Astrophysics Data System (ADS)
Dimitrakopoulos, Panagiotis
2018-03-01
The calculation of polytropic efficiencies is a very important task, especially during the development of new compression units, like compressor impellers, stages and stage groups. Such calculations are also crucial for the determination of the performance of a whole compressor. As processors and computational capacities have substantially been improved in the last years, the need for a new, rigorous, robust, accurate and at the same time standardized method merged, regarding the computation of the polytropic efficiencies, especially based on thermodynamics of real gases. The proposed method is based on the rigorous definition of the polytropic efficiency. The input consists of pressure and temperature values at the end points of the compression path (suction and discharge), for a given working fluid. The average relative error for the studied cases was 0.536 %. Thus, this high-accuracy method is proposed for efficiency calculations related with turbocompressors and their compression units, especially when they are operating at high power levels, for example in jet engines and high-power plants.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schaefer, Bastian; Goedecker, Stefan, E-mail: stefan.goedecker@unibas.ch
2016-07-21
An analysis of the network defined by the potential energy minima of multi-atomic systems and their connectivity via reaction pathways that go through transition states allows us to understand important characteristics like thermodynamic, dynamic, and structural properties. Unfortunately computing the transition states and reaction pathways in addition to the significant energetically low-lying local minima is a computationally demanding task. We here introduce a computationally efficient method that is based on a combination of the minima hopping global optimization method and the insight that uphill barriers tend to increase with increasing structural distances of the educt and product states. This methodmore » allows us to replace the exact connectivity information and transition state energies with alternative and approximate concepts. Without adding any significant additional cost to the minima hopping global optimization approach, this method allows us to generate an approximate network of the minima, their connectivity, and a rough measure for the energy needed for their interconversion. This can be used to obtain a first qualitative idea on important physical and chemical properties by means of a disconnectivity graph analysis. Besides the physical insight obtained by such an analysis, the gained knowledge can be used to make a decision if it is worthwhile or not to invest computational resources for an exact computation of the transition states and the reaction pathways. Furthermore it is demonstrated that the here presented method can be used for finding physically reasonable interconversion pathways that are promising input pathways for methods like transition path sampling or discrete path sampling.« less
NASA Technical Reports Server (NTRS)
Van Dalsem, W. R.; Steger, J. L.
1985-01-01
A simple and computationally efficient algorithm for solving the unsteady three-dimensional boundary-layer equations in the time-accurate or relaxation mode is presented. Results of the new algorithm are shown to be in quantitative agreement with detailed experimental data for flow over a swept infinite wing. The separated flow over a 6:1 ellipsoid at angle of attack, and the transonic flow over a finite-wing with shock-induced 'mushroom' separation are also computed and compared with available experimental data. It is concluded that complex, separated, three-dimensional viscous layers can be economically and routinely computed using a time-relaxation boundary-layer algorithm.
NASA Technical Reports Server (NTRS)
Hopkins, Dale A.
1992-01-01
The presentation gives a partial overview of research and development underway in the Structures Division of LeRC, which collectively is referred to as the Computational Structures Technology Program. The activities in the program are diverse and encompass four major categories: (1) composite materials and structures; (2) probabilistic analysis and reliability; (3) design optimization and expert systems; and (4) computational methods and simulation. The approach of the program is comprehensive and entails exploration of fundamental theories of structural mechanics to accurately represent the complex physics governing engine structural performance, formulation, and implementation of computational techniques and integrated simulation strategies to provide accurate and efficient solutions of the governing theoretical models by exploiting the emerging advances in computer technology, and validation and verification through numerical and experimental tests to establish confidence and define the qualities and limitations of the resulting theoretical models and computational solutions. The program comprises both in-house and sponsored research activities. The remainder of the presentation provides a sample of activities to illustrate the breadth and depth of the program and to demonstrate the accomplishments and benefits that have resulted.
Increasing the computational efficient of digital cross correlation by a vectorization method
NASA Astrophysics Data System (ADS)
Chang, Ching-Yuan; Ma, Chien-Ching
2017-08-01
This study presents a vectorization method for use in MATLAB programming aimed at increasing the computational efficiency of digital cross correlation in sound and images, resulting in a speedup of 6.387 and 36.044 times compared with performance values obtained from looped expression. This work bridges the gap between matrix operations and loop iteration, preserving flexibility and efficiency in program testing. This paper uses numerical simulation to verify the speedup of the proposed vectorization method as well as experiments to measure the quantitative transient displacement response subjected to dynamic impact loading. The experiment involved the use of a high speed camera as well as a fiber optic system to measure the transient displacement in a cantilever beam under impact from a steel ball. Experimental measurement data obtained from the two methods are in excellent agreement in both the time and frequency domain, with discrepancies of only 0.68%. Numerical and experiment results demonstrate the efficacy of the proposed vectorization method with regard to computational speed in signal processing and high precision in the correlation algorithm. We also present the source code with which to build MATLAB-executable functions on Windows as well as Linux platforms, and provide a series of examples to demonstrate the application of the proposed vectorization method.
Evaluation of Preduster in Cement Industry Based on Computational Fluid Dynamic
NASA Astrophysics Data System (ADS)
Septiani, E. L.; Widiyastuti, W.; Djafaar, A.; Ghozali, I.; Pribadi, H. M.
2017-10-01
Ash-laden hot air from clinker in cement industry is being used to reduce water contain in coal, however it may contain large amount of ash even though it was treated by a preduster. This study investigated preduster performance as a cyclone separator in the cement industry by Computational Fluid Dynamic method. In general, the best performance of cyclone is it have relatively high efficiency with the low pressure drop. The most accurate and simple turbulence model, Reynold Average Navier Stokes (RANS), standard k-ε, and combination with Lagrangian model as particles tracking model were used to solve the problem. The measurement in simulation result are flow pattern in the cyclone, pressure outlet and collection efficiency of preduster. The applied model well predicted by comparing with the most accurate empirical model and pressure outlet in experimental measurement.
Extracting Time-Accurate Acceleration Vectors From Nontrivial Accelerometer Arrangements.
Franck, Jennifer A; Blume, Janet; Crisco, Joseph J; Franck, Christian
2015-09-01
Sports-related concussions are of significant concern in many impact sports, and their detection relies on accurate measurements of the head kinematics during impact. Among the most prevalent recording technologies are videography, and more recently, the use of single-axis accelerometers mounted in a helmet, such as the HIT system. Successful extraction of the linear and angular impact accelerations depends on an accurate analysis methodology governed by the equations of motion. Current algorithms are able to estimate the magnitude of acceleration and hit location, but make assumptions about the hit orientation and are often limited in the position and/or orientation of the accelerometers. The newly formulated algorithm presented in this manuscript accurately extracts the full linear and rotational acceleration vectors from a broad arrangement of six single-axis accelerometers directly from the governing set of kinematic equations. The new formulation linearizes the nonlinear centripetal acceleration term with a finite-difference approximation and provides a fast and accurate solution for all six components of acceleration over long time periods (>250 ms). The approximation of the nonlinear centripetal acceleration term provides an accurate computation of the rotational velocity as a function of time and allows for reconstruction of a multiple-impact signal. Furthermore, the algorithm determines the impact location and orientation and can distinguish between glancing, high rotational velocity impacts, or direct impacts through the center of mass. Results are shown for ten simulated impact locations on a headform geometry computed with three different accelerometer configurations in varying degrees of signal noise. Since the algorithm does not require simplifications of the actual impacted geometry, the impact vector, or a specific arrangement of accelerometer orientations, it can be easily applied to many impact investigations in which accurate kinematics need
A time accurate prediction of the viscous flow in a turbine stage including a rotor in motion
NASA Astrophysics Data System (ADS)
Shavalikul, Akamol
in the relative frame of reference; the boundary conditions for the computations were obtained from inlet flow measurements performed in the AFTRF. A complete turbine stage, including an NGV and a rotor row was simulated using the RANS solver with the SST kappa -- o turbulence model, with two different computational models for the interface between the rotating component and the stationary component. The first interface model, the circumferentially averaged mixing plane model, was solved for a fixed position of the rotor blades relative to the NGV in the stationary frame of reference. The information transferred between the NGV and rotor domains is obtained by averaging across the entire interface. The quasi-steady state flow characteristics of the AFTRF can be obtained from this interface model. After the model was validated with the existing experimental data, this model was not only used to investigate the flow characteristics in the turbine stage but also the effects of using pressure side rotor tip extensions. The tip leakage flow fields simulated from this model and from the linear cascade model show similar trends. More detailed understanding of unsteady characteristics of a turbine flow field can be obtained using the second type of interface model, the time accurate sliding mesh model. The potential flow interactions, wake characteristics, their effects on secondary flow formation, and the wake mixing process in a rotor passage were examined using this model. Furthermore, turbine stage efficiency and effects of tip clearance height on the turbine stage efficiency were also investigated. A comparison between the results from the circumferential average model and the time accurate flow model results is presented. It was found that the circumferential average model cannot accurately simulate flow interaction characteristics on the interface plane between the NGV trailing edge and the rotor leading edge. However, the circumferential average model does give
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Dejun, E-mail: dejun.lin@gmail.com
2015-09-21
Accurate representation of intermolecular forces has been the central task of classical atomic simulations, known as molecular mechanics. Recent advancements in molecular mechanics models have put forward the explicit representation of permanent and/or induced electric multipole (EMP) moments. The formulas developed so far to calculate EMP interactions tend to have complicated expressions, especially in Cartesian coordinates, which can only be applied to a specific kernel potential function. For example, one needs to develop a new formula each time a new kernel function is encountered. The complication of these formalisms arises from an intriguing and yet obscured mathematical relation between themore » kernel functions and the gradient operators. Here, I uncover this relation via rigorous derivation and find that the formula to calculate EMP interactions is basically invariant to the potential kernel functions as long as they are of the form f(r), i.e., any Green’s function that depends on inter-particle distance. I provide an algorithm for efficient evaluation of EMP interaction energies, forces, and torques for any kernel f(r) up to any arbitrary rank of EMP moments in Cartesian coordinates. The working equations of this algorithm are essentially the same for any kernel f(r). Recently, a few recursive algorithms were proposed to calculate EMP interactions. Depending on the kernel functions, the algorithm here is about 4–16 times faster than these algorithms in terms of the required number of floating point operations and is much more memory efficient. I show that it is even faster than a theoretically ideal recursion scheme, i.e., one that requires 1 floating point multiplication and 1 addition per recursion step. This algorithm has a compact vector-based expression that is optimal for computer programming. The Cartesian nature of this algorithm makes it fit easily into modern molecular simulation packages as compared with spherical coordinate
TU-AB-BRA-02: An Efficient Atlas-Based Synthetic CT Generation Method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Han, X
2016-06-15
Purpose: A major obstacle for MR-only radiotherapy is the need to generate an accurate synthetic CT (sCT) from MR image(s) of a patient for the purposes of dose calculation and DRR generation. We propose here an accurate and efficient atlas-based sCT generation method, which has a computation speed largely independent of the number of atlases used. Methods: Atlas-based sCT generation requires a set of atlases with co-registered CT and MR images. Unlike existing methods that align each atlas to the new patient independently, we first create an average atlas and pre-align every atlas to the average atlas space. When amore » new patient arrives, we compute only one deformable image registration to align the patient MR image to the average atlas, which indirectly aligns the patient to all pre-aligned atlases. A patch-based non-local weighted fusion is performed in the average atlas space to generate the sCT for the patient, which is then warped back to the original patient space. We further adapt a PatchMatch algorithm that can quickly find top matches between patches of the patient image and all atlas images, which makes the patch fusion step also independent of the number of atlases used. Results: Nineteen brain tumour patients with both CT and T1-weighted MR images are used as testing data and a leave-one-out validation is performed. Each sCT generated is compared against the original CT image of the same patient on a voxel-by-voxel basis. The proposed method produces a mean absolute error (MAE) of 98.6±26.9 HU overall. The accuracy is comparable with a conventional implementation scheme, but the computation time is reduced from over an hour to four minutes. Conclusion: An average atlas space patch fusion approach can produce highly accurate sCT estimations very efficiently. Further validation on dose computation accuracy and using a larger patient cohort is warranted. The author is a full time employee of Elekta, Inc.« less
Efficient and Robust Optimization for Building Energy Simulation.
Pourarian, Shokouh; Kearsley, Anthony; Wen, Jin; Pertzborn, Amanda
2016-06-15
Efficiently, robustly and accurately solving large sets of structured, non-linear algebraic and differential equations is one of the most computationally expensive steps in the dynamic simulation of building energy systems. Here, the efficiency, robustness and accuracy of two commonly employed solution methods are compared. The comparison is conducted using the HVACSIM+ software package, a component based building system simulation tool. The HVACSIM+ software presently employs Powell's Hybrid method to solve systems of nonlinear algebraic equations that model the dynamics of energy states and interactions within buildings. It is shown here that the Powell's method does not always converge to a solution. Since a myriad of other numerical methods are available, the question arises as to which method is most appropriate for building energy simulation. This paper finds considerable computational benefits result from replacing the Powell's Hybrid method solver in HVACSIM+ with a solver more appropriate for the challenges particular to numerical simulations of buildings. Evidence is provided that a variant of the Levenberg-Marquardt solver has superior accuracy and robustness compared to the Powell's Hybrid method presently used in HVACSIM+.
Computer controlled fluorometer device and method of operating same
Kolber, Z.; Falkowski, P.
1990-07-17
A computer controlled fluorometer device and method of operating same, said device being made to include a pump flash source and a probe flash source and one or more sample chambers in combination with a light condenser lens system and associated filters and reflectors and collimators, as well as signal conditioning and monitoring means and a programmable computer means and a software programmable source of background irradiance that is operable according to the method of the invention to rapidly, efficiently and accurately measure photosynthetic activity by precisely monitoring and recording changes in fluorescence yield produced by a controlled series of predetermined cycles of probe and pump flashes from the respective probe and pump sources that are controlled by the computer means. 13 figs.
Computer controlled fluorometer device and method of operating same
Kolber, Zbigniew; Falkowski, Paul
1990-01-01
A computer controlled fluorometer device and method of operating same, said device being made to include a pump flash source and a probe flash source and one or more sample chambers in combination with a light condenser lens system and associated filters and reflectors and collimators, as well as signal conditioning and monitoring means and a programmable computer means and a software programmable source of background irradiance that is operable according to the method of the invention to rapidly, efficiently and accurately measure photosynthetic activity by precisely monitoring and recording changes in fluorescence yield produced by a controlled series of predetermined cycles of probe and pump flashes from the respective probe and pump sources that are controlled by the computer means.
Kaya, Mine; Hajimirza, Shima
2018-05-25
This paper uses surrogate modeling for very fast design of thin film solar cells with improved solar-to-electricity conversion efficiency. We demonstrate that the wavelength-specific optical absorptivity of a thin film multi-layered amorphous-silicon-based solar cell can be modeled accurately with Neural Networks and can be efficiently approximated as a function of cell geometry and wavelength. Consequently, the external quantum efficiency can be computed by averaging surrogate absorption and carrier recombination contributions over the entire irradiance spectrum in an efficient way. Using this framework, we optimize a multi-layer structure consisting of ITO front coating, metallic back-reflector and oxide layers for achieving maximum efficiency. Our required computation time for an entire model fitting and optimization is 5 to 20 times less than the best previous optimization results based on direct Finite Difference Time Domain (FDTD) simulations, therefore proving the value of surrogate modeling. The resulting optimization solution suggests at least 50% improvement in the external quantum efficiency compared to bare silicon, and 25% improvement compared to a random design.
2013-01-01
Background Sentinel node biopsy often results in the identification and removal of multiple nodes as sentinel nodes, although most of these nodes could be non-sentinel nodes. This study investigated whether computed tomography-lymphography (CT-LG) can distinguish sentinel nodes from non-sentinel nodes and whether sentinel nodes identified by CT-LG can accurately stage the axilla in patients with breast cancer. Methods This study included 184 patients with breast cancer and clinically negative nodes. Contrast agent was injected interstitially. The location of sentinel nodes was marked on the skin surface using a CT laser light navigator system. Lymph nodes located just under the marks were first removed as sentinel nodes. Then, all dyed nodes or all hot nodes were removed. Results The mean number of sentinel nodes identified by CT-LG was significantly lower than that of dyed and/or hot nodes removed (1.1 vs 1.8, p <0.0001). Twenty-three (12.5%) patients had ≥2 sentinel nodes identified by CT-LG removed, whereas 94 (51.1%) of patients had ≥2 dyed and/or hot nodes removed (p <0.0001). Pathological evaluation demonstrated that 47 (25.5%) of 184 patients had metastasis to at least one node. All 47 patients demonstrated metastases to at least one of the sentinel nodes identified by CT-LG. Conclusions CT-LG can distinguish sentinel nodes from non-sentinel nodes, and sentinel nodes identified by CT-LG can accurately stage the axilla in patients with breast cancer. Successful identification of sentinel nodes using CT-LG may facilitate image-based diagnosis of metastasis, possibly leading to the omission of sentinel node biopsy. PMID:24321242
Computational simulation and aerodynamic sensitivity analysis of film-cooled turbines
NASA Astrophysics Data System (ADS)
Massa, Luca
A computational tool is developed for the time accurate sensitivity analysis of the stage performance of hot gas, unsteady turbine components. An existing turbomachinery internal flow solver is adapted to the high temperature environment typical of the hot section of jet engines. A real gas model and film cooling capabilities are successfully incorporated in the software. The modifications to the existing algorithm are described; both the theoretical model and the numerical implementation are validated. The accuracy of the code in evaluating turbine stage performance is tested using a turbine geometry typical of the last stage of aeronautical jet engines. The results of the performance analysis show that the predictions differ from the experimental data by less than 3%. A reliable grid generator, applicable to the domain discretization of the internal flow field of axial flow turbine is developed. A sensitivity analysis capability is added to the flow solver, by rendering it able to accurately evaluate the derivatives of the time varying output functions. The complex Taylor's series expansion (CTSE) technique is reviewed. Two of them are used to demonstrate the accuracy and time dependency of the differentiation process. The results are compared with finite differences (FD) approximations. The CTSE is more accurate than the FD, but less efficient. A "black box" differentiation of the source code, resulting from the automated application of the CTSE, generates high fidelity sensitivity algorithms, but with low computational efficiency and high memory requirements. New formulations of the CTSE are proposed and applied. Selective differentiation of the method for solving the non-linear implicit residual equation leads to sensitivity algorithms with the same accuracy but improved run time. The time dependent sensitivity derivatives are computed in run times comparable to the ones required by the FD approach.
Hierarchy of Efficiently Computable and Faithful Lower Bounds to Quantum Discord
NASA Astrophysics Data System (ADS)
Piani, Marco
2016-08-01
Quantum discord expresses a fundamental nonclassicality of correlations that is more general than entanglement, but that, in its standard definition, is not easily evaluated. We derive a hierarchy of computationally efficient lower bounds to the standard quantum discord. Every nontrivial element of the hierarchy constitutes by itself a valid discordlike measure, based on a fundamental feature of quantum correlations: their lack of shareability. Our approach emphasizes how the difference between entanglement and discord depends on whether shareability is intended as a static property or as a dynamical process.
NASA Technical Reports Server (NTRS)
Walston, W. H., Jr.
1986-01-01
The comparative computational efficiencies of the finite element (FEM), boundary element (BEM), and hybrid boundary element-finite element (HVFEM) analysis techniques are evaluated for representative bounded domain interior and unbounded domain exterior problems in elastostatics. Computational efficiency is carefully defined in this study as the computer time required to attain a specified level of solution accuracy. The study found the FEM superior to the BEM for the interior problem, while the reverse was true for the exterior problem. The hybrid analysis technique was found to be comparable or superior to both the FEM and BEM for both the interior and exterior problems.
Efficient Computation of Info-Gap Robustness for Finite Element Models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stull, Christopher J.; Hemez, Francois M.; Williams, Brian J.
2012-07-05
A recent research effort at LANL proposed info-gap decision theory as a framework by which to measure the predictive maturity of numerical models. Info-gap theory explores the trade-offs between accuracy, that is, the extent to which predictions reproduce the physical measurements, and robustness, that is, the extent to which predictions are insensitive to modeling assumptions. Both accuracy and robustness are necessary to demonstrate predictive maturity. However, conducting an info-gap analysis can present a formidable challenge, from the standpoint of the required computational resources. This is because a robustness function requires the resolution of multiple optimization problems. This report offers anmore » alternative, adjoint methodology to assess the info-gap robustness of Ax = b-like numerical models solved for a solution x. Two situations that can arise in structural analysis and design are briefly described and contextualized within the info-gap decision theory framework. The treatments of the info-gap problems, using the adjoint methodology are outlined in detail, and the latter problem is solved for four separate finite element models. As compared to statistical sampling, the proposed methodology offers highly accurate approximations of info-gap robustness functions for the finite element models considered in the report, at a small fraction of the computational cost. It is noted that this report considers only linear systems; a natural follow-on study would extend the methodologies described herein to include nonlinear systems.« less
Time-Accurate Computations of Isolated Circular Synthetic Jets in Crossflow
NASA Technical Reports Server (NTRS)
Rumsey, C. L.; Schaeffler, N. W.; Milanovic, I. M.; Zaman, K. B. M. Q.
2007-01-01
Results from unsteady Reynolds-averaged Navier-Stokes computations are described for two different synthetic jet flows issuing into a turbulent boundary layer crossflow through a circular orifice. In one case the jet effect is mostly contained within the boundary layer, while in the other case the jet effect extends beyond the boundary layer edge. Both cases have momentum flux ratios less than 2. Several numerical parameters are investigated, and some lessons learned regarding the CFD methods for computing these types of flow fields are summarized. Results in both cases are compared to experiment.
An Efficient and Accurate Genetic Algorithm for Backcalculation of Flexible Pavement Layer Moduli
DOT National Transportation Integrated Search
2012-12-01
The importance of a backcalculation method in the analysis of elastic modulus in pavement engineering has been : known for decades. Despite many backcalculation programs employing different backcalculation procedures and : algorithms, accurate invers...
Hypersonic Shock Wave Computations Using the Generalized Boltzmann Equation
NASA Astrophysics Data System (ADS)
Agarwal, Ramesh; Chen, Rui; Cheremisin, Felix G.
2006-11-01
Hypersonic shock structure in diatomic gases is computed by solving the Generalized Boltzmann Equation (GBE), where the internal and translational degrees of freedom are considered in the framework of quantum and classical mechanics respectively [1]. The computational framework available for the standard Boltzmann equation [2] is extended by including both the rotational and vibrational degrees of freedom in the GBE. There are two main difficulties encountered in computation of high Mach number flows of diatomic gases with internal degrees of freedom: (1) a large velocity domain is needed for accurate numerical description of the distribution function resulting in enormous computational effort in calculation of the collision integral, and (2) about 50 energy levels are needed for accurate representation of the rotational spectrum of the gas. Our methodology addresses these problems, and as a result the efficiency of calculations has increased by several orders of magnitude. The code has been validated by computing the shock structure in Nitrogen for Mach numbers up to 25 including the translational and rotational degrees of freedom. [1] Beylich, A., ``An Interlaced System for Nitrogen Gas,'' Proc. of CECAM Workshop, ENS de Lyon, France, 2000. [2] Cheremisin, F., ``Solution of the Boltzmann Kinetic Equation for High Speed Flows of a Rarefied Gas,'' Proc. of the 24th Int. Symp. on Rarefied Gas Dynamics, Bari, Italy, 2004.
Chen, Nan; Majda, Andrew J
2017-12-05
Solving the Fokker-Planck equation for high-dimensional complex dynamical systems is an important issue. Recently, the authors developed efficient statistically accurate algorithms for solving the Fokker-Planck equations associated with high-dimensional nonlinear turbulent dynamical systems with conditional Gaussian structures, which contain many strong non-Gaussian features such as intermittency and fat-tailed probability density functions (PDFs). The algorithms involve a hybrid strategy with a small number of samples [Formula: see text], where a conditional Gaussian mixture in a high-dimensional subspace via an extremely efficient parametric method is combined with a judicious Gaussian kernel density estimation in the remaining low-dimensional subspace. In this article, two effective strategies are developed and incorporated into these algorithms. The first strategy involves a judicious block decomposition of the conditional covariance matrix such that the evolutions of different blocks have no interactions, which allows an extremely efficient parallel computation due to the small size of each individual block. The second strategy exploits statistical symmetry for a further reduction of [Formula: see text] The resulting algorithms can efficiently solve the Fokker-Planck equation with strongly non-Gaussian PDFs in much higher dimensions even with orders in the millions and thus beat the curse of dimension. The algorithms are applied to a [Formula: see text]-dimensional stochastic coupled FitzHugh-Nagumo model for excitable media. An accurate recovery of both the transient and equilibrium non-Gaussian PDFs requires only [Formula: see text] samples! In addition, the block decomposition facilitates the algorithms to efficiently capture the distinct non-Gaussian features at different locations in a [Formula: see text]-dimensional two-layer inhomogeneous Lorenz 96 model, using only [Formula: see text] samples. Copyright © 2017 the Author(s). Published by PNAS.
Assessment of computational prediction of tail buffeting
NASA Technical Reports Server (NTRS)
Edwards, John W.
1990-01-01
Assessments of the viability of computational methods and the computer resource requirements for the prediction of tail buffeting are made. Issues involved in the use of Euler and Navier-Stokes equations in modeling vortex-dominated and buffet flows are discussed and the requirement for sufficient grid density to allow accurate, converged calculations is stressed. Areas in need of basic fluid dynamics research are highlighted: vorticity convection, vortex breakdown, dynamic turbulence modeling for free shear layers, unsteady flow separation for moderately swept, rounded leading-edge wings, vortex flows about wings at high subsonic speeds. An estimate of the computer run time for a buffeting response calculation for a full span F-15 aircraft indicates that an improvement in computer and/or algorithm efficiency of three orders of magnitude is needed to enable routine use of such methods. Attention is also drawn to significant uncertainties in the estimates, in particular with regard to nonlinearities contained within the modeling and the question of the repeatability or randomness of buffeting response.
Accurate Evaluation Method of Molecular Binding Affinity from Fluctuation Frequency
NASA Astrophysics Data System (ADS)
Hoshino, Tyuji; Iwamoto, Koji; Ode, Hirotaka; Ohdomari, Iwao
2008-05-01
Exact estimation of the molecular binding affinity is significantly important for drug discovery. The energy calculation is a direct method to compute the strength of the interaction between two molecules. This energetic approach is, however, not accurate enough to evaluate a slight difference in binding affinity when distinguishing a prospective substance from dozens of candidates for medicine. Hence more accurate estimation of drug efficacy in a computer is currently demanded. Previously we proposed a concept of estimating molecular binding affinity, focusing on the fluctuation at an interface between two molecules. The aim of this paper is to demonstrate the compatibility between the proposed computational technique and experimental measurements, through several examples for computer simulations of an association of human immunodeficiency virus type-1 (HIV-1) protease and its inhibitor (an example for a drug-enzyme binding), a complexation of an antigen and its antibody (an example for a protein-protein binding), and a combination of estrogen receptor and its ligand chemicals (an example for a ligand-receptor binding). The proposed affinity estimation has proven to be a promising technique in the advanced stage of the discovery and the design of drugs.
High accurate time system of the Low Latitude Meridian Circle.
NASA Astrophysics Data System (ADS)
Yang, Jing; Wang, Feng; Li, Zhiming
In order to obtain the high accurate time signal for the Low Latitude Meridian Circle (LLMC), a new GPS accurate time system is developed which include GPS, 1 MC frequency source and self-made clock system. The second signal of GPS is synchronously used in the clock system and information can be collected by a computer automatically. The difficulty of the cancellation of the time keeper can be overcomed by using this system.
Computationally efficient method for optical simulation of solar cells and their applications
NASA Astrophysics Data System (ADS)
Semenikhin, I.; Zanuccoli, M.; Fiegna, C.; Vyurkov, V.; Sangiorgi, E.
2013-01-01
This paper presents two novel implementations of the Differential method to solve the Maxwell equations in nanostructured optoelectronic solid state devices. The first proposed implementation is based on an improved and computationally efficient T-matrix formulation that adopts multiple-precision arithmetic to tackle the numerical instability problem which arises due to evanescent modes. The second implementation adopts the iterative approach that allows to achieve low computational complexity O(N logN) or better. The proposed algorithms may work with structures with arbitrary spatial variation of the permittivity. The developed two-dimensional numerical simulator is applied to analyze the dependence of the absorption characteristics of a thin silicon slab on the morphology of the front interface and on the angle of incidence of the radiation with respect to the device surface.
On Efficient Multigrid Methods for Materials Processing Flows with Small Particles
NASA Technical Reports Server (NTRS)
Thomas, James (Technical Monitor); Diskin, Boris; Harik, VasylMichael
2004-01-01
Multiscale modeling of materials requires simulations of multiple levels of structural hierarchy. The computational efficiency of numerical methods becomes a critical factor for simulating large physical systems with highly desperate length scales. Multigrid methods are known for their superior efficiency in representing/resolving different levels of physical details. The efficiency is achieved by employing interactively different discretizations on different scales (grids). To assist optimization of manufacturing conditions for materials processing with numerous particles (e.g., dispersion of particles, controlling flow viscosity and clusters), a new multigrid algorithm has been developed for a case of multiscale modeling of flows with small particles that have various length scales. The optimal efficiency of the algorithm is crucial for accurate predictions of the effect of processing conditions (e.g., pressure and velocity gradients) on the local flow fields that control the formation of various microstructures or clusters.
NASA Astrophysics Data System (ADS)
Jones, Adam; Utyuzhnikov, Sergey
2017-08-01
Turbulent flow in a ribbed channel is studied using an efficient near-wall domain decomposition (NDD) method. The NDD approach is formulated by splitting the computational domain into an inner and outer region, with an interface boundary between the two. The computational mesh covers the outer region, and the flow in this region is solved using the open-source CFD code Code_Saturne with special boundary conditions on the interface boundary, called interface boundary conditions (IBCs). The IBCs are of Robin type and incorporate the effect of the inner region on the flow in the outer region. IBCs are formulated in terms of the distance from the interface boundary to the wall in the inner region. It is demonstrated that up to 90% of the region between the ribs in the ribbed passage can be removed from the computational mesh with an error on the friction factor within 2.5%. In addition, computations with NDD are faster than computations based on low Reynolds number (LRN) models by a factor of five. Different rib heights can be studied with the same mesh in the outer region without affecting the accuracy of the friction factor. This is tested with six different rib heights in an example of a design optimisation study. It is found that the friction factors computed with NDD are almost identical to the fully-resolved results. When used for inverse problems, NDD is considerably more efficient than LRN computations because only one computation needs to be performed and only one mesh needs to be generated.
High-order computational fluid dynamics tools for aircraft design
Wang, Z. J.
2014-01-01
Most forecasts predict an annual airline traffic growth rate between 4.5 and 5% in the foreseeable future. To sustain that growth, the environmental impact of aircraft cannot be ignored. Future aircraft must have much better fuel economy, dramatically less greenhouse gas emissions and noise, in addition to better performance. Many technical breakthroughs must take place to achieve the aggressive environmental goals set up by governments in North America and Europe. One of these breakthroughs will be physics-based, highly accurate and efficient computational fluid dynamics and aeroacoustics tools capable of predicting complex flows over the entire flight envelope and through an aircraft engine, and computing aircraft noise. Some of these flows are dominated by unsteady vortices of disparate scales, often highly turbulent, and they call for higher-order methods. As these tools will be integral components of a multi-disciplinary optimization environment, they must be efficient to impact design. Ultimately, the accuracy, efficiency, robustness, scalability and geometric flexibility will determine which methods will be adopted in the design process. This article explores these aspects and identifies pacing items. PMID:25024419
Fast Ss-Ilm a Computationally Efficient Algorithm to Discover Socially Important Locations
NASA Astrophysics Data System (ADS)
Dokuz, A. S.; Celik, M.
2017-11-01
Socially important locations are places which are frequently visited by social media users in their social media lifetime. Discovering socially important locations provide several valuable information about user behaviours on social media networking sites. However, discovering socially important locations are challenging due to data volume and dimensions, spatial and temporal calculations, location sparseness in social media datasets, and inefficiency of current algorithms. In the literature, several studies are conducted to discover important locations, however, the proposed approaches do not work in computationally efficient manner. In this study, we propose Fast SS-ILM algorithm by modifying the algorithm of SS-ILM to mine socially important locations efficiently. Experimental results show that proposed Fast SS-ILM algorithm decreases execution time of socially important locations discovery process up to 20 %.
Algorithms for Efficient Computation of Transfer Functions for Large Order Flexible Systems
NASA Technical Reports Server (NTRS)
Maghami, Peiman G.; Giesy, Daniel P.
1998-01-01
An efficient and robust computational scheme is given for the calculation of the frequency response function of a large order, flexible system implemented with a linear, time invariant control system. Advantage is taken of the highly structured sparsity of the system matrix of the plant based on a model of the structure using normal mode coordinates. The computational time per frequency point of the new computational scheme is a linear function of system size, a significant improvement over traditional, still-matrix techniques whose computational times per frequency point range from quadratic to cubic functions of system size. This permits the practical frequency domain analysis of systems of much larger order than by traditional, full-matrix techniques. Formulations are given for both open- and closed-loop systems. Numerical examples are presented showing the advantages of the present formulation over traditional approaches, both in speed and in accuracy. Using a model with 703 structural modes, the present method was up to two orders of magnitude faster than a traditional method. The present method generally showed good to excellent accuracy throughout the range of test frequencies, while traditional methods gave adequate accuracy for lower frequencies, but generally deteriorated in performance at higher frequencies with worst case errors being many orders of magnitude times the correct values.
An efficient unstructured WENO method for supersonic reactive flows
NASA Astrophysics Data System (ADS)
Zhao, Wen-Geng; Zheng, Hong-Wei; Liu, Feng-Jun; Shi, Xiao-Tian; Gao, Jun; Hu, Ning; Lv, Meng; Chen, Si-Cong; Zhao, Hong-Da
2018-03-01
An efficient high-order numerical method for supersonic reactive flows is proposed in this article. The reactive source term and convection term are solved separately by splitting scheme. In the reaction step, an adaptive time-step method is presented, which can improve the efficiency greatly. In the convection step, a third-order accurate weighted essentially non-oscillatory (WENO) method is adopted to reconstruct the solution in the unstructured grids. Numerical results show that our new method can capture the correct propagation speed of the detonation wave exactly even in coarse grids, while high order accuracy can be achieved in the smooth region. In addition, the proposed adaptive splitting method can reduce the computational cost greatly compared with the traditional splitting method.
NASA Technical Reports Server (NTRS)
Lindner, Bernhard Lee; Ackerman, Thomas P.; Pollack, James B.
1990-01-01
CO2 comprises 95 pct. of the composition of the Martian atmosphere. However, the Martian atmosphere also has a high aerosol content. Dust particles vary from less than 0.2 to greater than 3.0. CO2 is an active absorber and emitter in near IR and IR wavelengths; the near IR absorption bands of CO2 provide significant heating of the atmosphere, and the 15 micron band provides rapid cooling. Including both CO2 and aerosol radiative transfer simultaneously in a model is difficult. Aerosol radiative transfer requires a multiple scattering code, while CO2 radiative transfer must deal with complex wavelength structure. As an alternative to the pure atmosphere treatment in most models which causes inaccuracies, a treatment was developed called the exponential sum or k distribution approximation. The chief advantage of the exponential sum approach is that the integration over k space of f(k) can be computed more quickly than the integration of k sub upsilon over frequency. The exponential sum approach is superior to the photon path distribution and emissivity techniques for dusty conditions. This study was the first application of the exponential sum approach to Martian conditions.
Computer-assisted map projection research
Snyder, John Parr
1985-01-01
Computers have opened up areas of map projection research which were previously too complicated to utilize, for example, using a least-squares fit to a very large number of points. One application has been in the efficient transfer of data between maps on different projections. While the transfer of moderate amounts of data is satisfactorily accomplished using the analytical map projection formulas, polynomials are more efficient for massive transfers. Suitable coefficients for the polynomials may be determined more easily for general cases using least squares instead of Taylor series. A second area of research is in the determination of a map projection fitting an unlabeled map, so that accurate data transfer can take place. The computer can test one projection after another, and include iteration where required. A third area is in the use of least squares to fit a map projection with optimum parameters to the region being mapped, so that distortion is minimized. This can be accomplished for standard conformal, equalarea, or other types of projections. Even less distortion can result if complex transformations of conformal projections are utilized. This bulletin describes several recent applications of these principles, as well as historical usage and background.
NASA Astrophysics Data System (ADS)
Jimenez, Edward S.; Goodman, Eric L.; Park, Ryeojin; Orr, Laurel J.; Thompson, Kyle R.
2014-09-01
This paper will investigate energy-efficiency for various real-world industrial computed-tomography reconstruction algorithms, both CPU- and GPU-based implementations. This work shows that the energy required for a given reconstruction is based on performance and problem size. There are many ways to describe performance and energy efficiency, thus this work will investigate multiple metrics including performance-per-watt, energy-delay product, and energy consumption. This work found that irregular GPU-based approaches1 realized tremendous savings in energy consumption when compared to CPU implementations while also significantly improving the performance-per- watt and energy-delay product metrics. Additional energy savings and other metric improvement was realized on the GPU-based reconstructions by improving storage I/O by implementing a parallel MIMD-like modularization of the compute and I/O tasks.
Energy-efficient quantum computing
NASA Astrophysics Data System (ADS)
Ikonen, Joni; Salmilehto, Juha; Möttönen, Mikko
2017-04-01
In the near future, one of the major challenges in the realization of large-scale quantum computers operating at low temperatures is the management of harmful heat loads owing to thermal conduction of cabling and dissipation at cryogenic components. This naturally raises the question that what are the fundamental limitations of energy consumption in scalable quantum computing. In this work, we derive the greatest lower bound for the gate error induced by a single application of a bosonic drive mode of given energy. Previously, such an error type has been considered to be inversely proportional to the total driving power, but we show that this limitation can be circumvented by introducing a qubit driving scheme which reuses and corrects drive pulses. Specifically, our method serves to reduce the average energy consumption per gate operation without increasing the average gate error. Thus our work shows that precise, scalable control of quantum systems can, in principle, be implemented without the introduction of excessive heat or decoherence.
NASA Astrophysics Data System (ADS)
Cai, Jiaxiang; Liang, Hua; Zhang, Chun
2018-06-01
Based on the multi-symplectic Hamiltonian formula of the generalized Rosenau-type equation, a multi-symplectic scheme and an energy-preserving scheme are proposed. To improve the accuracy of the solution, we apply the composition technique to the obtained schemes to develop high-order schemes which are also multi-symplectic and energy-preserving respectively. Discrete fast Fourier transform makes a significant improvement to the computational efficiency of schemes. Numerical results verify that all the proposed schemes have satisfactory performance in providing accurate solution and preserving the discrete mass and energy invariants. Numerical results also show that although each basic time step is divided into several composition steps, the computational efficiency of the composition schemes is much higher than that of the non-composite schemes.
Characterizing Deep Brain Stimulation effects in computationally efficient neural network models.
Latteri, Alberta; Arena, Paolo; Mazzone, Paolo
2011-04-15
Recent studies on the medical treatment of Parkinson's disease (PD) led to the introduction of the so called Deep Brain Stimulation (DBS) technique. This particular therapy allows to contrast actively the pathological activity of various Deep Brain structures, responsible for the well known PD symptoms. This technique, frequently joined to dopaminergic drugs administration, replaces the surgical interventions implemented to contrast the activity of specific brain nuclei, called Basal Ganglia (BG). This clinical protocol gave the possibility to analyse and inspect signals measured from the electrodes implanted into the deep brain regions. The analysis of these signals led to the possibility to study the PD as a specific case of dynamical synchronization in biological neural networks, with the advantage to apply the theoretical analysis developed in such scientific field to find efficient treatments to face with this important disease. Experimental results in fact show that the PD neurological diseases are characterized by a pathological signal synchronization in BG. Parkinsonian tremor, for example, is ascribed to be caused by neuron populations of the Thalamic and Striatal structures that undergo an abnormal synchronization. On the contrary, in normal conditions, the activity of the same neuron populations do not appear to be correlated and synchronized. To study in details the effect of the stimulation signal on a pathological neural medium, efficient models of these neural structures were built, which are able to show, without any external input, the intrinsic properties of a pathological neural tissue, mimicking the BG synchronized dynamics.We start considering a model already introduced in the literature to investigate the effects of electrical stimulation on pathologically synchronized clusters of neurons. This model used Morris Lecar type neurons. This neuron model, although having a high level of biological plausibility, requires a large computational effort
Second-order accurate nonoscillatory schemes for scalar conservation laws
NASA Technical Reports Server (NTRS)
Huynh, Hung T.
1989-01-01
Explicit finite difference schemes for the computation of weak solutions of nonlinear scalar conservation laws is presented and analyzed. These schemes are uniformly second-order accurate and nonoscillatory in the sense that the number of extrema of the discrete solution is not increasing in time.
Efficient free energy calculations of quantum systems through computer simulations
NASA Astrophysics Data System (ADS)
Antonelli, Alex; Ramirez, Rafael; Herrero, Carlos; Hernandez, Eduardo
2009-03-01
In general, the classical limit is assumed in computer simulation calculations of free energy. This approximation, however, is not justifiable for a class of systems in which quantum contributions for the free energy cannot be neglected. The inclusion of quantum effects is important for the determination of reliable phase diagrams of these systems. In this work, we present a new methodology to compute the free energy of many-body quantum systems [1]. This methodology results from the combination of the path integral formulation of statistical mechanics and efficient non-equilibrium methods to estimate free energy, namely, the adiabatic switching and reversible scaling methods. A quantum Einstein crystal is used as a model to show the accuracy and reliability the methodology. This new method is applied to the calculation of solid-liquid coexistence properties of neon. Our findings indicate that quantum contributions to properties such as, melting point, latent heat of fusion, entropy of fusion, and slope of melting line can be up to 10% of the calculated values using the classical approximation. [1] R. M. Ramirez, C. P. Herrero, A. Antonelli, and E. R. Hernández, Journal of Chemical Physics 129, 064110 (2008)
A Very High Order, Adaptable MESA Implementation for Aeroacoustic Computations
NASA Technical Reports Server (NTRS)
Dydson, Roger W.; Goodrich, John W.
2000-01-01
Since computational efficiency and wave resolution scale with accuracy, the ideal would be infinitely high accuracy for problems with widely varying wavelength scales. Currently, many of the computational aeroacoustics methods are limited to 4th order accurate Runge-Kutta methods in time which limits their resolution and efficiency. However, a new procedure for implementing the Modified Expansion Solution Approximation (MESA) schemes, based upon Hermitian divided differences, is presented which extends the effective accuracy of the MESA schemes to 57th order in space and time when using 128 bit floating point precision. This new approach has the advantages of reducing round-off error, being easy to program. and is more computationally efficient when compared to previous approaches. Its accuracy is limited only by the floating point hardware. The advantages of this new approach are demonstrated by solving the linearized Euler equations in an open bi-periodic domain. A 500th order MESA scheme can now be created in seconds, making these schemes ideally suited for the next generation of high performance 256-bit (double quadruple) or higher precision computers. This ease of creation makes it possible to adapt the algorithm to the mesh in time instead of its converse: this is ideal for resolving varying wavelength scales which occur in noise generation simulations. And finally, the sources of round-off error which effect the very high order methods are examined and remedies provided that effectively increase the accuracy of the MESA schemes while using current computer technology.
Versino, Daniele; Bronkhorst, Curt Allan
2018-01-31
The computational formulation of a micro-mechanical material model for the dynamic failure of ductile metals is presented in this paper. The statistical nature of porosity initiation is accounted for by introducing an arbitrary probability density function which describes the pores nucleation pressures. Each micropore within the representative volume element is modeled as a thick spherical shell made of plastically incompressible material. The treatment of porosity by a distribution of thick-walled spheres also allows for the inclusion of micro-inertia effects under conditions of shock and dynamic loading. The second order ordinary differential equation governing the microscopic porosity evolution is solved withmore » a robust implicit procedure. A new Chebyshev collocation method is employed to approximate the porosity distribution and remapping is used to optimize memory usage. The adaptive approximation of the porosity distribution leads to a reduction of computational time and memory usage of up to two orders of magnitude. Moreover, the proposed model affords consistent performance: changing the nucleation pressure probability density function and/or the applied strain rate does not reduce accuracy or computational efficiency of the material model. The numerical performance of the model and algorithms presented is tested against three problems for high density tantalum: single void, one-dimensional uniaxial strain, and two-dimensional plate impact. Here, the results using the integration and algorithmic advances suggest a significant improvement in computational efficiency and accuracy over previous treatments for dynamic loading conditions.« less
Efficient frequent pattern mining algorithm based on node sets in cloud computing environment
NASA Astrophysics Data System (ADS)
Billa, V. N. Vinay Kumar; Lakshmanna, K.; Rajesh, K.; Reddy, M. Praveen Kumar; Nagaraja, G.; Sudheer, K.
2017-11-01
The ultimate goal of Data Mining is to determine the hidden information which is useful in making decisions using the large databases collected by an organization. This Data Mining involves many tasks that are to be performed during the process. Mining frequent itemsets is the one of the most important tasks in case of transactional databases. These transactional databases contain the data in very large scale where the mining of these databases involves the consumption of physical memory and time in proportion to the size of the database. A frequent pattern mining algorithm is said to be efficient only if it consumes less memory and time to mine the frequent itemsets from the given large database. Having these points in mind in this thesis we proposed a system which mines frequent itemsets in an optimized way in terms of memory and time by using cloud computing as an important factor to make the process parallel and the application is provided as a service. A complete framework which uses a proven efficient algorithm called FIN algorithm. FIN algorithm works on Nodesets and POC (pre-order coding) tree. In order to evaluate the performance of the system we conduct the experiments to compare the efficiency of the same algorithm applied in a standalone manner and in cloud computing environment on a real time data set which is traffic accidents data set. The results show that the memory consumption and execution time taken for the process in the proposed system is much lesser than those of standalone system.
ERIC Educational Resources Information Center
McKinley, Kenneth H.; Self, Burl E., Jr.
A study was conducted to determine the feasibility of using the computer-based Synagraphic Mapping Program (SYMAP) and the Statistical Package for the Social Sciences (SPSS) in formulating an efficient and accurate information system which Creek Nation tribal staff could implement and use in planning for more effective and precise delivery of…
Accurate computational design of multipass transmembrane proteins.
Lu, Peilong; Min, Duyoung; DiMaio, Frank; Wei, Kathy Y; Vahey, Michael D; Boyken, Scott E; Chen, Zibo; Fallas, Jorge A; Ueda, George; Sheffler, William; Mulligan, Vikram Khipple; Xu, Wenqing; Bowie, James U; Baker, David
2018-03-02
The computational design of transmembrane proteins with more than one membrane-spanning region remains a major challenge. We report the design of transmembrane monomers, homodimers, trimers, and tetramers with 76 to 215 residue subunits containing two to four membrane-spanning regions and up to 860 total residues that adopt the target oligomerization state in detergent solution. The designed proteins localize to the plasma membrane in bacteria and in mammalian cells, and magnetic tweezer unfolding experiments in the membrane indicate that they are very stable. Crystal structures of the designed dimer and tetramer-a rocket-shaped structure with a wide cytoplasmic base that funnels into eight transmembrane helices-are very close to the design models. Our results pave the way for the design of multispan membrane proteins with new functions. Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.
Using Neural Net Technology To Enhance the Efficiency of a Computer Adaptive Testing Application.
ERIC Educational Resources Information Center
Van Nelson, C.; Henriksen, Larry W.
The potential for computer adaptive testing (CAT) has been well documented. In order to improve the efficiency of this process, it may be possible to utilize a neural network, or more specifically, a back propagation neural network. The paper asserts that in order to accomplish this end, it must be shown that grouping examinees by ability as…
On the implementation of an accurate and efficient solver for convection-diffusion equations
NASA Astrophysics Data System (ADS)
Wu, Chin-Tien
In this dissertation, we examine several different aspects of computing the numerical solution of the convection-diffusion equation. The solution of this equation often exhibits sharp gradients due to Dirichlet outflow boundaries or discontinuities in boundary conditions. Because of the singular-perturbed nature of the equation, numerical solutions often have severe oscillations when grid sizes are not small enough to resolve sharp gradients. To overcome such difficulties, the streamline diffusion discretization method can be used to obtain an accurate approximate solution in regions where the solution is smooth. To increase accuracy of the solution in the regions containing layers, adaptive mesh refinement and mesh movement based on a posteriori error estimations can be employed. An error-adapted mesh refinement strategy based on a posteriori error estimations is also proposed to resolve layers. For solving the sparse linear systems that arise from discretization, goemetric multigrid (MG) and algebraic multigrid (AMG) are compared. In addition, both methods are also used as preconditioners for Krylov subspace methods. We derive some convergence results for MG with line Gauss-Seidel smoothers and bilinear interpolation. Finally, while considering adaptive mesh refinement as an integral part of the solution process, it is natural to set a stopping tolerance for the iterative linear solvers on each mesh stage so that the difference between the approximate solution obtained from iterative methods and the finite element solution is bounded by an a posteriori error bound. Here, we present two stopping criteria. The first is based on a residual-type a posteriori error estimator developed by Verfurth. The second is based on an a posteriori error estimator, using local solutions, developed by Kay and Silvester. Our numerical results show the refined mesh obtained from the iterative solution which satisfies the second criteria is similar to the refined mesh obtained from
Computational modelling of oxygenation processes in enzymes and biomimetic model complexes.
de Visser, Sam P; Quesne, Matthew G; Martin, Bodo; Comba, Peter; Ryde, Ulf
2014-01-11
With computational resources becoming more efficient and more powerful and at the same time cheaper, computational methods have become more and more popular for studies on biochemical and biomimetic systems. Although large efforts from the scientific community have gone into exploring the possibilities of computational methods for studies on large biochemical systems, such studies are not without pitfalls and often cannot be routinely done but require expert execution. In this review we summarize and highlight advances in computational methodology and its application to enzymatic and biomimetic model complexes. In particular, we emphasize on topical and state-of-the-art methodologies that are able to either reproduce experimental findings, e.g., spectroscopic parameters and rate constants, accurately or make predictions of short-lived intermediates and fast reaction processes in nature. Moreover, we give examples of processes where certain computational methods dramatically fail.
Application of kernel functions for accurate similarity search in large chemical databases.
Wang, Xiaohong; Huan, Jun; Smalter, Aaron; Lushington, Gerald H
2010-04-29
Similarity search in chemical structure databases is an important problem with many applications in chemical genomics, drug design, and efficient chemical probe screening among others. It is widely believed that structure based methods provide an efficient way to do the query. Recently various graph kernel functions have been designed to capture the intrinsic similarity of graphs. Though successful in constructing accurate predictive and classification models, graph kernel functions can not be applied to large chemical compound database due to the high computational complexity and the difficulties in indexing similarity search for large databases. To bridge graph kernel function and similarity search in chemical databases, we applied a novel kernel-based similarity measurement, developed in our team, to measure similarity of graph represented chemicals. In our method, we utilize a hash table to support new graph kernel function definition, efficient storage and fast search. We have applied our method, named G-hash, to large chemical databases. Our results show that the G-hash method achieves state-of-the-art performance for k-nearest neighbor (k-NN) classification. Moreover, the similarity measurement and the index structure is scalable to large chemical databases with smaller indexing size, and faster query processing time as compared to state-of-the-art indexing methods such as Daylight fingerprints, C-tree and GraphGrep. Efficient similarity query processing method for large chemical databases is challenging since we need to balance running time efficiency and similarity search accuracy. Our previous similarity search method, G-hash, provides a new way to perform similarity search in chemical databases. Experimental study validates the utility of G-hash in chemical databases.
Analysis of a Multiprocessor Guidance Computer. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Maltach, E. G.
1969-01-01
The design of the next generation of spaceborne digital computers is described. It analyzes a possible multiprocessor computer configuration. For the analysis, a set of representative space computing tasks was abstracted from the Lunar Module Guidance Computer programs as executed during the lunar landing, from the Apollo program. This computer performs at this time about 24 concurrent functions, with iteration rates from 10 times per second to once every two seconds. These jobs were tabulated in a machine-independent form, and statistics of the overall job set were obtained. It was concluded, based on a comparison of simulation and Markov results, that the Markov process analysis is accurate in predicting overall trends and in configuration comparisons, but does not provide useful detailed information in specific situations. Using both types of analysis, it was determined that the job scheduling function is a critical one for efficiency of the multiprocessor. It is recommended that research into the area of automatic job scheduling be performed.
NASA Technical Reports Server (NTRS)
Lee, C. S. G.; Chen, C. L.
1989-01-01
Two efficient mapping algorithms for scheduling the robot inverse dynamics computation consisting of m computational modules with precedence relationship to be executed on a multiprocessor system consisting of p identical homogeneous processors with processor and communication costs to achieve minimum computation time are presented. An objective function is defined in terms of the sum of the processor finishing time and the interprocessor communication time. The minimax optimization is performed on the objective function to obtain the best mapping. This mapping problem can be formulated as a combination of the graph partitioning and the scheduling problems; both have been known to be NP-complete. Thus, to speed up the searching for a solution, two heuristic algorithms were proposed to obtain fast but suboptimal mapping solutions. The first algorithm utilizes the level and the communication intensity of the task modules to construct an ordered priority list of ready modules and the module assignment is performed by a weighted bipartite matching algorithm. For a near-optimal mapping solution, the problem can be solved by the heuristic algorithm with simulated annealing. These proposed optimization algorithms can solve various large-scale problems within a reasonable time. Computer simulations were performed to evaluate and verify the performance and the validity of the proposed mapping algorithms. Finally, experiments for computing the inverse dynamics of a six-jointed PUMA-like manipulator based on the Newton-Euler dynamic equations were implemented on an NCUBE/ten hypercube computer to verify the proposed mapping algorithms. Computer simulation and experimental results are compared and discussed.
Sampling efficiency of modified 37-mm sampling cassettes using computational fluid dynamics.
Anthony, T Renée; Sleeth, Darrah; Volckens, John
2016-01-01
In the U.S., most industrial hygiene practitioners continue to rely on the closed-face cassette (CFC) to assess worker exposures to hazardous dusts, primarily because ease of use, cost, and familiarity. However, mass concentrations measured with this classic sampler underestimate exposures to larger particles throughout the inhalable particulate mass (IPM) size range (up to aerodynamic diameters of 100 μm). To investigate whether the current 37-mm inlet cap can be redesigned to better meet the IPM sampling criterion, computational fluid dynamics (CFD) models were developed, and particle sampling efficiencies associated with various modifications to the CFC inlet cap were determined. Simulations of fluid flow (standard k-epsilon turbulent model) and particle transport (laminar trajectories, 1-116 μm) were conducted using sampling flow rates of 10 L min(-1) in slow moving air (0.2 m s(-1)) in the facing-the-wind orientation. Combinations of seven inlet shapes and three inlet diameters were evaluated as candidates to replace the current 37-mm inlet cap. For a given inlet geometry, differences in sampler efficiency between inlet diameters averaged less than 1% for particles through 100 μm, but the largest opening was found to increase the efficiency for the 116 μm particles by 14% for the flat inlet cap. A substantial reduction in sampler efficiency was identified for sampler inlets with side walls extending beyond the dimension of the external lip of the current 37-mm CFC. The inlet cap based on the 37-mm CFC dimensions with an expanded 15-mm entry provided the best agreement with facing-the-wind human aspiration efficiency. The sampler efficiency was increased with a flat entry or with a thin central lip adjacent to the new enlarged entry. This work provides a substantial body of sampling efficiency estimates as a function of particle size and inlet geometry for personal aerosol samplers.
Sankaran, Sethuraman; Humphrey, Jay D.; Marsden, Alison L.
2013-01-01
Computational models for vascular growth and remodeling (G&R) are used to predict the long-term response of vessels to changes in pressure, flow, and other mechanical loading conditions. Accurate predictions of these responses are essential for understanding numerous disease processes. Such models require reliable inputs of numerous parameters, including material properties and growth rates, which are often experimentally derived, and inherently uncertain. While earlier methods have used a brute force approach, systematic uncertainty quantification in G&R models promises to provide much better information. In this work, we introduce an efficient framework for uncertainty quantification and optimal parameter selection, and illustrate it via several examples. First, an adaptive sparse grid stochastic collocation scheme is implemented in an established G&R solver to quantify parameter sensitivities, and near-linear scaling with the number of parameters is demonstrated. This non-intrusive and parallelizable algorithm is compared with standard sampling algorithms such as Monte-Carlo. Second, we determine optimal arterial wall material properties by applying robust optimization. We couple the G&R simulator with an adaptive sparse grid collocation approach and a derivative-free optimization algorithm. We show that an artery can achieve optimal homeostatic conditions over a range of alterations in pressure and flow; robustness of the solution is enforced by including uncertainty in loading conditions in the objective function. We then show that homeostatic intramural and wall shear stress is maintained for a wide range of material properties, though the time it takes to achieve this state varies. We also show that the intramural stress is robust and lies within 5% of its mean value for realistic variability of the material parameters. We observe that prestretch of elastin and collagen are most critical to maintaining homeostasis, while values of the material properties are
Towards future high performance computing: What will change? How can we be efficient?
NASA Astrophysics Data System (ADS)
Düben, Peter
2017-04-01
How can we make the most out of "exascale" supercomputers that will be available soon and enable us to calculate an amazing number of 1,000,000,000,000,000,000 real numbers operations within a single second? How do we need to design applications to use these machines efficiently? What are the limits? We will discuss opportunities and limits of the use of future high performance computers from the perspective of Earth System Modelling. We will provide an overview about future challenges and outline how numerical application will need to be changed to run efficiently on supercomputers in the future. We will also discuss how different disciplines can support each other and talk about data handling and numerical precision of data.
Machine Learning of Accurate Energy-Conserving Molecular Force Fields
NASA Astrophysics Data System (ADS)
Chmiela, Stefan; Tkatchenko, Alexandre; Sauceda, Huziel; Poltavsky, Igor; Schütt, Kristof; Müller, Klaus-Robert; GDML Collaboration
Efficient and accurate access to the Born-Oppenheimer potential energy surface (PES) is essential for long time scale molecular dynamics (MD) simulations. Using conservation of energy - a fundamental property of closed classical and quantum mechanical systems - we develop an efficient gradient-domain machine learning (GDML) approach to construct accurate molecular force fields using a restricted number of samples from ab initio MD trajectories (AIMD). The GDML implementation is able to reproduce global potential-energy surfaces of intermediate-size molecules with an accuracy of 0.3 kcal/mol for energies and 1 kcal/mol/Å for atomic forces using only 1000 conformational geometries for training. We demonstrate this accuracy for AIMD trajectories of molecules, including benzene, toluene, naphthalene, malonaldehyde, ethanol, uracil, and aspirin. The challenge of constructing conservative force fields is accomplished in our work by learning in a Hilbert space of vector-valued functions that obey the law of energy conservation. The GDML approach enables quantitative MD simulations for molecules at a fraction of cost of explicit AIMD calculations, thereby allowing the construction of efficient force fields with the accuracy and transferability of high-level ab initio methods.
Realistic and efficient 2D crack simulation
NASA Astrophysics Data System (ADS)
Yadegar, Jacob; Liu, Xiaoqing; Singh, Abhishek
2010-04-01
Although numerical algorithms for 2D crack simulation have been studied in Modeling and Simulation (M&S) and computer graphics for decades, realism and computational efficiency are still major challenges. In this paper, we introduce a high-fidelity, scalable, adaptive and efficient/runtime 2D crack/fracture simulation system by applying the mathematically elegant Peano-Cesaro triangular meshing/remeshing technique to model the generation of shards/fragments. The recursive fractal sweep associated with the Peano-Cesaro triangulation provides efficient local multi-resolution refinement to any level-of-detail. The generated binary decomposition tree also provides efficient neighbor retrieval mechanism used for mesh element splitting and merging with minimal memory requirements essential for realistic 2D fragment formation. Upon load impact/contact/penetration, a number of factors including impact angle, impact energy, and material properties are all taken into account to produce the criteria of crack initialization, propagation, and termination leading to realistic fractal-like rubble/fragments formation. The aforementioned parameters are used as variables of probabilistic models of cracks/shards formation, making the proposed solution highly adaptive by allowing machine learning mechanisms learn the optimal values for the variables/parameters based on prior benchmark data generated by off-line physics based simulation solutions that produce accurate fractures/shards though at highly non-real time paste. Crack/fracture simulation has been conducted on various load impacts with different initial locations at various impulse scales. The simulation results demonstrate that the proposed system has the capability to realistically and efficiently simulate 2D crack phenomena (such as window shattering and shards generation) with diverse potentials in military and civil M&S applications such as training and mission planning.
Jin, Miaomiao; Cheng, Long; Li, Yi; Hu, Siyu; Lu, Ke; Chen, Jia; Duan, Nian; Wang, Zhuorui; Zhou, Yaxiong; Chang, Ting-Chang; Miao, Xiangshui
2018-06-27
Owing to the capability of integrating the information storage and computing in the same physical location, in-memory computing with memristors has become a research hotspot as a promising route for non von Neumann architecture. However, it is still a challenge to develop high performance devices as well as optimized logic methodologies to realize energy-efficient computing. Herein, filamentary Cu/GeTe/TiN memristor is reported to show satisfactory properties with nanosecond switching speed (< 60 ns), low voltage operation (< 2 V), high endurance (>104 cycles) and good retention (>104 s @85℃). It is revealed that the charge carrier conduction mechanisms in high resistance and low resistance states are Schottky emission and hopping transport between the adjacent Cu clusters, respectively, based on the analysis of current-voltage behaviors and resistance-temperature characteristics. An intuitive picture is given to describe the dynamic processes of resistive switching. Moreover, based on the basic material implication (IMP) logic circuit, we proposed a reconfigurable logic method and experimentally implemented IMP, NOT, OR, and COPY logic functions. Design of a one-bit full adder with reduction in computational sequences and its validation in simulation further demonstrate the potential practical application. The results provide important progress towards understanding of resistive switching mechanism and realization of energy-efficient in-memory computing architecture. © 2018 IOP Publishing Ltd.
NASA Technical Reports Server (NTRS)
Chang, Chau-Lyan; Venkatachari, Balaji Shankar; Cheng, Gary
2013-01-01
With the wide availability of affordable multiple-core parallel supercomputers, next generation numerical simulations of flow physics are being focused on unsteady computations for problems involving multiple time scales and multiple physics. These simulations require higher solution accuracy than most algorithms and computational fluid dynamics codes currently available. This paper focuses on the developmental effort for high-fidelity multi-dimensional, unstructured-mesh flow solvers using the space-time conservation element, solution element (CESE) framework. Two approaches have been investigated in this research in order to provide high-accuracy, cross-cutting numerical simulations for a variety of flow regimes: 1) time-accurate local time stepping and 2) highorder CESE method. The first approach utilizes consistent numerical formulations in the space-time flux integration to preserve temporal conservation across the cells with different marching time steps. Such approach relieves the stringent time step constraint associated with the smallest time step in the computational domain while preserving temporal accuracy for all the cells. For flows involving multiple scales, both numerical accuracy and efficiency can be significantly enhanced. The second approach extends the current CESE solver to higher-order accuracy. Unlike other existing explicit high-order methods for unstructured meshes, the CESE framework maintains a CFL condition of one for arbitrarily high-order formulations while retaining the same compact stencil as its second-order counterpart. For large-scale unsteady computations, this feature substantially enhances numerical efficiency. Numerical formulations and validations using benchmark problems are discussed in this paper along with realistic examples.
Kang, Kyoung-Tak; Kim, Sung-Hwan; Son, Juhyun; Lee, Young Han; Koh, Yong-Gon
2017-01-01
Computational models have been identified as efficient techniques in the clinical decision-making process. However, computational model was validated using published data in most previous studies, and the kinematic validation of such models still remains a challenge. Recently, studies using medical imaging have provided a more accurate visualization of knee joint kinematics. The purpose of the present study was to perform kinematic validation for the subject-specific computational knee joint model by comparison with subject's medical imaging under identical laxity condition. The laxity test was applied to the anterior-posterior drawer under 90° flexion and the varus-valgus under 20° flexion with a series of stress radiographs, a Telos device, and computed tomography. The loading condition in the computational subject-specific knee joint model was identical to the laxity test condition in the medical image. Our computational model showed knee laxity kinematic trends that were consistent with the computed tomography images, except for negligible differences because of the indirect application of the subject's in vivo material properties. Medical imaging based on computed tomography with the laxity test allowed us to measure not only the precise translation but also the rotation of the knee joint. This methodology will be beneficial in the validation of laxity tests for subject- or patient-specific computational models.
NASA Astrophysics Data System (ADS)
Zhu, Jun; Chen, Lijun; Ma, Lantao; Li, Dejian; Jiang, Wei; Pan, Lihong; Shen, Huiting; Jia, Hongmin; Hsiang, Chingyun; Cheng, Guojie; Ling, Li; Chen, Shijie; Wang, Jun; Liao, Wenkui; Zhang, Gary
2014-04-01
Defect review is a time consuming job. Human error makes result inconsistent. The defects located on don't care area would not hurt the yield and no need to review them such as defects on dark area. However, critical area defects can impact yield dramatically and need more attention to review them such as defects on clear area. With decrease in integrated circuit dimensions, mask defects are always thousands detected during inspection even more. Traditional manual or simple classification approaches are unable to meet efficient and accuracy requirement. This paper focuses on automatic defect management and classification solution using image output of Lasertec inspection equipment and Anchor pattern centric image process technology. The number of mask defect found during an inspection is always in the range of thousands or even more. This system can handle large number defects with quick and accurate defect classification result. Our experiment includes Die to Die and Single Die modes. The classification accuracy can reach 87.4% and 93.3%. No critical or printable defects are missing in our test cases. The missing classification defects are 0.25% and 0.24% in Die to Die mode and Single Die mode. This kind of missing rate is encouraging and acceptable to apply on production line. The result can be output and reloaded back to inspection machine to have further review. This step helps users to validate some unsure defects with clear and magnification images when captured images can't provide enough information to make judgment. This system effectively reduces expensive inline defect review time. As a fully inline automated defect management solution, the system could be compatible with current inspection approach and integrated with optical simulation even scoring function and guide wafer level defect inspection.
A cloud computing based 12-lead ECG telemedicine service.
Hsieh, Jui-Chien; Hsu, Meng-Wei
2012-07-28
Due to the great variability of 12-lead ECG instruments and medical specialists' interpretation skills, it remains a challenge to deliver rapid and accurate 12-lead ECG reports with senior cardiologists' decision making support in emergency telecardiology. We create a new cloud and pervasive computing based 12-lead Electrocardiography (ECG) service to realize ubiquitous 12-lead ECG tele-diagnosis. This developed service enables ECG to be transmitted and interpreted via mobile phones. That is, tele-consultation can take place while the patient is on the ambulance, between the onsite clinicians and the off-site senior cardiologists, or among hospitals. Most importantly, this developed service is convenient, efficient, and inexpensive. This cloud computing based ECG tele-consultation service expands the traditional 12-lead ECG applications onto the collaboration of clinicians at different locations or among hospitals. In short, this service can greatly improve medical service quality and efficiency, especially for patients in rural areas. This service has been evaluated and proved to be useful by cardiologists in Taiwan.
NASA Astrophysics Data System (ADS)
Schwörer, Magnus; Breitenfeld, Benedikt; Tröster, Philipp; Bauer, Sebastian; Lorenzen, Konstantin; Tavan, Paul; Mathias, Gerald
2013-06-01
Hybrid molecular dynamics (MD) simulations, in which the forces acting on the atoms are calculated by grid-based density functional theory (DFT) for a solute molecule and by a polarizable molecular mechanics (PMM) force field for a large solvent environment composed of several 103-105 molecules, pose a challenge. A corresponding computational approach should guarantee energy conservation, exclude artificial distortions of the electron density at the interface between the DFT and PMM fragments, and should treat the long-range electrostatic interactions within the hybrid simulation system in a linearly scaling fashion. Here we describe a corresponding Hamiltonian DFT/(P)MM implementation, which accounts for inducible atomic dipoles of a PMM environment in a joint DFT/PMM self-consistency iteration. The long-range parts of the electrostatics are treated by hierarchically nested fast multipole expansions up to a maximum distance dictated by the minimum image convention of toroidal boundary conditions and, beyond that distance, by a reaction field approach such that the computation scales linearly with the number of PMM atoms. Short-range over-polarization artifacts are excluded by using Gaussian inducible dipoles throughout the system and Gaussian partial charges in the PMM region close to the DFT fragment. The Hamiltonian character, the stability, and efficiency of the implementation are investigated by hybrid DFT/PMM-MD simulations treating one molecule of the water dimer and of bulk water by DFT and the respective remainder by PMM.
Tensor scale: An analytic approach with efficient computation and applications☆
Xu, Ziyue; Saha, Punam K.; Dasgupta, Soura
2015-01-01
Scale is a widely used notion in computer vision and image understanding that evolved in the form of scale-space theory where the key idea is to represent and analyze an image at various resolutions. Recently, we introduced a notion of local morphometric scale referred to as “tensor scale” using an ellipsoidal model that yields a unified representation of structure size, orientation and anisotropy. In the previous work, tensor scale was described using a 2-D algorithmic approach and a precise analytic definition was missing. Also, the application of tensor scale in 3-D using the previous framework is not practical due to high computational complexity. In this paper, an analytic definition of tensor scale is formulated for n-dimensional (n-D) images that captures local structure size, orientation and anisotropy. Also, an efficient computational solution in 2- and 3-D using several novel differential geometric approaches is presented and the accuracy of results is experimentally examined. Also, a matrix representation of tensor scale is derived facilitating several operations including tensor field smoothing to capture larger contextual knowledge. Finally, the applications of tensor scale in image filtering and n-linear interpolation are presented and the performance of their results is examined in comparison with respective state-of-art methods. Specifically, the performance of tensor scale based image filtering is compared with gradient and Weickert’s structure tensor based diffusive filtering algorithms. Also, the performance of tensor scale based n-linear interpolation is evaluated in comparison with standard n-linear and windowed-sinc interpolation methods. PMID:26236148
Saglam, Ali S; Chong, Lillian T
2016-01-14
An essential baseline for determining the extent to which electrostatic interactions enhance the kinetics of protein-protein association is the "basal" kon, which is the rate constant for association in the absence of electrostatic interactions. However, since such association events are beyond the milliseconds time scale, it has not been practical to compute the basal kon by directly simulating the association with flexible models. Here, we computed the basal kon for barnase and barstar, two of the most rapidly associating proteins, using highly efficient, flexible molecular simulations. These simulations involved (a) pseudoatomic protein models that reproduce the molecular shapes, electrostatic, and diffusion properties of all-atom models, and (b) application of the weighted ensemble path sampling strategy, which enhanced the efficiency of generating association events by >130-fold. We also examined the extent to which the computed basal kon is affected by inclusion of intermolecular hydrodynamic interactions in the simulations.
NASA Astrophysics Data System (ADS)
Papasotiriou, P. J.; Geroyannis, V. S.
We implement Hartle's perturbation method to the computation of relativistic rigidly rotating neutron star models. The program has been written in SCILAB (© INRIA ENPC), a matrix-oriented high-level programming language. The numerical method is described in very detail and is applied to many models in slow or fast rotation. We show that, although the method is perturbative, it gives accurate results for all practical purposes and it should prove an efficient tool for computing rapidly rotating pulsars.
Tresley, Jonathan; Jose, Jean
2015-04-01
Osteoarthritis of the knee can be a debilitating and extremely painful condition. In patients who desire to postpone knee arthroplasty or in those who are not surgical candidates, percutaneous knee injection therapies have the potential to reduce pain and swelling, maintain joint mobility, and minimize disability. Published studies cite poor accuracy of intra-articular knee joint injections without imaging guidance. We present a sonographically guided posteromedial approach to intra-articular knee joint injections with 100% accuracy and no complications in a consecutive series of 67 patients undergoing subsequent computed tomographic or magnetic resonance arthrography. Although many other standard approaches are available, a posteromedial intra-articular technique is particularly useful in patients with a large body habitus and theoretically allows for simultaneous aspiration of Baker cysts with a single sterile preparation and without changing the patient's position. The posteromedial technique described in this paper is not compared or deemed superior to other standard approaches but, rather, is presented as a potentially safe and efficient alternative. © 2015 by the American Institute of Ultrasound in Medicine.
Tahir, Muhammad; Jan, Bismillah; Hayat, Maqsood; Shah, Shakir Ullah; Amin, Muhammad
2018-04-01
Discriminative and informative feature extraction is the core requirement for accurate and efficient classification of protein subcellular localization images so that drug development could be more effective. The objective of this paper is to propose a novel modification in the Threshold Adjacency Statistics technique and enhance its discriminative power. In this work, we utilized Threshold Adjacency Statistics from a novel perspective to enhance its discrimination power and efficiency. In this connection, we utilized seven threshold ranges to produce seven distinct feature spaces, which are then used to train seven SVMs. The final prediction is obtained through the majority voting scheme. The proposed ETAS-SubLoc system is tested on two benchmark datasets using 5-fold cross-validation technique. We observed that our proposed novel utilization of TAS technique has improved the discriminative power of the classifier. The ETAS-SubLoc system has achieved 99.2% accuracy, 99.3% sensitivity and 99.1% specificity for Endogenous dataset outperforming the classical Threshold Adjacency Statistics technique. Similarly, 91.8% accuracy, 96.3% sensitivity and 91.6% specificity values are achieved for Transfected dataset. Simulation results validated the effectiveness of ETAS-SubLoc that provides superior prediction performance compared to the existing technique. The proposed methodology aims at providing support to pharmaceutical industry as well as research community towards better drug designing and innovation in the fields of bioinformatics and computational biology. The implementation code for replicating the experiments presented in this paper is available at: https://drive.google.com/file/d/0B7IyGPObWbSqRTRMcXI2bG5CZWs/view?usp=sharing. Copyright © 2018 Elsevier B.V. All rights reserved.
An approach for accurate simulation of liquid mixing in a T-shaped micromixer.
Matsunaga, Takuya; Lee, Ho-Joon; Nishino, Koichi
2013-04-21
In this paper, we propose a new computational method for efficient evaluation of the fluid mixing behaviour in a T-shaped micromixer with a rectangular cross section at high Schmidt number under steady state conditions. Our approach enables a low-cost high-quality simulation based on tracking of fluid particles for convective fluid mixing and posterior solving of a model of the species equation for molecular diffusion. The examined parameter range is Re = 1.33 × 10(-2) to 240 at Sc = 3600. The proposed method is shown to simulate well the mixing quality even in the engulfment regime, where the ordinary grid-based simulation is not able to obtain accurate solutions with affordable mesh sizes due to the numerical diffusion at high Sc. The obtained results agree well with a backward random-walk Monte Carlo simulation, by which the accuracy of the proposed method is verified. For further investigation of the characteristics of the proposed method, the Sc dependency is examined in a wide range of Sc from 10 to 3600 at Re = 200. The study reveals that the model discrepancy error emerges more significantly in the concentration distribution at lower Sc, while the resulting mixing quality is accurate over the entire range.
NASA Technical Reports Server (NTRS)
Tam, Christopher K. W.; Aganin, Alexei
2000-01-01
The transonic nozzle transmission problem and the open rotor noise radiation problem are solved computationally. Both are multiple length scales problems. For efficient and accurate numerical simulation, the multiple-size-mesh multiple-time-step Dispersion-Relation-Preserving scheme is used to calculate the time periodic solution. To ensure an accurate solution, high quality numerical boundary conditions are also needed. For the nozzle problem, a set of nonhomogeneous, outflow boundary conditions are required. The nonhomogeneous boundary conditions not only generate the incoming sound waves but also, at the same time, allow the reflected acoustic waves and entropy waves, if present, to exit the computation domain without reflection. For the open rotor problem, there is an apparent singularity at the axis of rotation. An analytic extension approach is developed to provide a high quality axis boundary treatment.
Improving the Efficiency of Free Energy Calculations in the Amber Molecular Dynamics Package.
Kaus, Joseph W; Pierce, Levi T; Walker, Ross C; McCammont, J Andrew
2013-09-10
Alchemical transformations are widely used methods to calculate free energies. Amber has traditionally included support for alchemical transformations as part of the sander molecular dynamics (MD) engine. Here we describe the implementation of a more efficient approach to alchemical transformations in the Amber MD package. Specifically we have implemented this new approach within the more computational efficient and scalable pmemd MD engine that is included with the Amber MD package. The majority of the gain in efficiency comes from the improved design of the calculation, which includes better parallel scaling and reduction in the calculation of redundant terms. This new implementation is able to reproduce results from equivalent simulations run with the existing functionality, but at 2.5 times greater computational efficiency. This new implementation is also able to run softcore simulations at the λ end states making direct calculation of free energies more accurate, compared to the extrapolation required in the existing implementation. The updated alchemical transformation functionality will be included in the next major release of Amber (scheduled for release in Q1 2014) and will be available at http://ambermd.org, under the Amber license.
Improving the Efficiency of Free Energy Calculations in the Amber Molecular Dynamics Package
Pierce, Levi T.; Walker, Ross C.; McCammont, J. Andrew
2013-01-01
Alchemical transformations are widely used methods to calculate free energies. Amber has traditionally included support for alchemical transformations as part of the sander molecular dynamics (MD) engine. Here we describe the implementation of a more efficient approach to alchemical transformations in the Amber MD package. Specifically we have implemented this new approach within the more computational efficient and scalable pmemd MD engine that is included with the Amber MD package. The majority of the gain in efficiency comes from the improved design of the calculation, which includes better parallel scaling and reduction in the calculation of redundant terms. This new implementation is able to reproduce results from equivalent simulations run with the existing functionality, but at 2.5 times greater computational efficiency. This new implementation is also able to run softcore simulations at the λ end states making direct calculation of free energies more accurate, compared to the extrapolation required in the existing implementation. The updated alchemical transformation functionality will be included in the next major release of Amber (scheduled for release in Q1 2014) and will be available at http://ambermd.org, under the Amber license. PMID:24185531
Efficient implementation of the many-body Reactive Bond Order (REBO) potential on GPU
NASA Astrophysics Data System (ADS)
Trędak, Przemysław; Rudnicki, Witold R.; Majewski, Jacek A.
2016-09-01
The second generation Reactive Bond Order (REBO) empirical potential is commonly used to accurately model a wide range hydrocarbon materials. It is also extensible to other atom types and interactions. REBO potential assumes complex multi-body interaction model, that is difficult to represent efficiently in the SIMD or SIMT programming model. Hence, despite its importance, no efficient GPGPU implementation has been developed for this potential. Here we present a detailed description of a highly efficient GPGPU implementation of molecular dynamics algorithm using REBO potential. The presented algorithm takes advantage of rarely used properties of the SIMT architecture of a modern GPU to solve difficult synchronizations issues that arise in computations of multi-body potential. Techniques developed for this problem may be also used to achieve efficient solutions of different problems. The performance of proposed algorithm is assessed using a range of model systems. It is compared to highly optimized CPU implementation (both single core and OpenMP) available in LAMMPS package. These experiments show up to 6x improvement in forces computation time using single processor of the NVIDIA Tesla K80 compared to high end 16-core Intel Xeon processor.
Computationally Efficient Nonlinear Bell Inequalities for Quantum Networks
NASA Astrophysics Data System (ADS)
Luo, Ming-Xing
2018-04-01
The correlations in quantum networks have attracted strong interest with new types of violations of the locality. The standard Bell inequalities cannot characterize the multipartite correlations that are generated by multiple sources. The main problem is that no computationally efficient method is available for constructing useful Bell inequalities for general quantum networks. In this work, we show a significant improvement by presenting new, explicit Bell-type inequalities for general networks including cyclic networks. These nonlinear inequalities are related to the matching problem of an equivalent unweighted bipartite graph that allows constructing a polynomial-time algorithm. For the quantum resources consisting of bipartite entangled pure states and generalized Greenberger-Horne-Zeilinger (GHZ) states, we prove the generic nonmultilocality of quantum networks with multiple independent observers using new Bell inequalities. The violations are maximal with respect to the presented Tsirelson's bound for Einstein-Podolsky-Rosen states and GHZ states. Moreover, these violations hold for Werner states or some general noisy states. Our results suggest that the presented Bell inequalities can be used to characterize experimental quantum networks.
NASA Astrophysics Data System (ADS)
Sengupta, Abhronil; Roy, Kaushik
2017-12-01
Present day computers expend orders of magnitude more computational resources to perform various cognitive and perception related tasks that humans routinely perform every day. This has recently resulted in a seismic shift in the field of computation where research efforts are being directed to develop a neurocomputer that attempts to mimic the human brain by nanoelectronic components and thereby harness its efficiency in recognition problems. Bridging the gap between neuroscience and nanoelectronics, this paper attempts to provide a review of the recent developments in the field of spintronic device based neuromorphic computing. Description of various spin-transfer torque mechanisms that can be potentially utilized for realizing device structures mimicking neural and synaptic functionalities is provided. A cross-layer perspective extending from the device to the circuit and system level is presented to envision the design of an All-Spin neuromorphic processor enabled with on-chip learning functionalities. Device-circuit-algorithm co-simulation framework calibrated to experimental results suggest that such All-Spin neuromorphic systems can potentially achieve almost two orders of magnitude energy improvement in comparison to state-of-the-art CMOS implementations.
Kruskal-Wallis-based computationally efficient feature selection for face recognition.
Ali Khan, Sajid; Hussain, Ayyaz; Basit, Abdul; Akram, Sheeraz
2014-01-01
Face recognition in today's technological world, and face recognition applications attain much more importance. Most of the existing work used frontal face images to classify face image. However these techniques fail when applied on real world face images. The proposed technique effectively extracts the prominent facial features. Most of the features are redundant and do not contribute to representing face. In order to eliminate those redundant features, computationally efficient algorithm is used to select the more discriminative face features. Extracted features are then passed to classification step. In the classification step, different classifiers are ensemble to enhance the recognition accuracy rate as single classifier is unable to achieve the high accuracy. Experiments are performed on standard face database images and results are compared with existing techniques.
A fourth order accurate finite difference scheme for the computation of elastic waves
NASA Technical Reports Server (NTRS)
Bayliss, A.; Jordan, K. E.; Lemesurier, B. J.; Turkel, E.
1986-01-01
A finite difference for elastic waves is introduced. The model is based on the first order system of equations for the velocities and stresses. The differencing is fourth order accurate on the spatial derivatives and second order accurate in time. The model is tested on a series of examples including the Lamb problem, scattering from plane interf aces and scattering from a fluid-elastic interface. The scheme is shown to be effective for these problems. The accuracy and stability is insensitive to the Poisson ratio. For the class of problems considered here it is found that the fourth order scheme requires for two-thirds to one-half the resolution of a typical second order scheme to give comparable accuracy.
Computer Controlled Portable Greenhouse Climate Control System for Enhanced Energy Efficiency
NASA Astrophysics Data System (ADS)
Datsenko, Anthony; Myer, Steve; Petties, Albert; Hustek, Ryan; Thompson, Mark
2010-04-01
This paper discusses a student project at Kettering University focusing on the design and construction of an energy efficient greenhouse climate control system. In order to maintain acceptable temperatures and stabilize temperature fluctuations in a portable plastic greenhouse economically, a computer controlled climate control system was developed to capture and store thermal energy incident on the structure during daylight periods and release the stored thermal energy during dark periods. The thermal storage mass for the greenhouse system consisted of a water filled base unit. The heat exchanger consisted of a system of PVC tubing. The control system used a programmable LabView computer interface to meet functional specifications that minimized temperature fluctuations and recorded data during operation. The greenhouse was a portable sized unit with a 5' x 5' footprint. Control input sensors were temperature, water level, and humidity sensors and output control devices were fan actuating relays and water fill solenoid valves. A Graphical User Interface was developed to monitor the system, set control parameters, and to provide programmable data recording times and intervals.
Computational Aerothermodynamics in Aeroassist Applications
NASA Technical Reports Server (NTRS)
Gnoffo, Peter A.
2001-01-01
Aeroassisted planetary entry uses atmospheric drag to decelerate spacecraft from super-orbital to orbital or suborbital velocities. Numerical simulation of flow fields surrounding these spacecraft during hypersonic atmospheric entry is required to define aerothermal loads. The severe compression in the shock layer in front of the vehicle and subsequent, rapid expansion into the wake are characterized by high temperature, thermo-chemical nonequilibrium processes. Implicit algorithms required for efficient, stable computation of the governing equations involving disparate time scales of convection, diffusion, chemical reactions, and thermal relaxation are discussed. Robust point-implicit strategies are utilized in the initialization phase; less robust but more efficient line-implicit strategies are applied in the endgame. Applications to ballutes (balloon-like decelerators) in the atmospheres of Venus, Mars, Titan, Saturn, and Neptune and a Mars Sample Return Orbiter (MSRO) are featured. Examples are discussed where time-accurate simulation is required to achieve a steady-state solution.
Development of an Efficient CFD Model for Nuclear Thermal Thrust Chamber Assembly Design
NASA Technical Reports Server (NTRS)
Cheng, Gary; Ito, Yasushi; Ross, Doug; Chen, Yen-Sen; Wang, Ten-See
2007-01-01
The objective of this effort is to develop an efficient and accurate computational methodology to predict both detailed thermo-fluid environments and global characteristics of the internal ballistics for a hypothetical solid-core nuclear thermal thrust chamber assembly (NTTCA). Several numerical and multi-physics thermo-fluid models, such as real fluid, chemically reacting, turbulence, conjugate heat transfer, porosity, and power generation, were incorporated into an unstructured-grid, pressure-based computational fluid dynamics solver as the underlying computational methodology. The numerical simulations of detailed thermo-fluid environment of a single flow element provide a mechanism to estimate the thermal stress and possible occurrence of the mid-section corrosion of the solid core. In addition, the numerical results of the detailed simulation were employed to fine tune the porosity model mimic the pressure drop and thermal load of the coolant flow through a single flow element. The use of the tuned porosity model enables an efficient simulation of the entire NTTCA system, and evaluating its performance during the design cycle.
Validation of an Accurate Three-Dimensional Helical Slow-Wave Circuit Model
NASA Technical Reports Server (NTRS)
Kory, Carol L.
1997-01-01
The helical slow-wave circuit embodies a helical coil of rectangular tape supported in a metal barrel by dielectric support rods. Although the helix slow-wave circuit remains the mainstay of the traveling-wave tube (TWT) industry because of its exceptionally wide bandwidth, a full helical circuit, without significant dimensional approximations, has not been successfully modeled until now. Numerous attempts have been made to analyze the helical slow-wave circuit so that the performance could be accurately predicted without actually building it, but because of its complex geometry, many geometrical approximations became necessary rendering the previous models inaccurate. In the course of this research it has been demonstrated that using the simulation code, MAFIA, the helical structure can be modeled with actual tape width and thickness, dielectric support rod geometry and materials. To demonstrate the accuracy of the MAFIA model, the cold-test parameters including dispersion, on-axis interaction impedance and attenuation have been calculated for several helical TWT slow-wave circuits with a variety of support rod geometries including rectangular and T-shaped rods, as well as various support rod materials including isotropic, anisotropic and partially metal coated dielectrics. Compared with experimentally measured results, the agreement is excellent. With the accuracy of the MAFIA helical model validated, the code was used to investigate several conventional geometric approximations in an attempt to obtain the most computationally efficient model. Several simplifications were made to a standard model including replacing the helical tape with filaments, and replacing rectangular support rods with shapes conforming to the cylindrical coordinate system with effective permittivity. The approximate models are compared with the standard model in terms of cold-test characteristics and computational time. The model was also used to determine the sensitivity of various
Efficient and Robust Optimization for Building Energy Simulation
Pourarian, Shokouh; Kearsley, Anthony; Wen, Jin; Pertzborn, Amanda
2016-01-01
Efficiently, robustly and accurately solving large sets of structured, non-linear algebraic and differential equations is one of the most computationally expensive steps in the dynamic simulation of building energy systems. Here, the efficiency, robustness and accuracy of two commonly employed solution methods are compared. The comparison is conducted using the HVACSIM+ software package, a component based building system simulation tool. The HVACSIM+ software presently employs Powell’s Hybrid method to solve systems of nonlinear algebraic equations that model the dynamics of energy states and interactions within buildings. It is shown here that the Powell’s method does not always converge to a solution. Since a myriad of other numerical methods are available, the question arises as to which method is most appropriate for building energy simulation. This paper finds considerable computational benefits result from replacing the Powell’s Hybrid method solver in HVACSIM+ with a solver more appropriate for the challenges particular to numerical simulations of buildings. Evidence is provided that a variant of the Levenberg-Marquardt solver has superior accuracy and robustness compared to the Powell’s Hybrid method presently used in HVACSIM+. PMID:27325907
2016-01-01
An important challenge in the simulation of biomolecular systems is a quantitative description of the protonation and deprotonation process of amino acid residues. Despite the seeming simplicity of adding or removing a positively charged hydrogen nucleus, simulating the actual protonation/deprotonation process is inherently difficult. It requires both the explicit treatment of the excess proton, including its charge defect delocalization and Grotthuss shuttling through inhomogeneous moieties (water and amino residues), and extensive sampling of coupled condensed phase motions. In a recent paper (J. Chem. Theory Comput.2014, 10, 2729−273725061442), a multiscale approach was developed to map high-level quantum mechanics/molecular mechanics (QM/MM) data into a multiscale reactive molecular dynamics (MS-RMD) model in order to describe amino acid deprotonation in bulk water. In this article, we extend the fitting approach (called FitRMD) to create MS-RMD models for ionizable amino acids within proteins. The resulting models are shown to faithfully reproduce the free energy profiles of the reference QM/MM Hamiltonian for PT inside an example protein, the ClC-ec1 H+/Cl– antiporter. Moreover, we show that the resulting MS-RMD models are computationally efficient enough to then characterize more complex 2-dimensional free energy surfaces due to slow degrees of freedom such as water hydration of internal protein cavities that can be inherently coupled to the excess proton charge translocation. The FitRMD method is thus shown to be an effective way to map ab initio level accuracy into a much more computationally efficient reactive MD method in order to explicitly simulate and quantitatively describe amino acid protonation/deprotonation in proteins. PMID:26734942
Lee, Sangyun; Liang, Ruibin; Voth, Gregory A; Swanson, Jessica M J
2016-02-09
An important challenge in the simulation of biomolecular systems is a quantitative description of the protonation and deprotonation process of amino acid residues. Despite the seeming simplicity of adding or removing a positively charged hydrogen nucleus, simulating the actual protonation/deprotonation process is inherently difficult. It requires both the explicit treatment of the excess proton, including its charge defect delocalization and Grotthuss shuttling through inhomogeneous moieties (water and amino residues), and extensive sampling of coupled condensed phase motions. In a recent paper (J. Chem. Theory Comput. 2014, 10, 2729-2737), a multiscale approach was developed to map high-level quantum mechanics/molecular mechanics (QM/MM) data into a multiscale reactive molecular dynamics (MS-RMD) model in order to describe amino acid deprotonation in bulk water. In this article, we extend the fitting approach (called FitRMD) to create MS-RMD models for ionizable amino acids within proteins. The resulting models are shown to faithfully reproduce the free energy profiles of the reference QM/MM Hamiltonian for PT inside an example protein, the ClC-ec1 H(+)/Cl(-) antiporter. Moreover, we show that the resulting MS-RMD models are computationally efficient enough to then characterize more complex 2-dimensional free energy surfaces due to slow degrees of freedom such as water hydration of internal protein cavities that can be inherently coupled to the excess proton charge translocation. The FitRMD method is thus shown to be an effective way to map ab initio level accuracy into a much more computationally efficient reactive MD method in order to explicitly simulate and quantitatively describe amino acid protonation/deprotonation in proteins.
Efficient Integrative Multi-SNP Association Analysis via Deterministic Approximation of Posteriors.
Wen, Xiaoquan; Lee, Yeji; Luca, Francesca; Pique-Regi, Roger
2016-06-02
With the increasing availability of functional genomic data, incorporating genomic annotations into genetic association analysis has become a standard procedure. However, the existing methods often lack rigor and/or computational efficiency and consequently do not maximize the utility of functional annotations. In this paper, we propose a rigorous inference procedure to perform integrative association analysis incorporating genomic annotations for both traditional GWASs and emerging molecular QTL mapping studies. In particular, we propose an algorithm, named deterministic approximation of posteriors (DAP), which enables highly efficient and accurate joint enrichment analysis and identification of multiple causal variants. We use a series of simulation studies to highlight the power and computational efficiency of our proposed approach and further demonstrate it by analyzing the cross-population eQTL data from the GEUVADIS project and the multi-tissue eQTL data from the GTEx project. In particular, we find that genetic variants predicted to disrupt transcription factor binding sites are enriched in cis-eQTLs across all tissues. Moreover, the enrichment estimates obtained across the tissues are correlated with the cell types for which the annotations are derived. Copyright © 2016 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Tanaka, T.; Tachikawa, Y.; Ichikawa, Y.; Yorozu, K.
2017-12-01
Flood is one of the most hazardous disasters and causes serious damage to people and property around the world. To prevent/mitigate flood damage through early warning system and/or river management planning, numerical modelling of flood-inundation processes is essential. In a literature, flood-inundation models have been extensively developed and improved to achieve flood flow simulation with complex topography at high resolution. With increasing demands on flood-inundation modelling, its computational burden is now one of the key issues. Improvements of computational efficiency of full shallow water equations are made from various perspectives such as approximations of the momentum equations, parallelization technique, and coarsening approaches. To support these techniques and more improve the computational efficiency of flood-inundation simulations, this study proposes an Automatic Domain Updating (ADU) method of 2-D flood-inundation simulation. The ADU method traces the wet and dry interface and automatically updates the simulation domain in response to the progress and recession of flood propagation. The updating algorithm is as follow: first, to register the simulation cells potentially flooded at initial stage (such as floodplains nearby river channels), and then if a registered cell is flooded, to register its surrounding cells. The time for this additional process is saved by checking only cells at wet and dry interface. The computation time is reduced by skipping the processing time of non-flooded area. This algorithm is easily applied to any types of 2-D flood inundation models. The proposed ADU method is implemented to 2-D local inertial equations for the Yodo River basin, Japan. Case studies for two flood events show that the simulation is finished within two to 10 times smaller time showing the same result as that without the ADU method.
Advanced Computational Methods in Bio-Mechanics.
Al Qahtani, Waleed M S; El-Anwar, Mohamed I
2018-04-15
A novel partnership between surgeons and machines, made possible by advances in computing and engineering technology, could overcome many of the limitations of traditional surgery. By extending surgeons' ability to plan and carry out surgical interventions more accurately and with fewer traumas, computer-integrated surgery (CIS) systems could help to improve clinical outcomes and the efficiency of healthcare delivery. CIS systems could have a similar impact on surgery to that long since realised in computer-integrated manufacturing. Mathematical modelling and computer simulation have proved tremendously successful in engineering. Computational mechanics has enabled technological developments in virtually every area of our lives. One of the greatest challenges for mechanists is to extend the success of computational mechanics to fields outside traditional engineering, in particular to biology, the biomedical sciences, and medicine. Biomechanics has significant potential for applications in orthopaedic industry, and the performance arts since skills needed for these activities are visibly related to the human musculoskeletal and nervous systems. Although biomechanics is widely used nowadays in the orthopaedic industry to design orthopaedic implants for human joints, dental parts, external fixations and other medical purposes, numerous researches funded by billions of dollars are still running to build a new future for sports and human healthcare in what is called biomechanics era.
Computational efficiency and Amdahl’s law for the adaptive resolution simulation technique
DOE Office of Scientific and Technical Information (OSTI.GOV)
Junghans, Christoph; Agarwal, Animesh; Delle Site, Luigi
Here, we discuss the computational performance of the adaptive resolution technique in molecular simulation when it is compared with equivalent full coarse-grained and full atomistic simulations. We show that an estimate of its efficiency, within 10%–15% accuracy, is given by the Amdahl’s Law adapted to the specific quantities involved in the problem. The derivation of the predictive formula is general enough that it may be applied to the general case of molecular dynamics approaches where a reduction of degrees of freedom in a multi scale fashion occurs.
Computational efficiency and Amdahl’s law for the adaptive resolution simulation technique
Junghans, Christoph; Agarwal, Animesh; Delle Site, Luigi
2017-06-01
Here, we discuss the computational performance of the adaptive resolution technique in molecular simulation when it is compared with equivalent full coarse-grained and full atomistic simulations. We show that an estimate of its efficiency, within 10%–15% accuracy, is given by the Amdahl’s Law adapted to the specific quantities involved in the problem. The derivation of the predictive formula is general enough that it may be applied to the general case of molecular dynamics approaches where a reduction of degrees of freedom in a multi scale fashion occurs.
ERIC Educational Resources Information Center
Anglin, Linda; Anglin, Kenneth; Schumann, Paul L.; Kaliski, John A.
2008-01-01
This study tests the use of computer-assisted grading rubrics compared to other grading methods with respect to the efficiency and effectiveness of different grading processes for subjective assignments. The test was performed on a large Introduction to Business course. The students in this course were randomly assigned to four treatment groups…
Cross hole GPR traveltime inversion using a fast and accurate neural network as a forward model
NASA Astrophysics Data System (ADS)
Mejer Hansen, Thomas
2017-04-01
Probabilistic formulated inverse problems can be solved using Monte Carlo based sampling methods. In principle both advanced prior information, such as based on geostatistics, and complex non-linear forward physical models can be considered. However, in practice these methods can be associated with huge computational costs that in practice limit their application. This is not least due to the computational requirements related to solving the forward problem, where the physical response of some earth model has to be evaluated. Here, it is suggested to replace a numerical complex evaluation of the forward problem, with a trained neural network that can be evaluated very fast. This will introduce a modeling error, that is quantified probabilistically such that it can be accounted for during inversion. This allows a very fast and efficient Monte Carlo sampling of the solution to an inverse problem. We demonstrate the methodology for first arrival travel time inversion of cross hole ground-penetrating radar (GPR) data. An accurate forward model, based on 2D full-waveform modeling followed by automatic travel time picking, is replaced by a fast neural network. This provides a sampling algorithm three orders of magnitude faster than using the full forward model, and considerably faster, and more accurate, than commonly used approximate forward models. The methodology has the potential to dramatically change the complexity of the types of inverse problems that can be solved using non-linear Monte Carlo sampling techniques.
High-Performance Computing Data Center Efficiency Dashboard | Computational
recovery water (ERW) loop Heat exchanger for energy recovery Thermosyphon Heat exchanger between ERW loop and cooling tower loop Evaporative cooling towers Learn more about our energy-efficient facility
NASA Astrophysics Data System (ADS)
Beck, Jeffrey; Bos, Jeremy P.
2017-05-01
We compare several modifications to the open-source wave optics package, WavePy, intended to improve execution time. Specifically, we compare the relative performance of the Intel MKL, a CPU based OpenCV distribution, and GPU-based version. Performance is compared between distributions both on the same compute platform and between a fully-featured computing workstation and the NVIDIA Jetson TX1 platform. Comparisons are drawn in terms of both execution time and power consumption. We have found that substituting the Fast Fourier Transform operation from OpenCV provides a marked improvement on all platforms. In addition, we show that embedded platforms offer some possibility for extensive improvement in terms of efficiency compared to a fully featured workstation.
NASA Astrophysics Data System (ADS)
Allphin, Devin
Computational fluid dynamics (CFD) solution approximations for complex fluid flow problems have become a common and powerful engineering analysis technique. These tools, though qualitatively useful, remain limited in practice by their underlying inverse relationship between simulation accuracy and overall computational expense. While a great volume of research has focused on remedying these issues inherent to CFD, one traditionally overlooked area of resource reduction for engineering analysis concerns the basic definition and determination of functional relationships for the studied fluid flow variables. This artificial relationship-building technique, called meta-modeling or surrogate/offline approximation, uses design of experiments (DOE) theory to efficiently approximate non-physical coupling between the variables of interest in a fluid flow analysis problem. By mathematically approximating these variables, DOE methods can effectively reduce the required quantity of CFD simulations, freeing computational resources for other analytical focuses. An idealized interpretation of a fluid flow problem can also be employed to create suitably accurate approximations of fluid flow variables for the purposes of engineering analysis. When used in parallel with a meta-modeling approximation, a closed-form approximation can provide useful feedback concerning proper construction, suitability, or even necessity of an offline approximation tool. It also provides a short-circuit pathway for further reducing the overall computational demands of a fluid flow analysis, again freeing resources for otherwise unsuitable resource expenditures. To validate these inferences, a design optimization problem was presented requiring the inexpensive estimation of aerodynamic forces applied to a valve operating on a simulated piston-cylinder heat engine. The determination of these forces was to be found using parallel surrogate and exact approximation methods, thus evidencing the comparative
Accurate Bit Error Rate Calculation for Asynchronous Chaos-Based DS-CDMA over Multipath Channel
NASA Astrophysics Data System (ADS)
Kaddoum, Georges; Roviras, Daniel; Chargé, Pascal; Fournier-Prunaret, Daniele
2009-12-01
An accurate approach to compute the bit error rate expression for multiuser chaosbased DS-CDMA system is presented in this paper. For more realistic communication system a slow fading multipath channel is considered. A simple RAKE receiver structure is considered. Based on the bit energy distribution, this approach compared to others computation methods existing in literature gives accurate results with low computation charge. Perfect estimation of the channel coefficients with the associated delays and chaos synchronization is assumed. The bit error rate is derived in terms of the bit energy distribution, the number of paths, the noise variance, and the number of users. Results are illustrated by theoretical calculations and numerical simulations which point out the accuracy of our approach.
Accurate Monotonicity - Preserving Schemes With Runge-Kutta Time Stepping
NASA Technical Reports Server (NTRS)
Suresh, A.; Huynh, H. T.
1997-01-01
A new class of high-order monotonicity-preserving schemes for the numerical solution of conservation laws is presented. The interface value in these schemes is obtained by limiting a higher-order polynominal reconstruction. The limiting is designed to preserve accuracy near extrema and to work well with Runge-Kutta time stepping. Computational efficiency is enhanced by a simple test that determines whether the limiting procedure is needed. For linear advection in one dimension, these schemes are shown as well as the Euler equations also confirm their high accuracy, good shock resolution, and computational efficiency.
NASA Astrophysics Data System (ADS)
Wu, Zedong; Alkhalifah, Tariq
2018-07-01
Numerical simulation of the acoustic wave equation in either isotropic or anisotropic media is crucial to seismic modeling, imaging and inversion. Actually, it represents the core computation cost of these highly advanced seismic processing methods. However, the conventional finite-difference method suffers from severe numerical dispersion errors and S-wave artifacts when solving the acoustic wave equation for anisotropic media. We propose a method to obtain the finite-difference coefficients by comparing its numerical dispersion with the exact form. We find the optimal finite difference coefficients that share the dispersion characteristics of the exact equation with minimal dispersion error. The method is extended to solve the acoustic wave equation in transversely isotropic (TI) media without S-wave artifacts. Numerical examples show that the method is highly accurate and efficient.
Multiscale solvers and systematic upscaling in computational physics
NASA Astrophysics Data System (ADS)
Brandt, A.
2005-07-01
Multiscale algorithms can overcome the scale-born bottlenecks that plague most computations in physics. These algorithms employ separate processing at each scale of the physical space, combined with interscale iterative interactions, in ways which use finer scales very sparingly. Having been developed first and well known as multigrid solvers for partial differential equations, highly efficient multiscale techniques have more recently been developed for many other types of computational tasks, including: inverse PDE problems; highly indefinite (e.g., standing wave) equations; Dirac equations in disordered gauge fields; fast computation and updating of large determinants (as needed in QCD); fast integral transforms; integral equations; astrophysics; molecular dynamics of macromolecules and fluids; many-atom electronic structures; global and discrete-state optimization; practical graph problems; image segmentation and recognition; tomography (medical imaging); fast Monte-Carlo sampling in statistical physics; and general, systematic methods of upscaling (accurate numerical derivation of large-scale equations from microscopic laws).
Dual computer monitors to increase efficiency of conducting systematic reviews.
Wang, Zhen; Asi, Noor; Elraiyah, Tarig A; Abu Dabrh, Abd Moain; Undavalli, Chaitanya; Glasziou, Paul; Montori, Victor; Murad, Mohammad Hassan
2014-12-01
Systematic reviews (SRs) are the cornerstone of evidence-based medicine. In this study, we evaluated the effectiveness of using two computer screens on the efficiency of conducting SRs. A cohort of reviewers before and after using dual monitors were compared with a control group that did not use dual monitors. The outcomes were time spent for abstract screening, full-text screening and data extraction, and inter-rater agreement. We adopted multivariate difference-in-differences linear regression models. A total of 60 SRs conducted by 54 reviewers were included in this analysis. We found a significant reduction of 23.81 minutes per article in data extraction in the intervention group relative to the control group (95% confidence interval: -46.03, -1.58, P = 0.04), which was a 36.85% reduction in time. There was no significant difference in time spent on abstract screening, full-text screening, or inter-rater agreement between the two groups. Using dual monitors when conducting SRs is associated with significant reduction of time spent on data extraction. No significant difference was observed on time spent on abstract screening or full-text screening. Using dual monitors is one strategy that may improve the efficiency of conducting SRs. Copyright © 2014 Elsevier Inc. All rights reserved.
Peppas, Kostas P; Lazarakis, Fotis; Alexandridis, Antonis; Dangakis, Kostas
2012-08-01
In this Letter we investigate the error performance of multiple-input multiple-output free-space optical communication systems employing intensity modulation/direct detection and operating over strong atmospheric turbulence channels. Atmospheric-induced strong turbulence fading is modeled using the negative exponential distribution. For the considered system, an approximate yet accurate analytical expression for the average bit error probability is derived and an efficient method for its numerical evaluation is proposed. Numerically evaluated and computer simulation results are further provided to demonstrate the validity of the proposed mathematical analysis.
Lippert, Christoph; Xiang, Jing; Horta, Danilo; Widmer, Christian; Kadie, Carl; Heckerman, David; Listgarten, Jennifer
2014-01-01
Motivation: Set-based variance component tests have been identified as a way to increase power in association studies by aggregating weak individual effects. However, the choice of test statistic has been largely ignored even though it may play an important role in obtaining optimal power. We compared a standard statistical test—a score test—with a recently developed likelihood ratio (LR) test. Further, when correction for hidden structure is needed, or gene–gene interactions are sought, state-of-the art algorithms for both the score and LR tests can be computationally impractical. Thus we develop new computationally efficient methods. Results: After reviewing theoretical differences in performance between the score and LR tests, we find empirically on real data that the LR test generally has more power. In particular, on 15 of 17 real datasets, the LR test yielded at least as many associations as the score test—up to 23 more associations—whereas the score test yielded at most one more association than the LR test in the two remaining datasets. On synthetic data, we find that the LR test yielded up to 12% more associations, consistent with our results on real data, but also observe a regime of extremely small signal where the score test yielded up to 25% more associations than the LR test, consistent with theory. Finally, our computational speedups now enable (i) efficient LR testing when the background kernel is full rank, and (ii) efficient score testing when the background kernel changes with each test, as for gene–gene interaction tests. The latter yielded a factor of 2000 speedup on a cohort of size 13 500. Availability: Software available at http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/Fastlmm/. Contact: heckerma@microsoft.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25075117
Lippert, Christoph; Xiang, Jing; Horta, Danilo; Widmer, Christian; Kadie, Carl; Heckerman, David; Listgarten, Jennifer
2014-11-15
Set-based variance component tests have been identified as a way to increase power in association studies by aggregating weak individual effects. However, the choice of test statistic has been largely ignored even though it may play an important role in obtaining optimal power. We compared a standard statistical test-a score test-with a recently developed likelihood ratio (LR) test. Further, when correction for hidden structure is needed, or gene-gene interactions are sought, state-of-the art algorithms for both the score and LR tests can be computationally impractical. Thus we develop new computationally efficient methods. After reviewing theoretical differences in performance between the score and LR tests, we find empirically on real data that the LR test generally has more power. In particular, on 15 of 17 real datasets, the LR test yielded at least as many associations as the score test-up to 23 more associations-whereas the score test yielded at most one more association than the LR test in the two remaining datasets. On synthetic data, we find that the LR test yielded up to 12% more associations, consistent with our results on real data, but also observe a regime of extremely small signal where the score test yielded up to 25% more associations than the LR test, consistent with theory. Finally, our computational speedups now enable (i) efficient LR testing when the background kernel is full rank, and (ii) efficient score testing when the background kernel changes with each test, as for gene-gene interaction tests. The latter yielded a factor of 2000 speedup on a cohort of size 13 500. Software available at http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/Fastlmm/. heckerma@microsoft.com Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Prakash, Jaya; Yalavarthy, Phaneendra K
2013-03-01
Developing a computationally efficient automated method for the optimal choice of regularization parameter in diffuse optical tomography. The least-squares QR (LSQR)-type method that uses Lanczos bidiagonalization is known to be computationally efficient in performing the reconstruction procedure in diffuse optical tomography. The same is effectively deployed via an optimization procedure that uses the simplex method to find the optimal regularization parameter. The proposed LSQR-type method is compared with the traditional methods such as L-curve, generalized cross-validation (GCV), and recently proposed minimal residual method (MRM)-based choice of regularization parameter using numerical and experimental phantom data. The results indicate that the proposed LSQR-type and MRM-based methods performance in terms of reconstructed image quality is similar and superior compared to L-curve and GCV-based methods. The proposed method computational complexity is at least five times lower compared to MRM-based method, making it an optimal technique. The LSQR-type method was able to overcome the inherent limitation of computationally expensive nature of MRM-based automated way finding the optimal regularization parameter in diffuse optical tomographic imaging, making this method more suitable to be deployed in real-time.
A space-efficient quantum computer simulator suitable for high-speed FPGA implementation
NASA Astrophysics Data System (ADS)
Frank, Michael P.; Oniciuc, Liviu; Meyer-Baese, Uwe H.; Chiorescu, Irinel
2009-05-01
Conventional vector-based simulators for quantum computers are quite limited in the size of the quantum circuits they can handle, due to the worst-case exponential growth of even sparse representations of the full quantum state vector as a function of the number of quantum operations applied. However, this exponential-space requirement can be avoided by using general space-time tradeoffs long known to complexity theorists, which can be appropriately optimized for this particular problem in a way that also illustrates some interesting reformulations of quantum mechanics. In this paper, we describe the design and empirical space/time complexity measurements of a working software prototype of a quantum computer simulator that avoids excessive space requirements. Due to its space-efficiency, this design is well-suited to embedding in single-chip environments, permitting especially fast execution that avoids access latencies to main memory. We plan to prototype our design on a standard FPGA development board.
Developments toward more accurate molecular modeling of liquids
NASA Astrophysics Data System (ADS)
Evans, Tom J.
2000-12-01
The general goal of this research has been to improve upon existing combined quantum mechanics/molecular mechanics (QM/MM) methodologies. Error weighting functions have been introduced into the perturbative Monte Carlo (PMC) method for use with QM/MM. The PMC approach, introduced earlier, provides a means to reduce the number of full self-consistent field (SCF) calculations in simulations using the QM/MM potential by evoking perturbation theory to calculate energy changes due to displacements of a MM molecule. This will allow the ab initio QM/MM approach to be applied to systems that require more advanced, computationally demanding treatments of the QM and/or MM regions. Efforts have also been made to improve the accuracy of the representation of the solvent molecules usually represented by MM force fields. Results from an investigation of the applicability of the embedded density functional theory (EDFT) for studying physical properties of solutions will be presented. In this approach, the solute wavefunction is solved self- consistently in the field of individually frozen electron-density solvent molecules. To test its accuracy, the potential curves for interactions between Li+, Cl- and H2O with a single frozen-density H 2O molecule in different orientations have been calculated. With the development of the more sophisticated effective fragment potential (EFP) representation of solvent molecules, a QM/EFP technique was created. This hybrid QM/EFP approach was used to investigate the solvation of Li + by small clusters of water, as a test case for larger ionic dusters. The EFP appears to provide an accurate representation of the strong interactions that exist between Li+ and H2O. With the QM/EFP methodology comes an increased computational expense, resulting in an even greater need to rely on the PMC approach. However, while including the PMC into the hybrid QM/EFP technique, it was discovered that the previous implementation of the PMC was done incorrectly
Thierbach, Adrian; Neiss, Christian; Gallandi, Lukas; Marom, Noa; Körzdörfer, Thomas; Görling, Andreas
2017-10-10
An accurate yet computationally very efficient and formally well justified approach to calculate molecular ionization potentials is presented and tested. The first as well as higher ionization potentials are obtained as the negatives of the Kohn-Sham eigenvalues of the neutral molecule after adjusting the eigenvalues by a recently [ Görling Phys. Rev. B 2015 , 91 , 245120 ] introduced potential adjustor for exchange-correlation potentials. Technically the method is very simple. Besides a Kohn-Sham calculation of the neutral molecule, only a second Kohn-Sham calculation of the cation is required. The eigenvalue spectrum of the neutral molecule is shifted such that the negative of the eigenvalue of the highest occupied molecular orbital equals the energy difference of the total electronic energies of the cation minus the neutral molecule. For the first ionization potential this simply amounts to a ΔSCF calculation. Then, the higher ionization potentials are obtained as the negatives of the correspondingly shifted Kohn-Sham eigenvalues. Importantly, this shift of the Kohn-Sham eigenvalue spectrum is not just ad hoc. In fact, it is formally necessary for the physically correct energetic adjustment of the eigenvalue spectrum as it results from ensemble density-functional theory. An analogous approach for electron affinities is equally well obtained and justified. To illustrate the practical benefits of the approach, we calculate the valence ionization energies of test sets of small- and medium-sized molecules and photoelectron spectra of medium-sized electron acceptor molecules using a typical semilocal (PBE) and two typical global hybrid functionals (B3LYP and PBE0). The potential adjusted B3LYP and PBE0 eigenvalues yield valence ionization potentials that are in very good agreement with experimental values, reaching an accuracy that is as good as the best G 0 W 0 methods, however, at much lower computational costs. The potential adjusted PBE eigenvalues result in
Computational toxicology (CompTox) leverages the significant gains in computing power and computational techniques (e.g., numerical approaches, structure-activity relationships, bioinformatics) realized over the last few years, thereby reducing costs and increasing efficiency i...
NASA Astrophysics Data System (ADS)
Shaat, Musbah; Bader, Faouzi
2010-12-01
Cognitive Radio (CR) systems have been proposed to increase the spectrum utilization by opportunistically access the unused spectrum. Multicarrier communication systems are promising candidates for CR systems. Due to its high spectral efficiency, filter bank multicarrier (FBMC) can be considered as an alternative to conventional orthogonal frequency division multiplexing (OFDM) for transmission over the CR networks. This paper addresses the problem of resource allocation in multicarrier-based CR networks. The objective is to maximize the downlink capacity of the network under both total power and interference introduced to the primary users (PUs) constraints. The optimal solution has high computational complexity which makes it unsuitable for practical applications and hence a low complexity suboptimal solution is proposed. The proposed algorithm utilizes the spectrum holes in PUs bands as well as active PU bands. The performance of the proposed algorithm is investigated for OFDM and FBMC based CR systems. Simulation results illustrate that the proposed resource allocation algorithm with low computational complexity achieves near optimal performance and proves the efficiency of using FBMC in CR context.
Model-Invariant Hybrid Computations of Separated Flows for RCA Standard Test Cases
NASA Technical Reports Server (NTRS)
Woodruff, Stephen
2016-01-01
NASA's Revolutionary Computational Aerosciences (RCA) subproject has identified several smooth-body separated flows as standard test cases to emphasize the challenge these flows present for computational methods and their importance to the aerospace community. Results of computations of two of these test cases, the NASA hump and the FAITH experiment, are presented. The computations were performed with the model-invariant hybrid LES-RANS formulation, implemented in the NASA code VULCAN-CFD. The model- invariant formulation employs gradual LES-RANS transitions and compensation for model variation to provide more accurate and efficient hybrid computations. Comparisons revealed that the LES-RANS transitions employed in these computations were sufficiently gradual that the compensating terms were unnecessary. Agreement with experiment was achieved only after reducing the turbulent viscosity to mitigate the effect of numerical dissipation. The stream-wise evolution of peak Reynolds shear stress was employed as a measure of turbulence dynamics in separated flows useful for evaluating computations.
Efficient Monte Carlo Estimation of the Expected Value of Sample Information Using Moment Matching.
Heath, Anna; Manolopoulou, Ioanna; Baio, Gianluca
2018-02-01
The Expected Value of Sample Information (EVSI) is used to calculate the economic value of a new research strategy. Although this value would be important to both researchers and funders, there are very few practical applications of the EVSI. This is due to computational difficulties associated with calculating the EVSI in practical health economic models using nested simulations. We present an approximation method for the EVSI that is framed in a Bayesian setting and is based on estimating the distribution of the posterior mean of the incremental net benefit across all possible future samples, known as the distribution of the preposterior mean. Specifically, this distribution is estimated using moment matching coupled with simulations that are available for probabilistic sensitivity analysis, which is typically mandatory in health economic evaluations. This novel approximation method is applied to a health economic model that has previously been used to assess the performance of other EVSI estimators and accurately estimates the EVSI. The computational time for this method is competitive with other methods. We have developed a new calculation method for the EVSI which is computationally efficient and accurate. This novel method relies on some additional simulation so can be expensive in models with a large computational cost.
Cupola Furnace Computer Process Model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seymour Katz
2004-12-31
The cupola furnace generates more than 50% of the liquid iron used to produce the 9+ million tons of castings annually. The cupola converts iron and steel into cast iron. The main advantages of the cupola furnace are lower energy costs than those of competing furnaces (electric) and the ability to melt less expensive metallic scrap than the competing furnaces. However the chemical and physical processes that take place in the cupola furnace are highly complex making it difficult to operate the furnace in optimal fashion. The results are low energy efficiency and poor recovery of important and expensive alloymore » elements due to oxidation. Between 1990 and 2004 under the auspices of the Department of Energy, the American Foundry Society and General Motors Corp. a computer simulation of the cupola furnace was developed that accurately describes the complex behavior of the furnace. When provided with the furnace input conditions the model provides accurate values of the output conditions in a matter of seconds. It also provides key diagnostics. Using clues from the diagnostics a trained specialist can infer changes in the operation that will move the system toward higher efficiency. Repeating the process in an iterative fashion leads to near optimum operating conditions with just a few iterations. More advanced uses of the program have been examined. The program is currently being combined with an ''Expert System'' to permit optimization in real time. The program has been combined with ''neural network'' programs to affect very easy scanning of a wide range of furnace operation. Rudimentary efforts were successfully made to operate the furnace using a computer. References to these more advanced systems will be found in the ''Cupola Handbook''. Chapter 27, American Foundry Society, Des Plaines, IL (1999).« less
Snore related signals processing in a private cloud computing system.
Qian, Kun; Guo, Jian; Xu, Huijie; Zhu, Zhaomeng; Zhang, Gongxuan
2014-09-01
Snore related signals (SRS) have been demonstrated to carry important information about the obstruction site and degree in the upper airway of Obstructive Sleep Apnea-Hypopnea Syndrome (OSAHS) patients in recent years. To make this acoustic signal analysis method more accurate and robust, big SRS data processing is inevitable. As an emerging concept and technology, cloud computing has motivated numerous researchers and engineers to exploit applications both in academic and industry field, which could have an ability to implement a huge blue print in biomedical engineering. Considering the security and transferring requirement of biomedical data, we designed a system based on private cloud computing to process SRS. Then we set the comparable experiments of processing a 5-hour audio recording of an OSAHS patient by a personal computer, a server and a private cloud computing system to demonstrate the efficiency of the infrastructure we proposed.
Computational Study of the CC3 Impeller and Vaneless Diffuser Experiment
NASA Technical Reports Server (NTRS)
Kulkarni, Sameer; Beach, Timothy A.; Skoch, Gary J.
2013-01-01
Centrifugal compressors are compatible with the low exit corrected flows found in the high pressure compressor of turboshaft engines and may play an increasing role in turbofan engines as engine overall pressure ratios increase. Centrifugal compressor stages are difficult to model accurately with RANS CFD solvers. A computational study of the CC3 centrifugal impeller in its vaneless diffuser configuration was undertaken as part of an effort to understand potential causes of RANS CFD mis-prediction in these types of geometries. Three steady, periodic cases of the impeller and diffuser were modeled using the TURBO Parallel Version 4 code: 1) a k-epsilon turbulence model computation on a 6.8 million point grid using wall functions, 2) a k-epsilon turbulence model computation on a 14 million point grid integrating to the wall, and 3) a k-omega turbulence model computation on the 14 million point grid integrating to the wall. It was found that all three cases compared favorably to data from inlet to impeller trailing edge, but the k-epsilon and k-omega computations had disparate results beyond the trailing edge and into the vaneless diffuser. A large region of reversed flow was observed in the k-epsilon computations which extended from 70% to 100% span at the exit rating plane, whereas the k-omega computation had reversed flow from 95% to 100% span. Compared to experimental data at near-peak-efficiency, the reversed flow region in the k-epsilon case resulted in an under-prediction in adiabatic efficiency of 8.3 points, whereas the k-omega case was 1.2 points lower in efficiency.
Computational Study of the CC3 Impeller and Vaneless Diffuser Experiment
NASA Technical Reports Server (NTRS)
Kulkarni, Sameer; Beach, Timothy A.; Skoch, Gary J.
2013-01-01
Centrifugal compressors are compatible with the low exit corrected flows found in the high pressure compressor of turboshaft engines and may play an increasing role in turbofan engines as engine overall pressure ratios increase. Centrifugal compressor stages are difficult to model accurately with RANS CFD solvers. A computational study of the CC3 centrifugal impeller in its vaneless diffuser configuration was undertaken as part of an effort to understand potential causes of RANS CFD mis-prediction in these types of geometries. Three steady, periodic cases of the impeller and diffuser were modeled using the TURBO Parallel Version 4 code: (1) a k-e turbulence model computation on a 6.8 million point grid using wall functions, (2) a k-e turbulence model computation on a 14 million point grid integrating to the wall, and (3) a k-? turbulence model computation on the 14 million point grid integrating to the wall. It was found that all three cases compared favorably to data from inlet to impeller trailing edge, but the k-e and k-? computations had disparate results beyond the trailing edge and into the vaneless diffuser. A large region of reversed flow was observed in the k-e computations which extended from 70 to 100 percent span at the exit rating plane, whereas the k-? computation had reversed flow from 95 to 100 percent span. Compared to experimental data at near-peak-efficiency, the reversed flow region in the k-e case resulted in an underprediction in adiabatic efficiency of 8.3 points, whereas the k-? case was 1.2 points lower in efficiency.
Efficiently computing exact geodesic loops within finite steps.
Xin, Shi-Qing; He, Ying; Fu, Chi-Wing
2012-06-01
Closed geodesics, or geodesic loops, are crucial to the study of differential topology and differential geometry. Although the existence and properties of closed geodesics on smooth surfaces have been widely studied in mathematics community, relatively little progress has been made on how to compute them on polygonal surfaces. Most existing algorithms simply consider the mesh as a graph and so the resultant loops are restricted only on mesh edges, which are far from the actual geodesics. This paper is the first to prove the existence and uniqueness of geodesic loop restricted on a closed face sequence; it contributes also with an efficient algorithm to iteratively evolve an initial closed path on a given mesh into an exact geodesic loop within finite steps. Our proposed algorithm takes only an O(k) space complexity and an O(mk) time complexity (experimentally), where m is the number of vertices in the region bounded by the initial loop and the resultant geodesic loop, and k is the average number of edges in the edge sequences that the evolving loop passes through. In contrast to the existing geodesic curvature flow methods which compute an approximate geodesic loop within a predefined threshold, our method is exact and can apply directly to triangular meshes without needing to solve any differential equation with a numerical solver; it can run at interactive speed, e.g., in the order of milliseconds, for a mesh with around 50K vertices, and hence, significantly outperforms existing algorithms. Actually, our algorithm could run at interactive speed even for larger meshes. Besides the complexity of the input mesh, the geometric shape could also affect the number of evolving steps, i.e., the performance. We motivate our algorithm with an interactive shape segmentation example shown later in the paper.