NASA Astrophysics Data System (ADS)
MacDonald, Christopher L.; Bhattacharya, Nirupama; Sprouse, Brian P.; Silva, Gabriel A.
2015-09-01
Computing numerical solutions to fractional differential equations can be computationally intensive due to the effect of non-local derivatives in which all previous time points contribute to the current iteration. In general, numerical approaches that depend on truncating part of the system history while efficient, can suffer from high degrees of error and inaccuracy. Here we present an adaptive time step memory method for smooth functions applied to the Grünwald-Letnikov fractional diffusion derivative. This method is computationally efficient and results in smaller errors during numerical simulations. Sampled points along the system's history at progressively longer intervals are assumed to reflect the values of neighboring time points. By including progressively fewer points backward in time, a temporally 'weighted' history is computed that includes contributions from the entire past of the system, maintaining accuracy, but with fewer points actually calculated, greatly improving computational efficiency.
CCOMP: An efficient algorithm for complex roots computation of determinantal equations
NASA Astrophysics Data System (ADS)
Zouros, Grigorios P.
2018-01-01
In this paper a free Python algorithm, entitled CCOMP (Complex roots COMPutation), is developed for the efficient computation of complex roots of determinantal equations inside a prescribed complex domain. The key to the method presented is the efficient determination of the candidate points inside the domain which, in their close neighborhood, a complex root may lie. Once these points are detected, the algorithm proceeds to a two-dimensional minimization problem with respect to the minimum modulus eigenvalue of the system matrix. In the core of CCOMP exist three sub-algorithms whose tasks are the efficient estimation of the minimum modulus eigenvalues of the system matrix inside the prescribed domain, the efficient computation of candidate points which guarantee the existence of minima, and finally, the computation of minima via bound constrained minimization algorithms. Theoretical results and heuristics support the development and the performance of the algorithm, which is discussed in detail. CCOMP supports general complex matrices, and its efficiency, applicability and validity is demonstrated to a variety of microwave applications.
NASA Astrophysics Data System (ADS)
Sarangapani, R.; Jose, M. T.; Srinivasan, T. K.; Venkatraman, B.
2017-07-01
Methods for the determination of efficiency of an aged high purity germanium (HPGe) detector for gaseous sources have been presented in the paper. X-ray radiography of the detector has been performed to get detector dimensions for computational purposes. The dead layer thickness of HPGe detector has been ascertained from experiments and Monte Carlo computations. Experimental work with standard point and liquid sources in several cylindrical geometries has been undertaken for obtaining energy dependant efficiency. Monte Carlo simulations have been performed for computing efficiencies for point, liquid and gaseous sources. Self absorption correction factors have been obtained using mathematical equations for volume sources and MCNP simulations. Self-absorption correction and point source methods have been used to estimate the efficiency for gaseous sources. The efficiencies determined from the present work have been used to estimate activity of cover gas sample of a fast reactor.
Fast Semantic Segmentation of 3d Point Clouds with Strongly Varying Density
NASA Astrophysics Data System (ADS)
Hackel, Timo; Wegner, Jan D.; Schindler, Konrad
2016-06-01
We describe an effective and efficient method for point-wise semantic classification of 3D point clouds. The method can handle unstructured and inhomogeneous point clouds such as those derived from static terrestrial LiDAR or photogammetric reconstruction; and it is computationally efficient, making it possible to process point clouds with many millions of points in a matter of minutes. The key issue, both to cope with strong variations in point density and to bring down computation time, turns out to be careful handling of neighborhood relations. By choosing appropriate definitions of a point's (multi-scale) neighborhood, we obtain a feature set that is both expressive and fast to compute. We evaluate our classification method both on benchmark data from a mobile mapping platform and on a variety of large, terrestrial laser scans with greatly varying point density. The proposed feature set outperforms the state of the art with respect to per-point classification accuracy, while at the same time being much faster to compute.
Robust and efficient overset grid assembly for partitioned unstructured meshes
NASA Astrophysics Data System (ADS)
Roget, Beatrice; Sitaraman, Jayanarayanan
2014-03-01
This paper presents a method to perform efficient and automated Overset Grid Assembly (OGA) on a system of overlapping unstructured meshes in a parallel computing environment where all meshes are partitioned into multiple mesh-blocks and processed on multiple cores. The main task of the overset grid assembler is to identify, in parallel, among all points in the overlapping mesh system, at which points the flow solution should be computed (field points), interpolated (receptor points), or ignored (hole points). Point containment search or donor search, an algorithm to efficiently determine the cell that contains a given point, is the core procedure necessary for accomplishing this task. Donor search is particularly challenging for partitioned unstructured meshes because of the complex irregular boundaries that are often created during partitioning.
Image detection and compression for memory efficient system analysis
NASA Astrophysics Data System (ADS)
Bayraktar, Mustafa
2015-02-01
The advances in digital signal processing have been progressing towards efficient use of memory and processing. Both of these factors can be utilized efficiently by using feasible techniques of image storage by computing the minimum information of image which will enhance computation in later processes. Scale Invariant Feature Transform (SIFT) can be utilized to estimate and retrieve of an image. In computer vision, SIFT can be implemented to recognize the image by comparing its key features from SIFT saved key point descriptors. The main advantage of SIFT is that it doesn't only remove the redundant information from an image but also reduces the key points by matching their orientation and adding them together in different windows of image [1]. Another key property of this approach is that it works on highly contrasted images more efficiently because it`s design is based on collecting key points from the contrast shades of image.
Computational Study of the CC3 Impeller and Vaneless Diffuser Experiment
NASA Technical Reports Server (NTRS)
Kulkarni, Sameer; Beach, Timothy A.; Skoch, Gary J.
2013-01-01
Centrifugal compressors are compatible with the low exit corrected flows found in the high pressure compressor of turboshaft engines and may play an increasing role in turbofan engines as engine overall pressure ratios increase. Centrifugal compressor stages are difficult to model accurately with RANS CFD solvers. A computational study of the CC3 centrifugal impeller in its vaneless diffuser configuration was undertaken as part of an effort to understand potential causes of RANS CFD mis-prediction in these types of geometries. Three steady, periodic cases of the impeller and diffuser were modeled using the TURBO Parallel Version 4 code: 1) a k-epsilon turbulence model computation on a 6.8 million point grid using wall functions, 2) a k-epsilon turbulence model computation on a 14 million point grid integrating to the wall, and 3) a k-omega turbulence model computation on the 14 million point grid integrating to the wall. It was found that all three cases compared favorably to data from inlet to impeller trailing edge, but the k-epsilon and k-omega computations had disparate results beyond the trailing edge and into the vaneless diffuser. A large region of reversed flow was observed in the k-epsilon computations which extended from 70% to 100% span at the exit rating plane, whereas the k-omega computation had reversed flow from 95% to 100% span. Compared to experimental data at near-peak-efficiency, the reversed flow region in the k-epsilon case resulted in an under-prediction in adiabatic efficiency of 8.3 points, whereas the k-omega case was 1.2 points lower in efficiency.
Computational Study of the CC3 Impeller and Vaneless Diffuser Experiment
NASA Technical Reports Server (NTRS)
Kulkarni, Sameer; Beach, Timothy A.; Skoch, Gary J.
2013-01-01
Centrifugal compressors are compatible with the low exit corrected flows found in the high pressure compressor of turboshaft engines and may play an increasing role in turbofan engines as engine overall pressure ratios increase. Centrifugal compressor stages are difficult to model accurately with RANS CFD solvers. A computational study of the CC3 centrifugal impeller in its vaneless diffuser configuration was undertaken as part of an effort to understand potential causes of RANS CFD mis-prediction in these types of geometries. Three steady, periodic cases of the impeller and diffuser were modeled using the TURBO Parallel Version 4 code: (1) a k-e turbulence model computation on a 6.8 million point grid using wall functions, (2) a k-e turbulence model computation on a 14 million point grid integrating to the wall, and (3) a k-? turbulence model computation on the 14 million point grid integrating to the wall. It was found that all three cases compared favorably to data from inlet to impeller trailing edge, but the k-e and k-? computations had disparate results beyond the trailing edge and into the vaneless diffuser. A large region of reversed flow was observed in the k-e computations which extended from 70 to 100 percent span at the exit rating plane, whereas the k-? computation had reversed flow from 95 to 100 percent span. Compared to experimental data at near-peak-efficiency, the reversed flow region in the k-e case resulted in an underprediction in adiabatic efficiency of 8.3 points, whereas the k-? case was 1.2 points lower in efficiency.
NASA Astrophysics Data System (ADS)
Kang, Zhizhong
2013-10-01
This paper presents a new approach to automatic registration of terrestrial laser scanning (TLS) point clouds utilizing a novel robust estimation method by an efficient BaySAC (BAYes SAmpling Consensus). The proposed method directly generates reflectance images from 3D point clouds, and then using SIFT algorithm extracts keypoints to identify corresponding image points. The 3D corresponding points, from which transformation parameters between point clouds are computed, are acquired by mapping the 2D ones onto the point cloud. To remove false accepted correspondences, we implement a conditional sampling method to select the n data points with the highest inlier probabilities as a hypothesis set and update the inlier probabilities of each data point using simplified Bayes' rule for the purpose of improving the computation efficiency. The prior probability is estimated by the verification of the distance invariance between correspondences. The proposed approach is tested on four data sets acquired by three different scanners. The results show that, comparing with the performance of RANSAC, BaySAC leads to less iterations and cheaper computation cost when the hypothesis set is contaminated with more outliers. The registration results also indicate that, the proposed algorithm can achieve high registration accuracy on all experimental datasets.
Efficient matrix approach to optical wave propagation and Linear Canonical Transforms.
Shakir, Sami A; Fried, David L; Pease, Edwin A; Brennan, Terry J; Dolash, Thomas M
2015-10-05
The Fresnel diffraction integral form of optical wave propagation and the more general Linear Canonical Transforms (LCT) are cast into a matrix transformation form. Taking advantage of recent efficient matrix multiply algorithms, this approach promises an efficient computational and analytical tool that is competitive with FFT based methods but offers better behavior in terms of aliasing, transparent boundary condition, and flexibility in number of sampling points and computational window sizes of the input and output planes being independent. This flexibility makes the method significantly faster than FFT based propagators when only a single point, as in Strehl metrics, or a limited number of points, as in power-in-the-bucket metrics, are needed in the output observation plane.
Implementation of Steiner point of fuzzy set.
Liang, Jiuzhen; Wang, Dejiang
2014-01-01
This paper deals with the implementation of Steiner point of fuzzy set. Some definitions and properties of Steiner point are investigated and extended to fuzzy set. This paper focuses on establishing efficient methods to compute Steiner point of fuzzy set. Two strategies of computing Steiner point of fuzzy set are proposed. One is called linear combination of Steiner points computed by a series of crisp α-cut sets of the fuzzy set. The other is an approximate method, which is trying to find the optimal α-cut set approaching the fuzzy set. Stability analysis of Steiner point of fuzzy set is also studied. Some experiments on image processing are given, in which the two methods are applied for implementing Steiner point of fuzzy image, and both strategies show their own advantages in computing Steiner point of fuzzy set.
Delineating landscape view areas...a computer approach
Elliot L. Amidon; Gary H. Elsner
1968-01-01
The terrain visible from a given point can be determined quick and efficiently by a computer. A FORTRAN subprogram--called VIEWIT--has been developed for this purpose. Input consists of data on elevations, by coordinates, which can be obtained from maps or ae rial photos. The computer will produce an overlay that shows the maximum area visible from an observation point...
ERIC Educational Resources Information Center
Shih, Ching-Hsiang
2011-01-01
This study combines multi-mice technology (people with disabilities can use standard mice, instead of specialized alternative computer input devices, to achieve complete mouse operation) with an assistive pointing function (i.e. cursor-capturing, which enables the user to move the cursor to the target center automatically), to assess whether two…
Patwary, Nurmohammed; Preza, Chrysanthe
2015-01-01
A depth-variant (DV) image restoration algorithm for wide field fluorescence microscopy, using an orthonormal basis decomposition of DV point-spread functions (PSFs), is investigated in this study. The efficient PSF representation is based on a previously developed principal component analysis (PCA), which is computationally intensive. We present an approach developed to reduce the number of DV PSFs required for the PCA computation, thereby making the PCA-based approach computationally tractable for thick samples. Restoration results from both synthetic and experimental images show consistency and that the proposed algorithm addresses efficiently depth-induced aberration using a small number of principal components. Comparison of the PCA-based algorithm with a previously-developed strata-based DV restoration algorithm demonstrates that the proposed method improves performance by 50% in terms of accuracy and simultaneously reduces the processing time by 64% using comparable computational resources. PMID:26504634
Efficient LIDAR Point Cloud Data Managing and Processing in a Hadoop-Based Distributed Framework
NASA Astrophysics Data System (ADS)
Wang, C.; Hu, F.; Sha, D.; Han, X.
2017-10-01
Light Detection and Ranging (LiDAR) is one of the most promising technologies in surveying and mapping city management, forestry, object recognition, computer vision engineer and others. However, it is challenging to efficiently storage, query and analyze the high-resolution 3D LiDAR data due to its volume and complexity. In order to improve the productivity of Lidar data processing, this study proposes a Hadoop-based framework to efficiently manage and process LiDAR data in a distributed and parallel manner, which takes advantage of Hadoop's storage and computing ability. At the same time, the Point Cloud Library (PCL), an open-source project for 2D/3D image and point cloud processing, is integrated with HDFS and MapReduce to conduct the Lidar data analysis algorithms provided by PCL in a parallel fashion. The experiment results show that the proposed framework can efficiently manage and process big LiDAR data.
SAGE: The Self-Adaptive Grid Code. 3
NASA Technical Reports Server (NTRS)
Davies, Carol B.; Venkatapathy, Ethiraj
1999-01-01
The multi-dimensional self-adaptive grid code, SAGE, is an important tool in the field of computational fluid dynamics (CFD). It provides an efficient method to improve the accuracy of flow solutions while simultaneously reducing computer processing time. Briefly, SAGE enhances an initial computational grid by redistributing the mesh points into more appropriate locations. The movement of these points is driven by an equal-error-distribution algorithm that utilizes the relationship between high flow gradients and excessive solution errors. The method also provides a balance between clustering points in the high gradient regions and maintaining the smoothness and continuity of the adapted grid, The latest version, Version 3, includes the ability to change the boundaries of a given grid to more efficiently enclose flow structures and provides alternative redistribution algorithms.
Use of parallel computing in mass processing of laser data
NASA Astrophysics Data System (ADS)
Będkowski, J.; Bratuś, R.; Prochaska, M.; Rzonca, A.
2015-12-01
The first part of the paper includes a description of the rules used to generate the algorithm needed for the purpose of parallel computing and also discusses the origins of the idea of research on the use of graphics processors in large scale processing of laser scanning data. The next part of the paper includes the results of an efficiency assessment performed for an array of different processing options, all of which were substantially accelerated with parallel computing. The processing options were divided into the generation of orthophotos using point clouds, coloring of point clouds, transformations, and the generation of a regular grid, as well as advanced processes such as the detection of planes and edges, point cloud classification, and the analysis of data for the purpose of quality control. Most algorithms had to be formulated from scratch in the context of the requirements of parallel computing. A few of the algorithms were based on existing technology developed by the Dephos Software Company and then adapted to parallel computing in the course of this research study. Processing time was determined for each process employed for a typical quantity of data processed, which helped confirm the high efficiency of the solutions proposed and the applicability of parallel computing to the processing of laser scanning data. The high efficiency of parallel computing yields new opportunities in the creation and organization of processing methods for laser scanning data.
NASA Astrophysics Data System (ADS)
Yu, Jieqing; Wu, Lixin; Hu, Qingsong; Yan, Zhigang; Zhang, Shaoliang
2017-12-01
Visibility computation is of great interest to location optimization, environmental planning, ecology, and tourism. Many algorithms have been developed for visibility computation. In this paper, we propose a novel method of visibility computation, called synthetic visual plane (SVP), to achieve better performance with respect to efficiency, accuracy, or both. The method uses a global horizon, which is a synthesis of line-of-sight information of all nearer points, to determine the visibility of a point, which makes it an accurate visibility method. We used discretization of horizon to gain a good performance in efficiency. After discretization, the accuracy and efficiency of SVP depends on the scale of discretization (i.e., zone width). The method is more accurate at smaller zone widths, but this requires a longer operating time. Users must strike a balance between accuracy and efficiency at their discretion. According to our experiments, SVP is less accurate but more efficient than R2 if the zone width is set to one grid. However, SVP becomes more accurate than R2 when the zone width is set to 1/24 grid, while it continues to perform as fast or faster than R2. Although SVP performs worse than reference plane and depth map with respect to efficiency, it is superior in accuracy to these other two algorithms.
Cost-Benefit Analysis of Computer Resources for Machine Learning
Champion, Richard A.
2007-01-01
Machine learning describes pattern-recognition algorithms - in this case, probabilistic neural networks (PNNs). These can be computationally intensive, in part because of the nonlinear optimizer, a numerical process that calibrates the PNN by minimizing a sum of squared errors. This report suggests efficiencies that are expressed as cost and benefit. The cost is computer time needed to calibrate the PNN, and the benefit is goodness-of-fit, how well the PNN learns the pattern in the data. There may be a point of diminishing returns where a further expenditure of computer resources does not produce additional benefits. Sampling is suggested as a cost-reduction strategy. One consideration is how many points to select for calibration and another is the geometric distribution of the points. The data points may be nonuniformly distributed across space, so that sampling at some locations provides additional benefit while sampling at other locations does not. A stratified sampling strategy can be designed to select more points in regions where they reduce the calibration error and fewer points in regions where they do not. Goodness-of-fit tests ensure that the sampling does not introduce bias. This approach is illustrated by statistical experiments for computing correlations between measures of roadless area and population density for the San Francisco Bay Area. The alternative to training efficiencies is to rely on high-performance computer systems. These may require specialized programming and algorithms that are optimized for parallel performance.
NASA Astrophysics Data System (ADS)
Pan, Bing; Wang, Bo
2017-10-01
Digital volume correlation (DVC) is a powerful technique for quantifying interior deformation within solid opaque materials and biological tissues. In the last two decades, great efforts have been made to improve the accuracy and efficiency of the DVC algorithm. However, there is still a lack of a flexible, robust and accurate version that can be efficiently implemented in personal computers with limited RAM. This paper proposes an advanced DVC method that can realize accurate full-field internal deformation measurement applicable to high-resolution volume images with up to billions of voxels. Specifically, a novel layer-wise reliability-guided displacement tracking strategy combined with dynamic data management is presented to guide the DVC computation from slice to slice. The displacements at specified calculation points in each layer are computed using the advanced 3D inverse-compositional Gauss-Newton algorithm with the complete initial guess of the deformation vector accurately predicted from the computed calculation points. Since only limited slices of interest in the reference and deformed volume images rather than the whole volume images are required, the DVC calculation can thus be efficiently implemented on personal computers. The flexibility, accuracy and efficiency of the presented DVC approach are demonstrated by analyzing computer-simulated and experimentally obtained high-resolution volume images.
GPU-accelerated element-free reverse-time migration with Gauss points partition
NASA Astrophysics Data System (ADS)
Zhou, Zhen; Jia, Xiaofeng; Qiang, Xiaodong
2018-06-01
An element-free method (EFM) has been demonstrated successfully in elasticity, heat conduction and fatigue crack growth problems. We present the theory of EFM and its numerical applications in seismic modelling and reverse time migration (RTM). Compared with the finite difference method and the finite element method, the EFM has unique advantages: (1) independence of grids in computation and (2) lower expense and more flexibility (because only the information of the nodes and the boundary of the concerned area is required). However, in EFM, due to improper computation and storage of some large sparse matrices, such as the mass matrix and the stiffness matrix, the method is difficult to apply to seismic modelling and RTM for a large velocity model. To solve the problem of storage and computation efficiency, we propose a concept of Gauss points partition and utilise the graphics processing unit to improve the computational efficiency. We employ the compressed sparse row format to compress the intermediate large sparse matrices and attempt to simplify the operations by solving the linear equations with CULA solver. To improve the computation efficiency further, we introduce the concept of the lumped mass matrix. Numerical experiments indicate that the proposed method is accurate and more efficient than the regular EFM.
Design of a reversible single precision floating point subtractor.
Anantha Lakshmi, Av; Sudha, Gf
2014-01-04
In recent years, Reversible logic has emerged as a major area of research due to its ability to reduce the power dissipation which is the main requirement in the low power digital circuit design. It has wide applications like low power CMOS design, Nano-technology, Digital signal processing, Communication, DNA computing and Optical computing. Floating-point operations are needed very frequently in nearly all computing disciplines, and studies have shown floating-point addition/subtraction to be the most used floating-point operation. However, few designs exist on efficient reversible BCD subtractors but no work on reversible floating point subtractor. In this paper, it is proposed to present an efficient reversible single precision floating-point subtractor. The proposed design requires reversible designs of an 8-bit and a 24-bit comparator unit, an 8-bit and a 24-bit subtractor, and a normalization unit. For normalization, a 24-bit Reversible Leading Zero Detector and a 24-bit reversible shift register is implemented to shift the mantissas. To realize a reversible 1-bit comparator, in this paper, two new 3x3 reversible gates are proposed The proposed reversible 1-bit comparator is better and optimized in terms of the number of reversible gates used, the number of transistor count and the number of garbage outputs. The proposed work is analysed in terms of number of reversible gates, garbage outputs, constant inputs and quantum costs. Using these modules, an efficient design of a reversible single precision floating point subtractor is proposed. Proposed circuits have been simulated using Modelsim and synthesized using Xilinx Virtex5vlx30tff665-3. The total on-chip power consumed by the proposed 32-bit reversible floating point subtractor is 0.410 W.
Small target detection using objectness and saliency
NASA Astrophysics Data System (ADS)
Zhang, Naiwen; Xiao, Yang; Fang, Zhiwen; Yang, Jian; Wang, Li; Li, Tao
2017-10-01
We are motived by the need for generic object detection algorithm which achieves high recall for small targets in complex scenes with acceptable computational efficiency. We propose a novel object detection algorithm, which has high localization quality with acceptable computational cost. Firstly, we obtain the objectness map as in BING[1] and use NMS to get the top N points. Then, k-means algorithm is used to cluster them into K classes according to their location. We set the center points of the K classes as seed points. For each seed point, an object potential region is extracted. Finally, a fast salient object detection algorithm[2] is applied to the object potential regions to highlight objectlike pixels, and a series of efficient post-processing operations are proposed to locate the targets. Our method runs at 5 FPS on 1000*1000 images, and significantly outperforms previous methods on small targets in cluttered background.
Perspective: Memcomputing: Leveraging memory and physics to compute efficiently
NASA Astrophysics Data System (ADS)
Di Ventra, Massimiliano; Traversa, Fabio L.
2018-05-01
It is well known that physical phenomena may be of great help in computing some difficult problems efficiently. A typical example is prime factorization that may be solved in polynomial time by exploiting quantum entanglement on a quantum computer. There are, however, other types of (non-quantum) physical properties that one may leverage to compute efficiently a wide range of hard problems. In this perspective, we discuss how to employ one such property, memory (time non-locality), in a novel physics-based approach to computation: Memcomputing. In particular, we focus on digital memcomputing machines (DMMs) that are scalable. DMMs can be realized with non-linear dynamical systems with memory. The latter property allows the realization of a new type of Boolean logic, one that is self-organizing. Self-organizing logic gates are "terminal-agnostic," namely, they do not distinguish between the input and output terminals. When appropriately assembled to represent a given combinatorial/optimization problem, the corresponding self-organizing circuit converges to the equilibrium points that express the solutions of the problem at hand. In doing so, DMMs take advantage of the long-range order that develops during the transient dynamics. This collective dynamical behavior, reminiscent of a phase transition, or even the "edge of chaos," is mediated by families of classical trajectories (instantons) that connect critical points of increasing stability in the system's phase space. The topological character of the solution search renders DMMs robust against noise and structural disorder. Since DMMs are non-quantum systems described by ordinary differential equations, not only can they be built in hardware with the available technology, they can also be simulated efficiently on modern classical computers. As an example, we will show the polynomial-time solution of the subset-sum problem for the worst cases, and point to other types of hard problems where simulations of DMMs' equations of motion on classical computers have already demonstrated substantial advantages over traditional approaches. We conclude this article by outlining further directions of study.
A time-efficient algorithm for implementing the Catmull-Clark subdivision method
NASA Astrophysics Data System (ADS)
Ioannou, G.; Savva, A.; Stylianou, V.
2015-10-01
Splines are the most popular methods in Figure Modeling and CAGD (Computer Aided Geometric Design) in generating smooth surfaces from a number of control points. The control points define the shape of a figure and splines calculate the required number of points which when displayed on a computer screen the result is a smooth surface. However, spline methods are based on a rectangular topological structure of points, i.e., a two-dimensional table of vertices, and thus cannot generate complex figures, such as the human and animal bodies that their complex structure does not allow them to be defined by a regular rectangular grid. On the other hand surface subdivision methods, which are derived by splines, generate surfaces which are defined by an arbitrary topology of control points. This is the reason that during the last fifteen years subdivision methods have taken the lead over regular spline methods in all areas of modeling in both industry and research. The cost of executing computer software developed to read control points and calculate the surface is run-time, due to the fact that the surface-structure required for handling arbitrary topological grids is very complicate. There are many software programs that have been developed related to the implementation of subdivision surfaces however, not many algorithms are documented in the literature, to support developers for writing efficient code. This paper aims to assist programmers by presenting a time-efficient algorithm for implementing subdivision splines. The Catmull-Clark which is the most popular of the subdivision methods has been employed to illustrate the algorithm.
Estimating Function Approaches for Spatial Point Processes
NASA Astrophysics Data System (ADS)
Deng, Chong
Spatial point pattern data consist of locations of events that are often of interest in biological and ecological studies. Such data are commonly viewed as a realization from a stochastic process called spatial point process. To fit a parametric spatial point process model to such data, likelihood-based methods have been widely studied. However, while maximum likelihood estimation is often too computationally intensive for Cox and cluster processes, pairwise likelihood methods such as composite likelihood, Palm likelihood usually suffer from the loss of information due to the ignorance of correlation among pairs. For many types of correlated data other than spatial point processes, when likelihood-based approaches are not desirable, estimating functions have been widely used for model fitting. In this dissertation, we explore the estimating function approaches for fitting spatial point process models. These approaches, which are based on the asymptotic optimal estimating function theories, can be used to incorporate the correlation among data and yield more efficient estimators. We conducted a series of studies to demonstrate that these estmating function approaches are good alternatives to balance the trade-off between computation complexity and estimating efficiency. First, we propose a new estimating procedure that improves the efficiency of pairwise composite likelihood method in estimating clustering parameters. Our approach combines estimating functions derived from pairwise composite likeli-hood estimation and estimating functions that account for correlations among the pairwise contributions. Our method can be used to fit a variety of parametric spatial point process models and can yield more efficient estimators for the clustering parameters than pairwise composite likelihood estimation. We demonstrate its efficacy through a simulation study and an application to the longleaf pine data. Second, we further explore the quasi-likelihood approach on fitting second-order intensity function of spatial point processes. However, the original second-order quasi-likelihood is barely feasible due to the intense computation and high memory requirement needed to solve a large linear system. Motivated by the existence of geometric regular patterns in the stationary point processes, we find a lower dimension representation of the optimal weight function and propose a reduced second-order quasi-likelihood approach. Through a simulation study, we show that the proposed method not only demonstrates superior performance in fitting the clustering parameter but also merits in the relaxation of the constraint of the tuning parameter, H. Third, we studied the quasi-likelihood type estimating funciton that is optimal in a certain class of first-order estimating functions for estimating the regression parameter in spatial point process models. Then, by using a novel spectral representation, we construct an implementation that is computationally much more efficient and can be applied to more general setup than the original quasi-likelihood method.
The Efficiency and the Scalability of an Explicit Operator on an IBM POWER4 System
NASA Technical Reports Server (NTRS)
Frumkin, Michael; Biegel, Bryan A. (Technical Monitor)
2002-01-01
We present an evaluation of the efficiency and the scalability of an explicit CFD operator on an IBM POWER4 system. The POWER4 architecture exhibits a common trend in HPC architectures: boosting CPU processing power by increasing the number of functional units, while hiding the latency of memory access by increasing the depth of the memory hierarchy. The overall machine performance depends on the ability of the caches-buses-fabric-memory to feed the functional units with the data to be processed. In this study we evaluate the efficiency and scalability of one explicit CFD operator on an IBM POWER4. This operator performs computations at the points of a Cartesian grid and involves a few dozen floating point numbers and on the order of 100 floating point operations per grid point. The computations in all grid points are independent. Specifically, we estimate the efficiency of the RHS operator (SP of NPB) on a single processor as the observed/peak performance ratio. Then we estimate the scalability of the operator on a single chip (2 CPUs), a single MCM (8 CPUs), 16 CPUs, and the whole machine (32 CPUs). Then we perform the same measurements for a chache-optimized version of the RHS operator. For our measurements we use the HPM (Hardware Performance Monitor) counters available on the POWER4. These counters allow us to analyze the obtained performance results.
NASA Technical Reports Server (NTRS)
Mittra, R.; Rushdi, A.
1979-01-01
An approach for computing the geometrical optic fields reflected from a numerically specified surface is presented. The approach includes the step of deriving a specular point and begins with computing the reflected rays off the surface at the points where their coordinates, as well as the partial derivatives (or equivalently, the direction of the normal), are numerically specified. Then, a cluster of three adjacent rays are chosen to define a 'mean ray' and the divergence factor associated with this mean ray. Finally, the ampilitude, phase, and vector direction of the reflected field at a given observation point are derived by associating this point with the nearest mean ray and determining its position relative to such a ray.
Fuel Injector Design Optimization for an Annular Scramjet Geometry
NASA Technical Reports Server (NTRS)
Steffen, Christopher J., Jr.
2003-01-01
A four-parameter, three-level, central composite experiment design has been used to optimize the configuration of an annular scramjet injector geometry using computational fluid dynamics. The computational fluid dynamic solutions played the role of computer experiments, and response surface methodology was used to capture the simulation results for mixing efficiency and total pressure recovery within the scramjet flowpath. An optimization procedure, based upon the response surface results of mixing efficiency, was used to compare the optimal design configuration against the target efficiency value of 92.5%. The results of three different optimization procedures are presented and all point to the need to look outside the current design space for different injector geometries that can meet or exceed the stated mixing efficiency target.
NASA Technical Reports Server (NTRS)
Galvas, M. R.
1972-01-01
A computer program for predicting design point specific speed - efficiency characteristics of centrifugal compressors is presented with instructions for its use. The method permits rapid selection of compressor geometry that yields maximum total efficiency for a particular application. A numerical example is included to demonstrate the selection procedure.
Fast sweeping methods for hyperbolic systems of conservation laws at steady state II
NASA Astrophysics Data System (ADS)
Engquist, Björn; Froese, Brittany D.; Tsai, Yen-Hsi Richard
2015-04-01
The idea of using fast sweeping methods for solving stationary systems of conservation laws has previously been proposed for efficiently computing solutions with sharp shocks. We further develop these methods to allow for a more challenging class of problems including problems with sonic points, shocks originating in the interior of the domain, rarefaction waves, and two-dimensional systems. We show that fast sweeping methods can produce higher-order accuracy. Computational results validate the claims of accuracy, sharp shock curves, and optimal computational efficiency.
Ge, Hong-You; Vangsgaard, Steffen; Omland, Øyvind; Madeleine, Pascal; Arendt-Nielsen, Lars
2014-12-06
Musculoskeletal pain from the upper extremity and shoulder region is commonly reported by computer users. However, the functional status of central pain mechanisms, i.e., central sensitization and conditioned pain modulation (CPM), has not been investigated in this population. The aim was to evaluate sensitization and CPM in computer users with and without chronic musculoskeletal pain. Pressure pain threshold (PPT) mapping in the neck-shoulder (15 points) and the elbow (12 points) was assessed together with PPT measurement at mid-point in the tibialis anterior (TA) muscle among 47 computer users with chronic pain in the upper extremity and/or neck-shoulder pain (pain group) and 17 pain-free computer users (control group). Induced pain intensities and profiles over time were recorded using a 0-10 cm electronic visual analogue scale (VAS) in response to different levels of pressure stimuli on the forearm with a new technique of dynamic pressure algometry. The efficiency of CPM was assessed using cuff-induced pain as conditioning pain stimulus and PPT at TA as test stimulus. The demographics, job seniority and number of working hours/week using a computer were similar between groups. The PPTs measured at all 15 points in the neck-shoulder region were not significantly different between groups. There were no significant differences between groups neither in PPTs nor pain intensity induced by dynamic pressure algometry. No significant difference in PPT was observed in TA between groups. During CPM, a significant increase in PPT at TA was observed in both groups (P < 0.05) without significant differences between groups. For the chronic pain group, higher clinical pain intensity, lower PPT values from the neck-shoulder and higher pain intensity evoked by the roller were all correlated with less efficient descending pain modulation (P < 0.05). This suggests that the excitability of the central pain system is normal in a large group of computer users with low pain intensity chronic upper extremity and/or neck-shoulder pain and that increased excitability of the pain system cannot explain the reported pain. However, computer users with higher pain intensity and lower PPTs were found to have decreased efficiency in descending pain modulation.
Visualizing Matrix Multiplication
ERIC Educational Resources Information Center
Daugulis, Peteris; Sondore, Anita
2018-01-01
Efficient visualizations of computational algorithms are important tools for students, educators, and researchers. In this article, we point out an innovative visualization technique for matrix multiplication. This method differs from the standard, formal approach by using block matrices to make computations more visual. We find this method a…
Efficient volume computation for three-dimensional hexahedral cells
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dukowicz, J.K.
1988-02-01
Currently, algorithms for computing the volume of hexahedral cells with ''ruled'' surfaces require a minimum of 122 FLOPs (floating point operations) per cell. A new algorithm is described which reduces the operation count to 57 FLOPs per cell. copyright 1988 Academic Press, Inc.
ERIC Educational Resources Information Center
Shih, Ching-Hsiang; Shih, Ching-Tien; Wu, Hsiao-Ling
2010-01-01
The latest research adopted software technology to redesign the mouse driver, and turned a mouse into a useful pointing assistive device for people with multiple disabilities who cannot easily or possibly use a standard mouse, to improve their pointing performance through a new operation method, Extended Dynamic Pointing Assistive Program (EDPAP),…
Nonlinear Detection, Estimation, and Control for Free-Space Optical Communication
2008-08-17
original message. The promising features of this communication scheme such as high-bandwidth, power efficiency, and security, render it a viable means...bandwidth, power efficiency, and security, render it a viable means for high data rate point-to-point communication. In this dissertation, we adopt a...Department of Electrical and Computer Engineering In free-space optical communication, the intensity of a laser beam is modulated by a message, the beam
Gaussian Radial Basis Function for Efficient Computation of Forest Indirect Illumination
NASA Astrophysics Data System (ADS)
Abbas, Fayçal; Babahenini, Mohamed Chaouki
2018-06-01
Global illumination of natural scenes in real time like forests is one of the most complex problems to solve, because the multiple inter-reflections between the light and material of the objects composing the scene. The major problem that arises is the problem of visibility computation. In fact, the computing of visibility is carried out for all the set of leaves visible from the center of a given leaf, given the enormous number of leaves present in a tree, this computation performed for each leaf of the tree which also reduces performance. We describe a new approach that approximates visibility queries, which precede in two steps. The first step is to generate point cloud representing the foliage. We assume that the point cloud is composed of two classes (visible, not-visible) non-linearly separable. The second step is to perform a point cloud classification by applying the Gaussian radial basis function, which measures the similarity in term of distance between each leaf and a landmark leaf. It allows approximating the visibility requests to extract the leaves that will be used to calculate the amount of indirect illumination exchanged between neighbor leaves. Our approach allows efficiently treat the light exchanges in the scene of a forest, it allows a fast computation and produces images of good visual quality, all this takes advantage of the immense power of computation of the GPU.
Implementation of control point form of algebraic grid-generation technique
NASA Technical Reports Server (NTRS)
Choo, Yung K.; Miller, David P.; Reno, Charles J.
1991-01-01
The control point form (CPF) provides explicit control of physical grid shape and grid spacing through the movement of the control points. The control point array, called a control net, is a space grid type arrangement of locations in physical space with an index for each direction. As an algebraic method CPF is efficient and works well with interactive computer graphics. A family of menu-driven, interactive grid-generation computer codes (TURBO) is being developed by using CPF. Key features of TurboI (a TURBO member) are discussed and typical results are presented. TurboI runs on any IRIS 4D series workstation.
Anatomy-driven multiple trajectory planning (ADMTP) of intracranial electrodes for epilepsy surgery.
Sparks, Rachel; Vakharia, Vejay; Rodionov, Roman; Vos, Sjoerd B; Diehl, Beate; Wehner, Tim; Miserocchi, Anna; McEvoy, Andrew W; Duncan, John S; Ourselin, Sebastien
2017-08-01
Epilepsy is potentially curable with resective surgery if the epileptogenic zone (EZ) can be identified. If non-invasive imaging is unable to elucidate the EZ, intracranial electrodes may be implanted to identify the EZ as well as map cortical function. In current clinical practice, each electrode trajectory is determined by time-consuming manual inspection of preoperative imaging to find a path that avoids blood vessels while traversing appropriate deep and superficial regions of interest (ROIs). We present anatomy-driven multiple trajectory planning (ADMTP) to find safe trajectories from a list of user-defined ROIs within minutes rather than the hours required for manual planning. Electrode trajectories are automatically computed in three steps: (1) Target Point Selection to identify appropriate target points within each ROI; (2) Trajectory Risk Scoring to quantify the cumulative distance to critical structures (blood vessels) along each trajectory, defined as the skull entry point to target point. (3) Implantation Plan Computation: to determine a feasible combination of low-risk trajectories for all electrodes. ADMTP was evaluated on 20 patients (190 electrodes). ADMTP lowered the quantitative risk score in 83% of electrodes. Qualitative results show ADMTP found suitable trajectories for 70% of electrodes; a similar portion of manual trajectories were considered suitable. Trajectory suitability for ADMTP was 95% if traversing sulci was not included in the safety criteria. ADMTP is computationally efficient, computing between 7 and 12 trajectories in 54.5 (17.3-191.9) s. ADMTP efficiently compute safe and surgically feasible electrode trajectories.
Distributed Computation of the knn Graph for Large High-Dimensional Point Sets
Plaku, Erion; Kavraki, Lydia E.
2009-01-01
High-dimensional problems arising from robot motion planning, biology, data mining, and geographic information systems often require the computation of k nearest neighbor (knn) graphs. The knn graph of a data set is obtained by connecting each point to its k closest points. As the research in the above-mentioned fields progressively addresses problems of unprecedented complexity, the demand for computing knn graphs based on arbitrary distance metrics and large high-dimensional data sets increases, exceeding resources available to a single machine. In this work we efficiently distribute the computation of knn graphs for clusters of processors with message passing. Extensions to our distributed framework include the computation of graphs based on other proximity queries, such as approximate knn or range queries. Our experiments show nearly linear speedup with over one hundred processors and indicate that similar speedup can be obtained with several hundred processors. PMID:19847318
Krityakierne, Tipaluck; Akhtar, Taimoor; Shoemaker, Christine A.
2016-02-02
This paper presents a parallel surrogate-based global optimization method for computationally expensive objective functions that is more effective for larger numbers of processors. To reach this goal, we integrated concepts from multi-objective optimization and tabu search into, single objective, surrogate optimization. Our proposed derivative-free algorithm, called SOP, uses non-dominated sorting of points for which the expensive function has been previously evaluated. The two objectives are the expensive function value of the point and the minimum distance of the point to previously evaluated points. Based on the results of non-dominated sorting, P points from the sorted fronts are selected as centersmore » from which many candidate points are generated by random perturbations. Based on surrogate approximation, the best candidate point is subsequently selected for expensive evaluation for each of the P centers, with simultaneous computation on P processors. Centers that previously did not generate good solutions are tabu with a given tenure. We show almost sure convergence of this algorithm under some conditions. The performance of SOP is compared with two RBF based methods. The test results show that SOP is an efficient method that can reduce time required to find a good near optimal solution. In a number of cases the efficiency of SOP is so good that SOP with 8 processors found an accurate answer in less wall-clock time than the other algorithms did with 32 processors.« less
Efficient Computation of Closed-loop Frequency Response for Large Order Flexible Systems
NASA Technical Reports Server (NTRS)
Maghami, Peiman G.; Giesy, Daniel P.
1997-01-01
An efficient and robust computational scheme is given for the calculation of the frequency response function of a large order, flexible system implemented with a linear, time invariant control system. Advantage is taken of the highly structured sparsity of the system matrix of the plant based on a model of the structure using normal mode coordinates. The computational time per frequency point of the new computational scheme is a linear function of system size, a significant improvement over traditional, full-matrix techniques whose computational times per frequency point range from quadratic to cubic functions of system size. This permits the practical frequency domain analysis of systems of much larger order than by traditional, full-matrix techniques. Formulations are given for both open and closed loop loop systems. Numerical examples are presented showing the advantages of the present formulation over traditional approaches, both in speed and in accuracy. Using a model with 703 structural modes, a speed-up of almost two orders of magnitude was observed while accuracy improved by up to 5 decimal places.
A stochastic method for computing hadronic matrix elements
Alexandrou, Constantia; Constantinou, Martha; Dinter, Simon; ...
2014-01-24
In this study, we present a stochastic method for the calculation of baryon 3-point functions which is an alternative to the typically used sequential method offering more versatility. We analyze the scaling of the error of the stochastically evaluated 3-point function with the lattice volume and find a favorable signal to noise ratio suggesting that the stochastic method can be extended to large volumes providing an efficient approach to compute hadronic matrix elements and form factors.
Collaborative Indoor Access Point Localization Using Autonomous Mobile Robot Swarm.
Awad, Fahed; Naserllah, Muhammad; Omar, Ammar; Abu-Hantash, Alaa; Al-Taj, Abrar
2018-01-31
Localization of access points has become an important research problem due to the wide range of applications it addresses such as dismantling critical security threats caused by rogue access points or optimizing wireless coverage of access points within a service area. Existing proposed solutions have mostly relied on theoretical hypotheses or computer simulation to demonstrate the efficiency of their methods. The techniques that rely on estimating the distance using samples of the received signal strength usually assume prior knowledge of the signal propagation characteristics of the indoor environment in hand and tend to take a relatively large number of uniformly distributed random samples. This paper presents an efficient and practical collaborative approach to detect the location of an access point in an indoor environment without any prior knowledge of the environment. The proposed approach comprises a swarm of wirelessly connected mobile robots that collaboratively and autonomously collect a relatively small number of non-uniformly distributed random samples of the access point's received signal strength. These samples are used to efficiently and accurately estimate the location of the access point. The experimental testing verified that the proposed approach can identify the location of the access point in an accurate and efficient manner.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huang, He; Luo, Li -Shi; Li, Rui
To compute the non-oscillating mutual interaction for a systems with N points, the fast multipole method (FMM) has an efficiency that scales linearly with the number of points. Specifically, for Coulomb interaction, FMM can be constructed using either the spherical harmonic functions or the totally symmetric Cartesian tensors. In this paper, we will present that the effciency of the Cartesian tensor-based FMM for the Coulomb interaction can be significantly improved by implementing the traces of the Cartesian tensors in calculation to reduce the independent elements of the n-th rank totally symmetric Cartesian tensor from (n + 1)(n + 2)=2 tomore » 2n + 1. The computation complexity for the operations in FMM are analyzed and expressed as polynomials of the highest rank of the Cartesian tensors. For most operations, the complexity is reduced by one order. Numerical examples regarding the convergence and the effciency of the new algorithm are demonstrated. As a result, a reduction of computation time up to 50% has been observed for a moderate number of points and rank of tensors.« less
Huang, He; Luo, Li -Shi; Li, Rui; ...
2018-05-17
To compute the non-oscillating mutual interaction for a systems with N points, the fast multipole method (FMM) has an efficiency that scales linearly with the number of points. Specifically, for Coulomb interaction, FMM can be constructed using either the spherical harmonic functions or the totally symmetric Cartesian tensors. In this paper, we will present that the effciency of the Cartesian tensor-based FMM for the Coulomb interaction can be significantly improved by implementing the traces of the Cartesian tensors in calculation to reduce the independent elements of the n-th rank totally symmetric Cartesian tensor from (n + 1)(n + 2)=2 tomore » 2n + 1. The computation complexity for the operations in FMM are analyzed and expressed as polynomials of the highest rank of the Cartesian tensors. For most operations, the complexity is reduced by one order. Numerical examples regarding the convergence and the effciency of the new algorithm are demonstrated. As a result, a reduction of computation time up to 50% has been observed for a moderate number of points and rank of tensors.« less
NASA Astrophysics Data System (ADS)
Kochmann, Julian; Wulfinghoff, Stephan; Ehle, Lisa; Mayer, Joachim; Svendsen, Bob; Reese, Stefanie
2018-06-01
Recently, two-scale FE-FFT-based methods (e.g., Spahn et al. in Comput Methods Appl Mech Eng 268:871-883, 2014; Kochmann et al. in Comput Methods Appl Mech Eng 305:89-110, 2016) have been proposed to predict the microscopic and overall mechanical behavior of heterogeneous materials. The purpose of this work is the extension to elasto-viscoplastic polycrystals, efficient and robust Fourier solvers and the prediction of micromechanical fields during macroscopic deformation processes. Assuming scale separation, the macroscopic problem is solved using the finite element method. The solution of the microscopic problem, which is embedded as a periodic unit cell (UC) in each macroscopic integration point, is found by employing fast Fourier transforms, fixed-point and Newton-Krylov methods. The overall material behavior is defined by the mean UC response. In order to ensure spatially converged micromechanical fields as well as feasible overall CPU times, an efficient but simple solution strategy for two-scale simulations is proposed. As an example, the constitutive behavior of 42CrMo4 steel is predicted during macroscopic three-point bending tests.
NASA Astrophysics Data System (ADS)
Kochmann, Julian; Wulfinghoff, Stephan; Ehle, Lisa; Mayer, Joachim; Svendsen, Bob; Reese, Stefanie
2017-09-01
Recently, two-scale FE-FFT-based methods (e.g., Spahn et al. in Comput Methods Appl Mech Eng 268:871-883, 2014; Kochmann et al. in Comput Methods Appl Mech Eng 305:89-110, 2016) have been proposed to predict the microscopic and overall mechanical behavior of heterogeneous materials. The purpose of this work is the extension to elasto-viscoplastic polycrystals, efficient and robust Fourier solvers and the prediction of micromechanical fields during macroscopic deformation processes. Assuming scale separation, the macroscopic problem is solved using the finite element method. The solution of the microscopic problem, which is embedded as a periodic unit cell (UC) in each macroscopic integration point, is found by employing fast Fourier transforms, fixed-point and Newton-Krylov methods. The overall material behavior is defined by the mean UC response. In order to ensure spatially converged micromechanical fields as well as feasible overall CPU times, an efficient but simple solution strategy for two-scale simulations is proposed. As an example, the constitutive behavior of 42CrMo4 steel is predicted during macroscopic three-point bending tests.
Joint classification and contour extraction of large 3D point clouds
NASA Astrophysics Data System (ADS)
Hackel, Timo; Wegner, Jan D.; Schindler, Konrad
2017-08-01
We present an effective and efficient method for point-wise semantic classification and extraction of object contours of large-scale 3D point clouds. What makes point cloud interpretation challenging is the sheer size of several millions of points per scan and the non-grid, sparse, and uneven distribution of points. Standard image processing tools like texture filters, for example, cannot handle such data efficiently, which calls for dedicated point cloud labeling methods. It turns out that one of the major drivers for efficient computation and handling of strong variations in point density, is a careful formulation of per-point neighborhoods at multiple scales. This allows, both, to define an expressive feature set and to extract topologically meaningful object contours. Semantic classification and contour extraction are interlaced problems. Point-wise semantic classification enables extracting a meaningful candidate set of contour points while contours help generating a rich feature representation that benefits point-wise classification. These methods are tailored to have fast run time and small memory footprint for processing large-scale, unstructured, and inhomogeneous point clouds, while still achieving high classification accuracy. We evaluate our methods on the semantic3d.net benchmark for terrestrial laser scans with >109 points.
GPU-accelerated Modeling and Element-free Reverse-time Migration with Gauss Points Partition
NASA Astrophysics Data System (ADS)
Zhen, Z.; Jia, X.
2014-12-01
Element-free method (EFM) has been applied to seismic modeling and migration. Compared with finite element method (FEM) and finite difference method (FDM), it is much cheaper and more flexible because only the information of the nodes and the boundary of the study area are required in computation. In the EFM, the number of Gauss points should be consistent with the number of model nodes; otherwise the accuracy of the intermediate coefficient matrices would be harmed. Thus when we increase the nodes of velocity model in order to obtain higher resolution, we find that the size of the computer's memory will be a bottleneck. The original EFM can deal with at most 81×81 nodes in the case of 2G memory, as tested by Jia and Hu (2006). In order to solve the problem of storage and computation efficiency, we propose a concept of Gauss points partition (GPP), and utilize the GPUs to improve the computation efficiency. Considering the characteristics of the Gaussian points, the GPP method doesn't influence the propagation of seismic wave in the velocity model. To overcome the time-consuming computation of the stiffness matrix (K) and the mass matrix (M), we also use the GPUs in our computation program. We employ the compressed sparse row (CSR) format to compress the intermediate sparse matrices and try to simplify the operations by solving the linear equations with the CULA Sparse's Conjugate Gradient (CG) solver instead of the linear sparse solver 'PARDISO'. It is observed that our strategy can significantly reduce the computational time of K and Mcompared with the algorithm based on CPU. The model tested is Marmousi model. The length of the model is 7425m and the depth is 2990m. We discretize the model with 595x298 nodes, 300x300 Gauss cells and 3x3 Gauss points in each cell. In contrast to the computational time of the conventional EFM, the GPUs-GPP approach can substantially improve the efficiency. The speedup ratio of time consumption of computing K, M is 120 and the speedup ratio time consumption of RTM is 11.5. At the same time, the accuracy of imaging is not harmed. Another advantage of the GPUs-GPP method is its easy applications in other numerical methods such as the FEM. Finally, in the GPUs-GPP method, the arrays require quite limited memory storage, which makes the method promising in dealing with large-scale 3D problems.
Framework to trade optimality for local processing in large-scale wavefront reconstruction problems.
Haber, Aleksandar; Verhaegen, Michel
2016-11-15
We show that the minimum variance wavefront estimation problems permit localized approximate solutions, in the sense that the wavefront value at a point (excluding unobservable modes, such as the piston mode) can be approximated by a linear combination of the wavefront slope measurements in the point's neighborhood. This enables us to efficiently compute a wavefront estimate by performing a single sparse matrix-vector multiplication. Moreover, our results open the possibility for the development of wavefront estimators that can be easily implemented in a decentralized/distributed manner, and in which the estimate optimality can be easily traded for computational efficiency. We numerically validate our approach on Hudgin wavefront sensor geometries, and the results can be easily generalized to Fried geometries.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rebay, S.
This work is devoted to the description of an efficient unstructured mesh generation method entirely based on the Delaunay triangulation. The distinctive characteristic of the proposed method is that point positions and connections are computed simultaneously. This result is achieved by taking advantage of the sequential way in which the Bowyer-Watson algorithm computes the Delaunay triangulation. Two methods are proposed which have great geometrical flexibility, in that they allow us to treat domains of arbitrary shape and topology and to generate arbitrarily nonuniform meshes. The methods are computationally efficient and are applicable both in two and three dimensions. 11 refs.,more » 20 figs., 1 tab.« less
Unsteady three-dimensional thermal field prediction in turbine blades using nonlinear BEM
NASA Technical Reports Server (NTRS)
Martin, Thomas J.; Dulikravich, George S.
1993-01-01
A time-and-space accurate and computationally efficient fully three dimensional unsteady temperature field analysis computer code has been developed for truly arbitrary configurations. It uses boundary element method (BEM) formulation based on an unsteady Green's function approach, multi-point Gaussian quadrature spatial integration on each panel, and a highly clustered time-step integration. The code accepts either temperatures or heat fluxes as boundary conditions that can vary in time on a point-by-point basis. Comparisons of the BEM numerical results and known analytical unsteady results for simple shapes demonstrate very high accuracy and reliability of the algorithm. An example of computed three dimensional temperature and heat flux fields in a realistically shaped internally cooled turbine blade is also discussed.
A Lightweight Protocol for Secure Video Streaming
Morkevicius, Nerijus; Bagdonas, Kazimieras
2018-01-01
The Internet of Things (IoT) introduces many new challenges which cannot be solved using traditional cloud and host computing models. A new architecture known as fog computing is emerging to address these technological and security gaps. Traditional security paradigms focused on providing perimeter-based protections and client/server point to point protocols (e.g., Transport Layer Security (TLS)) are no longer the best choices for addressing new security challenges in fog computing end devices, where energy and computational resources are limited. In this paper, we present a lightweight secure streaming protocol for the fog computing “Fog Node-End Device” layer. This protocol is lightweight, connectionless, supports broadcast and multicast operations, and is able to provide data source authentication, data integrity, and confidentiality. The protocol is based on simple and energy efficient cryptographic methods, such as Hash Message Authentication Codes (HMAC) and symmetrical ciphers, and uses modified User Datagram Protocol (UDP) packets to embed authentication data into streaming data. Data redundancy could be added to improve reliability in lossy networks. The experimental results summarized in this paper confirm that the proposed method efficiently uses energy and computational resources and at the same time provides security properties on par with the Datagram TLS (DTLS) standard. PMID:29757988
A Lightweight Protocol for Secure Video Streaming.
Venčkauskas, Algimantas; Morkevicius, Nerijus; Bagdonas, Kazimieras; Damaševičius, Robertas; Maskeliūnas, Rytis
2018-05-14
The Internet of Things (IoT) introduces many new challenges which cannot be solved using traditional cloud and host computing models. A new architecture known as fog computing is emerging to address these technological and security gaps. Traditional security paradigms focused on providing perimeter-based protections and client/server point to point protocols (e.g., Transport Layer Security (TLS)) are no longer the best choices for addressing new security challenges in fog computing end devices, where energy and computational resources are limited. In this paper, we present a lightweight secure streaming protocol for the fog computing "Fog Node-End Device" layer. This protocol is lightweight, connectionless, supports broadcast and multicast operations, and is able to provide data source authentication, data integrity, and confidentiality. The protocol is based on simple and energy efficient cryptographic methods, such as Hash Message Authentication Codes (HMAC) and symmetrical ciphers, and uses modified User Datagram Protocol (UDP) packets to embed authentication data into streaming data. Data redundancy could be added to improve reliability in lossy networks. The experimental results summarized in this paper confirm that the proposed method efficiently uses energy and computational resources and at the same time provides security properties on par with the Datagram TLS (DTLS) standard.
Computationally efficient algorithm for Gaussian Process regression in case of structured samples
NASA Astrophysics Data System (ADS)
Belyaev, M.; Burnaev, E.; Kapushev, Y.
2016-04-01
Surrogate modeling is widely used in many engineering problems. Data sets often have Cartesian product structure (for instance factorial design of experiments with missing points). In such case the size of the data set can be very large. Therefore, one of the most popular algorithms for approximation-Gaussian Process regression-can be hardly applied due to its computational complexity. In this paper a computationally efficient approach for constructing Gaussian Process regression in case of data sets with Cartesian product structure is presented. Efficiency is achieved by using a special structure of the data set and operations with tensors. Proposed algorithm has low computational as well as memory complexity compared to existing algorithms. In this work we also introduce a regularization procedure allowing to take into account anisotropy of the data set and avoid degeneracy of regression model.
Collaborative Indoor Access Point Localization Using Autonomous Mobile Robot Swarm
Awad, Fahed; Naserllah, Muhammad; Omar, Ammar; Abu-Hantash, Alaa; Al-Taj, Abrar
2018-01-01
Localization of access points has become an important research problem due to the wide range of applications it addresses such as dismantling critical security threats caused by rogue access points or optimizing wireless coverage of access points within a service area. Existing proposed solutions have mostly relied on theoretical hypotheses or computer simulation to demonstrate the efficiency of their methods. The techniques that rely on estimating the distance using samples of the received signal strength usually assume prior knowledge of the signal propagation characteristics of the indoor environment in hand and tend to take a relatively large number of uniformly distributed random samples. This paper presents an efficient and practical collaborative approach to detect the location of an access point in an indoor environment without any prior knowledge of the environment. The proposed approach comprises a swarm of wirelessly connected mobile robots that collaboratively and autonomously collect a relatively small number of non-uniformly distributed random samples of the access point’s received signal strength. These samples are used to efficiently and accurately estimate the location of the access point. The experimental testing verified that the proposed approach can identify the location of the access point in an accurate and efficient manner. PMID:29385042
An efficient method to compute spurious end point contributions in PO solutions. [Physical Optics
NASA Technical Reports Server (NTRS)
Gupta, Inder J.; Burnside, Walter D.; Pistorius, Carl W. I.
1987-01-01
A method is given to compute the spurious endpoint contributions in the physical optics solution for electromagnetic scattering from conducting bodies. The method is applicable to general three-dimensional structures. The only information required to use the method is the radius of curvature of the body at the shadow boundary. Thus, the method is very efficient for numerical computations. As an illustration, the method is applied to several bodies of revolution to compute the endpoint contributions for backscattering in the case of axial incidence. It is shown that in high-frequency situations, the endpoint contributions obtained using the method are equal to the true endpoint contributions.
International Conference on Stiff Computation Held at Park City, Utah on April 12, 13 and 14, 1982.
1983-05-31
algorithm should be designed which can analyse a system description and find out for the user ~to which class of problems his system belongs... Dove...processors designed to implement aspecific solution process. yrne: IEE floating point chip design " used by INE and others is an example (Xahan)...the...hardware speciaList has designed his computer such that the paraL#L features can be addressed convenientLy and !! ’) efficientLy, and 4;) the software
On the tip of the tongue: learning typing and pointing with an intra-oral computer interface.
Caltenco, Héctor A; Breidegard, Björn; Struijk, Lotte N S Andreasen
2014-07-01
To evaluate typing and pointing performance and improvement over time of four able-bodied participants using an intra-oral tongue-computer interface for computer control. A physically disabled individual may lack the ability to efficiently control standard computer input devices. There have been several efforts to produce and evaluate interfaces that provide individuals with physical disabilities the possibility to control personal computers. Training with the intra-oral tongue-computer interface was performed by playing games over 18 sessions. Skill improvement was measured through typing and pointing exercises at the end of each training session. Typing throughput improved from averages of 2.36 to 5.43 correct words per minute. Pointing throughput improved from averages of 0.47 to 0.85 bits/s. Target tracking performance, measured as relative time on target, improved from averages of 36% to 47%. Path following throughput improved from averages of 0.31 to 0.83 bits/s and decreased to 0.53 bits/s with more difficult tasks. Learning curves support the notion that the tongue can rapidly learn novel motor tasks. Typing and pointing performance of the tongue-computer interface is comparable to performances of other proficient assistive devices, which makes the tongue a feasible input organ for computer control. Intra-oral computer interfaces could provide individuals with severe upper-limb mobility impairments the opportunity to control computers and automatic equipment. Typing and pointing performance of the tongue-computer interface is comparable to performances of other proficient assistive devices, but does not cause fatigue easily and might be invisible to other people, which is highly prioritized by assistive device users. Combination of visual and auditory feedback is vital for a good performance of an intra-oral computer interface and helps to reduce involuntary or erroneous activations.
A Very High Order, Adaptable MESA Implementation for Aeroacoustic Computations
NASA Technical Reports Server (NTRS)
Dydson, Roger W.; Goodrich, John W.
2000-01-01
Since computational efficiency and wave resolution scale with accuracy, the ideal would be infinitely high accuracy for problems with widely varying wavelength scales. Currently, many of the computational aeroacoustics methods are limited to 4th order accurate Runge-Kutta methods in time which limits their resolution and efficiency. However, a new procedure for implementing the Modified Expansion Solution Approximation (MESA) schemes, based upon Hermitian divided differences, is presented which extends the effective accuracy of the MESA schemes to 57th order in space and time when using 128 bit floating point precision. This new approach has the advantages of reducing round-off error, being easy to program. and is more computationally efficient when compared to previous approaches. Its accuracy is limited only by the floating point hardware. The advantages of this new approach are demonstrated by solving the linearized Euler equations in an open bi-periodic domain. A 500th order MESA scheme can now be created in seconds, making these schemes ideally suited for the next generation of high performance 256-bit (double quadruple) or higher precision computers. This ease of creation makes it possible to adapt the algorithm to the mesh in time instead of its converse: this is ideal for resolving varying wavelength scales which occur in noise generation simulations. And finally, the sources of round-off error which effect the very high order methods are examined and remedies provided that effectively increase the accuracy of the MESA schemes while using current computer technology.
Phase quality map based on local multi-unwrapped results for two-dimensional phase unwrapping.
Zhong, Heping; Tang, Jinsong; Zhang, Sen
2015-02-01
The efficiency of a phase unwrapping algorithm and the reliability of the corresponding unwrapped result are two key problems in reconstructing the digital elevation model of a scene from its interferometric synthetic aperture radar (InSAR) or interferometric synthetic aperture sonar (InSAS) data. In this paper, a new phase quality map is designed and implemented in a graphic processing unit (GPU) environment, which greatly accelerates the unwrapping process of the quality-guided algorithm and enhances the correctness of the unwrapped result. In a local wrapped phase window, the center point is selected as the reference point, and then two unwrapped results are computed by integrating in two different simple ways. After the two local unwrapped results are computed, the total difference of the two unwrapped results is regarded as the phase quality value of the center point. In order to accelerate the computing process of the new proposed quality map, we have implemented it in a GPU environment. The wrapped phase data are first uploaded to the memory of a device, and then the kernel function is called in the device to compute the phase quality in parallel by blocks of threads. Unwrapping tests performed on the simulated and real InSAS data confirm the accuracy and efficiency of the proposed method.
ERIC Educational Resources Information Center
Shih, Ching-Hsiang; Chang, Man-Ling; Shih, Ching-Tien
2009-01-01
This study evaluated whether two people with multiple disabilities and minimal motor behavior would be able to improve their pointing performance using finger poke ability with a mouse wheel through a Dynamic Pointing Assistive Program (DPAP) and a newly developed mouse driver (i.e., a new mouse driver replaces standard mouse driver, changes a…
ERIC Educational Resources Information Center
Shih, Ching-Hsiang; Chiu, Sheng-Kai; Chu, Chiung-Ling; Shih, Ching-Tien; Liao, Yung-Kun; Lin, Chia-Chen
2010-01-01
This study evaluated whether two people with multiple disabilities would be able to improve their pointing performance using hand swing with a standard mouse through an Extended Dynamic Pointing Assistive Program (EDPAP) and a newly developed mouse driver (i.e., a new mouse driver replaces standard mouse driver, and changes a mouse into a precise…
Numerical Study of Boundary Layer Interaction with Shocks: Method Improvement and Test Computation
NASA Technical Reports Server (NTRS)
Adams, N. A.
1995-01-01
The objective is the development of a high-order and high-resolution method for the direct numerical simulation of shock turbulent-boundary-layer interaction. Details concerning the spatial discretization of the convective terms can be found in Adams and Shariff (1995). The computer code based on this method as introduced in Adams (1994) was formulated in Cartesian coordinates and thus has been limited to simple rectangular domains. For more general two-dimensional geometries, as a compression corner, an extension to generalized coordinates is necessary. To keep the requirements or limitations for grid generation low, the extended formulation should allow for non-orthogonal grids. Still, for simplicity and cost efficiency, periodicity can be assumed in one cross-flow direction. For easy vectorization, the compact-ENO coupling algorithm as used in Adams (1994) treated whole planes normal to the derivative direction with the ENO scheme whenever at least one point of this plane satisfied the detection criterion. This is apparently too restrictive for more general geometries and more complex shock patterns. Here we introduce a localized compact-ENO coupling algorithm, which is efficient as long as the overall number of grid points treated by the ENO scheme is small compared to the total number of grid points. Validation and test computations with the final code are performed to assess the efficiency and suitability of the computer code for the problems of interest. We define a set of parameters where a direct numerical simulation of a turbulent boundary layer along a compression corner with reasonably fine resolution is affordable.
Komorkiewicz, Mateusz; Kryjak, Tomasz; Gorgon, Marek
2014-01-01
This article presents an efficient hardware implementation of the Horn-Schunck algorithm that can be used in an embedded optical flow sensor. An architecture is proposed, that realises the iterative Horn-Schunck algorithm in a pipelined manner. This modification allows to achieve data throughput of 175 MPixels/s and makes processing of Full HD video stream (1, 920 × 1, 080 @ 60 fps) possible. The structure of the optical flow module as well as pre- and post-filtering blocks and a flow reliability computation unit is described in details. Three versions of optical flow modules, with different numerical precision, working frequency and obtained results accuracy are proposed. The errors caused by switching from floating- to fixed-point computations are also evaluated. The described architecture was tested on popular sequences from an optical flow dataset of the Middlebury University. It achieves state-of-the-art results among hardware implementations of single scale methods. The designed fixed-point architecture achieves performance of 418 GOPS with power efficiency of 34 GOPS/W. The proposed floating-point module achieves 103 GFLOPS, with power efficiency of 24 GFLOPS/W. Moreover, a 100 times speedup compared to a modern CPU with SIMD support is reported. A complete, working vision system realized on Xilinx VC707 evaluation board is also presented. It is able to compute optical flow for Full HD video stream received from an HDMI camera in real-time. The obtained results prove that FPGA devices are an ideal platform for embedded vision systems. PMID:24526303
Komorkiewicz, Mateusz; Kryjak, Tomasz; Gorgon, Marek
2014-02-12
This article presents an efficient hardware implementation of the Horn-Schunck algorithm that can be used in an embedded optical flow sensor. An architecture is proposed, that realises the iterative Horn-Schunck algorithm in a pipelined manner. This modification allows to achieve data throughput of 175 MPixels/s and makes processing of Full HD video stream (1; 920 × 1; 080 @ 60 fps) possible. The structure of the optical flow module as well as pre- and post-filtering blocks and a flow reliability computation unit is described in details. Three versions of optical flow modules, with different numerical precision, working frequency and obtained results accuracy are proposed. The errors caused by switching from floating- to fixed-point computations are also evaluated. The described architecture was tested on popular sequences from an optical flow dataset of the Middlebury University. It achieves state-of-the-art results among hardware implementations of single scale methods. The designed fixed-point architecture achieves performance of 418 GOPS with power efficiency of 34 GOPS/W. The proposed floating-point module achieves 103 GFLOPS, with power efficiency of 24 GFLOPS/W. Moreover, a 100 times speedup compared to a modern CPU with SIMD support is reported. A complete, working vision system realized on Xilinx VC707 evaluation board is also presented. It is able to compute optical flow for Full HD video stream received from an HDMI camera in real-time. The obtained results prove that FPGA devices are an ideal platform for embedded vision systems.
Indirect addressing and load balancing for faster solution to Mandelbrot Set on SIMD architectures
NASA Technical Reports Server (NTRS)
Tomboulian, Sherryl
1989-01-01
SIMD computers with local indirect addressing allow programs to have queues and buffers, making certain kinds of problems much more efficient. Examined here are a class of problems characterized by computations on data points where the computation is identical, but the convergence rate is data dependent. Normally, in this situation, the algorithm time is governed by the maximum number of iterations required by each point. Using indirect addressing allows a processor to proceed to the next data point when it is done, reducing the overall number of iterations required to approach the mean convergence rate when a sufficiently large problem set is solved. Load balancing techniques can be applied for additional performance improvement. Simulations of this technique applied to solving Mandelbrot Sets indicate significant performance gains.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bryan, Frank; Dennis, John; MacCready, Parker
This project aimed to improve long term global climate simulations by resolving and enhancing the representation of the processes involved in the cycling of freshwater through estuaries and coastal regions. This was a collaborative multi-institution project consisting of physical oceanographers, climate model developers, and computational scientists. It specifically targeted the DOE objectives of advancing simulation and predictive capability of climate models through improvements in resolution and physical process representation. The main computational objectives were: 1. To develop computationally efficient, but physically based, parameterizations of estuary and continental shelf mixing processes for use in an Earth System Model (CESM). 2. Tomore » develop a two-way nested regional modeling framework in order to dynamically downscale the climate response of particular coastal ocean regions and to upscale the impact of the regional coastal processes to the global climate in an Earth System Model (CESM). 3. To develop computational infrastructure to enhance the efficiency of data transfer between specific sources and destinations, i.e., a point-to-point communication capability, (used in objective 1) within POP, the ocean component of CESM.« less
NASA Astrophysics Data System (ADS)
Wang, Yuan; Chen, Zhidong; Sang, Xinzhu; Li, Hui; Zhao, Linmin
2018-03-01
Holographic displays can provide the complete optical wave field of a three-dimensional (3D) scene, including the depth perception. However, it often takes a long computation time to produce traditional computer-generated holograms (CGHs) without more complex and photorealistic rendering. The backward ray-tracing technique is able to render photorealistic high-quality images, which noticeably reduce the computation time achieved from the high-degree parallelism. Here, a high-efficiency photorealistic computer-generated hologram method is presented based on the ray-tracing technique. Rays are parallelly launched and traced under different illuminations and circumstances. Experimental results demonstrate the effectiveness of the proposed method. Compared with the traditional point cloud CGH, the computation time is decreased to 24 s to reconstruct a 3D object of 100 ×100 rays with continuous depth change.
Analytical determination of propeller performance degradation due to ice accretion
NASA Technical Reports Server (NTRS)
Miller, T. L.
1986-01-01
A computer code has been developed which is capable of computing propeller performance for clean, glaze, or rime iced propeller configurations, thereby providing a mechanism for determining the degree of performance degradation which results from a given icing encounter. The inviscid, incompressible flow field at each specified propeller radial location is first computed using the Theodorsen transformation method of conformal mapping. A droplet trajectory computation then calculates droplet impingement points and airfoil collection efficiency for each radial location, at which point several user-selectable empirical correlations are available for determining the aerodynamic penalities which arise due to the ice accretion. Propeller performance is finally computed using strip analysis for either the clean or iced propeller. In the iced mode, the differential thrust and torque coefficient equations are modified by the drag and lift coefficient increments due to ice to obtain the appropriate iced values. Comparison with available experimental propeller icing data shows good agreement in several cases. The code's capability to properly predict iced thrust coefficient, power coefficient, and propeller efficiency is shown to be dependent on the choice of empirical correlation employed as well as proper specification of radial icing extent.
a Voxel-Based Filtering Algorithm for Mobile LIDAR Data
NASA Astrophysics Data System (ADS)
Qin, H.; Guan, G.; Yu, Y.; Zhong, L.
2018-04-01
This paper presents a stepwise voxel-based filtering algorithm for mobile LiDAR data. In the first step, to improve computational efficiency, mobile LiDAR points, in xy-plane, are first partitioned into a set of two-dimensional (2-D) blocks with a given block size, in each of which all laser points are further organized into an octree partition structure with a set of three-dimensional (3-D) voxels. Then, a voxel-based upward growing processing is performed to roughly separate terrain from non-terrain points with global and local terrain thresholds. In the second step, the extracted terrain points are refined by computing voxel curvatures. This voxel-based filtering algorithm is comprehensively discussed in the analyses of parameter sensitivity and overall performance. An experimental study performed on multiple point cloud samples, collected by different commercial mobile LiDAR systems, showed that the proposed algorithm provides a promising solution to terrain point extraction from mobile point clouds.
NASA Technical Reports Server (NTRS)
Dayton, J. A., Jr.; Kosmahl, H. G.; Ramins, P.; Stankiewicz, N.
1979-01-01
Experimental and analytical results are compared for two high performance, octave bandwidth TWT's that use depressed collectors (MDC's) to improve the efficiency. The computations were carried out with advanced, multidimensional computer programs that are described here in detail. These programs model the electron beam as a series of either disks or rings of charge and follow their multidimensional trajectories from the RF input of the ideal TWT, through the slow wave structure, through the magnetic refocusing system, to their points of impact in the depressed collector. Traveling wave tube performance, collector efficiency, and collector current distribution were computed and the results compared with measurements for a number of TWT-MDC systems. Power conservation and correct accounting of TWT and collector losses were observed. For the TWT's operating at saturation, very good agreement was obtained between the computed and measured collector efficiencies. For a TWT operating 3 and 6 dB below saturation, excellent agreement between computed and measured collector efficiencies was obtained in some cases but only fair agreement in others. However, deviations can largely be explained by small differences in the computed and actual spent beam energy distributions. The analytical tools used here appear to be sufficiently refined to design efficient collectors for this class of TWT. However, for maximum efficiency, some experimental optimization (e.g., collector voltages and aperture sizes) will most likely be required.
An efficient and accurate 3D displacements tracking strategy for digital volume correlation
NASA Astrophysics Data System (ADS)
Pan, Bing; Wang, Bo; Wu, Dafang; Lubineau, Gilles
2014-07-01
Owing to its inherent computational complexity, practical implementation of digital volume correlation (DVC) for internal displacement and strain mapping faces important challenges in improving its computational efficiency. In this work, an efficient and accurate 3D displacement tracking strategy is proposed for fast DVC calculation. The efficiency advantage is achieved by using three improvements. First, to eliminate the need of updating Hessian matrix in each iteration, an efficient 3D inverse compositional Gauss-Newton (3D IC-GN) algorithm is introduced to replace existing forward additive algorithms for accurate sub-voxel displacement registration. Second, to ensure the 3D IC-GN algorithm that converges accurately and rapidly and avoid time-consuming integer-voxel displacement searching, a generalized reliability-guided displacement tracking strategy is designed to transfer accurate and complete initial guess of deformation for each calculation point from its computed neighbors. Third, to avoid the repeated computation of sub-voxel intensity interpolation coefficients, an interpolation coefficient lookup table is established for tricubic interpolation. The computational complexity of the proposed fast DVC and the existing typical DVC algorithms are first analyzed quantitatively according to necessary arithmetic operations. Then, numerical tests are performed to verify the performance of the fast DVC algorithm in terms of measurement accuracy and computational efficiency. The experimental results indicate that, compared with the existing DVC algorithm, the presented fast DVC algorithm produces similar precision and slightly higher accuracy at a substantially reduced computational cost.
NASA Astrophysics Data System (ADS)
Nguyen, Van-Dung; Wu, Ling; Noels, Ludovic
2017-03-01
This work provides a unified treatment of arbitrary kinds of microscopic boundary conditions usually considered in the multi-scale computational homogenization method for nonlinear multi-physics problems. An efficient procedure is developed to enforce the multi-point linear constraints arising from the microscopic boundary condition either by the direct constraint elimination or by the Lagrange multiplier elimination methods. The macroscopic tangent operators are computed in an efficient way from a multiple right hand sides linear system whose left hand side matrix is the stiffness matrix of the microscopic linearized system at the converged solution. The number of vectors at the right hand side is equal to the number of the macroscopic kinematic variables used to formulate the microscopic boundary condition. As the resolution of the microscopic linearized system often follows a direct factorization procedure, the computation of the macroscopic tangent operators is then performed using this factorized matrix at a reduced computational time.
Star adaptation for two-algorithms used on serial computers
NASA Technical Reports Server (NTRS)
Howser, L. M.; Lambiotte, J. J., Jr.
1974-01-01
Two representative algorithms used on a serial computer and presently executed on the Control Data Corporation 6000 computer were adapted to execute efficiently on the Control Data STAR-100 computer. Gaussian elimination for the solution of simultaneous linear equations and the Gauss-Legendre quadrature formula for the approximation of an integral are the two algorithms discussed. A description is given of how the programs were adapted for STAR and why these adaptations were necessary to obtain an efficient STAR program. Some points to consider when adapting an algorithm for STAR are discussed. Program listings of the 6000 version coded in 6000 FORTRAN, the adapted STAR version coded in 6000 FORTRAN, and the STAR version coded in STAR FORTRAN are presented in the appendices.
NASA Technical Reports Server (NTRS)
Rausch, Russ D.; Batina, John T.; Yang, Henry T. Y.
1991-01-01
Spatial adaption procedures for the accurate and efficient solution of steady and unsteady inviscid flow problems are described. The adaption procedures were developed and implemented within a two-dimensional unstructured-grid upwind-type Euler code. These procedures involve mesh enrichment and mesh coarsening to either add points in a high gradient region or the flow or remove points where they are not needed, respectively, to produce solutions of high spatial accuracy at minimal computational costs. A detailed description is given of the enrichment and coarsening procedures and comparisons with alternative results and experimental data are presented to provide an assessment of the accuracy and efficiency of the capability. Steady and unsteady transonic results, obtained using spatial adaption for the NACA 0012 airfoil, are shown to be of high spatial accuracy, primarily in that the shock waves are very sharply captured. The results were obtained with a computational savings of a factor of approximately fifty-three for a steady case and as much as twenty-five for the unsteady cases.
NASA Technical Reports Server (NTRS)
Rausch, Russ D.; Yang, Henry T. Y.; Batina, John T.
1991-01-01
Spatial adaption procedures for the accurate and efficient solution of steady and unsteady inviscid flow problems are described. The adaption procedures were developed and implemented within a two-dimensional unstructured-grid upwind-type Euler code. These procedures involve mesh enrichment and mesh coarsening to either add points in high gradient regions of the flow or remove points where they are not needed, respectively, to produce solutions of high spatial accuracy at minimal computational cost. The paper gives a detailed description of the enrichment and coarsening procedures and presents comparisons with alternative results and experimental data to provide an assessment of the accuracy and efficiency of the capability. Steady and unsteady transonic results, obtained using spatial adaption for the NACA 0012 airfoil, are shown to be of high spatial accuracy, primarily in that the shock waves are very sharply captured. The results were obtained with a computational savings of a factor of approximately fifty-three for a steady case and as much as twenty-five for the unsteady cases.
A NEW METHOD FOR FINDING POINT SOURCES IN HIGH-ENERGY NEUTRINO DATA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fang, Ke; Miller, M. Coleman
The IceCube collaboration has reported the first detection of high-energy astrophysical neutrinos, including ∼50 high-energy starting events, but no individual sources have been identified. It is therefore important to develop the most sensitive and efficient possible algorithms to identify the point sources of these neutrinos. The most popular current method works by exploring a dense grid of possible directions to individual sources, and identifying the single direction with the maximum probability of having produced multiple detected neutrinos. This method has numerous strengths, but it is computationally intensive and because it focuses on the single best location for a point source,more » additional point sources are not included in the evidence. We propose a new maximum likelihood method that uses the angular separations between all pairs of neutrinos in the data. Unlike existing autocorrelation methods for this type of analysis, which also use angular separations between neutrino pairs, our method incorporates information about the point-spread function and can identify individual point sources. We find that if the angular resolution is a few degrees or better, then this approach reduces both false positive and false negative errors compared to the current method, and is also more computationally efficient up to, potentially, hundreds of thousands of detected neutrinos.« less
Y-MP floating point and Cholesky factorization
NASA Technical Reports Server (NTRS)
Carter, Russell
1991-01-01
The floating point arithmetics implemented in the Cray 2 and Cray Y-MP computer systems are nearly identical, but large scale computations performed on the two systems have exhibited significant differences in accuracy. The difference in accuracy is analyzed for Cholesky factorization algorithm, and it is found that the source of the difference is the subtract magnitude operation of the Cray Y-MP. The results from numerical experiments for a range of problem sizes are presented, and an efficient method for improving the accuracy of the factorization obtained on the Y-MP is presented.
On automating domain connectivity for overset grids
NASA Technical Reports Server (NTRS)
Chiu, Ing-Tsau
1994-01-01
An alternative method for domain connectivity among systems of overset grids is presented. Reference uniform Cartesian systems of points are used to achieve highly efficient domain connectivity, and form the basis for a future fully automated system. The Cartesian systems are used to approximated body surfaces and to map the computational space of component grids. By exploiting the characteristics of Cartesian Systems, Chimera type hole-cutting and identification of donor elements for intergrid boundary points can be carried out very efficiently. The method is tested for a range of geometrically complex multiple-body overset grid systems.
Computationally efficient control allocation
NASA Technical Reports Server (NTRS)
Durham, Wayne (Inventor)
2001-01-01
A computationally efficient method for calculating near-optimal solutions to the three-objective, linear control allocation problem is disclosed. The control allocation problem is that of distributing the effort of redundant control effectors to achieve some desired set of objectives. The problem is deemed linear if control effectiveness is affine with respect to the individual control effectors. The optimal solution is that which exploits the collective maximum capability of the effectors within their individual physical limits. Computational efficiency is measured by the number of floating-point operations required for solution. The method presented returned optimal solutions in more than 90% of the cases examined; non-optimal solutions returned by the method were typically much less than 1% different from optimal and the errors tended to become smaller than 0.01% as the number of controls was increased. The magnitude of the errors returned by the present method was much smaller than those that resulted from either pseudo inverse or cascaded generalized inverse solutions. The computational complexity of the method presented varied linearly with increasing numbers of controls; the number of required floating point operations increased from 5.5 i, to seven times faster than did the minimum-norm solution (the pseudoinverse), and at about the same rate as did the cascaded generalized inverse solution. The computational requirements of the method presented were much better than that of previously described facet-searching methods which increase in proportion to the square of the number of controls.
A GPU-accelerated implicit meshless method for compressible flows
NASA Astrophysics Data System (ADS)
Zhang, Jia-Le; Ma, Zhi-Hua; Chen, Hong-Quan; Cao, Cheng
2018-05-01
This paper develops a recently proposed GPU based two-dimensional explicit meshless method (Ma et al., 2014) by devising and implementing an efficient parallel LU-SGS implicit algorithm to further improve the computational efficiency. The capability of the original 2D meshless code is extended to deal with 3D complex compressible flow problems. To resolve the inherent data dependency of the standard LU-SGS method, which causes thread-racing conditions destabilizing numerical computation, a generic rainbow coloring method is presented and applied to organize the computational points into different groups by painting neighboring points with different colors. The original LU-SGS method is modified and parallelized accordingly to perform calculations in a color-by-color manner. The CUDA Fortran programming model is employed to develop the key kernel functions to apply boundary conditions, calculate time steps, evaluate residuals as well as advance and update the solution in the temporal space. A series of two- and three-dimensional test cases including compressible flows over single- and multi-element airfoils and a M6 wing are carried out to verify the developed code. The obtained solutions agree well with experimental data and other computational results reported in the literature. Detailed analysis on the performance of the developed code reveals that the developed CPU based implicit meshless method is at least four to eight times faster than its explicit counterpart. The computational efficiency of the implicit method could be further improved by ten to fifteen times on the GPU.
NASA Astrophysics Data System (ADS)
Mo, S.; Lu, D.; Shi, X.; Zhang, G.; Ye, M.; Wu, J.
2016-12-01
Surrogate models have shown remarkable computational efficiency in hydrological simulations involving design space exploration, sensitivity analysis, uncertainty quantification, etc. The central task of constructing a global surrogate models is to achieve a prescribed approximation accuracy with as few original model executions as possible, which requires a good design strategy to optimize the distribution of data points in the parameter domains and an effective stopping criterion to automatically terminate the design process when desired approximation accuracy is achieved. This study proposes a novel adaptive sampling strategy, which starts from a small number of initial samples and adaptively selects additional samples by balancing the collection in unexplored regions and refinement in interesting areas. We define an efficient and effective evaluation metric basing on Taylor expansion to select the most promising potential samples from candidate points, and propose a robust stopping criterion basing on the approximation accuracy at new points to guarantee the achievement of desired accuracy. The numerical results of several benchmark analytical functions indicate that the proposed approach is more computationally efficient and robust than the widely used maximin distance design and two other well-known adaptive sampling strategies. The application to two complicated multiphase flow problems further demonstrates the efficiency and effectiveness of our method in constructing global surrogate models for high-dimensional and highly nonlinear problems. Acknowledgements: This work was financially supported by the National Nature Science Foundation of China grants No. 41030746 and 41172206.
Stream Kriging: Incremental and recursive ordinary Kriging over spatiotemporal data streams
NASA Astrophysics Data System (ADS)
Zhong, Xu; Kealy, Allison; Duckham, Matt
2016-05-01
Ordinary Kriging is widely used for geospatial interpolation and estimation. Due to the O (n3) time complexity of solving the system of linear equations, ordinary Kriging for a large set of source points is computationally intensive. Conducting real-time Kriging interpolation over continuously varying spatiotemporal data streams can therefore be especially challenging. This paper develops and tests two new strategies for improving the performance of an ordinary Kriging interpolator adapted to a stream-processing environment. These strategies rely on the expectation that, over time, source data points will frequently refer to the same spatial locations (for example, where static sensor nodes are generating repeated observations of a dynamic field). First, an incremental strategy improves efficiency in cases where a relatively small proportion of previously processed spatial locations are absent from the source points at any given iteration. Second, a recursive strategy improves efficiency in cases where there is substantial set overlap between the sets of spatial locations of source points at the current and previous iterations. These two strategies are evaluated in terms of their computational efficiency in comparison to ordinary Kriging algorithm. The results show that these two strategies can reduce the time taken to perform the interpolation by up to 90%, and approach average-case time complexity of O (n2) when most but not all source points refer to the same locations over time. By combining the approaches developed in this paper with existing heuristic ordinary Kriging algorithms, the conclusions indicate how further efficiency gains could potentially be accrued. The work ultimately contributes to the development of online ordinary Kriging interpolation algorithms, capable of real-time spatial interpolation with large streaming data sets.
Two-Dimensional Grids About Airfoils and Other Shapes
NASA Technical Reports Server (NTRS)
Sorenson, R.
1982-01-01
GRAPE computer program generates two-dimensional finite-difference grids about airfoils and other shapes by use of Poisson differential equation. GRAPE can be used with any boundary shape, even one specified by tabulated points and including limited number of sharp corners. Numerically stable and computationally fast, GRAPE provides aerodynamic analyst with efficient and consistant means of grid generation.
Continuously Deformation Monitoring of Subway Tunnel Based on Terrestrial Point Clouds
NASA Astrophysics Data System (ADS)
Kang, Z.; Tuo, L.; Zlatanova, S.
2012-07-01
The deformation monitoring of subway tunnel is of extraordinary necessity. Therefore, a method for deformation monitoring based on terrestrial point clouds is proposed in this paper. First, the traditional adjacent stations registration is replaced by sectioncontrolled registration, so that the common control points can be used by each station and thus the error accumulation avoided within a section. Afterwards, the central axis of the subway tunnel is determined through RANSAC (Random Sample Consensus) algorithm and curve fitting. Although with very high resolution, laser points are still discrete and thus the vertical section is computed via the quadric fitting of the vicinity of interest, instead of the fitting of the whole model of a subway tunnel, which is determined by the intersection line rotated about the central axis of tunnel within a vertical plane. The extraction of the vertical section is then optimized using RANSAC for the purpose of filtering out noises. Based on the extracted vertical sections, the volume of tunnel deformation is estimated by the comparison between vertical sections extracted at the same position from different epochs of point clouds. Furthermore, the continuously extracted vertical sections are deployed to evaluate the convergent tendency of the tunnel. The proposed algorithms are verified using real datasets in terms of accuracy and computation efficiency. The experimental result of fitting accuracy analysis shows the maximum deviation between interpolated point and real point is 1.5 mm, and the minimum one is 0.1 mm; the convergent tendency of the tunnel was detected by the comparison of adjacent fitting radius. The maximum error is 6 mm, while the minimum one is 1 mm. The computation cost of vertical section abstraction is within 3 seconds/section, which proves high efficiency..
Influence matrix program for aerodynamic lifting surface theory. [in subsonic flows
NASA Technical Reports Server (NTRS)
Medan, R. T.; Ray, K. S.
1973-01-01
A users manual is described for a USA FORTRAN 4 computer program which computes an aerodynamic influence matrix and is one of several computer programs used to analyze lifting, thin wings in steady, subsonic flow according to a kernel function method lifting surface theory. The most significant features of the program are that it can treat unsymmetrical wings, control points can be placed on the leading and/or trailing edges, and a stable, efficient algorithm is used to compute the influence matrix.
Comparison of the different approaches to generate holograms from data acquired with a Kinect sensor
NASA Astrophysics Data System (ADS)
Kang, Ji-Hoon; Leportier, Thibault; Ju, Byeong-Kwon; Song, Jin Dong; Lee, Kwang-Hoon; Park, Min-Chul
2017-05-01
Data of real scenes acquired in real-time with a Kinect sensor can be processed with different approaches to generate a hologram. 3D models can be generated from a point cloud or a mesh representation. The advantage of the point cloud approach is that computation process is well established since it involves only diffraction and propagation of point sources between parallel planes. On the other hand, the mesh representation enables to reduce the number of elements necessary to represent the object. Then, even though the computation time for the contribution of a single element increases compared to a simple point, the total computation time can be reduced significantly. However, the algorithm is more complex since propagation of elemental polygons between non-parallel planes should be implemented. Finally, since a depth map of the scene is acquired at the same time than the intensity image, a depth layer approach can also be adopted. This technique is appropriate for a fast computation since propagation of an optical wavefront from one plane to another can be handled efficiently with the fast Fourier transform. Fast computation with depth layer approach is convenient for real time applications, but point cloud method is more appropriate when high resolution is needed. In this study, since Kinect can be used to obtain both point cloud and depth map, we examine the different approaches that can be adopted for hologram computation and compare their performance.
NASA Astrophysics Data System (ADS)
Wei, Hui; Gong, Guanghong; Li, Ni
2017-10-01
Computer-generated hologram (CGH) is a promising 3D display technology while it is challenged by heavy computation load and vast memory requirement. To solve these problems, a depth compensating CGH calculation method based on symmetry and similarity of zone plates is proposed and implemented on graphics processing unit (GPU). An improved LUT method is put forward to compute the distances between object points and hologram pixels in the XY direction. The concept of depth compensating factor is defined and used for calculating the holograms of points with different depth positions instead of layer-based methods. The proposed method is suitable for arbitrary sampling objects with lower memory usage and higher computational efficiency compared to other CGH methods. The effectiveness of the proposed method is validated by numerical and optical experiments.
Learning to assign binary weights to binary descriptor
NASA Astrophysics Data System (ADS)
Huang, Zhoudi; Wei, Zhenzhong; Zhang, Guangjun
2016-10-01
Constructing robust binary local feature descriptors are receiving increasing interest due to their binary nature, which can enable fast processing while requiring significantly less memory than their floating-point competitors. To bridge the performance gap between the binary and floating-point descriptors without increasing the computational cost of computing and matching, optimal binary weights are learning to assign to binary descriptor for considering each bit might contribute differently to the distinctiveness and robustness. Technically, a large-scale regularized optimization method is applied to learn float weights for each bit of the binary descriptor. Furthermore, binary approximation for the float weights is performed by utilizing an efficient alternatively greedy strategy, which can significantly improve the discriminative power while preserve fast matching advantage. Extensive experimental results on two challenging datasets (Brown dataset and Oxford dataset) demonstrate the effectiveness and efficiency of the proposed method.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Krityakierne, Tipaluck; Akhtar, Taimoor; Shoemaker, Christine A.
This paper presents a parallel surrogate-based global optimization method for computationally expensive objective functions that is more effective for larger numbers of processors. To reach this goal, we integrated concepts from multi-objective optimization and tabu search into, single objective, surrogate optimization. Our proposed derivative-free algorithm, called SOP, uses non-dominated sorting of points for which the expensive function has been previously evaluated. The two objectives are the expensive function value of the point and the minimum distance of the point to previously evaluated points. Based on the results of non-dominated sorting, P points from the sorted fronts are selected as centersmore » from which many candidate points are generated by random perturbations. Based on surrogate approximation, the best candidate point is subsequently selected for expensive evaluation for each of the P centers, with simultaneous computation on P processors. Centers that previously did not generate good solutions are tabu with a given tenure. We show almost sure convergence of this algorithm under some conditions. The performance of SOP is compared with two RBF based methods. The test results show that SOP is an efficient method that can reduce time required to find a good near optimal solution. In a number of cases the efficiency of SOP is so good that SOP with 8 processors found an accurate answer in less wall-clock time than the other algorithms did with 32 processors.« less
Computational methods for efficient structural reliability and reliability sensitivity analysis
NASA Technical Reports Server (NTRS)
Wu, Y.-T.
1993-01-01
This paper presents recent developments in efficient structural reliability analysis methods. The paper proposes an efficient, adaptive importance sampling (AIS) method that can be used to compute reliability and reliability sensitivities. The AIS approach uses a sampling density that is proportional to the joint PDF of the random variables. Starting from an initial approximate failure domain, sampling proceeds adaptively and incrementally with the goal of reaching a sampling domain that is slightly greater than the failure domain to minimize over-sampling in the safe region. Several reliability sensitivity coefficients are proposed that can be computed directly and easily from the above AIS-based failure points. These probability sensitivities can be used for identifying key random variables and for adjusting design to achieve reliability-based objectives. The proposed AIS methodology is demonstrated using a turbine blade reliability analysis problem.
NASA Technical Reports Server (NTRS)
Chima, R. V.; Strazisar, A. J.
1982-01-01
Two and three dimensional inviscid solutions for the flow in a transonic axial compressor rotor at design speed are compared with probe and laser anemometers measurements at near-stall and maximum-flow operating points. Experimental details of the laser anemometer system and computational details of the two dimensional axisymmetric code and three dimensional Euler code are described. Comparisons are made between relative Mach number and flow angle contours, shock location, and shock strength. A procedure for using an efficient axisymmetric code to generate downstream pressure input for computationally expensive Euler codes is discussed. A film supplement shows the calculations of the two operating points with the time-marching Euler code.
Precise and efficient evaluation of gravimetric quantities at arbitrarily scattered points in space
NASA Astrophysics Data System (ADS)
Ivanov, Kamen G.; Pavlis, Nikolaos K.; Petrushev, Pencho
2017-12-01
Gravimetric quantities are commonly represented in terms of high degree surface or solid spherical harmonics. After EGM2008, such expansions routinely extend to spherical harmonic degree 2190, which makes the computation of gravimetric quantities at a large number of arbitrarily scattered points in space using harmonic synthesis, a very computationally demanding process. We present here the development of an algorithm and its associated software for the efficient and precise evaluation of gravimetric quantities, represented in high degree solid spherical harmonics, at arbitrarily scattered points in the space exterior to the surface of the Earth. The new algorithm is based on representation of the quantities of interest in solid ellipsoidal harmonics and application of the tensor product trigonometric needlets. A FORTRAN implementation of this algorithm has been developed and extensively tested. The capabilities of the code are demonstrated using as examples the disturbing potential T, height anomaly ζ , gravity anomaly Δ g , gravity disturbance δ g , north-south deflection of the vertical ξ , east-west deflection of the vertical η , and the second radial derivative T_{rr} of the disturbing potential. After a pre-computational step that takes between 1 and 2 h per quantity, the current version of the software is capable of computing on a standard PC each of these quantities in the range from the surface of the Earth up to 544 km above that surface at speeds between 20,000 and 40,000 point evaluations per second, depending on the gravimetric quantity being evaluated, while the relative error does not exceed 10^{-6} and the memory (RAM) use is 9.3 GB.
Inversion Of Jacobian Matrix For Robot Manipulators
NASA Technical Reports Server (NTRS)
Fijany, Amir; Bejczy, Antal K.
1989-01-01
Report discusses inversion of Jacobian matrix for class of six-degree-of-freedom arms with spherical wrist, i.e., with last three joints intersecting. Shows by taking advantage of simple geometry of such arms, closed-form solution of Q=J-1X, which represents linear transformation from task space to joint space, obtained efficiently. Presents solutions for PUMA arm, JPL/Stanford arm, and six-revolute-joint coplanar arm along with all singular points. Main contribution of paper shows simple geometry of this type of arms exploited in performing inverse transformation without any need to compute Jacobian or its inverse explicitly. Implication of this computational efficiency advanced task-space control schemes for spherical-wrist arms implemented more efficiently.
Compression of 3D Point Clouds Using a Region-Adaptive Hierarchical Transform.
De Queiroz, Ricardo; Chou, Philip A
2016-06-01
In free-viewpoint video, there is a recent trend to represent scene objects as solids rather than using multiple depth maps. Point clouds have been used in computer graphics for a long time and with the recent possibility of real time capturing and rendering, point clouds have been favored over meshes in order to save computation. Each point in the cloud is associated with its 3D position and its color. We devise a method to compress the colors in point clouds which is based on a hierarchical transform and arithmetic coding. The transform is a hierarchical sub-band transform that resembles an adaptive variation of a Haar wavelet. The arithmetic encoding of the coefficients assumes Laplace distributions, one per sub-band. The Laplace parameter for each distribution is transmitted to the decoder using a custom method. The geometry of the point cloud is encoded using the well-established octtree scanning. Results show that the proposed solution performs comparably to the current state-of-the-art, in many occasions outperforming it, while being much more computationally efficient. We believe this work represents the state-of-the-art in intra-frame compression of point clouds for real-time 3D video.
Change Detection of Mobile LIDAR Data Using Cloud Computing
NASA Astrophysics Data System (ADS)
Liu, Kun; Boehm, Jan; Alis, Christian
2016-06-01
Change detection has long been a challenging problem although a lot of research has been conducted in different fields such as remote sensing and photogrammetry, computer vision, and robotics. In this paper, we blend voxel grid and Apache Spark together to propose an efficient method to address the problem in the context of big data. Voxel grid is a regular geometry representation consisting of the voxels with the same size, which fairly suites parallel computation. Apache Spark is a popular distributed parallel computing platform which allows fault tolerance and memory cache. These features can significantly enhance the performance of Apache Spark and results in an efficient and robust implementation. In our experiments, both synthetic and real point cloud data are employed to demonstrate the quality of our method.
Algorithms for Efficient Computation of Transfer Functions for Large Order Flexible Systems
NASA Technical Reports Server (NTRS)
Maghami, Peiman G.; Giesy, Daniel P.
1998-01-01
An efficient and robust computational scheme is given for the calculation of the frequency response function of a large order, flexible system implemented with a linear, time invariant control system. Advantage is taken of the highly structured sparsity of the system matrix of the plant based on a model of the structure using normal mode coordinates. The computational time per frequency point of the new computational scheme is a linear function of system size, a significant improvement over traditional, still-matrix techniques whose computational times per frequency point range from quadratic to cubic functions of system size. This permits the practical frequency domain analysis of systems of much larger order than by traditional, full-matrix techniques. Formulations are given for both open- and closed-loop systems. Numerical examples are presented showing the advantages of the present formulation over traditional approaches, both in speed and in accuracy. Using a model with 703 structural modes, the present method was up to two orders of magnitude faster than a traditional method. The present method generally showed good to excellent accuracy throughout the range of test frequencies, while traditional methods gave adequate accuracy for lower frequencies, but generally deteriorated in performance at higher frequencies with worst case errors being many orders of magnitude times the correct values.
Optimization of Angular-Momentum Biases of Reaction Wheels
NASA Technical Reports Server (NTRS)
Lee, Clifford; Lee, Allan
2008-01-01
RBOT [RWA Bias Optimization Tool (wherein RWA signifies Reaction Wheel Assembly )] is a computer program designed for computing angular momentum biases for reaction wheels used for providing spacecraft pointing in various directions as required for scientific observations. RBOT is currently deployed to support the Cassini mission to prevent operation of reaction wheels at unsafely high speeds while minimizing time in undesirable low-speed range, where elasto-hydrodynamic lubrication films in bearings become ineffective, leading to premature bearing failure. The problem is formulated as a constrained optimization problem in which maximum wheel speed limit is a hard constraint and a cost functional that increases as speed decreases below a low-speed threshold. The optimization problem is solved using a parametric search routine known as the Nelder-Mead simplex algorithm. To increase computational efficiency for extended operation involving large quantity of data, the algorithm is designed to (1) use large time increments during intervals when spacecraft attitudes or rates of rotation are nearly stationary, (2) use sinusoidal-approximation sampling to model repeated long periods of Earth-point rolling maneuvers to reduce computational loads, and (3) utilize an efficient equation to obtain wheel-rate profiles as functions of initial wheel biases based on conservation of angular momentum (in an inertial frame) using pre-computed terms.
Parallel and pipeline computation of fast unitary transforms
NASA Technical Reports Server (NTRS)
Fino, B. J.; Algazi, V. R.
1975-01-01
The letter discusses the parallel and pipeline organization of fast-unitary-transform algorithms such as the fast Fourier transform, and points out the efficiency of a combined parallel-pipeline processor of a transform such as the Haar transform, in which (2 to the n-th power) -1 hardware 'butterflies' generate a transform of order 2 to the n-th power every computation cycle.
ERIC Educational Resources Information Center
Shih, Ching-Hsiang
2013-01-01
This study provided that people with multiple disabilities can have a collaborative working chance in computer operations through an Enhanced Multiple Cursor Dynamic Pointing Assistive Program (EMCDPAP, a new kind of software that replaces the standard mouse driver, changes a mouse wheel into a thumb/finger poke detector, and manages mouse…
ERIC Educational Resources Information Center
Shih, Ching-Hsiang; Shih, Ching-Tien; Peng, Chin-Ling
2011-01-01
This study evaluated whether two people with multiple disabilities would be able to improve their pointing performance through an Automatic Target Acquisition Program (ATAP) and a newly developed mouse driver (i.e. a new mouse driver replaces standard mouse driver, and is able to monitor mouse movement and intercept click action). Initially, both…
NASA Astrophysics Data System (ADS)
Shayanfar, Mohsen Ali; Barkhordari, Mohammad Ali; Roudak, Mohammad Amin
2017-06-01
Monte Carlo simulation (MCS) is a useful tool for computation of probability of failure in reliability analysis. However, the large number of required random samples makes it time-consuming. Response surface method (RSM) is another common method in reliability analysis. Although RSM is widely used for its simplicity, it cannot be trusted in highly nonlinear problems due to its linear nature. In this paper, a new efficient algorithm, employing the combination of importance sampling, as a class of MCS, and RSM is proposed. In the proposed algorithm, analysis starts with importance sampling concepts and using a represented two-step updating rule of design point. This part finishes after a small number of samples are generated. Then RSM starts to work using Bucher experimental design, with the last design point and a represented effective length as the center point and radius of Bucher's approach, respectively. Through illustrative numerical examples, simplicity and efficiency of the proposed algorithm and the effectiveness of the represented rules are shown.
Tensor Factorization for Low-Rank Tensor Completion.
Zhou, Pan; Lu, Canyi; Lin, Zhouchen; Zhang, Chao
2018-03-01
Recently, a tensor nuclear norm (TNN) based method was proposed to solve the tensor completion problem, which has achieved state-of-the-art performance on image and video inpainting tasks. However, it requires computing tensor singular value decomposition (t-SVD), which costs much computation and thus cannot efficiently handle tensor data, due to its natural large scale. Motivated by TNN, we propose a novel low-rank tensor factorization method for efficiently solving the 3-way tensor completion problem. Our method preserves the low-rank structure of a tensor by factorizing it into the product of two tensors of smaller sizes. In the optimization process, our method only needs to update two smaller tensors, which can be more efficiently conducted than computing t-SVD. Furthermore, we prove that the proposed alternating minimization algorithm can converge to a Karush-Kuhn-Tucker point. Experimental results on the synthetic data recovery, image and video inpainting tasks clearly demonstrate the superior performance and efficiency of our developed method over state-of-the-arts including the TNN and matricization methods.
Genomic cloud computing: legal and ethical points to consider
Dove, Edward S; Joly, Yann; Tassé, Anne-Marie; Burton, Paul; Chisholm, Rex; Fortier, Isabel; Goodwin, Pat; Harris, Jennifer; Hveem, Kristian; Kaye, Jane; Kent, Alistair; Knoppers, Bartha Maria; Lindpaintner, Klaus; Little, Julian; Riegman, Peter; Ripatti, Samuli; Stolk, Ronald; Bobrow, Martin; Cambon-Thomsen, Anne; Dressler, Lynn; Joly, Yann; Kato, Kazuto; Knoppers, Bartha Maria; Rodriguez, Laura Lyman; McPherson, Treasa; Nicolás, Pilar; Ouellette, Francis; Romeo-Casabona, Carlos; Sarin, Rajiv; Wallace, Susan; Wiesner, Georgia; Wilson, Julia; Zeps, Nikolajs; Simkevitz, Howard; De Rienzo, Assunta; Knoppers, Bartha M
2015-01-01
The biggest challenge in twenty-first century data-intensive genomic science, is developing vast computer infrastructure and advanced software tools to perform comprehensive analyses of genomic data sets for biomedical research and clinical practice. Researchers are increasingly turning to cloud computing both as a solution to integrate data from genomics, systems biology and biomedical data mining and as an approach to analyze data to solve biomedical problems. Although cloud computing provides several benefits such as lower costs and greater efficiency, it also raises legal and ethical issues. In this article, we discuss three key ‘points to consider' (data control; data security, confidentiality and transfer; and accountability) based on a preliminary review of several publicly available cloud service providers' Terms of Service. These ‘points to consider' should be borne in mind by genomic research organizations when negotiating legal arrangements to store genomic data on a large commercial cloud service provider's servers. Diligent genomic cloud computing means leveraging security standards and evaluation processes as a means to protect data and entails many of the same good practices that researchers should always consider in securing their local infrastructure. PMID:25248396
Genomic cloud computing: legal and ethical points to consider.
Dove, Edward S; Joly, Yann; Tassé, Anne-Marie; Knoppers, Bartha M
2015-10-01
The biggest challenge in twenty-first century data-intensive genomic science, is developing vast computer infrastructure and advanced software tools to perform comprehensive analyses of genomic data sets for biomedical research and clinical practice. Researchers are increasingly turning to cloud computing both as a solution to integrate data from genomics, systems biology and biomedical data mining and as an approach to analyze data to solve biomedical problems. Although cloud computing provides several benefits such as lower costs and greater efficiency, it also raises legal and ethical issues. In this article, we discuss three key 'points to consider' (data control; data security, confidentiality and transfer; and accountability) based on a preliminary review of several publicly available cloud service providers' Terms of Service. These 'points to consider' should be borne in mind by genomic research organizations when negotiating legal arrangements to store genomic data on a large commercial cloud service provider's servers. Diligent genomic cloud computing means leveraging security standards and evaluation processes as a means to protect data and entails many of the same good practices that researchers should always consider in securing their local infrastructure.
NASA Astrophysics Data System (ADS)
Wright, David; Thyer, Mark; Westra, Seth
2015-04-01
Highly influential data points are those that have a disproportionately large impact on model performance, parameters and predictions. However, in current hydrological modelling practice the relative influence of individual data points on hydrological model calibration is not commonly evaluated. This presentation illustrates and evaluates several influence diagnostics tools that hydrological modellers can use to assess the relative influence of data. The feasibility and importance of including influence detection diagnostics as a standard tool in hydrological model calibration is discussed. Two classes of influence diagnostics are evaluated: (1) computationally demanding numerical "case deletion" diagnostics; and (2) computationally efficient analytical diagnostics, based on Cook's distance. These diagnostics are compared against hydrologically orientated diagnostics that describe changes in the model parameters (measured through the Mahalanobis distance), performance (objective function displacement) and predictions (mean and maximum streamflow). These influence diagnostics are applied to two case studies: a stage/discharge rating curve model, and a conceptual rainfall-runoff model (GR4J). Removing a single data point from the calibration resulted in differences to mean flow predictions of up to 6% for the rating curve model, and differences to mean and maximum flow predictions of up to 10% and 17%, respectively, for the hydrological model. When using the Nash-Sutcliffe efficiency in calibration, the computationally cheaper Cook's distance metrics produce similar results to the case-deletion metrics at a fraction of the computational cost. However, Cooks distance is adapted from linear regression with inherit assumptions on the data and is therefore less flexible than case deletion. Influential point detection diagnostics show great potential to improve current hydrological modelling practices by identifying highly influential data points. The findings of this study establish the feasibility and importance of including influential point detection diagnostics as a standard tool in hydrological model calibration. They provide the hydrologist with important information on whether model calibration is susceptible to a small number of highly influent data points. This enables the hydrologist to make a more informed decision of whether to (1) remove/retain the calibration data; (2) adjust the calibration strategy and/or hydrological model to reduce the susceptibility of model predictions to a small number of influential observations.
A computational framework for automation of point defect calculations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Goyal, Anuj; Gorai, Prashun; Peng, Haowei
We have developed a complete and rigorously validated open-source Python framework to automate point defect calculations using density functional theory. Furthermore, the framework provides an effective and efficient method for defect structure generation, and creation of simple yet customizable workflows to analyze defect calculations. This package provides the capability to compute widely-accepted correction schemes to overcome finite-size effects, including (1) potential alignment, (2) image-charge correction, and (3) band filling correction to shallow defects. Using Si, ZnO and In2O3 as test examples, we demonstrate the package capabilities and validate the methodology.
Baryonic and mesonic 3-point functions with open spin indices
NASA Astrophysics Data System (ADS)
Bali, Gunnar S.; Collins, Sara; Gläßle, Benjamin; Heybrock, Simon; Korcyl, Piotr; Löffler, Marius; Rödl, Rudolf; Schäfer, Andreas
2018-03-01
We have implemented a new way of computing three-point correlation functions. It is based on a factorization of the entire correlation function into two parts which are evaluated with open spin-(and to some extent flavor-) indices. This allows us to estimate the two contributions simultaneously for many different initial and final states and momenta, with little computational overhead. We explain this factorization as well as its efficient implementation in a new library which has been written to provide the necessary functionality on modern parallel architectures and on CPUs, including Intel's Xeon Phi series.
A computational framework for automation of point defect calculations
Goyal, Anuj; Gorai, Prashun; Peng, Haowei; ...
2017-01-13
We have developed a complete and rigorously validated open-source Python framework to automate point defect calculations using density functional theory. Furthermore, the framework provides an effective and efficient method for defect structure generation, and creation of simple yet customizable workflows to analyze defect calculations. This package provides the capability to compute widely-accepted correction schemes to overcome finite-size effects, including (1) potential alignment, (2) image-charge correction, and (3) band filling correction to shallow defects. Using Si, ZnO and In2O3 as test examples, we demonstrate the package capabilities and validate the methodology.
A hardware-oriented concurrent TZ search algorithm for High-Efficiency Video Coding
NASA Astrophysics Data System (ADS)
Doan, Nghia; Kim, Tae Sung; Rhee, Chae Eun; Lee, Hyuk-Jae
2017-12-01
High-Efficiency Video Coding (HEVC) is the latest video coding standard, in which the compression performance is double that of its predecessor, the H.264/AVC standard, while the video quality remains unchanged. In HEVC, the test zone (TZ) search algorithm is widely used for integer motion estimation because it effectively searches the good-quality motion vector with a relatively small amount of computation. However, the complex computation structure of the TZ search algorithm makes it difficult to implement it in the hardware. This paper proposes a new integer motion estimation algorithm which is designed for hardware execution by modifying the conventional TZ search to allow parallel motion estimations of all prediction unit (PU) partitions. The algorithm consists of the three phases of zonal, raster, and refinement searches. At the beginning of each phase, the algorithm obtains the search points required by the original TZ search for all PU partitions in a coding unit (CU). Then, all redundant search points are removed prior to the estimation of the motion costs, and the best search points are then selected for all PUs. Compared to the conventional TZ search algorithm, experimental results show that the proposed algorithm significantly decreases the Bjøntegaard Delta bitrate (BD-BR) by 0.84%, and it also reduces the computational complexity by 54.54%.
Efficient Process Migration for Parallel Processing on Non-Dedicated Networks of Workstations
NASA Technical Reports Server (NTRS)
Chanchio, Kasidit; Sun, Xian-He
1996-01-01
This paper presents the design and preliminary implementation of MpPVM, a software system that supports process migration for PVM application programs in a non-dedicated heterogeneous computing environment. New concepts of migration point as well as migration point analysis and necessary data analysis are introduced. In MpPVM, process migrations occur only at previously inserted migration points. Migration point analysis determines appropriate locations to insert migration points; whereas, necessary data analysis provides a minimum set of variables to be transferred at each migration pint. A new methodology to perform reliable point-to-point data communications in a migration environment is also discussed. Finally, a preliminary implementation of MpPVM and its experimental results are presented, showing the correctness and promising performance of our process migration mechanism in a scalable non-dedicated heterogeneous computing environment. While MpPVM is developed on top of PVM, the process migration methodology introduced in this study is general and can be applied to any distributed software environment.
Method and system for data clustering for very large databases
NASA Technical Reports Server (NTRS)
Livny, Miron (Inventor); Zhang, Tian (Inventor); Ramakrishnan, Raghu (Inventor)
1998-01-01
Multi-dimensional data contained in very large databases is efficiently and accurately clustered to determine patterns therein and extract useful information from such patterns. Conventional computer processors may be used which have limited memory capacity and conventional operating speed, allowing massive data sets to be processed in a reasonable time and with reasonable computer resources. The clustering process is organized using a clustering feature tree structure wherein each clustering feature comprises the number of data points in the cluster, the linear sum of the data points in the cluster, and the square sum of the data points in the cluster. A dense region of data points is treated collectively as a single cluster, and points in sparsely occupied regions can be treated as outliers and removed from the clustering feature tree. The clustering can be carried out continuously with new data points being received and processed, and with the clustering feature tree being restructured as necessary to accommodate the information from the newly received data points.
An efficient method to compute microlensed light curves for point sources
NASA Technical Reports Server (NTRS)
Witt, Hans J.
1993-01-01
We present a method to compute microlensed light curves for point sources. This method has the general advantage that all microimages contributing to the light curve are found. While a source moves along a straight line, all micro images are located either on the primary image track or on the secondary image tracks (loops). The primary image track extends from - infinity to + infinity and is made of many sequents which are continuously connected. All the secondary image tracks (loops) begin and end on the lensing point masses. The method can be applied to any microlensing situation with point masses in the deflector plane, even for the overcritical case and surface densities close to the critical. Furthermore, we present general rules to evaluate the light curve for a straight track arbitrary placed in the caustic network of a sample of many point masses.
Ren, Qiang; Nagar, Jogender; Kang, Lei; Bian, Yusheng; Werner, Ping; Werner, Douglas H
2017-05-18
A highly efficient numerical approach for simulating the wideband optical response of nano-architectures comprised of Drude-Critical Points (DCP) media (e.g., gold and silver) is proposed and validated through comparing with commercial computational software. The kernel of this algorithm is the subdomain level discontinuous Galerkin time domain (DGTD) method, which can be viewed as a hybrid of the spectral-element time-domain method (SETD) and the finite-element time-domain (FETD) method. An hp-refinement technique is applied to decrease the Degrees-of-Freedom (DoFs) and computational requirements. The collocated E-J scheme facilitates solving the auxiliary equations by converting the inversions of matrices to simpler vector manipulations. A new hybrid time stepping approach, which couples the Runge-Kutta and Newmark methods, is proposed to solve the temporal auxiliary differential equations (ADEs) with a high degree of efficiency. The advantages of this new approach, in terms of computational resource overhead and accuracy, are validated through comparison with well-known commercial software for three diverse cases, which cover both near-field and far-field properties with plane wave and lumped port sources. The presented work provides the missing link between DCP dispersive models and FETD and/or SETD based algorithms. It is a competitive candidate for numerically studying the wideband plasmonic properties of DCP media.
A hardware implementation of the discrete Pascal transform for image processing
NASA Astrophysics Data System (ADS)
Goodman, Thomas J.; Aburdene, Maurice F.
2006-02-01
The discrete Pascal transform is a polynomial transform with applications in pattern recognition, digital filtering, and digital image processing. It already has been shown that the Pascal transform matrix can be decomposed into a product of binary matrices. Such a factorization leads to a fast and efficient hardware implementation without the use of multipliers, which consume large amounts of hardware. We recently developed a field-programmable gate array (FPGA) implementation to compute the Pascal transform. Our goal was to demonstrate the computational efficiency of the transform while keeping hardware requirements at a minimum. Images are uploaded into memory from a remote computer prior to processing, and the transform coefficients can be offloaded from the FPGA board for analysis. Design techniques like as-soon-as-possible scheduling and adder sharing allowed us to develop a fast and efficient system. An eight-point, one-dimensional transform completes in 13 clock cycles and requires only four adders. An 8x8 two-dimensional transform completes in 240 cycles and requires only a top-level controller in addition to the one-dimensional transform hardware. Finally, through minor modifications to the controller, the transform operations can be pipelined to achieve 100% utilization of the four adders, allowing one eight-point transform to complete every seven clock cycles.
PRESAGE: Protecting Structured Address Generation against Soft Errors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sharma, Vishal C.; Gopalakrishnan, Ganesh; Krishnamoorthy, Sriram
Modern computer scaling trends in pursuit of larger component counts and power efficiency have, unfortunately, lead to less reliable hardware and consequently soft errors escaping into application data ("silent data corruptions"). Techniques to enhance system resilience hinge on the availability of efficient error detectors that have high detection rates, low false positive rates, and lower computational overhead. Unfortunately, efficient detectors to detect faults during address generation (to index large arrays) have not been widely researched. We present a novel lightweight compiler-driven technique called PRESAGE for detecting bit-flips affecting structured address computations. A key insight underlying PRESAGE is that any addressmore » computation scheme that flows an already incurred error is better than a scheme that corrupts one particular array access but otherwise (falsely) appears to compute perfectly. Enabling the flow of errors allows one to situate detectors at loop exit points, and helps turn silent corruptions into easily detectable error situations. Our experiments using PolyBench benchmark suite indicate that PRESAGE-based error detectors have a high error-detection rate while incurring low overheads.« less
PRESAGE: Protecting Structured Address Generation against Soft Errors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sharma, Vishal C.; Gopalakrishnan, Ganesh; Krishnamoorthy, Sriram
Modern computer scaling trends in pursuit of larger component counts and power efficiency have, unfortunately, lead to less reliable hardware and consequently soft errors escaping into application data ("silent data corruptions"). Techniques to enhance system resilience hinge on the availability of efficient error detectors that have high detection rates, low false positive rates, and lower computational overhead. Unfortunately, efficient detectors to detect faults during address generation have not been widely researched (especially in the context of indexing large arrays). We present a novel lightweight compiler-driven technique called PRESAGE for detecting bit-flips affecting structured address computations. A key insight underlying PRESAGEmore » is that any address computation scheme that propagates an already incurred error is better than a scheme that corrupts one particular array access but otherwise (falsely) appears to compute perfectly. Ensuring the propagation of errors allows one to place detectors at loop exit points and helps turn silent corruptions into easily detectable error situations. Our experiments using the PolyBench benchmark suite indicate that PRESAGE-based error detectors have a high error-detection rate while incurring low overheads.« less
NASA Astrophysics Data System (ADS)
Li, X.; Li, S. W.
2012-07-01
In this paper, an efficient global optimization algorithm in the field of artificial intelligence, named Particle Swarm Optimization (PSO), is introduced into close range photogrammetric data processing. PSO can be applied to obtain the approximate values of exterior orientation elements under the condition that multi-intersection photography and a small portable plane control frame are used. PSO, put forward by an American social psychologist J. Kennedy and an electrical engineer R.C. Eberhart, is a stochastic global optimization method based on swarm intelligence, which was inspired by social behavior of bird flocking or fish schooling. The strategy of obtaining the approximate values of exterior orientation elements using PSO is as follows: in terms of image coordinate observed values and space coordinates of few control points, the equations of calculating the image coordinate residual errors can be given. The sum of absolute value of each image coordinate is minimized to be the objective function. The difference between image coordinate observed value and the image coordinate computed through collinear condition equation is defined as the image coordinate residual error. Firstly a gross area of exterior orientation elements is given, and then the adjustment of other parameters is made to get the particles fly in the gross area. After iterative computation for certain times, the satisfied approximate values of exterior orientation elements are obtained. By doing so, the procedures like positioning and measuring space control points in close range photogrammetry can be avoided. Obviously, this method can improve the surveying efficiency greatly and at the same time can decrease the surveying cost. And during such a process, only one small portable control frame with a couple of control points is employed, and there are no strict requirements for the space distribution of control points. In order to verify the effectiveness of this algorithm, two experiments are carried out. In the first experiment, images of a standard grid board are taken according to multi-intersection photography using digital camera. Three points or six points which are located on the left-down corner of the standard grid are regarded as control points respectively, and the exterior orientation elements of each image are computed through PSO, and compared with these elements computed through bundle adjustment. In the second experiment, the exterior orientation elements obtained from the first experiment are used as approximate values in bundle adjustment and then the space coordinates of other grid points on the board can be computed. The coordinate difference of grid points between these computed space coordinates and their known coordinates can be used to compute the accuracy. The point accuracy computed in above experiments are ±0.76mm and ±0.43mm respectively. The above experiments prove the effectiveness of PSO used in close range photogrammetry to compute approximate values of exterior orientation elements, and the algorithm can meet the requirement of higher accuracy. In short, PSO can get better results in a faster, cheaper way compared with other surveying methods in close range photogrammetry.
NASA Astrophysics Data System (ADS)
Xu, Jun; Dang, Chao; Kong, Fan
2017-10-01
This paper presents a new method for efficient structural reliability analysis. In this method, a rotational quasi-symmetric point method (RQ-SPM) is proposed for evaluating the fractional moments of the performance function. Then, the derivation of the performance function's probability density function (PDF) is carried out based on the maximum entropy method in which constraints are specified in terms of fractional moments. In this regard, the probability of failure can be obtained by a simple integral over the performance function's PDF. Six examples, including a finite element-based reliability analysis and a dynamic system with strong nonlinearity, are used to illustrate the efficacy of the proposed method. All the computed results are compared with those by Monte Carlo simulation (MCS). It is found that the proposed method can provide very accurate results with low computational effort.
Massively parallel multicanonical simulations
NASA Astrophysics Data System (ADS)
Gross, Jonathan; Zierenberg, Johannes; Weigel, Martin; Janke, Wolfhard
2018-03-01
Generalized-ensemble Monte Carlo simulations such as the multicanonical method and similar techniques are among the most efficient approaches for simulations of systems undergoing discontinuous phase transitions or with rugged free-energy landscapes. As Markov chain methods, they are inherently serial computationally. It was demonstrated recently, however, that a combination of independent simulations that communicate weight updates at variable intervals allows for the efficient utilization of parallel computational resources for multicanonical simulations. Implementing this approach for the many-thread architecture provided by current generations of graphics processing units (GPUs), we show how it can be efficiently employed with of the order of 104 parallel walkers and beyond, thus constituting a versatile tool for Monte Carlo simulations in the era of massively parallel computing. We provide the fully documented source code for the approach applied to the paradigmatic example of the two-dimensional Ising model as starting point and reference for practitioners in the field.
Efficient Jacobian inversion for the control of simple robot manipulators
NASA Technical Reports Server (NTRS)
Fijany, Amir; Bejczy, Antal K.
1988-01-01
Symbolic inversion of the Jacobian matrix for spherical wrist arms is investigated. It is shown that, taking advantage of the simple geometry of these arms, the closed-form solution of the system Q = J-1X, representing a transformation from task space to joint space, can be obtained very efficiently. The solutions for PUMA, Stanford, and a six-revolute-joint coplanar arm, along with all singular points, are presented. The solution for each joint variable is found as an explicit function of the singular points which provides a better insight into the effect of different singular points on the motion and force exertion of each individual joint. For the above arms, the computation cost of the solution is on the same order as the cost of forward kinematic solution and it is significantly reduced if forward kinematic solution is already obtained. A comparison with previous methods shows that this method is the most efficient to date.
Billings, Seth D.; Boctor, Emad M.; Taylor, Russell H.
2015-01-01
We present a probabilistic registration algorithm that robustly solves the problem of rigid-body alignment between two shapes with high accuracy, by aptly modeling measurement noise in each shape, whether isotropic or anisotropic. For point-cloud shapes, the probabilistic framework additionally enables modeling locally-linear surface regions in the vicinity of each point to further improve registration accuracy. The proposed Iterative Most-Likely Point (IMLP) algorithm is formed as a variant of the popular Iterative Closest Point (ICP) algorithm, which iterates between point-correspondence and point-registration steps. IMLP’s probabilistic framework is used to incorporate a generalized noise model into both the correspondence and the registration phases of the algorithm, hence its name as a most-likely point method rather than a closest-point method. To efficiently compute the most-likely correspondences, we devise a novel search strategy based on a principal direction (PD)-tree search. We also propose a new approach to solve the generalized total-least-squares (GTLS) sub-problem of the registration phase, wherein the point correspondences are registered under a generalized noise model. Our GTLS approach has improved accuracy, efficiency, and stability compared to prior methods presented for this problem and offers a straightforward implementation using standard least squares. We evaluate the performance of IMLP relative to a large number of prior algorithms including ICP, a robust variant on ICP, Generalized ICP (GICP), and Coherent Point Drift (CPD), as well as drawing close comparison with the prior anisotropic registration methods of GTLS-ICP and A-ICP. The performance of IMLP is shown to be superior with respect to these algorithms over a wide range of noise conditions, outliers, and misalignments using both mesh and point-cloud representations of various shapes. PMID:25748700
Parallel, adaptive finite element methods for conservation laws
NASA Technical Reports Server (NTRS)
Biswas, Rupak; Devine, Karen D.; Flaherty, Joseph E.
1994-01-01
We construct parallel finite element methods for the solution of hyperbolic conservation laws in one and two dimensions. Spatial discretization is performed by a discontinuous Galerkin finite element method using a basis of piecewise Legendre polynomials. Temporal discretization utilizes a Runge-Kutta method. Dissipative fluxes and projection limiting prevent oscillations near solution discontinuities. A posteriori estimates of spatial errors are obtained by a p-refinement technique using superconvergence at Radau points. The resulting method is of high order and may be parallelized efficiently on MIMD computers. We compare results using different limiting schemes and demonstrate parallel efficiency through computations on an NCUBE/2 hypercube. We also present results using adaptive h- and p-refinement to reduce the computational cost of the method.
An efficient method for removing point sources from full-sky radio interferometric maps
NASA Astrophysics Data System (ADS)
Berger, Philippe; Oppermann, Niels; Pen, Ue-Li; Shaw, J. Richard
2017-12-01
A new generation of wide-field radio interferometers designed for 21-cm surveys is being built as drift scan instruments allowing them to observe large fractions of the sky. With large numbers of antennas and frequency channels, the enormous instantaneous data rates of these telescopes require novel, efficient, data management and analysis techniques. The m-mode formalism exploits the periodicity of such data with the sidereal day, combined with the assumption of statistical isotropy of the sky, to achieve large computational savings and render optimal analysis methods computationally tractable. We present an extension to that work that allows us to adopt a more realistic sky model and treat objects such as bright point sources. We develop a linear procedure for deconvolving maps, using a Wiener filter reconstruction technique, which simultaneously allows filtering of these unwanted components. We construct an algorithm, based on the Sherman-Morrison-Woodbury formula, to efficiently invert the data covariance matrix, as required for any optimal signal-to-noise ratio weighting. The performance of our algorithm is demonstrated using simulations of a cylindrical transit telescope.
Optimizing Cubature for Efficient Integration of Subspace Deformations
An, Steven S.; Kim, Theodore; James, Doug L.
2009-01-01
We propose an efficient scheme for evaluating nonlinear subspace forces (and Jacobians) associated with subspace deformations. The core problem we address is efficient integration of the subspace force density over the 3D spatial domain. Similar to Gaussian quadrature schemes that efficiently integrate functions that lie in particular polynomial subspaces, we propose cubature schemes (multi-dimensional quadrature) optimized for efficient integration of force densities associated with particular subspace deformations, particular materials, and particular geometric domains. We support generic subspace deformation kinematics, and nonlinear hyperelastic materials. For an r-dimensional deformation subspace with O(r) cubature points, our method is able to evaluate subspace forces at O(r2) cost. We also describe composite cubature rules for runtime error estimation. Results are provided for various subspace deformation models, several hyperelastic materials (St.Venant-Kirchhoff, Mooney-Rivlin, Arruda-Boyce), and multimodal (graphics, haptics, sound) applications. We show dramatically better efficiency than traditional Monte Carlo integration. CR Categories: I.6.8 [Simulation and Modeling]: Types of Simulation—Animation, I.3.5 [Computer Graphics]: Computational Geometry and Object Modeling—Physically based modeling G.1.4 [Mathematics of Computing]: Numerical Analysis—Quadrature and Numerical Differentiation PMID:19956777
Speeding up Coarse Point Cloud Registration by Threshold-Independent Baysac Match Selection
NASA Astrophysics Data System (ADS)
Kang, Z.; Lindenbergh, R.; Pu, S.
2016-06-01
This paper presents an algorithm for the automatic registration of terrestrial point clouds by match selection using an efficiently conditional sampling method -- threshold-independent BaySAC (BAYes SAmpling Consensus) and employs the error metric of average point-to-surface residual to reduce the random measurement error and then approach the real registration error. BaySAC and other basic sampling algorithms usually need to artificially determine a threshold by which inlier points are identified, which leads to a threshold-dependent verification process. Therefore, we applied the LMedS method to construct the cost function that is used to determine the optimum model to reduce the influence of human factors and improve the robustness of the model estimate. Point-to-point and point-to-surface error metrics are most commonly used. However, point-to-point error in general consists of at least two components, random measurement error and systematic error as a result of a remaining error in the found rigid body transformation. Thus we employ the measure of the average point-to-surface residual to evaluate the registration accuracy. The proposed approaches, together with a traditional RANSAC approach, are tested on four data sets acquired by three different scanners in terms of their computational efficiency and quality of the final registration. The registration results show the st.dev of the average point-to-surface residuals is reduced from 1.4 cm (plain RANSAC) to 0.5 cm (threshold-independent BaySAC). The results also show that, compared to the performance of RANSAC, our BaySAC strategies lead to less iterations and cheaper computational cost when the hypothesis set is contaminated with more outliers.
NASA Technical Reports Server (NTRS)
Choudhary, Alok Nidhi; Leung, Mun K.; Huang, Thomas S.; Patel, Janak H.
1989-01-01
Computer vision systems employ a sequence of vision algorithms in which the output of an algorithm is the input of the next algorithm in the sequence. Algorithms that constitute such systems exhibit vastly different computational characteristics, and therefore, require different data decomposition techniques and efficient load balancing techniques for parallel implementation. However, since the input data for a task is produced as the output data of the previous task, this information can be exploited to perform knowledge based data decomposition and load balancing. Presented here are algorithms for a motion estimation system. The motion estimation is based on the point correspondence between the involved images which are a sequence of stereo image pairs. Researchers propose algorithms to obtain point correspondences by matching feature points among stereo image pairs at any two consecutive time instants. Furthermore, the proposed algorithms employ non-iterative procedures, which results in saving considerable amounts of computation time. The system consists of the following steps: (1) extraction of features; (2) stereo match of images in one time instant; (3) time match of images from consecutive time instants; (4) stereo match to compute final unambiguous points; and (5) computation of motion parameters.
A note on parallel and pipeline computation of fast unitary transforms
NASA Technical Reports Server (NTRS)
Fino, B. J.; Algazi, V. R.
1974-01-01
The parallel and pipeline organization of fast unitary transform algorithms such as the Fast Fourier Transform are discussed. The efficiency is pointed out of a combined parallel-pipeline processor of a transform such as the Haar transform in which 2 to the n minus 1 power hardware butterflies generate a transform of order 2 to the n power every computation cycle.
Fast Computation and Assessment Methods in Power System Analysis
NASA Astrophysics Data System (ADS)
Nagata, Masaki
Power system analysis is essential for efficient and reliable power system operation and control. Recently, online security assessment system has become of importance, as more efficient use of power networks is eagerly required. In this article, fast power system analysis techniques such as contingency screening, parallel processing and intelligent systems application are briefly surveyed from the view point of their application to online dynamic security assessment.
hp-Adaptive time integration based on the BDF for viscous flows
NASA Astrophysics Data System (ADS)
Hay, A.; Etienne, S.; Pelletier, D.; Garon, A.
2015-06-01
This paper presents a procedure based on the Backward Differentiation Formulas of order 1 to 5 to obtain efficient time integration of the incompressible Navier-Stokes equations. The adaptive algorithm performs both stepsize and order selections to control respectively the solution accuracy and the computational efficiency of the time integration process. The stepsize selection (h-adaptivity) is based on a local error estimate and an error controller to guarantee that the numerical solution accuracy is within a user prescribed tolerance. The order selection (p-adaptivity) relies on the idea that low-accuracy solutions can be computed efficiently by low order time integrators while accurate solutions require high order time integrators to keep computational time low. The selection is based on a stability test that detects growing numerical noise and deems a method of order p stable if there is no method of lower order that delivers the same solution accuracy for a larger stepsize. Hence, it guarantees both that (1) the used method of integration operates inside of its stability region and (2) the time integration procedure is computationally efficient. The proposed time integration procedure also features a time-step rejection and quarantine mechanisms, a modified Newton method with a predictor and dense output techniques to compute solution at off-step points.
Multigrid methods for bifurcation problems: The self adjoint case
NASA Technical Reports Server (NTRS)
Taasan, Shlomo
1987-01-01
This paper deals with multigrid methods for computational problems that arise in the theory of bifurcation and is restricted to the self adjoint case. The basic problem is to solve for arcs of solutions, a task that is done successfully with an arc length continuation method. Other important issues are, for example, detecting and locating singular points as part of the continuation process, switching branches at bifurcation points, etc. Multigrid methods have been applied to continuation problems. These methods work well at regular points and at limit points, while they may encounter difficulties in the vicinity of bifurcation points. A new continuation method that is very efficient also near bifurcation points is presented here. The other issues mentioned above are also treated very efficiently with appropriate multigrid algorithms. For example, it is shown that limit points and bifurcation points can be solved for directly by a multigrid algorithm. Moreover, the algorithms presented here solve the corresponding problems in just a few work units (about 10 or less), where a work unit is the work involved in one local relaxation on the finest grid.
Guiffant, Gérard; Durussel, Jean Jacques; Flaud, Patrice; Vigier, Jean Pierre; Merckx, Jacques
2012-01-01
The use of totally implantable venous access devices developed as a medical device allowing mid- and long-term, frequent, repeated, or continuous injection of therapeutic products, by vascular, cavitary, or perineural access. The effective flushing of these devices is a central element to assure long-lasting use. Our experimental work demonstrates that directing the Huber point needle opening in the diametrically opposite direction of the implantable port exit channel increases the flushing efficiency. These results are consolidated by numerical computations, which support recommendations not only for their maintenance, but also for their use.
Guiffant, Gérard; Durussel, Jean Jacques; Flaud, Patrice; Vigier, Jean Pierre; Merckx, Jacques
2012-01-01
The use of totally implantable venous access devices developed as a medical device allowing mid- and long-term, frequent, repeated, or continuous injection of therapeutic products, by vascular, cavitary, or perineural access. The effective flushing of these devices is a central element to assure long-lasting use. Our experimental work demonstrates that directing the Huber point needle opening in the diametrically opposite direction of the implantable port exit channel increases the flushing efficiency. These results are consolidated by numerical computations, which support recommendations not only for their maintenance, but also for their use. PMID:23166455
Efficient terrestrial laser scan segmentation exploiting data structure
NASA Astrophysics Data System (ADS)
Mahmoudabadi, Hamid; Olsen, Michael J.; Todorovic, Sinisa
2016-09-01
New technologies such as lidar enable the rapid collection of massive datasets to model a 3D scene as a point cloud. However, while hardware technology continues to advance, processing 3D point clouds into informative models remains complex and time consuming. A common approach to increase processing efficiently is to segment the point cloud into smaller sections. This paper proposes a novel approach for point cloud segmentation using computer vision algorithms to analyze panoramic representations of individual laser scans. These panoramas can be quickly created using an inherent neighborhood structure that is established during the scanning process, which scans at fixed angular increments in a cylindrical or spherical coordinate system. In the proposed approach, a selected image segmentation algorithm is applied on several input layers exploiting this angular structure including laser intensity, range, normal vectors, and color information. These segments are then mapped back to the 3D point cloud so that modeling can be completed more efficiently. This approach does not depend on pre-defined mathematical models and consequently setting parameters for them. Unlike common geometrical point cloud segmentation methods, the proposed method employs the colorimetric and intensity data as another source of information. The proposed algorithm is demonstrated on several datasets encompassing variety of scenes and objects. Results show a very high perceptual (visual) level of segmentation and thereby the feasibility of the proposed algorithm. The proposed method is also more efficient compared to Random Sample Consensus (RANSAC), which is a common approach for point cloud segmentation.
NASA Technical Reports Server (NTRS)
Bonhaus, Daryl L.; Wornom, Stephen F.
1991-01-01
Two codes which solve the 3-D Thin Layer Navier-Stokes (TLNS) equations are used to compute the steady state flow for two test cases representing typical finite wings at transonic conditions. Several grids of C-O topology and varying point densities are used to determine the effects of grid refinement. After a description of each code and test case, standards for determining code efficiency and accuracy are defined and applied to determine the relative performance of the two codes in predicting turbulent transonic wing flows. Comparisons of computed surface pressure distributions with experimental data are made.
Matching point-of care devices to clinicians for positive outcomes.
Utterback, Karen; Waldo, Billie H
2005-07-01
Home care clinicians' use of point-of-care (POC) technology has increased 63% in the past 5 years. Although there are more POC system choices, matching the right device to each clinician's role is a challenge. This article clarifies the uses of laptop or notebook computer, personal digital assistants (PDAs), telephony, or automated telehealth, suggesting ways these technologies can result in clinical efficiencies, care coordination, and regulatory compliance.
MOLAR: Modular Linux and Adaptive Runtime Support for HEC OS/R Research
DOE Office of Scientific and Technical Information (OSTI.GOV)
Frank Mueller
2009-02-05
MOLAR is a multi-institution research effort that concentrates on adaptive, reliable,and efficient operating and runtime system solutions for ultra-scale high-end scientific computing on the next generation of supercomputers. This research addresses the challenges outlined by the FAST-OS - forum to address scalable technology for runtime and operating systems --- and HECRTF --- high-end computing revitalization task force --- activities by providing a modular Linux and adaptable runtime support for high-end computing operating and runtime systems. The MOLAR research has the following goals to address these issues. (1) Create a modular and configurable Linux system that allows customized changes based onmore » the requirements of the applications, runtime systems, and cluster management software. (2) Build runtime systems that leverage the OS modularity and configurability to improve efficiency, reliability, scalability, ease-of-use, and provide support to legacy and promising programming models. (3) Advance computer reliability, availability and serviceability (RAS) management systems to work cooperatively with the OS/R to identify and preemptively resolve system issues. (4) Explore the use of advanced monitoring and adaptation to improve application performance and predictability of system interruptions. The overall goal of the research conducted at NCSU is to develop scalable algorithms for high-availability without single points of failure and without single points of control.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, W Michael; Kohlmeyer, Axel; Plimpton, Steven J
The use of accelerators such as graphics processing units (GPUs) has become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power requirements. Hybrid high-performance computers, machines with nodes containing more than one type of floating-point processor (e.g. CPU and GPU), are now becoming more prevalent due to these advantages. In this paper, we present a continuation of previous work implementing algorithms for using accelerators into the LAMMPS molecular dynamics software for distributed memory parallel hybrid machines. In our previous work, we focused on acceleration for short-range models with anmore » approach intended to harness the processing power of both the accelerator and (multi-core) CPUs. To augment the existing implementations, we present an efficient implementation of long-range electrostatic force calculation for molecular dynamics. Specifically, we present an implementation of the particle-particle particle-mesh method based on the work by Harvey and De Fabritiis. We present benchmark results on the Keeneland InfiniBand GPU cluster. We provide a performance comparison of the same kernels compiled with both CUDA and OpenCL. We discuss limitations to parallel efficiency and future directions for improving performance on hybrid or heterogeneous computers.« less
NASA Technical Reports Server (NTRS)
Deepak, A.; Fluellen, A.
1978-01-01
An efficient numerical method of multiple quadratures, the Conroy method, is applied to the problem of computing multiple scattering contributions in the radiative transfer through realistic planetary atmospheres. A brief error analysis of the method is given and comparisons are drawn with the more familiar Monte Carlo method. Both methods are stochastic problem-solving models of a physical or mathematical process and utilize the sampling scheme for points distributed over a definite region. In the Monte Carlo scheme the sample points are distributed randomly over the integration region. In the Conroy method, the sample points are distributed systematically, such that the point distribution forms a unique, closed, symmetrical pattern which effectively fills the region of the multidimensional integration. The methods are illustrated by two simple examples: one, of multidimensional integration involving two independent variables, and the other, of computing the second order scattering contribution to the sky radiance.
A method to approximate a closest loadability limit using multiple load flow solutions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yorino, Naoto; Harada, Shigemi; Cheng, Haozhong
A new method is proposed to approximate a closest loadability limit (CLL), or closest saddle node bifurcation point, using a pair of multiple load flow solutions. More strictly, the obtainable points by the method are the stationary points including not only CLL but also farthest and saddle points. An operating solution and a low voltage load flow solution are used to efficiently estimate the node injections at a CLL as well as the left and right eigenvectors corresponding to the zero eigenvalue of the load flow Jacobian. They can be used in monitoring loadability margin, in identification of weak spotsmore » in a power system and in the examination of an optimal control against voltage collapse. Most of the computation time of the proposed method is taken in calculating the load flow solution pair. The remaining computation time is less than that of an ordinary load flow.« less
A Computational Framework for Automation of Point Defect Calculations
NASA Astrophysics Data System (ADS)
Goyal, Anuj; Gorai, Prashun; Peng, Haowei; Lany, Stephan; Stevanovic, Vladan; National Renewable Energy Laboratory, Golden, Colorado 80401 Collaboration
A complete and rigorously validated open-source Python framework to automate point defect calculations using density functional theory has been developed. The framework provides an effective and efficient method for defect structure generation, and creation of simple yet customizable workflows to analyze defect calculations. The package provides the capability to compute widely accepted correction schemes to overcome finite-size effects, including (1) potential alignment, (2) image-charge correction, and (3) band filling correction to shallow defects. Using Si, ZnO and In2O3as test examples, we demonstrate the package capabilities and validate the methodology. We believe that a robust automated tool like this will enable the materials by design community to assess the impact of point defects on materials performance. National Renewable Energy Laboratory, Golden, Colorado 80401.
Implementing direct, spatially isolated problems on transputer networks
NASA Technical Reports Server (NTRS)
Ellis, Graham K.
1988-01-01
Parametric studies were performed on transputer networks of up to 40 processors to determine how to implement and maximize the performance of the solution of problems where no processor-to-processor data transfer is required for the problem solution (spatially isolated). Two types of problems are investigated a computationally intensive problem where the solution required the transmission of 160 bytes of data through the parallel network, and a communication intensive example that required the transmission of 3 Mbytes of data through the network. This data consists of solutions being sent back to the host processor and not intermediate results for another processor to work on. Studies were performed on both integer and floating-point transputers. The latter features an on-chip floating-point math unit and offers approximately an order of magnitude performance increase over the integer transputer on real valued computations. The results indicate that a minimum amount of work is required on each node per communication to achieve high network speedups (efficiencies). The floating-point processor requires approximately an order of magnitude more work per communication than the integer processor because of the floating-point unit's increased computing capacity.
Cook, David A; Sorensen, Kristi J; Wilkinson, John M; Berger, Richard A
2013-11-25
Answering clinical questions affects patient-care decisions and is important to continuous professional development. The process of point-of-care learning is incompletely understood. To understand what barriers and enabling factors influence physician point-of-care learning and what decisions physicians face during this process. Focus groups with grounded theory analysis. Focus group discussions were transcribed and then analyzed using a constant comparative approach to identify barriers, enabling factors, and key decisions related to physician information-seeking activities. Academic medical center and outlying community sites. Purposive sample of 50 primary care and subspecialist internal medicine and family medicine physicians, interviewed in 11 focus groups. Insufficient time was the main barrier to point-of-care learning. Other barriers included the patient comorbidities and contexts, the volume of available information, not knowing which resource to search, doubt that the search would yield an answer, difficulty remembering questions for later study, and inconvenient access to computers. Key decisions were whether to search (reasons to search included infrequently seen conditions, practice updates, complex questions, and patient education), when to search (before, during, or after the clinical encounter), where to search (with the patient present or in a separate room), what type of resource to use (colleague or computer), what specific resource to use (influenced first by efficiency and second by credibility), and when to stop. Participants noted that key features of efficiency (completeness, brevity, and searchability) are often in conflict. Physicians perceive that insufficient time is the greatest barrier to point-of-care learning, and efficiency is the most important determinant in selecting an information source. Designing knowledge resources and systems to target key decisions may improve learning and patient care.
Efficient Algorithms for Segmentation of Item-Set Time Series
NASA Astrophysics Data System (ADS)
Chundi, Parvathi; Rosenkrantz, Daniel J.
We propose a special type of time series, which we call an item-set time series, to facilitate the temporal analysis of software version histories, email logs, stock market data, etc. In an item-set time series, each observed data value is a set of discrete items. We formalize the concept of an item-set time series and present efficient algorithms for segmenting a given item-set time series. Segmentation of a time series partitions the time series into a sequence of segments where each segment is constructed by combining consecutive time points of the time series. Each segment is associated with an item set that is computed from the item sets of the time points in that segment, using a function which we call a measure function. We then define a concept called the segment difference, which measures the difference between the item set of a segment and the item sets of the time points in that segment. The segment difference values are required to construct an optimal segmentation of the time series. We describe novel and efficient algorithms to compute segment difference values for each of the measure functions described in the paper. We outline a dynamic programming based scheme to construct an optimal segmentation of the given item-set time series. We use the item-set time series segmentation techniques to analyze the temporal content of three different data sets—Enron email, stock market data, and a synthetic data set. The experimental results show that an optimal segmentation of item-set time series data captures much more temporal content than a segmentation constructed based on the number of time points in each segment, without examining the item set data at the time points, and can be used to analyze different types of temporal data.
The monitoring and managing application of cloud computing based on Internet of Things.
Luo, Shiliang; Ren, Bin
2016-07-01
Cloud computing and the Internet of Things are the two hot points in the Internet application field. The application of the two new technologies is in hot discussion and research, but quite less on the field of medical monitoring and managing application. Thus, in this paper, we study and analyze the application of cloud computing and the Internet of Things on the medical field. And we manage to make a combination of the two techniques in the medical monitoring and managing field. The model architecture for remote monitoring cloud platform of healthcare information (RMCPHI) was established firstly. Then the RMCPHI architecture was analyzed. Finally an efficient PSOSAA algorithm was proposed for the medical monitoring and managing application of cloud computing. Simulation results showed that our proposed scheme can improve the efficiency about 50%. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
A GPU accelerated and error-controlled solver for the unbounded Poisson equation in three dimensions
NASA Astrophysics Data System (ADS)
Exl, Lukas
2017-12-01
An efficient solver for the three dimensional free-space Poisson equation is presented. The underlying numerical method is based on finite Fourier series approximation. While the error of all involved approximations can be fully controlled, the overall computation error is driven by the convergence of the finite Fourier series of the density. For smooth and fast-decaying densities the proposed method will be spectrally accurate. The method scales with O(N log N) operations, where N is the total number of discretization points in the Cartesian grid. The majority of the computational costs come from fast Fourier transforms (FFT), which makes it ideal for GPU computation. Several numerical computations on CPU and GPU validate the method and show efficiency and convergence behavior. Tests are performed using the Vienna Scientific Cluster 3 (VSC3). A free MATLAB implementation for CPU and GPU is provided to the interested community.
Metalevel programming in robotics: Some issues
NASA Technical Reports Server (NTRS)
Kumarn, A.; Parameswaran, N.
1987-01-01
Computing in robotics has two important requirements: efficiency and flexibility. Algorithms for robot actions are implemented usually in procedural languages such as VAL and AL. But, since their excessive bindings create inflexible structures of computation, it is proposed that Logic Programming is a more suitable language for robot programming due to its non-determinism, declarative nature, and provision for metalevel programming. Logic Programming, however, results in inefficient computations. As a solution to this problem, researchers discuss a framework in which controls can be described to improve efficiency. They have divided controls into: (1) in-code and (2) metalevel and discussed them with reference to selection of rules and dataflow. Researchers illustrated the merit of Logic Programming by modelling the motion of a robot from one point to another avoiding obstacles.
Computationally efficient real-time interpolation algorithm for non-uniform sampled biosignals
Eftekhar, Amir; Kindt, Wilko; Constandinou, Timothy G.
2016-01-01
This Letter presents a novel, computationally efficient interpolation method that has been optimised for use in electrocardiogram baseline drift removal. In the authors’ previous Letter three isoelectric baseline points per heartbeat are detected, and here utilised as interpolation points. As an extension from linear interpolation, their algorithm segments the interpolation interval and utilises different piecewise linear equations. Thus, the algorithm produces a linear curvature that is computationally efficient while interpolating non-uniform samples. The proposed algorithm is tested using sinusoids with different fundamental frequencies from 0.05 to 0.7 Hz and also validated with real baseline wander data acquired from the Massachusetts Institute of Technology University and Boston's Beth Israel Hospital (MIT-BIH) Noise Stress Database. The synthetic data results show an root mean square (RMS) error of 0.9 μV (mean), 0.63 μV (median) and 0.6 μV (standard deviation) per heartbeat on a 1 mVp–p 0.1 Hz sinusoid. On real data, they obtain an RMS error of 10.9 μV (mean), 8.5 μV (median) and 9.0 μV (standard deviation) per heartbeat. Cubic spline interpolation and linear interpolation on the other hand shows 10.7 μV, 11.6 μV (mean), 7.8 μV, 8.9 μV (median) and 9.8 μV, 9.3 μV (standard deviation) per heartbeat. PMID:27382478
Computationally efficient real-time interpolation algorithm for non-uniform sampled biosignals.
Guven, Onur; Eftekhar, Amir; Kindt, Wilko; Constandinou, Timothy G
2016-06-01
This Letter presents a novel, computationally efficient interpolation method that has been optimised for use in electrocardiogram baseline drift removal. In the authors' previous Letter three isoelectric baseline points per heartbeat are detected, and here utilised as interpolation points. As an extension from linear interpolation, their algorithm segments the interpolation interval and utilises different piecewise linear equations. Thus, the algorithm produces a linear curvature that is computationally efficient while interpolating non-uniform samples. The proposed algorithm is tested using sinusoids with different fundamental frequencies from 0.05 to 0.7 Hz and also validated with real baseline wander data acquired from the Massachusetts Institute of Technology University and Boston's Beth Israel Hospital (MIT-BIH) Noise Stress Database. The synthetic data results show an root mean square (RMS) error of 0.9 μV (mean), 0.63 μV (median) and 0.6 μV (standard deviation) per heartbeat on a 1 mVp-p 0.1 Hz sinusoid. On real data, they obtain an RMS error of 10.9 μV (mean), 8.5 μV (median) and 9.0 μV (standard deviation) per heartbeat. Cubic spline interpolation and linear interpolation on the other hand shows 10.7 μV, 11.6 μV (mean), 7.8 μV, 8.9 μV (median) and 9.8 μV, 9.3 μV (standard deviation) per heartbeat.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Madduri, Kamesh; Im, Eun-Jin; Ibrahim, Khaled Z.
The next decade of high-performance computing (HPC) systems will see a rapid evolution and divergence of multi- and manycore architectures as power and cooling constraints limit increases in microprocessor clock speeds. Understanding efficient optimization methodologies on diverse multicore designs in the context of demanding numerical methods is one of the greatest challenges faced today by the HPC community. In this paper, we examine the efficient multicore optimization of GTC, a petascale gyrokinetic toroidal fusion code for studying plasma microturbulence in tokamak devices. For GTC’s key computational components (charge deposition and particle push), we explore efficient parallelization strategies across a broadmore » range of emerging multicore designs, including the recently-released Intel Nehalem-EX, the AMD Opteron Istanbul, and the highly multithreaded Sun UltraSparc T2+. We also present the first study on tuning gyrokinetic particle-in-cell (PIC) algorithms for graphics processors, using the NVIDIA C2050 (Fermi). Our work discusses several novel optimization approaches for gyrokinetic PIC, including mixed-precision computation, particle binning and decomposition strategies, grid replication, SIMDized atomic floating-point operations, and effective GPU texture memory utilization. Overall, we achieve significant performance improvements of 1.3–4.7× on these complex PIC kernels, despite the inherent challenges of data dependency and locality. Finally, our work also points to several architectural and programming features that could significantly enhance PIC performance and productivity on next-generation architectures.« less
Flow dynamics and energy efficiency of flow in the left ventricle during myocardial infarction.
Vasudevan, Vivek; Low, Adriel Jia Jun; Annamalai, Sarayu Parimal; Sampath, Smita; Poh, Kian Keong; Totman, Teresa; Mazlan, Muhammad; Croft, Grace; Richards, A Mark; de Kleijn, Dominique P V; Chin, Chih-Liang; Yap, Choon Hwai
2017-10-01
Cardiovascular disease is a leading cause of death worldwide, where myocardial infarction (MI) is a major category. After infarction, the heart has difficulty providing sufficient energy for circulation, and thus, understanding the heart's energy efficiency is important. We induced MI in a porcine animal model via circumflex ligation and acquired multiple-slice cine magnetic resonance (MR) images in a longitudinal manner-before infarction, and 1 week (acute) and 4 weeks (chronic) after infarction. Computational fluid dynamic simulations were performed based on MR images to obtain detailed fluid dynamics and energy dynamics of the left ventricles. Results showed that energy efficiency flow through the heart decreased at the acute time point. Since the heart was observed to experience changes in heart rate, stroke volume and chamber size over the two post-infarction time points, simulations were performed to test the effect of each of the three parameters. Increasing heart rate and stroke volume were found to significantly decrease flow energy efficiency, but the effect of chamber size was inconsistent. Strong complex interplay was observed between the three parameters, necessitating the use of non-dimensional parameterization to characterize flow energy efficiency. The ratio of Reynolds to Strouhal number, which is a form of Womersley number, was found to be the most effective non-dimensional parameter to represent energy efficiency of flow in the heart. We believe that this non-dimensional number can be computed for clinical cases via ultrasound and hypothesize that it can serve as a biomarker for clinical evaluations.
Advanced Unstructured Grid Generation for Complex Aerodynamic Applications
NASA Technical Reports Server (NTRS)
Pirzadeh, Shahyar Z.
2008-01-01
A new approach for distribution of grid points on the surface and in the volume has been developed and implemented in the NASA unstructured grid generation code VGRID. In addition to the point and line sources of prior work, the new approach utilizes surface and volume sources for automatic curvature-based grid sizing and convenient point distribution in the volume. A new exponential growth function produces smoother and more efficient grids and provides superior control over distribution of grid points in the field. All types of sources support anisotropic grid stretching which not only improves the grid economy but also provides more accurate solutions for certain aerodynamic applications. The new approach does not require a three-dimensional background grid as in the previous methods. Instead, it makes use of an efficient bounding-box auxiliary medium for storing grid parameters defined by surface sources. The new approach is less memory-intensive and more efficient computationally. The grids generated with the new method either eliminate the need for adaptive grid refinement for certain class of problems or provide high quality initial grids that would enhance the performance of many adaptation methods.
Advanced Unstructured Grid Generation for Complex Aerodynamic Applications
NASA Technical Reports Server (NTRS)
Pirzadeh, Shahyar
2010-01-01
A new approach for distribution of grid points on the surface and in the volume has been developed. In addition to the point and line sources of prior work, the new approach utilizes surface and volume sources for automatic curvature-based grid sizing and convenient point distribution in the volume. A new exponential growth function produces smoother and more efficient grids and provides superior control over distribution of grid points in the field. All types of sources support anisotropic grid stretching which not only improves the grid economy but also provides more accurate solutions for certain aerodynamic applications. The new approach does not require a three-dimensional background grid as in the previous methods. Instead, it makes use of an efficient bounding-box auxiliary medium for storing grid parameters defined by surface sources. The new approach is less memory-intensive and more efficient computationally. The grids generated with the new method either eliminate the need for adaptive grid refinement for certain class of problems or provide high quality initial grids that would enhance the performance of many adaptation methods.
Gartner, Thomas E; Epps, Thomas H; Jayaraman, Arthi
2016-11-08
We describe an extension of the Gibbs ensemble molecular dynamics (GEMD) method for studying phase equilibria. Our modifications to GEMD allow for direct control over particle transfer between phases and improve the method's numerical stability. Additionally, we found that the modified GEMD approach had advantages in computational efficiency in comparison to a hybrid Monte Carlo (MC)/MD Gibbs ensemble scheme in the context of the single component Lennard-Jones fluid. We note that this increase in computational efficiency does not compromise the close agreement of phase equilibrium results between the two methods. However, numerical instabilities in the GEMD scheme hamper GEMD's use near the critical point. We propose that the computationally efficient GEMD simulations can be used to map out the majority of the phase window, with hybrid MC/MD used as a follow up for conditions under which GEMD may be unstable (e.g., near-critical behavior). In this manner, we can capitalize on the contrasting strengths of these two methods to enable the efficient study of phase equilibria for systems that present challenges for a purely stochastic GEMC method, such as dense or low temperature systems, and/or those with complex molecular topologies.
Computer Program for the Design and Off-Design Performance of Turbojet and Turbofan Engine Cycles
NASA Technical Reports Server (NTRS)
Morris, S. J.
1978-01-01
The rapid computer program is designed to be run in a stand-alone mode or operated within a larger program. The computation is based on a simplified one-dimensional gas turbine cycle. Each component in the engine is modeled thermo-dynamically. The component efficiencies used in the thermodynamic modeling are scaled for the off-design conditions from input design point values using empirical trends which are included in the computer code. The engine cycle program is capable of producing reasonable engine performance prediction with a minimum of computer execute time. The current computer execute time on the IBM 360/67 for one Mach number, one altitude, and one power setting is about 0.1 seconds. about 0.1 seconds. The principal assumption used in the calculation is that the compressor is operated along a line of maximum adiabatic efficiency on the compressor map. The fluid properties are computed for the combustion mixture, but dissociation is not included. The procedure included in the program is only for the combustion of JP-4, methane, or hydrogen.
Tsumori, Nobuhiro; Takahashi, Motoki; Sakuma, Yoshiki; Saiki, Toshiharu
2011-10-10
We examined the near-field collection efficiency of near-infrared radiation for an aperture probe. We used InAs quantum dots as ideal point light sources with emission wavelengths ranging from 1.1 to 1.6 μm. We experimentally investigated the wavelength dependence of the collection efficiency and compared the results with computational simulations that modeled the actual probe structure. The observed degradation in the collection efficiency is attributed to the cutoff characteristics of the gold-clad tapered waveguide, which approaches an ideal conductor at near-infrared wavelengths. © 2011 Optical Society of America
Full On-Device Stay Points Detection in Smartphones for Location-Based Mobile Applications.
Pérez-Torres, Rafael; Torres-Huitzil, César; Galeana-Zapién, Hiram
2016-10-13
The tracking of frequently visited places, also known as stay points, is a critical feature in location-aware mobile applications as a way to adapt the information and services provided to smartphones users according to their moving patterns. Location based applications usually employ the GPS receiver along with Wi-Fi hot-spots and cellular cell tower mechanisms for estimating user location. Typically, fine-grained GPS location data are collected by the smartphone and transferred to dedicated servers for trajectory analysis and stay points detection. Such Mobile Cloud Computing approach has been successfully employed for extending smartphone's battery lifetime by exchanging computation costs, assuming that on-device stay points detection is prohibitive. In this article, we propose and validate the feasibility of having an alternative event-driven mechanism for stay points detection that is executed fully on-device, and that provides higher energy savings by avoiding communication costs. Our solution is encapsulated in a sensing middleware for Android smartphones, where a stream of GPS location updates is collected in the background, supporting duty cycling schemes, and incrementally analyzed following an event-driven paradigm for stay points detection. To evaluate the performance of the proposed middleware, real world experiments were conducted under different stress levels, validating its power efficiency when compared against a Mobile Cloud Computing oriented solution.
Strategies for efficient resolution analysis in full-waveform inversion
NASA Astrophysics Data System (ADS)
Fichtner, A.; van Leeuwen, T.; Trampert, J.
2016-12-01
Full-waveform inversion is developing into a standard method in the seismological toolbox. It combines numerical wave propagation for heterogeneous media with adjoint techniques in order to improve tomographic resolution. However, resolution becomes increasingly difficult to quantify because of the enormous computational requirements. Here we present two families of methods that can be used for efficient resolution analysis in full-waveform inversion. They are based on the targeted extraction of resolution proxies from the Hessian matrix, which is too large to store and to compute explicitly. Fourier methods rest on the application of the Hessian to Earth models with harmonic oscillations. This yields the Fourier spectrum of the Hessian for few selected wave numbers, from which we can extract properties of the tomographic point-spread function for any point in space. Random probing methods use uncorrelated, random test models instead of harmonic oscillations. Auto-correlating the Hessian-model applications for sufficiently many test models also characterises the point-spread function. Both Fourier and random probing methods provide a rich collection of resolution proxies. These include position- and direction-dependent resolution lengths, and the volume of point-spread functions as indicator of amplitude recovery and inter-parameter trade-offs. The computational requirements of these methods are equivalent to approximately 7 conjugate-gradient iterations in full-waveform inversion. This is significantly less than the optimisation itself, which may require tens to hundreds of iterations to reach convergence. In addition to the theoretical foundations of the Fourier and random probing methods, we show various illustrative examples from real-data full-waveform inversion for crustal and mantle structure.
Multi-petascale highly efficient parallel supercomputer
Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; Blumrich, Matthias A.; Boyle, Peter; Brunheroto, Jose R.; Chen, Dong; Cher, Chen -Yong; Chiu, George L.; Christ, Norman; Coteus, Paul W.; Davis, Kristan D.; Dozsa, Gabor J.; Eichenberger, Alexandre E.; Eisley, Noel A.; Ellavsky, Matthew R.; Evans, Kahn C.; Fleischer, Bruce M.; Fox, Thomas W.; Gara, Alan; Giampapa, Mark E.; Gooding, Thomas M.; Gschwind, Michael K.; Gunnels, John A.; Hall, Shawn A.; Haring, Rudolf A.; Heidelberger, Philip; Inglett, Todd A.; Knudson, Brant L.; Kopcsay, Gerard V.; Kumar, Sameer; Mamidala, Amith R.; Marcella, James A.; Megerian, Mark G.; Miller, Douglas R.; Miller, Samuel J.; Muff, Adam J.; Mundy, Michael B.; O'Brien, John K.; O'Brien, Kathryn M.; Ohmacht, Martin; Parker, Jeffrey J.; Poole, Ruth J.; Ratterman, Joseph D.; Salapura, Valentina; Satterfield, David L.; Senger, Robert M.; Smith, Brian; Steinmacher-Burow, Burkhard; Stockdell, William M.; Stunkel, Craig B.; Sugavanam, Krishnan; Sugawara, Yutaka; Takken, Todd E.; Trager, Barry M.; Van Oosten, James L.; Wait, Charles D.; Walkup, Robert E.; Watson, Alfred T.; Wisniewski, Robert W.; Wu, Peng
2015-07-14
A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.
NASA Astrophysics Data System (ADS)
Böhm, J.; Bredif, M.; Gierlinger, T.; Krämer, M.; Lindenberg, R.; Liu, K.; Michel, F.; Sirmacek, B.
2016-06-01
Current 3D data capturing as implemented on for example airborne or mobile laser scanning systems is able to efficiently sample the surface of a city by billions of unselective points during one working day. What is still difficult is to extract and visualize meaningful information hidden in these point clouds with the same efficiency. This is where the FP7 IQmulus project enters the scene. IQmulus is an interactive facility for processing and visualizing big spatial data. In this study the potential of IQmulus is demonstrated on a laser mobile mapping point cloud of 1 billion points sampling ~ 10 km of street environment in Toulouse, France. After the data is uploaded to the IQmulus Hadoop Distributed File System, a workflow is defined by the user consisting of retiling the data followed by a PCA driven local dimensionality analysis, which runs efficiently on the IQmulus cloud facility using a Spark implementation. Points scattering in 3 directions are clustered in the tree class, and are separated next into individual trees. Five hours of processing at the 12 node computing cluster results in the automatic identification of 4000+ urban trees. Visualization of the results in the IQmulus fat client helps users to appreciate the results, and developers to identify remaining flaws in the processing workflow.
NASA Astrophysics Data System (ADS)
Dimitrakopoulos, Panagiotis
2018-03-01
The calculation of polytropic efficiencies is a very important task, especially during the development of new compression units, like compressor impellers, stages and stage groups. Such calculations are also crucial for the determination of the performance of a whole compressor. As processors and computational capacities have substantially been improved in the last years, the need for a new, rigorous, robust, accurate and at the same time standardized method merged, regarding the computation of the polytropic efficiencies, especially based on thermodynamics of real gases. The proposed method is based on the rigorous definition of the polytropic efficiency. The input consists of pressure and temperature values at the end points of the compression path (suction and discharge), for a given working fluid. The average relative error for the studied cases was 0.536 %. Thus, this high-accuracy method is proposed for efficiency calculations related with turbocompressors and their compression units, especially when they are operating at high power levels, for example in jet engines and high-power plants.
A wavefront orientation method for precise numerical determination of tsunami travel time
NASA Astrophysics Data System (ADS)
Fine, I. V.; Thomson, R. E.
2013-04-01
We present a highly accurate and computationally efficient method (herein, the "wavefront orientation method") for determining the travel time of oceanic tsunamis. Based on Huygens principle, the method uses an eight-point grid-point pattern and the most recent information on the orientation of the advancing wave front to determine the time for a tsunami to travel to a specific oceanic location. The method is shown to provide improved accuracy and reduced anisotropy compared with the conventional multiple grid-point method presently in widespread use.
NASA Astrophysics Data System (ADS)
Caliari, Marco; Zuccher, Simone
2017-04-01
Although Fourier series approximation is ubiquitous in computational physics owing to the Fast Fourier Transform (FFT) algorithm, efficient techniques for the fast evaluation of a three-dimensional truncated Fourier series at a set of arbitrary points are quite rare, especially in MATLAB language. Here we employ the Nonequispaced Fast Fourier Transform (NFFT, by J. Keiner, S. Kunis, and D. Potts), a C library designed for this purpose, and provide a Matlab® and GNU Octave interface that makes NFFT easily available to the Numerical Analysis community. We test the effectiveness of our package in the framework of quantum vortex reconnections, where pseudospectral Fourier methods are commonly used and local high resolution is required in the post-processing stage. We show that the efficient evaluation of a truncated Fourier series at arbitrary points provides excellent results at a computational cost much smaller than carrying out a numerical simulation of the problem on a sufficiently fine regular grid that can reproduce comparable details of the reconnecting vortices.
2010-10-01
bodies becomes greater as surface as- perities wear down (Hutchings, 1992). We characterize friction damage by a change in the friction coefficient...points are such a set, and satisfy an additional constraint in which the skew ( third moment) is minimized, which reduces the average error for a...On sequential Monte Carlo sampling methods for Bayesian filtering. Statistics and Computing, 10, 197–208. Hutchings, I. M. (1992). Tribology : friction
NASA Astrophysics Data System (ADS)
Liu, Youshan; Teng, Jiwen; Xu, Tao; Badal, José
2017-05-01
The mass-lumped method avoids the cost of inverting the mass matrix and simultaneously maintains spatial accuracy by adopting additional interior integration points, known as cubature points. To date, such points are only known analytically in tensor domains, such as quadrilateral or hexahedral elements. Thus, the diagonal-mass-matrix spectral element method (SEM) in non-tensor domains always relies on numerically computed interpolation points or quadrature points. However, only the cubature points for degrees 1 to 6 are known, which is the reason that we have developed a p-norm-based optimization algorithm to obtain higher-order cubature points. In this way, we obtain and tabulate new cubature points with all positive integration weights for degrees 7 to 9. The dispersion analysis illustrates that the dispersion relation determined from the new optimized cubature points is comparable to that of the mass and stiffness matrices obtained by exact integration. Simultaneously, the Lebesgue constant for the new optimized cubature points indicates its surprisingly good interpolation properties. As a result, such points provide both good interpolation properties and integration accuracy. The Courant-Friedrichs-Lewy (CFL) numbers are tabulated for the conventional Fekete-based triangular spectral element (TSEM), the TSEM with exact integration, and the optimized cubature-based TSEM (OTSEM). A complementary study demonstrates the spectral convergence of the OTSEM. A numerical example conducted on a half-space model demonstrates that the OTSEM improves the accuracy by approximately one order of magnitude compared to the conventional Fekete-based TSEM. In particular, the accuracy of the 7th-order OTSEM is even higher than that of the 14th-order Fekete-based TSEM. Furthermore, the OTSEM produces a result that can compete in accuracy with the quadrilateral SEM (QSEM). The high accuracy of the OTSEM is also tested with a non-flat topography model. In terms of computational efficiency, the OTSEM is more efficient than the Fekete-based TSEM, although it is slightly costlier than the QSEM when a comparable numerical accuracy is required.
Numerical simulation using vorticity-vector potential formulation
NASA Technical Reports Server (NTRS)
Tokunaga, Hiroshi
1993-01-01
An accurate and efficient computational method is needed for three-dimensional incompressible viscous flows in engineering applications. On solving the turbulent shear flows directly or using the subgrid scale model, it is indispensable to resolve the small scale fluid motions as well as the large scale motions. From this point of view, the pseudo-spectral method is used so far as the computational method. However, the finite difference or the finite element methods are widely applied for computing the flow with practical importance since these methods are easily applied to the flows with complex geometric configurations. However, there exist several problems in applying the finite difference method to direct and large eddy simulations. Accuracy is one of most important problems. This point was already addressed by the present author on the direct simulations on the instability of the plane Poiseuille flow and also on the transition to turbulence. In order to obtain high efficiency, the multi-grid Poisson solver is combined with the higher-order, accurate finite difference method. The formulation method is also one of the most important problems in applying the finite difference method to the incompressible turbulent flows. The three-dimensional Navier-Stokes equations have been solved so far in the primitive variables formulation. One of the major difficulties of this method is the rigorous satisfaction of the equation of continuity. In general, the staggered grid is used for the satisfaction of the solenoidal condition for the velocity field at the wall boundary. However, the velocity field satisfies the equation of continuity automatically in the vorticity-vector potential formulation. From this point of view, the vorticity-vector potential method was extended to the generalized coordinate system. In the present article, we adopt the vorticity-vector potential formulation, the generalized coordinate system, and the 4th-order accurate difference method as the computational method. We present the computational method and apply the present method to computations of flows in a square cavity at large Reynolds number in order to investigate its effectiveness.
Dimension improvement in Dhar's refutation of the Eden conjecture
NASA Astrophysics Data System (ADS)
Bertrand, Quentin; Pertinand, Jules
2018-03-01
We consider the Eden model on the d-dimensional hypercubical unoriented lattice, for large d. Initially, every lattice point is healthy, except the origin which is infected. Then, each infected lattice point contaminates any of its neighbours with rate 1. The Eden model is equivalent to first passage percolation, with exponential passage times on edges. The Eden conjecture states that the limit shape of the Eden model is a Euclidean ball. By pushing the computations of Dhar [5] a little further with modern computers and efficient implementation we obtain improved bounds for the speed of infection. This shows that the Eden conjecture does not hold in dimension superior to 22 (the lowest known dimension was 35).
Unified Ultrasonic/Eddy-Current Data Acquisition
NASA Technical Reports Server (NTRS)
Chern, E. James; Butler, David W.
1993-01-01
Imaging station for detecting cracks and flaws in solid materials developed combining both ultrasonic C-scan and eddy-current imaging. Incorporation of both techniques into one system eliminates duplication of computers and of mechanical scanners; unifies acquisition, processing, and storage of data; reduces setup time for repetitious ultrasonic and eddy-current scans; and increases efficiency of system. Same mechanical scanner used to maneuver either ultrasonic or eddy-current probe over specimen and acquire point-by-point data. For ultrasonic scanning, probe linked to ultrasonic pulser/receiver circuit card, while, for eddy-current imaging, probe linked to impedance-analyzer circuit card. Both ultrasonic and eddy-current imaging subsystems share same desktop-computer controller, containing dedicated plug-in circuit boards for each.
Fast ground filtering for TLS data via Scanline Density Analysis
NASA Astrophysics Data System (ADS)
Che, Erzhuo; Olsen, Michael J.
2017-07-01
Terrestrial Laser Scanning (TLS) efficiently collects 3D information based on lidar (light detection and ranging) technology. TLS has been widely used in topographic mapping, engineering surveying, forestry, industrial facilities, cultural heritage, and so on. Ground filtering is a common procedure in lidar data processing, which separates the point cloud data into ground points and non-ground points. Effective ground filtering is helpful for subsequent procedures such as segmentation, classification, and modeling. Numerous ground filtering algorithms have been developed for Airborne Laser Scanning (ALS) data. However, many of these are error prone in application to TLS data because of its different angle of view and highly variable resolution. Further, many ground filtering techniques are limited in application within challenging topography and experience difficulty coping with some objects such as short vegetation, steep slopes, and so forth. Lastly, due to the large size of point cloud data, operations such as data traversing, multiple iterations, and neighbor searching significantly affect the computation efficiency. In order to overcome these challenges, we present an efficient ground filtering method for TLS data via a Scanline Density Analysis, which is very fast because it exploits the grid structure storing TLS data. The process first separates the ground candidates, density features, and unidentified points based on an analysis of point density within each scanline. Second, a region growth using the scan pattern is performed to cluster the ground candidates and further refine the ground points (clusters). In the experiment, the effectiveness, parameter robustness, and efficiency of the proposed method is demonstrated with datasets collected from an urban scene and a natural scene, respectively.
Secure and Efficient k-NN Queries⋆
Asif, Hafiz; Vaidya, Jaideep; Shafiq, Basit; Adam, Nabil
2017-01-01
Given the morass of available data, ranking and best match queries are often used to find records of interest. As such, k-NN queries, which give the k closest matches to a query point, are of particular interest, and have many applications. We study this problem in the context of the financial sector, wherein an investment portfolio database is queried for matching portfolios. Given the sensitivity of the information involved, our key contribution is to develop a secure k-NN computation protocol that can enable the computation k-NN queries in a distributed multi-party environment while taking domain semantics into account. The experimental results show that the proposed protocols are extremely efficient. PMID:29218333
Progress on a Taylor weak statement finite element algorithm for high-speed aerodynamic flows
NASA Technical Reports Server (NTRS)
Baker, A. J.; Freels, J. D.
1989-01-01
A new finite element numerical Computational Fluid Dynamics (CFD) algorithm has matured to the point of efficiently solving two-dimensional high speed real-gas compressible flow problems in generalized coordinates on modern vector computer systems. The algorithm employs a Taylor Weak Statement classical Galerkin formulation, a variably implicit Newton iteration, and a tensor matrix product factorization of the linear algebra Jacobian under a generalized coordinate transformation. Allowing for a general two-dimensional conservation law system, the algorithm has been exercised on the Euler and laminar forms of the Navier-Stokes equations. Real-gas fluid properties are admitted, and numerical results verify solution accuracy, efficiency, and stability over a range of test problem parameters.
NASA Astrophysics Data System (ADS)
Wang, Xu-yang; Zhdanov, Dmitry D.; Potemin, Igor S.; Wang, Ying; Cheng, Han
2016-10-01
One of the challenges of augmented reality is a seamless combination of objects of the real and virtual worlds, for example light sources. We suggest a measurement and computation models for reconstruction of light source position. The model is based on the dependence of luminance of the small size diffuse surface directly illuminated by point like source placed at a short distance from the observer or camera. The advantage of the computational model is the ability to eliminate the effects of indirect illumination. The paper presents a number of examples to illustrate the efficiency and accuracy of the proposed method.
On automating domain connectivity for overset grids
NASA Technical Reports Server (NTRS)
Chiu, Ing-Tsau; Meakin, Robert L.
1995-01-01
An alternative method for domain connectivity among systems of overset grids is presented. Reference uniform Cartesian systems of points are used to achieve highly efficient domain connectivity, and form the basis for a future fully automated system. The Cartesian systems are used to approximate body surfaces and to map the computational space of component grids. By exploiting the characteristics of Cartesian systems, Chimera type hole-cutting and identification of donor elements for intergrid boundary points can be carried out very efficiently. The method is tested for a range of geometrically complex multiple-body overset grid systems. A dynamic hole expansion/contraction algorithm is also implemented to obtain optimum domain connectivity; however, it is tested only for geometry of generic shapes.
USDA-ARS?s Scientific Manuscript database
Hydrologic models are essential tools for environmental assessment of agricultural non-point source pollution. The automatic calibration of hydrologic models, though efficient, demands significant computational power, which can limit its application. The study objective was to investigate a cost e...
Hierarchical extraction of urban objects from mobile laser scanning data
NASA Astrophysics Data System (ADS)
Yang, Bisheng; Dong, Zhen; Zhao, Gang; Dai, Wenxia
2015-01-01
Point clouds collected in urban scenes contain a huge number of points (e.g., billions), numerous objects with significant size variability, complex and incomplete structures, and variable point densities, raising great challenges for the automated extraction of urban objects in the field of photogrammetry, computer vision, and robotics. This paper addresses these challenges by proposing an automated method to extract urban objects robustly and efficiently. The proposed method generates multi-scale supervoxels from 3D point clouds using the point attributes (e.g., colors, intensities) and spatial distances between points, and then segments the supervoxels rather than individual points by combining graph based segmentation with multiple cues (e.g., principal direction, colors) of the supervoxels. The proposed method defines a set of rules for merging segments into meaningful units according to types of urban objects and forms the semantic knowledge of urban objects for the classification of objects. Finally, the proposed method extracts and classifies urban objects in a hierarchical order ranked by the saliency of the segments. Experiments show that the proposed method is efficient and robust for extracting buildings, streetlamps, trees, telegraph poles, traffic signs, cars, and enclosures from mobile laser scanning (MLS) point clouds, with an overall accuracy of 92.3%.
NASA Astrophysics Data System (ADS)
Levi, Michele; Steinhoff, Jan
2017-12-01
We present a novel public package ‘EFTofPNG’ for high precision computation in the effective field theory of post-Newtonian (PN) gravity, including spins. We created this package in view of the timely need to publicly share automated computation tools, which integrate the various types of physics manifested in the expected increasing influx of gravitational wave (GW) data. Hence, we created a free and open source package, which is self-contained, modular, all-inclusive, and accessible to the classical gravity community. The ‘EFTofPNG’ Mathematica package also uses the power of the ‘xTensor’ package, suited for complicated tensor computation, where our coding also strategically approaches the generic generation of Feynman contractions, which is universal to all perturbation theories in physics, by efficiently treating n-point functions as tensors of rank n. The package currently contains four independent units, which serve as subsidiaries to the main one. Its final unit serves as a pipeline chain for the obtainment of the final GW templates, and provides the full computation of derivatives and physical observables of interest. The upcoming ‘EFTofPNG’ package version 1.0 should cover the point mass sector, and all the spin sectors, up to the fourth PN order, and the two-loop level. We expect and strongly encourage public development of the package to improve its efficiency, and to extend it to further PN sectors, and observables useful for the waveform modelling.
Liu, Yong-Kuo; Chao, Nan; Xia, Hong; Peng, Min-Jun; Ayodeji, Abiodun
2018-05-17
This paper presents an improved and efficient virtual reality-based adaptive dose assessment method (VRBAM) applicable to the cutting and dismantling tasks in nuclear facility decommissioning. The method combines the modeling strength of virtual reality with the flexibility of adaptive technology. The initial geometry is designed with the three-dimensional computer-aided design tools, and a hybrid model composed of cuboids and a point-cloud is generated automatically according to the virtual model of the object. In order to improve the efficiency of dose calculation while retaining accuracy, the hybrid model is converted to a weighted point-cloud model, and the point kernels are generated by adaptively simplifying the weighted point-cloud model according to the detector position, an approach that is suitable for arbitrary geometries. The dose rates are calculated with the Point-Kernel method. To account for radiation scattering effects, buildup factors are calculated with the Geometric-Progression formula in the fitting function. The geometric modeling capability of VRBAM was verified by simulating basic geometries, which included a convex surface, a concave surface, a flat surface and their combination. The simulation results show that the VRBAM is more flexible and superior to other approaches in modeling complex geometries. In this paper, the computation time and dose rate results obtained from the proposed method were also compared with those obtained using the MCNP code and an earlier virtual reality-based method (VRBM) developed by the same authors. © 2018 IOP Publishing Ltd.
SCUT: clinical data organization for physicians using pen computers.
Wormuth, D. W.
1992-01-01
The role of computers in assisting physicians with patient care is rapidly advancing. One of the significant obstacles to efficient use of computers in patient care has been the unavailability of reasonably configured portable computers. Lightweight portable computers are becoming more attractive as physician data-management devices, but still pose a significant problem with bedside use. The advent of computers designed to accept input from a pen and having no keyboard present a usable computer platform to enable physicians to perform clinical computing at the bedside. This paper describes a prototype system to maintain an electronic "scut" sheet. SCUT makes use of pen-input and background rule checking to enhance patient care. GO Corporation's PenPoint Operating System is used to implement the SCUT project. PMID:1483012
Simulation of a combined-cycle engine
NASA Technical Reports Server (NTRS)
Vangerpen, Jon
1991-01-01
A FORTRAN computer program was developed to simulate the performance of combined-cycle engines. These engines combine features of both gas turbines and reciprocating engines. The computer program can simulate both design point and off-design operation. Widely varying engine configurations can be evaluated for their power, performance, and efficiency as well as the influence of altitude and air speed. Although the program was developed to simulate aircraft engines, it can be used with equal success for stationary and automative applications.
A Single LiDAR-Based Feature Fusion Indoor Localization Algorithm.
Wang, Yun-Ting; Peng, Chao-Chung; Ravankar, Ankit A; Ravankar, Abhijeet
2018-04-23
In past years, there has been significant progress in the field of indoor robot localization. To precisely recover the position, the robots usually relies on multiple on-board sensors. Nevertheless, this affects the overall system cost and increases computation. In this research work, we considered a light detection and ranging (LiDAR) device as the only sensor for detecting surroundings and propose an efficient indoor localization algorithm. To attenuate the computation effort and preserve localization robustness, a weighted parallel iterative closed point (WP-ICP) with interpolation is presented. As compared to the traditional ICP, the point cloud is first processed to extract corners and line features before applying point registration. Later, points labeled as corners are only matched with the corner candidates. Similarly, points labeled as lines are only matched with the lines candidates. Moreover, their ICP confidence levels are also fused in the algorithm, which make the pose estimation less sensitive to environment uncertainties. The proposed WP-ICP architecture reduces the probability of mismatch and thereby reduces the ICP iterations. Finally, based on given well-constructed indoor layouts, experiment comparisons are carried out under both clean and perturbed environments. It is shown that the proposed method is effective in significantly reducing computation effort and is simultaneously able to preserve localization precision.
A Single LiDAR-Based Feature Fusion Indoor Localization Algorithm
Wang, Yun-Ting; Peng, Chao-Chung; Ravankar, Ankit A.; Ravankar, Abhijeet
2018-01-01
In past years, there has been significant progress in the field of indoor robot localization. To precisely recover the position, the robots usually relies on multiple on-board sensors. Nevertheless, this affects the overall system cost and increases computation. In this research work, we considered a light detection and ranging (LiDAR) device as the only sensor for detecting surroundings and propose an efficient indoor localization algorithm. To attenuate the computation effort and preserve localization robustness, a weighted parallel iterative closed point (WP-ICP) with interpolation is presented. As compared to the traditional ICP, the point cloud is first processed to extract corners and line features before applying point registration. Later, points labeled as corners are only matched with the corner candidates. Similarly, points labeled as lines are only matched with the lines candidates. Moreover, their ICP confidence levels are also fused in the algorithm, which make the pose estimation less sensitive to environment uncertainties. The proposed WP-ICP architecture reduces the probability of mismatch and thereby reduces the ICP iterations. Finally, based on given well-constructed indoor layouts, experiment comparisons are carried out under both clean and perturbed environments. It is shown that the proposed method is effective in significantly reducing computation effort and is simultaneously able to preserve localization precision. PMID:29690624
NASA Astrophysics Data System (ADS)
Kacem, S.; Eichwald, O.; Ducasse, O.; Renon, N.; Yousfi, M.; Charrada, K.
2012-01-01
Streamers dynamics are characterized by the fast propagation of ionized shock waves at the nanosecond scale under very sharp space charge variations. The streamer dynamics modelling needs the solution of charged particle transport equations coupled to the elliptic Poisson's equation. The latter has to be solved at each time step of the streamers evolution in order to follow the propagation of the resulting space charge electric field. In the present paper, a full multi grid (FMG) and a multi grid (MG) methods have been adapted to solve Poisson's equation for streamer discharge simulations between asymmetric electrodes. The validity of the FMG method for the computation of the potential field is first shown by performing direct comparisons with analytic solution of the Laplacian potential in the case of a point-to-plane geometry. The efficiency of the method is also compared with the classical successive over relaxation method (SOR) and MUltifrontal massively parallel solver (MUMPS). MG method is then applied in the case of the simulation of positive streamer propagation and its efficiency is evaluated from comparisons to SOR and MUMPS methods in the chosen point-to-plane configuration. Very good agreements are obtained between the three methods for all electro-hydrodynamics characteristics of the streamer during its propagation in the inter-electrode gap. However in the case of MG method, the computational time to solve the Poisson's equation is at least 2 times faster in our simulation conditions.
Gyrokinetic particle-in-cell optimization on emerging multi- and manycore platforms
Madduri, Kamesh; Im, Eun-Jin; Ibrahim, Khaled Z.; ...
2011-03-02
The next decade of high-performance computing (HPC) systems will see a rapid evolution and divergence of multi- and manycore architectures as power and cooling constraints limit increases in microprocessor clock speeds. Understanding efficient optimization methodologies on diverse multicore designs in the context of demanding numerical methods is one of the greatest challenges faced today by the HPC community. In this paper, we examine the efficient multicore optimization of GTC, a petascale gyrokinetic toroidal fusion code for studying plasma microturbulence in tokamak devices. For GTC’s key computational components (charge deposition and particle push), we explore efficient parallelization strategies across a broadmore » range of emerging multicore designs, including the recently-released Intel Nehalem-EX, the AMD Opteron Istanbul, and the highly multithreaded Sun UltraSparc T2+. We also present the first study on tuning gyrokinetic particle-in-cell (PIC) algorithms for graphics processors, using the NVIDIA C2050 (Fermi). Our work discusses several novel optimization approaches for gyrokinetic PIC, including mixed-precision computation, particle binning and decomposition strategies, grid replication, SIMDized atomic floating-point operations, and effective GPU texture memory utilization. Overall, we achieve significant performance improvements of 1.3–4.7× on these complex PIC kernels, despite the inherent challenges of data dependency and locality. Finally, our work also points to several architectural and programming features that could significantly enhance PIC performance and productivity on next-generation architectures.« less
A fast point-cloud computing method based on spatial symmetry of Fresnel field
NASA Astrophysics Data System (ADS)
Wang, Xiangxiang; Zhang, Kai; Shen, Chuan; Zhu, Wenliang; Wei, Sui
2017-10-01
Aiming at the great challenge for Computer Generated Hologram (CGH) duo to the production of high spatial-bandwidth product (SBP) is required in the real-time holographic video display systems. The paper is based on point-cloud method and it takes advantage of the propagating reversibility of Fresnel diffraction in the propagating direction and the fringe pattern of a point source, known as Gabor zone plate has spatial symmetry, so it can be used as a basis for fast calculation of diffraction field in CGH. A fast Fresnel CGH method based on the novel look-up table (N-LUT) method is proposed, the principle fringe patterns (PFPs) at the virtual plane is pre-calculated by the acceleration algorithm and be stored. Secondly, the Fresnel diffraction fringe pattern at dummy plane can be obtained. Finally, the Fresnel propagation from dummy plan to hologram plane. The simulation experiments and optical experiments based on Liquid Crystal On Silicon (LCOS) is setup to demonstrate the validity of the proposed method under the premise of ensuring the quality of 3D reconstruction the method proposed in the paper can be applied to shorten the computational time and improve computational efficiency.
Full On-Device Stay Points Detection in Smartphones for Location-Based Mobile Applications
Pérez-Torres, Rafael; Torres-Huitzil, César; Galeana-Zapién, Hiram
2016-01-01
The tracking of frequently visited places, also known as stay points, is a critical feature in location-aware mobile applications as a way to adapt the information and services provided to smartphones users according to their moving patterns. Location based applications usually employ the GPS receiver along with Wi-Fi hot-spots and cellular cell tower mechanisms for estimating user location. Typically, fine-grained GPS location data are collected by the smartphone and transferred to dedicated servers for trajectory analysis and stay points detection. Such Mobile Cloud Computing approach has been successfully employed for extending smartphone’s battery lifetime by exchanging computation costs, assuming that on-device stay points detection is prohibitive. In this article, we propose and validate the feasibility of having an alternative event-driven mechanism for stay points detection that is executed fully on-device, and that provides higher energy savings by avoiding communication costs. Our solution is encapsulated in a sensing middleware for Android smartphones, where a stream of GPS location updates is collected in the background, supporting duty cycling schemes, and incrementally analyzed following an event-driven paradigm for stay points detection. To evaluate the performance of the proposed middleware, real world experiments were conducted under different stress levels, validating its power efficiency when compared against a Mobile Cloud Computing oriented solution. PMID:27754388
NASA Technical Reports Server (NTRS)
Jothiprasad, Giridhar; Mavriplis, Dimitri J.; Caughey, David A.
2002-01-01
The rapid increase in available computational power over the last decade has enabled higher resolution flow simulations and more widespread use of unstructured grid methods for complex geometries. While much of this effort has been focused on steady-state calculations in the aerodynamics community, the need to accurately predict off-design conditions, which may involve substantial amounts of flow separation, points to the need to efficiently simulate unsteady flow fields. Accurate unsteady flow simulations can easily require several orders of magnitude more computational effort than a corresponding steady-state simulation. For this reason, techniques for improving the efficiency of unsteady flow simulations are required in order to make such calculations feasible in the foreseeable future. The purpose of this work is to investigate possible reductions in computer time due to the choice of an efficient time-integration scheme from a series of schemes differing in the order of time-accuracy, and by the use of more efficient techniques to solve the nonlinear equations which arise while using implicit time-integration schemes. This investigation is carried out in the context of a two-dimensional unstructured mesh laminar Navier-Stokes solver.
NASA Technical Reports Server (NTRS)
Lyster, P. M.; Liewer, P. C.; Decyk, V. K.; Ferraro, R. D.
1995-01-01
A three-dimensional electrostatic particle-in-cell (PIC) plasma simulation code has been developed on coarse-grain distributed-memory massively parallel computers with message passing communications. Our implementation is the generalization to three-dimensions of the general concurrent particle-in-cell (GCPIC) algorithm. In the GCPIC algorithm, the particle computation is divided among the processors using a domain decomposition of the simulation domain. In a three-dimensional simulation, the domain can be partitioned into one-, two-, or three-dimensional subdomains ("slabs," "rods," or "cubes") and we investigate the efficiency of the parallel implementation of the push for all three choices. The present implementation runs on the Intel Touchstone Delta machine at Caltech; a multiple-instruction-multiple-data (MIMD) parallel computer with 512 nodes. We find that the parallel efficiency of the push is very high, with the ratio of communication to computation time in the range 0.3%-10.0%. The highest efficiency (> 99%) occurs for a large, scaled problem with 64(sup 3) particles per processing node (approximately 134 million particles of 512 nodes) which has a push time of about 250 ns per particle per time step. We have also developed expressions for the timing of the code which are a function of both code parameters (number of grid points, particles, etc.) and machine-dependent parameters (effective FLOP rate, and the effective interprocessor bandwidths for the communication of particles and grid points). These expressions can be used to estimate the performance of scaled problems--including those with inhomogeneous plasmas--to other parallel machines once the machine-dependent parameters are known.
Identity-Based Authentication for Cloud Computing
NASA Astrophysics Data System (ADS)
Li, Hongwei; Dai, Yuanshun; Tian, Ling; Yang, Haomiao
Cloud computing is a recently developed new technology for complex systems with massive-scale services sharing among numerous users. Therefore, authentication of both users and services is a significant issue for the trust and security of the cloud computing. SSL Authentication Protocol (SAP), once applied in cloud computing, will become so complicated that users will undergo a heavily loaded point both in computation and communication. This paper, based on the identity-based hierarchical model for cloud computing (IBHMCC) and its corresponding encryption and signature schemes, presented a new identity-based authentication protocol for cloud computing and services. Through simulation testing, it is shown that the authentication protocol is more lightweight and efficient than SAP, specially the more lightweight user side. Such merit of our model with great scalability is very suited to the massive-scale cloud.
NASA Technical Reports Server (NTRS)
Ramins, Peter; Force, Dale A.; Kosmahl, Henry G.
1987-01-01
A computational procedure for the design of traveling-wave-tube(TWT)/refocuser/multistage depressed collector (MDC) systems was used to design a short, permanent-magnet refocusing system and a highly efficient MDC for a medium-power, dual-mode, 4.8- to 9.6-GHz TWT. The computations were carried out with advanced, multidimensional computer programs which model the electron beam and follow the trajectories of representative charges from the radiofrequency (RF) input of the TWT, through the slow-wave structure and refocusing section, to their points of impact in the depressed collector. Secondary emission losses in the MDC were treated semiquantitatively by injecting representative secondary-electron-emission current into the MDA analysis at the point of impact of each primary beam. A comparison of computed and measured TWT and MDC performance showed very good agreement. The electrodes of the MDC were fabricated from a particluar form of isptropic graphite that was selected for its low secondary electron yield, ease of machinability, and vacuum properties.
NASA Technical Reports Server (NTRS)
Palmer, Grant; Venkatapathy, Ethiraj
1993-01-01
Three solution algorithms, explicit underrelaxation, point implicit, and lower upper symmetric Gauss-Seidel (LUSGS), are used to compute nonequilibrium flow around the Apollo 4 return capsule at 62 km altitude. By varying the Mach number, the efficiency and robustness of the solution algorithms were tested for different levels of chemical stiffness. The performance of the solution algorithms degraded as the Mach number and stiffness of the flow increased. At Mach 15, 23, and 30, the LUSGS method produces an eight order of magnitude drop in the L2 norm of the energy residual in 1/3 to 1/2 the Cray C-90 computer time as compared to the point implicit and explicit under-relaxation methods. The explicit under-relaxation algorithm experienced convergence difficulties at Mach 23 and above. At Mach 40 the performance of the LUSGS algorithm deteriorates to the point it is out-performed by the point implicit method. The effects of the viscous terms are investigated. Grid dependency questions are explored.
Efficient Delaunay Tessellation through K-D Tree Decomposition
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morozov, Dmitriy; Peterka, Tom
Delaunay tessellations are fundamental data structures in computational geometry. They are important in data analysis, where they can represent the geometry of a point set or approximate its density. The algorithms for computing these tessellations at scale perform poorly when the input data is unbalanced. We investigate the use of k-d trees to evenly distribute points among processes and compare two strategies for picking split points between domain regions. Because resulting point distributions no longer satisfy the assumptions of existing parallel Delaunay algorithms, we develop a new parallel algorithm that adapts to its input and prove its correctness. We evaluatemore » the new algorithm using two late-stage cosmology datasets. The new running times are up to 50 times faster using k-d tree compared with regular grid decomposition. Moreover, in the unbalanced data sets, decomposing the domain into a k-d tree is up to five times faster than decomposing it into a regular grid.« less
NASA Astrophysics Data System (ADS)
López, J.; Hernández, J.; Gómez, P.; Faura, F.
2018-02-01
The VOFTools library includes efficient analytical and geometrical routines for (1) area/volume computation, (2) truncation operations that typically arise in VOF (volume of fluid) methods, (3) area/volume conservation enforcement (VCE) in PLIC (piecewise linear interface calculation) reconstruction and(4) computation of the distance from a given point to the reconstructed interface. The computation of a polyhedron volume uses an efficient formula based on a quadrilateral decomposition and a 2D projection of each polyhedron face. The analytical VCE method is based on coupling an interpolation procedure to bracket the solution with an improved final calculation step based on the above volume computation formula. Although the library was originally created to help develop highly accurate advection and reconstruction schemes in the context of VOF methods, it may have more general applications. To assess the performance of the supplied routines, different tests, which are provided in FORTRAN and C, were implemented for several 2D and 3D geometries.
Efficient clustering aggregation based on data fragments.
Wu, Ou; Hu, Weiming; Maybank, Stephen J; Zhu, Mingliang; Li, Bing
2012-06-01
Clustering aggregation, known as clustering ensembles, has emerged as a powerful technique for combining different clustering results to obtain a single better clustering. Existing clustering aggregation algorithms are applied directly to data points, in what is referred to as the point-based approach. The algorithms are inefficient if the number of data points is large. We define an efficient approach for clustering aggregation based on data fragments. In this fragment-based approach, a data fragment is any subset of the data that is not split by any of the clustering results. To establish the theoretical bases of the proposed approach, we prove that clustering aggregation can be performed directly on data fragments under two widely used goodness measures for clustering aggregation taken from the literature. Three new clustering aggregation algorithms are described. The experimental results obtained using several public data sets show that the new algorithms have lower computational complexity than three well-known existing point-based clustering aggregation algorithms (Agglomerative, Furthest, and LocalSearch); nevertheless, the new algorithms do not sacrifice the accuracy.
NASA Astrophysics Data System (ADS)
Torrungrueng, Danai; Johnson, Joel T.; Chou, Hsi-Tseng
2002-03-01
The novel spectral acceleration (NSA) algorithm has been shown to produce an $[\\mathcal{O}]$(Ntot) efficient iterative method of moments for the computation of radiation/scattering from both one-dimensional (1-D) and two-dimensional large-scale quasi-planar structures, where Ntot is the total number of unknowns to be solved. This method accelerates the matrix-vector multiplication in an iterative method of moments solution and divides contributions between points into ``strong'' (exact matrix elements) and ``weak'' (NSA algorithm) regions. The NSA method is based on a spectral representation of the electromagnetic Green's function and appropriate contour deformation, resulting in a fast multipole-like formulation in which contributions from large numbers of points to a single point are evaluated simultaneously. In the standard NSA algorithm the NSA parameters are derived on the basis of the assumption that the outermost possible saddle point, φs,max, along the real axis in the complex angular domain is small. For given height variations of quasi-planar structures, this assumption can be satisfied by adjusting the size of the strong region Ls. However, for quasi-planar structures with large height variations, the adjusted size of the strong region is typically large, resulting in significant increases in computational time for the computation of the strong-region contribution and degrading overall efficiency of the NSA algorithm. In addition, for the case of extremely large scale structures, studies based on the physical optics approximation and a flat surface assumption show that the given NSA parameters in the standard NSA algorithm may yield inaccurate results. In this paper, analytical formulas associated with the NSA parameters for an arbitrary value of φs,max are presented, resulting in more flexibility in selecting Ls to compromise between the computation of the contributions of the strong and weak regions. In addition, a ``multilevel'' algorithm, decomposing 1-D extremely large scale quasi-planar structures into more than one weak region and appropriately choosing the NSA parameters for each weak region, is incorporated into the original NSA method to improve its accuracy.
Efficient Skeletonization of Volumetric Objects.
Zhou, Yong; Toga, Arthur W
1999-07-01
Skeletonization promises to become a powerful tool for compact shape description, path planning, and other applications. However, current techniques can seldom efficiently process real, complicated 3D data sets, such as MRI and CT data of human organs. In this paper, we present an efficient voxel-coding based algorithm for Skeletonization of 3D voxelized objects. The skeletons are interpreted as connected centerlines. consisting of sequences of medial points of consecutive clusters. These centerlines are initially extracted as paths of voxels, followed by medial point replacement, refinement, smoothness, and connection operations. The voxel-coding techniques have been proposed for each of these operations in a uniform and systematic fashion. In addition to preserving basic connectivity and centeredness, the algorithm is characterized by straightforward computation, no sensitivity to object boundary complexity, explicit extraction of ready-to-parameterize and branch-controlled skeletons, and efficient object hole detection. These issues are rarely discussed in traditional methods. A range of 3D medical MRI and CT data sets were used for testing the algorithm, demonstrating its utility.
Using speech recognition to enhance the Tongue Drive System functionality in computer access.
Huo, Xueliang; Ghovanloo, Maysam
2011-01-01
Tongue Drive System (TDS) is a wireless tongue operated assistive technology (AT), which can enable people with severe physical disabilities to access computers and drive powered wheelchairs using their volitional tongue movements. TDS offers six discrete commands, simultaneously available to the users, for pointing and typing as a substitute for mouse and keyboard in computer access, respectively. To enhance the TDS performance in typing, we have added a microphone, an audio codec, and a wireless audio link to its readily available 3-axial magnetic sensor array, and combined it with a commercially available speech recognition software, the Dragon Naturally Speaking, which is regarded as one of the most efficient ways for text entry. Our preliminary evaluations indicate that the combined TDS and speech recognition technologies can provide end users with significantly higher performance than using each technology alone, particularly in completing tasks that require both pointing and text entry, such as web surfing.
Efficient scatter model for simulation of ultrasound images from computed tomography data
NASA Astrophysics Data System (ADS)
D'Amato, J. P.; Lo Vercio, L.; Rubi, P.; Fernandez Vera, E.; Barbuzza, R.; Del Fresno, M.; Larrabide, I.
2015-12-01
Background and motivation: Real-time ultrasound simulation refers to the process of computationally creating fully synthetic ultrasound images instantly. Due to the high value of specialized low cost training for healthcare professionals, there is a growing interest in the use of this technology and the development of high fidelity systems that simulate the acquisitions of echographic images. The objective is to create an efficient and reproducible simulator that can run either on notebooks or desktops using low cost devices. Materials and methods: We present an interactive ultrasound simulator based on CT data. This simulator is based on ray-casting and provides real-time interaction capabilities. The simulation of scattering that is coherent with the transducer position in real time is also introduced. Such noise is produced using a simplified model of multiplicative noise and convolution with point spread functions (PSF) tailored for this purpose. Results: The computational efficiency of scattering maps generation was revised with an improved performance. This allowed a more efficient simulation of coherent scattering in the synthetic echographic images while providing highly realistic result. We describe some quality and performance metrics to validate these results, where a performance of up to 55fps was achieved. Conclusion: The proposed technique for real-time scattering modeling provides realistic yet computationally efficient scatter distributions. The error between the original image and the simulated scattering image was compared for the proposed method and the state-of-the-art, showing negligible differences in its distribution.
Efficient Open Source Lidar for Desktop Users
NASA Astrophysics Data System (ADS)
Flanagan, Jacob P.
Lidar --- Light Detection and Ranging --- is a remote sensing technology that utilizes a device similar to a rangefinder to determine a distance to a target. A laser pulse is shot at an object and the time it takes for the pulse to return in measured. The distance to the object is easily calculated using the speed property of light. For lidar, this laser is moved (primarily in a rotational movement usually accompanied by a translational movement) and records the distances to objects several thousands of times per second. From this, a 3 dimensional structure can be procured in the form of a point cloud. A point cloud is a collection of 3 dimensional points with at least an x, a y and a z attribute. These 3 attributes represent the position of a single point in 3 dimensional space. Other attributes can be associated with the points that include properties such as the intensity of the return pulse, the color of the target or even the time the point was recorded. Another very useful, post processed attribute is point classification where a point is associated with the type of object the point represents (i.e. ground.). Lidar has gained popularity and advancements in the technology has made its collection easier and cheaper creating larger and denser datasets. The need to handle this data in a more efficiently manner has become a necessity; The processing, visualizing or even simply loading lidar can be computationally intensive due to its very large size. Standard remote sensing and geographical information systems (GIS) software (ENVI, ArcGIS, etc.) was not originally built for optimized point cloud processing and its implementation is an afterthought and therefore inefficient. Newer, more optimized software for point cloud processing (QTModeler, TopoDOT, etc.) usually lack more advanced processing tools, requires higher end computers and are very costly. Existing open source lidar approaches the loading and processing of lidar in an iterative fashion that requires implementing batch coding and processing time that could take months for a standard lidar dataset. This project attempts to build a software with the best approach for creating, importing and exporting, manipulating and processing lidar, especially in the environmental field. Development of this software is described in 3 sections - (1) explanation of the search methods for efficiently extracting the "area of interest" (AOI) data from disk (file space), (2) using file space (for storage), budgeting memory space (for efficient processing) and moving between the two, and (3) method development for creating lidar products (usually raster based) used in environmental modeling and analysis (i.e.: hydrology feature extraction, geomorphological studies, ecology modeling, etc.).
NASA Astrophysics Data System (ADS)
Xu, Jincheng; Liu, Wei; Wang, Jin; Liu, Linong; Zhang, Jianfeng
2018-02-01
De-absorption pre-stack time migration (QPSTM) compensates for the absorption and dispersion of seismic waves by introducing an effective Q parameter, thereby making it an effective tool for 3D, high-resolution imaging of seismic data. Although the optimal aperture obtained via stationary-phase migration reduces the computational cost of 3D QPSTM and yields 3D stationary-phase QPSTM, the associated computational efficiency is still the main problem in the processing of 3D, high-resolution images for real large-scale seismic data. In the current paper, we proposed a division method for large-scale, 3D seismic data to optimize the performance of stationary-phase QPSTM on clusters of graphics processing units (GPU). Then, we designed an imaging point parallel strategy to achieve an optimal parallel computing performance. Afterward, we adopted an asynchronous double buffering scheme for multi-stream to perform the GPU/CPU parallel computing. Moreover, several key optimization strategies of computation and storage based on the compute unified device architecture (CUDA) were adopted to accelerate the 3D stationary-phase QPSTM algorithm. Compared with the initial GPU code, the implementation of the key optimization steps, including thread optimization, shared memory optimization, register optimization and special function units (SFU), greatly improved the efficiency. A numerical example employing real large-scale, 3D seismic data showed that our scheme is nearly 80 times faster than the CPU-QPSTM algorithm. Our GPU/CPU heterogeneous parallel computing framework significant reduces the computational cost and facilitates 3D high-resolution imaging for large-scale seismic data.
NASA Technical Reports Server (NTRS)
Ramins, P.; Force, D. A.; Palmer, R. W.; Dayton, J. A., Jr.; Kosmahl, H. G.
1986-01-01
A computational procedure for the design of TWT-refocuser-MDC systems was used to design a short 'dynamic' refocusing system and highly efficient four-stage depressed collector for a 200-W 8-18-GHz TWT. The computations were carried out with advanced multidimensional computer programs which model the electron beam as a series of disks of charge and follow their trajectories from the RF input of the TWT, through the slow-wave structure and refocusing section, to their points of impact in the depressed collector. Secondary emission losses in the MDC were treated semiquantitatively by injecting a representative beam of secondary electrons into the MDC analysis at the point of impact of each primary beam. A comparison of computed and measured TWT and MDC performance showed very good agreement. The electrodes of the MDC were fabricated from a particular form of isotropic graphite that was selected for its low secondary electron yield, thermal expansion characteristics, ease of machinability and vacuum properties. This MDC was tested at CW for more than 1000 h with negligible degradation in TWT and MDC performances.
Robust Vision-Based Pose Estimation Algorithm for AN Uav with Known Gravity Vector
NASA Astrophysics Data System (ADS)
Kniaz, V. V.
2016-06-01
Accurate estimation of camera external orientation with respect to a known object is one of the central problems in photogrammetry and computer vision. In recent years this problem is gaining an increasing attention in the field of UAV autonomous flight. Such application requires a real-time performance and robustness of the external orientation estimation algorithm. The accuracy of the solution is strongly dependent on the number of reference points visible on the given image. The problem only has an analytical solution if 3 or more reference points are visible. However, in limited visibility conditions it is often needed to perform external orientation with only 2 visible reference points. In such case the solution could be found if the gravity vector direction in the camera coordinate system is known. A number of algorithms for external orientation estimation for the case of 2 known reference points and a gravity vector were developed to date. Most of these algorithms provide analytical solution in the form of polynomial equation that is subject to large errors in the case of complex reference points configurations. This paper is focused on the development of a new computationally effective and robust algorithm for external orientation based on positions of 2 known reference points and a gravity vector. The algorithm implementation for guidance of a Parrot AR.Drone 2.0 micro-UAV is discussed. The experimental evaluation of the algorithm proved its computational efficiency and robustness against errors in reference points positions and complex configurations.
Comparison of Nonequilibrium Solution Algorithms Applied to Chemically Stiff Hypersonic Flows
NASA Technical Reports Server (NTRS)
Palmer, Grant; Venkatapathy, Ethiraj
1995-01-01
Three solution algorithms, explicit under-relaxation, point implicit, and lower-upper symmetric Gauss-Seidel, are used to compute nonequilibrium flow around the Apollo 4 return capsule at the 62-km altitude point in its descent trajectory. By varying the Mach number, the efficiency and robustness of the solution algorithms were tested for different levels of chemical stiffness.The performance of the solution algorithms degraded as the Mach number and stiffness of the flow increased. At Mach 15 and 30, the lower-upper symmetric Gauss-Seidel method produces an eight order of magnitude drop in the energy residual in one-third to one-half the Cray C-90 computer time as compared to the point implicit and explicit under-relaxation methods. The explicit under-relaxation algorithm experienced convergence difficulties at Mach 30 and above. At Mach 40 the performance of the lower-upper symmetric Gauss-Seidel algorithm deteriorates to the point that it is out performed by the point implicit method. The effects of the viscous terms are investigated. Grid dependency questions are explored.
Finite element computation on nearest neighbor connected machines
NASA Technical Reports Server (NTRS)
Mcaulay, A. D.
1984-01-01
Research aimed at faster, more cost effective parallel machines and algorithms for improving designer productivity with finite element computations is discussed. A set of 8 boards, containing 4 nearest neighbor connected arrays of commercially available floating point chips and substantial memory, are inserted into a commercially available machine. One-tenth Mflop (64 bit operation) processors provide an 89% efficiency when solving the equations arising in a finite element problem for a single variable regular grid of size 40 by 40 by 40. This is approximately 15 to 20 times faster than a much more expensive machine such as a VAX 11/780 used in double precision. The efficiency falls off as faster or more processors are envisaged because communication times become dominant. A novel successive overrelaxation algorithm which uses cyclic reduction in order to permit data transfer and computation to overlap in time is proposed.
Precise and fast spatial-frequency analysis using the iterative local Fourier transform.
Lee, Sukmock; Choi, Heejoo; Kim, Dae Wook
2016-09-19
The use of the discrete Fourier transform has decreased since the introduction of the fast Fourier transform (fFT), which is a numerically efficient computing process. This paper presents the iterative local Fourier transform (ilFT), a set of new processing algorithms that iteratively apply the discrete Fourier transform within a local and optimal frequency domain. The new technique achieves 210 times higher frequency resolution than the fFT within a comparable computation time. The method's superb computing efficiency, high resolution, spectrum zoom-in capability, and overall performance are evaluated and compared to other advanced high-resolution Fourier transform techniques, such as the fFT combined with several fitting methods. The effectiveness of the ilFT is demonstrated through the data analysis of a set of Talbot self-images (1280 × 1024 pixels) obtained with an experimental setup using grating in a diverging beam produced by a coherent point source.
Parametric Human Body Reconstruction Based on Sparse Key Points.
Cheng, Ke-Li; Tong, Ruo-Feng; Tang, Min; Qian, Jing-Ye; Sarkis, Michel
2016-11-01
We propose an automatic parametric human body reconstruction algorithm which can efficiently construct a model using a single Kinect sensor. A user needs to stand still in front of the sensor for a couple of seconds to measure the range data. The user's body shape and pose will then be automatically constructed in several seconds. Traditional methods optimize dense correspondences between range data and meshes. In contrast, our proposed scheme relies on sparse key points for the reconstruction. It employs regression to find the corresponding key points between the scanned range data and some annotated training data. We design two kinds of feature descriptors as well as corresponding regression stages to make the regression robust and accurate. Our scheme follows with dense refinement where a pre-factorization method is applied to improve the computational efficiency. Compared with other methods, our scheme achieves similar reconstruction accuracy but significantly reduces runtime.
Face pose tracking using the four-point algorithm
NASA Astrophysics Data System (ADS)
Fung, Ho Yin; Wong, Kin Hong; Yu, Ying Kin; Tsui, Kwan Pang; Kam, Ho Chuen
2017-06-01
In this paper, we have developed an algorithm to track the pose of a human face robustly and efficiently. Face pose estimation is very useful in many applications such as building virtual reality systems and creating an alternative input method for the disabled. Firstly, we have modified a face detection toolbox called DLib for the detection of a face in front of a camera. The detected face features are passed to a pose estimation method, known as the four-point algorithm, for pose computation. The theory applied and the technical problems encountered during system development are discussed in the paper. It is demonstrated that the system is able to track the pose of a face in real time using a consumer grade laptop computer.
NASA Astrophysics Data System (ADS)
Meng, Zeng; Yang, Dixiong; Zhou, Huanlin; Yu, Bo
2018-05-01
The first order reliability method has been extensively adopted for reliability-based design optimization (RBDO), but it shows inaccuracy in calculating the failure probability with highly nonlinear performance functions. Thus, the second order reliability method is required to evaluate the reliability accurately. However, its application for RBDO is quite challenge owing to the expensive computational cost incurred by the repeated reliability evaluation and Hessian calculation of probabilistic constraints. In this article, a new improved stability transformation method is proposed to search the most probable point efficiently, and the Hessian matrix is calculated by the symmetric rank-one update. The computational capability of the proposed method is illustrated and compared to the existing RBDO approaches through three mathematical and two engineering examples. The comparison results indicate that the proposed method is very efficient and accurate, providing an alternative tool for RBDO of engineering structures.
An efficient HZETRN (a galactic cosmic ray transport code)
NASA Technical Reports Server (NTRS)
Shinn, Judy L.; Wilson, John W.
1992-01-01
An accurate and efficient engineering code for analyzing the shielding requirements against the high-energy galactic heavy ions is needed. The HZETRN is a deterministic code developed at Langley Research Center that is constantly under improvement both in physics and numerical computation and is targeted for such use. One problem area connected with the space-marching technique used in this code is the propagation of the local truncation error. By improving the numerical algorithms for interpolation, integration, and grid distribution formula, the efficiency of the code is increased by a factor of eight as the number of energy grid points is reduced. The numerical accuracy of better than 2 percent for a shield thickness of 150 g/cm(exp 2) is found when a 45 point energy grid is used. The propagating step size, which is related to the perturbation theory, is also reevaluated.
Efficient Boundary Extraction of BSP Solids Based on Clipping Operations.
Wang, Charlie C L; Manocha, Dinesh
2013-01-01
We present an efficient algorithm to extract the manifold surface that approximates the boundary of a solid represented by a Binary Space Partition (BSP) tree. Our polygonization algorithm repeatedly performs clipping operations on volumetric cells that correspond to a spatial convex partition and computes the boundary by traversing the connected cells. We use point-based representations along with finite-precision arithmetic to improve the efficiency and generate the B-rep approximation of a BSP solid. The core of our polygonization method is a novel clipping algorithm that uses a set of logical operations to make it resistant to degeneracies resulting from limited precision of floating-point arithmetic. The overall BSP to B-rep conversion algorithm can accurately generate boundaries with sharp and small features, and is faster than prior methods. At the end of this paper, we use this algorithm for a few geometric processing applications including Boolean operations, model repair, and mesh reconstruction.
A GENERAL ALGORITHM FOR THE CONSTRUCTION OF CONTOUR PLOTS
NASA Technical Reports Server (NTRS)
Johnson, W.
1994-01-01
The graphical presentation of experimentally or theoretically generated data sets frequently involves the construction of contour plots. A general computer algorithm has been developed for the construction of contour plots. The algorithm provides for efficient and accurate contouring with a modular approach which allows flexibility in modifying the algorithm for special applications. The algorithm accepts as input data values at a set of points irregularly distributed over a plane. The algorithm is based on an interpolation scheme in which the points in the plane are connected by straight line segments to form a set of triangles. In general, the data is smoothed using a least-squares-error fit of the data to a bivariate polynomial. To construct the contours, interpolation along the edges of the triangles is performed, using the bivariable polynomial if data smoothing was performed. Once the contour points have been located, the contour may be drawn. This program is written in FORTRAN IV for batch execution and has been implemented on an IBM 360 series computer with a central memory requirement of approximately 100K of 8-bit bytes. This computer algorithm was developed in 1981.
Managing Information On Technical Requirements
NASA Technical Reports Server (NTRS)
Mauldin, Lemuel E., III; Hammond, Dana P.
1993-01-01
Technical Requirements Analysis and Control Systems/Initial Operating Capability (TRACS/IOC) computer program provides supplemental software tools for analysis, control, and interchange of project requirements so qualified project members have access to pertinent project information, even if in different locations. Enables users to analyze and control requirements, serves as focal point for project requirements, and integrates system supporting efficient and consistent operations. TRACS/IOC is HyperCard stack for use on Macintosh computers running HyperCard 1.2 or later and Oracle 1.2 or later.
2011-02-01
expected, with increased loading (or reduced axial -chord to pitch ratio for a given turning). In addition to minimizing design-point loss due to...5 Figure 2. Computed loading diagrams and Reynolds lapse rates for aft- (L1A) and mid- loaded (L1M) LPT blading (Clark et al., 2009...reference 22 in Welch, 2010) accomplishing the same 95° flow turning at high aerodynamic loading (Z = 1.34). .................8 Figure 3. Computed 2-D
Cubic spline numerical solution of an ablation problem with convective backface cooling
NASA Astrophysics Data System (ADS)
Lin, S.; Wang, P.; Kahawita, R.
1984-08-01
An implicit numerical technique using cubic splines is presented for solving an ablation problem on a thin wall with convective cooling. A non-uniform computational mesh with 6 grid points has been used for the numerical integration. The method has been found to be computationally efficient, providing for the care under consideration of an overall error of about 1 percent. The results obtained indicate that the convective cooling is an important factor in reducing the ablation thickness.
Program finds centrifugal compressor operating point
DOE Office of Scientific and Technical Information (OSTI.GOV)
Campos, M.C.M.M.; Rodrigues, P.S.B.
1990-09-01
This article presents the Scop program, a computational procedure developed using Fortran 77 language to find the operating point of centrifugal compressors starting from performance curves. Characteristics or performance curves traditionally are employed by manufacturers to inform users about turbocompressor behavior. Usually, these curves have polytropic head, H, and corresponding polytropic efficiency, {eta} plus rotation speed, N, and inlet volumetric flowrate, Q, as parameters. Two families of curves can be identified in this figure. One provides head-flow relationships for several speeds and the other refers to isoefficiency curves.
Pricing and simulation for real estate index options: Radial basis point interpolation
NASA Astrophysics Data System (ADS)
Gong, Pu; Zou, Dong; Wang, Jiayue
2018-06-01
This study employs the meshfree radial basis point interpolation (RBPI) for pricing real estate derivatives contingent on real estate index. This method combines radial and polynomial basis functions, which can guarantee the interpolation scheme with Kronecker property and effectively improve accuracy. An exponential change of variables, a mesh refinement algorithm and the Richardson extrapolation are employed in this study to implement the RBPI. Numerical results are presented to examine the computational efficiency and accuracy of our method.
Efficient biprediction decision scheme for fast high efficiency video coding encoding
NASA Astrophysics Data System (ADS)
Park, Sang-hyo; Lee, Seung-ho; Jang, Euee S.; Jun, Dongsan; Kang, Jung-Won
2016-11-01
An efficient biprediction decision scheme of high efficiency video coding (HEVC) is proposed for fast-encoding applications. For low-delay video applications, bidirectional prediction can be used to increase compression performance efficiently with previous reference frames. However, at the same time, the computational complexity of the HEVC encoder is significantly increased due to the additional biprediction search. Although a some research has attempted to reduce this complexity, whether the prediction is strongly related to both motion complexity and prediction modes in a coding unit has not yet been investigated. A method that avoids most compression-inefficient search points is proposed so that the computational complexity of the motion estimation process can be dramatically decreased. To determine if biprediction is critical, the proposed method exploits the stochastic correlation of the context of prediction units (PUs): the direction of a PU and the accuracy of a motion vector. Through experimental results, the proposed method showed that the time complexity of biprediction can be reduced to 30% on average, outperforming existing methods in view of encoding time, number of function calls, and memory access.
NASA Astrophysics Data System (ADS)
Havu, Vile; Blum, Volker; Scheffler, Matthias
2007-03-01
Numeric atom-centered local orbitals (NAO) are efficient basis sets for all-electron electronic structure theory. The locality of NAO's can be exploited to render (in principle) all operations of the self-consistency cycle O(N). This is straightforward for 3D integrals using domain decomposition into spatially close subsets of integration points, enabling critical computational savings that are effective from ˜tens of atoms (no significant overhead for smaller systems) and make large systems (100s of atoms) computationally feasible. Using a new all-electron NAO-based code,^1 we investigate the quantitative impact of exploiting this locality on two distinct classes of systems: Large light-element molecules [Alanine-based polypeptide chains (Ala)n], and compact transition metal clusters. Strict NAO locality is achieved by imposing a cutoff potential with an onset radius rc, and exploited by appropriately shaped integration domains (subsets of integration points). Conventional tight rc<= 3å have no measurable accuracy impact in (Ala)n, but introduce inaccuracies of 20-30 meV/atom in Cun. The domain shape impacts the computational effort by only 10-20 % for reasonable rc. ^1 V. Blum, R. Gehrke, P. Havu, V. Havu, M. Scheffler, The FHI Ab Initio Molecular Simulations (aims) Project, Fritz-Haber-Institut, Berlin (2006).
Liu, Derek; Sloboda, Ron S
2014-05-01
Boyer and Mok proposed a fast calculation method employing the Fourier transform (FT), for which calculation time is independent of the number of seeds but seed placement is restricted to calculation grid points. Here an interpolation method is described enabling unrestricted seed placement while preserving the computational efficiency of the original method. The Iodine-125 seed dose kernel was sampled and selected values were modified to optimize interpolation accuracy for clinically relevant doses. For each seed, the kernel was shifted to the nearest grid point via convolution with a unit impulse, implemented in the Fourier domain. The remaining fractional shift was performed using a piecewise third-order Lagrange filter. Implementation of the interpolation method greatly improved FT-based dose calculation accuracy. The dose distribution was accurate to within 2% beyond 3 mm from each seed. Isodose contours were indistinguishable from explicit TG-43 calculation. Dose-volume metric errors were negligible. Computation time for the FT interpolation method was essentially the same as Boyer's method. A FT interpolation method for permanent prostate brachytherapy TG-43 dose calculation was developed which expands upon Boyer's original method and enables unrestricted seed placement. The proposed method substantially improves the clinically relevant dose accuracy with negligible additional computation cost, preserving the efficiency of the original method.
NASA Astrophysics Data System (ADS)
Christensen, David B.; Basaeri, Hamid; Roundy, Shad
2017-12-01
In acoustic power transfer systems, a receiver is displaced from a transmitter by an axial depth, a lateral offset (alignment), and a rotation angle (orientation). In systems where the receiver’s position is not fixed, such as a receiver implanted in biological tissue, slight variations in depth, orientation, or alignment can cause significant variations in the received voltage and power. To address this concern, this paper presents a computationally efficient technique to model the effects of depth, orientation, and alignment via ray tracing (DOART) on received voltage and power in acoustic power transfer systems. DOART combines transducer circuit equivalent models, a modified version of Huygens principle, and ray tracing to simulate pressure wave propagation and reflection between a transmitter and a receiver in a homogeneous medium. A reflected grid method is introduced to calculate propagation distances, reflection coefficients, and initial vectors between a point on the transmitter and a point on the receiver for an arbitrary number of reflections. DOART convergence and simulation time per data point is discussed as a function of the number of reflections and elements chosen. Finally, experimental data is compared to DOART simulation data in terms of magnitude and shape of the received voltage signal.
Efficient visibility encoding for dynamic illumination in direct volume rendering.
Kronander, Joel; Jönsson, Daniel; Löw, Joakim; Ljung, Patric; Ynnerman, Anders; Unger, Jonas
2012-03-01
We present an algorithm that enables real-time dynamic shading in direct volume rendering using general lighting, including directional lights, point lights, and environment maps. Real-time performance is achieved by encoding local and global volumetric visibility using spherical harmonic (SH) basis functions stored in an efficient multiresolution grid over the extent of the volume. Our method enables high-frequency shadows in the spatial domain, but is limited to a low-frequency approximation of visibility and illumination in the angular domain. In a first pass, level of detail (LOD) selection in the grid is based on the current transfer function setting. This enables rapid online computation and SH projection of the local spherical distribution of visibility information. Using a piecewise integration of the SH coefficients over the local regions, the global visibility within the volume is then computed. By representing the light sources using their SH projections, the integral over lighting, visibility, and isotropic phase functions can be efficiently computed during rendering. The utility of our method is demonstrated in several examples showing the generality and interactive performance of the approach.
Paraskevopoulou, Sivylla E; Barsakcioglu, Deren Y; Saberi, Mohammed R; Eftekhar, Amir; Constandinou, Timothy G
2013-04-30
Next generation neural interfaces aspire to achieve real-time multi-channel systems by integrating spike sorting on chip to overcome limitations in communication channel capacity. The feasibility of this approach relies on developing highly efficient algorithms for feature extraction and clustering with the potential of low-power hardware implementation. We are proposing a feature extraction method, not requiring any calibration, based on first and second derivative features of the spike waveform. The accuracy and computational complexity of the proposed method are quantified and compared against commonly used feature extraction methods, through simulation across four datasets (with different single units) at multiple noise levels (ranging from 5 to 20% of the signal amplitude). The average classification error is shown to be below 7% with a computational complexity of 2N-3, where N is the number of sample points of each spike. Overall, this method presents a good trade-off between accuracy and computational complexity and is thus particularly well-suited for hardware-efficient implementation. Copyright © 2013 Elsevier B.V. All rights reserved.
SASS: A symmetry adapted stochastic search algorithm exploiting site symmetry
NASA Astrophysics Data System (ADS)
Wheeler, Steven E.; Schleyer, Paul v. R.; Schaefer, Henry F.
2007-03-01
A simple symmetry adapted search algorithm (SASS) exploiting point group symmetry increases the efficiency of systematic explorations of complex quantum mechanical potential energy surfaces. In contrast to previously described stochastic approaches, which do not employ symmetry, candidate structures are generated within simple point groups, such as C2, Cs, and C2v. This facilitates efficient sampling of the 3N-6 Pople's dimensional configuration space and increases the speed and effectiveness of quantum chemical geometry optimizations. Pople's concept of framework groups [J. Am. Chem. Soc. 102, 4615 (1980)] is used to partition the configuration space into structures spanning all possible distributions of sets of symmetry equivalent atoms. This provides an efficient means of computing all structures of a given symmetry with minimum redundancy. This approach also is advantageous for generating initial structures for global optimizations via genetic algorithm and other stochastic global search techniques. Application of the SASS method is illustrated by locating 14 low-lying stationary points on the cc-pwCVDZ ROCCSD(T) potential energy surface of Li5H2. The global minimum structure is identified, along with many unique, nonintuitive, energetically favorable isomers.
An approach for the regularization of a power flow solution around the maximum loading point
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kataoka, Y.
1992-08-01
In the conventional power flow solution, the boundary conditions are directly specified by active power and reactive power at each node, so that the singular point coincided with the maximum loading point. For this reason, the computations are often disturbed by ill-condition. This paper proposes a new method for getting the wide-range regularity by giving some modifications to the conventional power flow solution method, thereby eliminating the singular point or shifting it to the region with the voltage lower than that of the maximum loading point. Then, the continuous execution of V-P curves including maximum loading point is realized. Themore » efficiency and effectiveness of the method are tested in practical 598-nodes system in comparison with the conventional method.« less
Banerjee, Jayeeta; Bhattacharyya, Moushum
2011-12-01
There is a rapid shifting of media: from printed paper to computer screens. This transition is modifying the process of how we read and understand text. The efficiency of reading is dependent on how ergonomically the visual information is presented. Font types and size characteristics have been shown to affect reading. A detailed investigation of the effect of the font type and size on reading on computer screens has been carried out by using subjective, objective and physiological evaluation methods on young adults. A group of young participants volunteered for this study. Two types of fonts were used: Serif fonts (Times New Roman, Georgia, Courier New) and Sans serif fonts (Verdana, Arial, Tahoma). All fonts were presented in 10, 12 and 14 point sizes. This study used a 6 X 3 (font type X size) design matrix. Participants read 18 passages of approximately the same length and reading level on a computer monitor. Reading time, ranking and overall mental workload were measured. Eye movements were recorded by a binocular eye movement recorder. Reading time was minimum for Courier New l4 point. The participants' ranking was highest and mental workload was least for Verdana 14 point. The pupil diameter, fixation duration and gaze duration were least for Courier New 14 point. The present study recommends using 14 point sized fonts for reading on computer screen. Courier New is recommended for fast reading while for on screen presentation Verdana is recommended. The outcome of this study will help as a guideline to all the PC users, software developers, web page designers and computer industry as a whole.
GW/Bethe-Salpeter calculations for charged and model systems from real-space DFT
NASA Astrophysics Data System (ADS)
Strubbe, David A.
GW and Bethe-Salpeter (GW/BSE) calculations use mean-field input from density-functional theory (DFT) calculations to compute excited states of a condensed-matter system. Many parts of a GW/BSE calculation are efficiently performed in a plane-wave basis, and extensive effort has gone into optimizing and parallelizing plane-wave GW/BSE codes for large-scale computations. Most straightforwardly, plane-wave DFT can be used as a starting point, but real-space DFT is also an attractive starting point: it is systematically convergeable like plane waves, can take advantage of efficient domain parallelization for large systems, and is well suited physically for finite and especially charged systems. The flexibility of a real-space grid also allows convenient calculations on non-atomic model systems. I will discuss the interfacing of a real-space (TD)DFT code (Octopus, www.tddft.org/programs/octopus) with a plane-wave GW/BSE code (BerkeleyGW, www.berkeleygw.org), consider performance issues and accuracy, and present some applications to simple and paradigmatic systems that illuminate fundamental properties of these approximations in many-body perturbation theory.
A straightforward method to compute average stochastic oscillations from data samples.
Júlvez, Jorge
2015-10-19
Many biological systems exhibit sustained stochastic oscillations in their steady state. Assessing these oscillations is usually a challenging task due to the potential variability of the amplitude and frequency of the oscillations over time. As a result of this variability, when several stochastic replications are averaged, the oscillations are flattened and can be overlooked. This can easily lead to the erroneous conclusion that the system reaches a constant steady state. This paper proposes a straightforward method to detect and asses stochastic oscillations. The basis of the method is in the use of polar coordinates for systems with two species, and cylindrical coordinates for systems with more than two species. By slightly modifying these coordinate systems, it is possible to compute the total angular distance run by the system and the average Euclidean distance to a reference point. This allows us to compute confidence intervals, both for the average angular speed and for the distance to a reference point, from a set of replications. The use of polar (or cylindrical) coordinates provides a new perspective of the system dynamics. The mean trajectory that can be obtained by averaging the usual cartesian coordinates of the samples informs about the trajectory of the center of mass of the replications. In contrast to such a mean cartesian trajectory, the mean polar trajectory can be used to compute the average circular motion of those replications, and therefore, can yield evidence about sustained steady state oscillations. Both, the coordinate transformation and the computation of confidence intervals, can be carried out efficiently. This results in an efficient method to evaluate stochastic oscillations.
NASA Technical Reports Server (NTRS)
Cline, M. C.
1981-01-01
A computer program, VNAP2, for calculating turbulent (as well as laminar and inviscid), steady, and unsteady flow is presented. It solves the two dimensional, time dependent, compressible Navier-Stokes equations. The turbulence is modeled with either an algebraic mixing length model, a one equation model, or the Jones-Launder two equation model. The geometry may be a single or a dual flowing stream. The interior grid points are computed using the unsplit MacCormack scheme. Two options to speed up the calculations for high Reynolds number flows are included. The boundary grid points are computed using a reference plane characteristic scheme with the viscous terms treated as source functions. An explicit artificial viscosity is included for shock computations. The fluid is assumed to be a perfect gas. The flow boundaries may be arbitrary curved solid walls, inflow/outflow boundaries, or free jet envelopes. Typical problems that can be solved concern nozzles, inlets, jet powered afterbodies, airfoils, and free jet expansions. The accuracy and efficiency of the program are shown by calculations of several inviscid and turbulent flows. The program and its use are described completely, and six sample cases and a code listing are included.
A superellipsoid-plane model for simulating foot-ground contact during human gait.
Lopes, D S; Neptune, R R; Ambrósio, J A; Silva, M T
2016-01-01
Musculoskeletal models and forward dynamics simulations of human movement often include foot-ground interactions, with the foot-ground contact forces often determined using a constitutive model that depends on material properties and contact kinematics. When using soft constraints to model the foot-ground interactions, the kinematics of the minimum distance between the foot and planar ground needs to be computed. Due to their geometric simplicity, a considerable number of studies have used point-plane elements to represent these interacting bodies, but few studies have provided comparisons between point contact elements and other geometrically based analytical solutions. The objective of this work was to develop a more general-purpose superellipsoid-plane contact model that can be used to determine the three-dimensional foot-ground contact forces. As an example application, the model was used in a forward dynamics simulation of human walking. Simulation results and execution times were compared with a point-like viscoelastic contact model. Both models produced realistic ground reaction forces and kinematics with similar computational efficiency. However, solving the equations of motion with the surface contact model was found to be more efficient (~18% faster), and on average numerically ~37% less stiff. The superellipsoid-plane elements are also more versatile than point-like elements in that they allow for volumetric contact during three-dimensional motions (e.g. rotating, rolling, and sliding). In addition, the superellipsoid-plane element is geometrically accurate and easily integrated within multibody simulation code. These advantages make the use of superellipsoid-plane contact models in musculoskeletal simulations an appealing alternative to point-like elements.
Programming the Navier-Stokes computer: An abstract machine model and a visual editor
NASA Technical Reports Server (NTRS)
Middleton, David; Crockett, Tom; Tomboulian, Sherry
1988-01-01
The Navier-Stokes computer is a parallel computer designed to solve Computational Fluid Dynamics problems. Each processor contains several floating point units which can be configured under program control to implement a vector pipeline with several inputs and outputs. Since the development of an effective compiler for this computer appears to be very difficult, machine level programming seems necessary and support tools for this process have been studied. These support tools are organized into a graphical program editor. A programming process is described by which appropriate computations may be efficiently implemented on the Navier-Stokes computer. The graphical editor would support this programming process, verifying various programmer choices for correctness and deducing values such as pipeline delays and network configurations. Step by step details are provided and demonstrated with two example programs.
MUTILS - a set of efficient modeling tools for multi-core CPUs implemented in MEX
NASA Astrophysics Data System (ADS)
Krotkiewski, Marcin; Dabrowski, Marcin
2013-04-01
The need for computational performance is common in scientific applications, and in particular in numerical simulations, where high resolution models require efficient processing of large amounts of data. Especially in the context of geological problems the need to increase the model resolution to resolve physical and geometrical complexities seems to have no limits. Alas, the performance of new generations of CPUs does not improve any longer by simply increasing clock speeds. Current industrial trends are to increase the number of computational cores. As a result, parallel implementations are required in order to fully utilize the potential of new processors, and to study more complex models. We target simulations on small to medium scale shared memory computers: laptops and desktop PCs with ~8 CPU cores and up to tens of GB of memory to high-end servers with ~50 CPU cores and hundereds of GB of memory. In this setting MATLAB is often the environment of choice for scientists that want to implement their own models with little effort. It is a useful general purpose mathematical software package, but due to its versatility some of its functionality is not as efficient as it could be. In particular, the challanges of modern multi-core architectures are not fully addressed. We have developed MILAMIN 2 - an efficient FEM modeling environment written in native MATLAB. Amongst others, MILAMIN provides functions to define model geometry, generate and convert structured and unstructured meshes (also through interfaces to external mesh generators), compute element and system matrices, apply boundary conditions, solve the system of linear equations, address non-linear and transient problems, and perform post-processing. MILAMIN strives to combine the ease of code development and the computational efficiency. Where possible, the code is optimized and/or parallelized within the MATLAB framework. Native MATLAB is augmented with the MUTILS library - a set of MEX functions that implement the computationally intensive, performance critical parts of the code, which we have identified to be bottlenecks. Here, we discuss the functionality and performance of the MUTILS library. Currently, it includes: 1. time and memory efficient assembly of sparse matrices for FEM simulations 2. parallel sparse matrix - vector product with optimizations speficic to symmetric matrices and multiple degrees of freedom per node 3. parallel point in triangle location and point in tetrahedron location for unstructured, adaptive 2D and 3D meshes (useful for 'marker in cell' type of methods) 4. parallel FEM interpolation for 2D and 3D meshes of elements of different types and orders, and for different number of degrees of freedom per node 5. a stand-alone, MEX implementation of the Conjugate Gradients iterative solver 6. interface to METIS graph partitioning and a fast implementation of RCM reordering
Efficient Implementation of an Optimal Interpolator for Large Spatial Data Sets
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess; Mount, David M.
2007-01-01
Interpolating scattered data points is a problem of wide ranging interest. A number of approaches for interpolation have been proposed both from theoretical domains such as computational geometry and in applications' fields such as geostatistics. Our motivation arises from geological and mining applications. In many instances data can be costly to compute and are available only at nonuniformly scattered positions. Because of the high cost of collecting measurements, high accuracy is required in the interpolants. One of the most popular interpolation methods in this field is called ordinary kriging. It is popular because it is a best linear unbiased estimator. The price for its statistical optimality is that the estimator is computationally very expensive. This is because the value of each interpolant is given by the solution of a large dense linear system. In practice, kriging problems have been solved approximately by restricting the domain to a small local neighborhood of points that lie near the query point. Determining the proper size for this neighborhood is a solved by ad hoc methods, and it has been shown that this approach leads to undesirable discontinuities in the interpolant. Recently a more principled approach to approximating kriging has been proposed based on a technique called covariance tapering. This process achieves its efficiency by replacing the large dense kriging system with a much sparser linear system. This technique has been applied to a restriction of our problem, called simple kriging, which is not unbiased for general data sets. In this paper we generalize these results by showing how to apply covariance tapering to the more general problem of ordinary kriging. Through experimentation we demonstrate the space and time efficiency and accuracy of approximating ordinary kriging through the use of covariance tapering combined with iterative methods for solving large sparse systems. We demonstrate our approach on large data sizes arising both from synthetic sources and from real applications.
NASA Astrophysics Data System (ADS)
Ferrando, N.; Gosálvez, M. A.; Cerdá, J.; Gadea, R.; Sato, K.
2011-03-01
Presently, dynamic surface-based models are required to contain increasingly larger numbers of points and to propagate them over longer time periods. For large numbers of surface points, the octree data structure can be used as a balance between low memory occupation and relatively rapid access to the stored data. For evolution rules that depend on neighborhood states, extended simulation periods can be obtained by using simplified atomistic propagation models, such as the Cellular Automata (CA). This method, however, has an intrinsic parallel updating nature and the corresponding simulations are highly inefficient when performed on classical Central Processing Units (CPUs), which are designed for the sequential execution of tasks. In this paper, a series of guidelines is presented for the efficient adaptation of octree-based, CA simulations of complex, evolving surfaces into massively parallel computing hardware. A Graphics Processing Unit (GPU) is used as a cost-efficient example of the parallel architectures. For the actual simulations, we consider the surface propagation during anisotropic wet chemical etching of silicon as a computationally challenging process with a wide-spread use in microengineering applications. A continuous CA model that is intrinsically parallel in nature is used for the time evolution. Our study strongly indicates that parallel computations of dynamically evolving surfaces simulated using CA methods are significantly benefited by the incorporation of octrees as support data structures, substantially decreasing the overall computational time and memory usage.
NASA Astrophysics Data System (ADS)
Mössinger, Peter; Jester-Zürker, Roland; Jung, Alexander
2015-01-01
Numerical investigations of hydraulic turbo machines under steady-state conditions are state of the art in current product development processes. Nevertheless allow increasing computational resources refined discretization methods, more sophisticated turbulence models and therefore better predictions of results as well as the quantification of existing uncertainties. Single stage investigations are done using in-house tools for meshing and set-up procedure. Beside different model domains and a mesh study to reduce mesh dependencies, the variation of several eddy viscosity and Reynolds stress turbulence models are investigated. All obtained results are compared with available model test data. In addition to global values, measured magnitudes in the vaneless space, at runner blade and draft tube positions in term of pressure and velocity are considered. From there it is possible to estimate the influence and relevance of various model domains depending on different operating points and numerical variations. Good agreement can be found for pressure and velocity measurements with all model configurations and, except the BSL-RSM model, all turbulence models. At part load, deviations in hydraulic efficiency are at a large magnitude, whereas at best efficiency and high load operating point efficiencies are close to the measurement. A consideration of the runner side gap geometry as well as a refined mesh is able to improve the results either in relation to hydraulic efficiency or velocity distribution with the drawbacks of less stable numerics and increasing computational time.
Multi-Party Privacy-Preserving Set Intersection with Quasi-Linear Complexity
NASA Astrophysics Data System (ADS)
Cheon, Jung Hee; Jarecki, Stanislaw; Seo, Jae Hong
Secure computation of the set intersection functionality allows n parties to find the intersection between their datasets without revealing anything else about them. An efficient protocol for such a task could have multiple potential applications in commerce, health care, and security. However, all currently known secure set intersection protocols for n>2 parties have computational costs that are quadratic in the (maximum) number of entries in the dataset contributed by each party, making secure computation of the set intersection only practical for small datasets. In this paper, we describe the first multi-party protocol for securely computing the set intersection functionality with both the communication and the computation costs that are quasi-linear in the size of the datasets. For a fixed security parameter, our protocols require O(n2k) bits of communication and Õ(n2k) group multiplications per player in the malicious adversary setting, where k is the size of each dataset. Our protocol follows the basic idea of the protocol proposed by Kissner and Song, but we gain efficiency by using different representations of the polynomials associated with users' datasets and careful employment of algorithms that interpolate or evaluate polynomials on multiple points more efficiently. Moreover, the proposed protocol is robust. This means that the protocol outputs the desired result even if some corrupted players leave during the execution of the protocol.
User-assisted video segmentation system for visual communication
NASA Astrophysics Data System (ADS)
Wu, Zhengping; Chen, Chun
2002-01-01
Video segmentation plays an important role for efficient storage and transmission in visual communication. In this paper, we introduce a novel video segmentation system using point tracking and contour formation techniques. Inspired by the results from the study of the human visual system, we intend to solve the video segmentation problem into three separate phases: user-assisted feature points selection, feature points' automatic tracking, and contour formation. This splitting relieves the computer of ill-posed automatic segmentation problems, and allows a higher level of flexibility of the method. First, the precise feature points can be found using a combination of user assistance and an eigenvalue-based adjustment. Second, the feature points in the remaining frames are obtained using motion estimation and point refinement. At last, contour formation is used to extract the object, and plus a point insertion process to provide the feature points for next frame's tracking.
A hybrid framework for quantifying the influence of data in hydrological model calibration
NASA Astrophysics Data System (ADS)
Wright, David P.; Thyer, Mark; Westra, Seth; McInerney, David
2018-06-01
Influence diagnostics aim to identify a small number of influential data points that have a disproportionate impact on the model parameters and/or predictions. The key issues with current influence diagnostic techniques are that the regression-theory approaches do not provide hydrologically relevant influence metrics, while the case-deletion approaches are computationally expensive to calculate. The main objective of this study is to introduce a new two-stage hybrid framework that overcomes these challenges, by delivering hydrologically relevant influence metrics in a computationally efficient manner. Stage one uses computationally efficient regression-theory influence diagnostics to identify the most influential points based on Cook's distance. Stage two then uses case-deletion influence diagnostics to quantify the influence of points using hydrologically relevant metrics. To illustrate the application of the hybrid framework, we conducted three experiments on 11 hydro-climatologically diverse Australian catchments using the GR4J hydrological model. The first experiment investigated how many data points from stage one need to be retained in order to reliably identify those points that have the hightest influence on hydrologically relevant metrics. We found that a choice of 30-50 is suitable for hydrological applications similar to those explored in this study (30 points identified the most influential data 98% of the time and reduced the required recalibrations by 99% for a 10 year calibration period). The second experiment found little evidence of a change in the magnitude of influence with increasing calibration period length from 1, 2, 5 to 10 years. Even for 10 years the impact of influential points can still be high (>30% influence on maximum predicted flows). The third experiment compared the standard least squares (SLS) objective function with the weighted least squares (WLS) objective function on a 10 year calibration period. In two out of three flow metrics there was evidence that SLS, with the assumption of homoscedastic residual error, identified data points with higher influence (largest changes of 40%, 10%, and 44% for the maximum, mean, and low flows, respectively) than WLS, with the assumption of heteroscedastic residual errors (largest changes of 26%, 6%, and 6% for the maximum, mean, and low flows, respectively). The hybrid framework complements existing model diagnostic tools and can be applied to a wide range of hydrological modelling scenarios.
NASA Astrophysics Data System (ADS)
Li, Xuxu; Li, Xinyang; wang, Caixia
2018-03-01
This paper proposes an efficient approach to decrease the computational costs of correlation-based centroiding methods used for point source Shack-Hartmann wavefront sensors. Four typical similarity functions have been compared, i.e. the absolute difference function (ADF), ADF square (ADF2), square difference function (SDF), and cross-correlation function (CCF) using the Gaussian spot model. By combining them with fast search algorithms, such as three-step search (TSS), two-dimensional logarithmic search (TDL), cross search (CS), and orthogonal search (OS), computational costs can be reduced drastically without affecting the accuracy of centroid detection. Specifically, OS reduces calculation consumption by 90%. A comprehensive simulation indicates that CCF exhibits a better performance than other functions under various light-level conditions. Besides, the effectiveness of fast search algorithms has been verified.
NASA Astrophysics Data System (ADS)
Diehl, Roger E.; Schinnerer, Ralph G.; Williamson, Walton E.; Boden, Daryl G.
The present conference discusses topics in orbit determination, tethered satellite systems, celestial mechanics, guidance optimization, flexible body dynamics and control, attitude dynamics and control, Mars mission analyses, earth-orbiting mission analysis/debris, space probe mission analyses, and orbital computation numerical analyses. Attention is given to electrodynamic forces for control of tethered satellite systems, orbiting debris threats to asteroid flyby missions, launch velocity requirements for interceptors of short range ballistic missiles, transfers between libration-point orbits in the elliptic restricted problem, minimum fuel spacecraft reorientation, orbital guidance for hitting a fixed point at maximum speed, efficient computation of satellite visibility periods, orbit decay and reentry prediction for space debris, and the determination of satellite close approaches.
NASA Technical Reports Server (NTRS)
Diehl, Roger E. (Editor); Schinnerer, Ralph G. (Editor); Williamson, Walton E. (Editor); Boden, Daryl G. (Editor)
1992-01-01
The present conference discusses topics in orbit determination, tethered satellite systems, celestial mechanics, guidance optimization, flexible body dynamics and control, attitude dynamics and control, Mars mission analyses, earth-orbiting mission analysis/debris, space probe mission analyses, and orbital computation numerical analyses. Attention is given to electrodynamic forces for control of tethered satellite systems, orbiting debris threats to asteroid flyby missions, launch velocity requirements for interceptors of short range ballistic missiles, transfers between libration-point orbits in the elliptic restricted problem, minimum fuel spacecraft reorientation, orbital guidance for hitting a fixed point at maximum speed, efficient computation of satellite visibility periods, orbit decay and reentry prediction for space debris, and the determination of satellite close approaches.
Techniques of EMG signal analysis: detection, processing, classification and applications
Hussain, M.S.; Mohd-Yasin, F.
2006-01-01
Electromyography (EMG) signals can be used for clinical/biomedical applications, Evolvable Hardware Chip (EHW) development, and modern human computer interaction. EMG signals acquired from muscles require advanced methods for detection, decomposition, processing, and classification. The purpose of this paper is to illustrate the various methodologies and algorithms for EMG signal analysis to provide efficient and effective ways of understanding the signal and its nature. We further point up some of the hardware implementations using EMG focusing on applications related to prosthetic hand control, grasp recognition, and human computer interaction. A comparison study is also given to show performance of various EMG signal analysis methods. This paper provides researchers a good understanding of EMG signal and its analysis procedures. This knowledge will help them develop more powerful, flexible, and efficient applications. PMID:16799694
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nitao, J J
The goal of the Event Reconstruction Project is to find the location and strength of atmospheric release points, both stationary and moving. Source inversion relies on observational data as input. The methodology is sufficiently general to allow various forms of data. In this report, the authors will focus primarily on concentration measurements obtained at point monitoring locations at various times. The algorithms being investigated in the Project are the MCMC (Markov Chain Monte Carlo), SMC (Sequential Monte Carlo) Methods, classical inversion methods, and hybrids of these. They refer the reader to the report by Johannesson et al. (2004) for explanationsmore » of these methods. These methods require computing the concentrations at all monitoring locations for a given ''proposed'' source characteristic (locations and strength history). It is anticipated that the largest portion of the CPU time will take place performing this computation. MCMC and SMC will require this computation to be done at least tens of thousands of times. Therefore, an efficient means of computing forward model predictions is important to making the inversion practical. In this report they show how Green's functions and reciprocal Green's functions can significantly accelerate forward model computations. First, instead of computing a plume for each possible source strength history, they can compute plumes from unit impulse sources only. By using linear superposition, they can obtain the response for any strength history. This response is given by the forward Green's function. Second, they may use the law of reciprocity. Suppose that they require the concentration at a single monitoring point x{sub m} due to a potential (unit impulse) source that is located at x{sub s}. instead of computing a plume with source location x{sub s}, they compute a ''reciprocal plume'' whose (unit impulse) source is at the monitoring locations x{sub m}. The reciprocal plume is computed using a reversed-direction wind field. The wind field and transport coefficients must also be appropriately time-reversed. Reciprocity says that the concentration of reciprocal plume at x{sub s} is related to the desired concentration at x{sub m}. Since there are many less monitoring points than potential source locations, the number of forward model computations is drastically reduced.« less
NASA Technical Reports Server (NTRS)
Hedgley, D. R., Jr.
1982-01-01
The requirements for computer-generated perspective projections of three dimensional objects has escalated. A general solution was developed. The theoretical solution to this problem is presented. The method is very efficient as it minimizes the selection of points and comparison of line segments and hence avoids the devastation of square-law growth.
Development of full wave code for modeling RF fields in hot non-uniform plasmas
NASA Astrophysics Data System (ADS)
Zhao, Liangji; Svidzinski, Vladimir; Spencer, Andrew; Kim, Jin-Soo
2016-10-01
FAR-TECH, Inc. is developing a full wave RF modeling code to model RF fields in fusion devices and in general plasma applications. As an important component of the code, an adaptive meshless technique is introduced to solve the wave equations, which allows resolving plasma resonances efficiently and adapting to the complexity of antenna geometry and device boundary. The computational points are generated using either a point elimination method or a force balancing method based on the monitor function, which is calculated by solving the cold plasma dispersion equation locally. Another part of the code is the conductivity kernel calculation, used for modeling the nonlocal hot plasma dielectric response. The conductivity kernel is calculated on a coarse grid of test points and then interpolated linearly onto the computational points. All the components of the code are parallelized using MPI and OpenMP libraries to optimize the execution speed and memory. The algorithm and the results of our numerical approach to solving 2-D wave equations in a tokamak geometry will be presented. Work is supported by the U.S. DOE SBIR program.
NASA Technical Reports Server (NTRS)
Bernstein, Ira B.; Brookshaw, Leigh; Fox, Peter A.
1992-01-01
The present numerical method for accurate and efficient solution of systems of linear equations proceeds by numerically developing a set of basis solutions characterized by slowly varying dependent variables. The solutions thus obtained are shown to have a computational overhead largely independent of the small size of the scale length which characterizes the solutions; in many cases, the technique obviates series solutions near singular points, and its known sources of error can be easily controlled without a substantial increase in computational time.
Floating-point scaling technique for sources separation automatic gain control
NASA Astrophysics Data System (ADS)
Fermas, A.; Belouchrani, A.; Ait-Mohamed, O.
2012-07-01
Based on the floating-point representation and taking advantage of scaling factor indetermination in blind source separation (BSS) processing, we propose a scaling technique applied to the separation matrix, to avoid the saturation or the weakness in the recovered source signals. This technique performs an automatic gain control in an on-line BSS environment. We demonstrate the effectiveness of this technique by using the implementation of a division-free BSS algorithm with two inputs, two outputs. The proposed technique is computationally cheaper and efficient for a hardware implementation compared to the Euclidean normalisation.
The Octree Encoding Method for Efficient Solid Modeling.
1982-08-01
vertex point of a test obel is interior or exterior to the object. If the object is a convex polyhedron , the surface is described by a L 55 yp Test Obel...values can be generated by adding an offset from a pre-computed table. For a convex polyhedron , if a point is in the positive . .half-space of all face...planes, then it is interior to the polyhedron . If it is in the negative half-plane of any face plane, then it is exterior to the object. Otherwise, the
A one-model approach based on relaxed combinations of inputs for evaluating input congestion in DEA
NASA Astrophysics Data System (ADS)
Khodabakhshi, Mohammad
2009-08-01
This paper provides a one-model approach of input congestion based on input relaxation model developed in data envelopment analysis (e.g. [G.R. Jahanshahloo, M. Khodabakhshi, Suitable combination of inputs for improving outputs in DEA with determining input congestion -- Considering textile industry of China, Applied Mathematics and Computation (1) (2004) 263-273; G.R. Jahanshahloo, M. Khodabakhshi, Determining assurance interval for non-Archimedean ele improving outputs model in DEA, Applied Mathematics and Computation 151 (2) (2004) 501-506; M. Khodabakhshi, A super-efficiency model based on improved outputs in data envelopment analysis, Applied Mathematics and Computation 184 (2) (2007) 695-703; M. Khodabakhshi, M. Asgharian, An input relaxation measure of efficiency in stochastic data analysis, Applied Mathematical Modelling 33 (2009) 2010-2023]. This approach reduces solving three problems with the two-model approach introduced in the first of the above-mentioned reference to two problems which is certainly important from computational point of view. The model is applied to a set of data extracted from ISI database to estimate input congestion of 12 Canadian business schools.
Communication target object recognition for D2D connection with feature size limit
NASA Astrophysics Data System (ADS)
Ok, Jiheon; Kim, Soochang; Kim, Young-hoon; Lee, Chulhee
2015-03-01
Recently, a new concept of device-to-device (D2D) communication, which is called "point-and-link communication" has attracted great attentions due to its intuitive and simple operation. This approach enables user to communicate with target devices without any pre-identification information such as SSIDs, MAC addresses by selecting the target image displayed on the user's own device. In this paper, we present an efficient object matching algorithm that can be applied to look(point)-and-link communications for mobile services. Due to the limited channel bandwidth and low computational power of mobile terminals, the matching algorithm should satisfy low-complexity, low-memory and realtime requirements. To meet these requirements, we propose fast and robust feature extraction by considering the descriptor size and processing time. The proposed algorithm utilizes a HSV color histogram, SIFT (Scale Invariant Feature Transform) features and object aspect ratios. To reduce the descriptor size under 300 bytes, a limited number of SIFT key points were chosen as feature points and histograms were binarized while maintaining required performance. Experimental results show the robustness and the efficiency of the proposed algorithm.
Nam, Junghyun; Choo, Kim-Kwang Raymond; Han, Sangchul; Kim, Moonseong; Paik, Juryon; Won, Dongho
2015-01-01
A smart-card-based user authentication scheme for wireless sensor networks (hereafter referred to as a SCA-WSN scheme) is designed to ensure that only users who possess both a smart card and the corresponding password are allowed to gain access to sensor data and their transmissions. Despite many research efforts in recent years, it remains a challenging task to design an efficient SCA-WSN scheme that achieves user anonymity. The majority of published SCA-WSN schemes use only lightweight cryptographic techniques (rather than public-key cryptographic techniques) for the sake of efficiency, and have been demonstrated to suffer from the inability to provide user anonymity. Some schemes employ elliptic curve cryptography for better security but require sensors with strict resource constraints to perform computationally expensive scalar-point multiplications; despite the increased computational requirements, these schemes do not provide user anonymity. In this paper, we present a new SCA-WSN scheme that not only achieves user anonymity but also is efficient in terms of the computation loads for sensors. Our scheme employs elliptic curve cryptography but restricts its use only to anonymous user-to-gateway authentication, thereby allowing sensors to perform only lightweight cryptographic operations. Our scheme also enjoys provable security in a formal model extended from the widely accepted Bellare-Pointcheval-Rogaway (2000) model to capture the user anonymity property and various SCA-WSN specific attacks (e.g., stolen smart card attacks, node capture attacks, privileged insider attacks, and stolen verifier attacks).
Nam, Junghyun; Choo, Kim-Kwang Raymond; Han, Sangchul; Kim, Moonseong; Paik, Juryon; Won, Dongho
2015-01-01
A smart-card-based user authentication scheme for wireless sensor networks (hereafter referred to as a SCA-WSN scheme) is designed to ensure that only users who possess both a smart card and the corresponding password are allowed to gain access to sensor data and their transmissions. Despite many research efforts in recent years, it remains a challenging task to design an efficient SCA-WSN scheme that achieves user anonymity. The majority of published SCA-WSN schemes use only lightweight cryptographic techniques (rather than public-key cryptographic techniques) for the sake of efficiency, and have been demonstrated to suffer from the inability to provide user anonymity. Some schemes employ elliptic curve cryptography for better security but require sensors with strict resource constraints to perform computationally expensive scalar-point multiplications; despite the increased computational requirements, these schemes do not provide user anonymity. In this paper, we present a new SCA-WSN scheme that not only achieves user anonymity but also is efficient in terms of the computation loads for sensors. Our scheme employs elliptic curve cryptography but restricts its use only to anonymous user-to-gateway authentication, thereby allowing sensors to perform only lightweight cryptographic operations. Our scheme also enjoys provable security in a formal model extended from the widely accepted Bellare-Pointcheval-Rogaway (2000) model to capture the user anonymity property and various SCA-WSN specific attacks (e.g., stolen smart card attacks, node capture attacks, privileged insider attacks, and stolen verifier attacks). PMID:25849359
Ewald Electrostatics for Mixtures of Point and Continuous Line Charges.
Antila, Hanne S; Tassel, Paul R Van; Sammalkorpi, Maria
2015-10-15
Many charged macro- or supramolecular systems, such as DNA, are approximately rod-shaped and, to the lowest order, may be treated as continuous line charges. However, the standard method used to calculate electrostatics in molecular simulation, the Ewald summation, is designed to treat systems of point charges. We extend the Ewald concept to a hybrid system containing both point charges and continuous line charges. We find the calculated force between a point charge and (i) a continuous line charge and (ii) a discrete line charge consisting of uniformly spaced point charges to be numerically equivalent when the separation greatly exceeds the discretization length. At shorter separations, discretization induces deviations in the force and energy, and point charge-point charge correlation effects. Because significant computational savings are also possible, the continuous line charge Ewald method presented here offers the possibility of accurate and efficient electrostatic calculations.
Blind quantum computing with weak coherent pulses.
Dunjko, Vedran; Kashefi, Elham; Leverrier, Anthony
2012-05-18
The universal blind quantum computation (UBQC) protocol [A. Broadbent, J. Fitzsimons, and E. Kashefi, in Proceedings of the 50th Annual IEEE Symposiumon Foundations of Computer Science (IEEE Computer Society, Los Alamitos, CA, USA, 2009), pp. 517-526.] allows a client to perform quantum computation on a remote server. In an ideal setting, perfect privacy is guaranteed if the client is capable of producing specific, randomly chosen single qubit states. While from a theoretical point of view, this may constitute the lowest possible quantum requirement, from a pragmatic point of view, generation of such states to be sent along long distances can never be achieved perfectly. We introduce the concept of ϵ blindness for UBQC, in analogy to the concept of ϵ security developed for other cryptographic protocols, allowing us to characterize the robustness and security properties of the protocol under possible imperfections. We also present a remote blind single qubit preparation protocol with weak coherent pulses for the client to prepare, in a delegated fashion, quantum states arbitrarily close to perfect random single qubit states. This allows us to efficiently achieve ϵ-blind UBQC for any ϵ>0, even if the channel between the client and the server is arbitrarily lossy.
Blind Quantum Computing with Weak Coherent Pulses
NASA Astrophysics Data System (ADS)
Dunjko, Vedran; Kashefi, Elham; Leverrier, Anthony
2012-05-01
The universal blind quantum computation (UBQC) protocol [A. Broadbent, J. Fitzsimons, and E. Kashefi, in Proceedings of the 50th Annual IEEE Symposiumon Foundations of Computer Science (IEEE Computer Society, Los Alamitos, CA, USA, 2009), pp. 517-526.] allows a client to perform quantum computation on a remote server. In an ideal setting, perfect privacy is guaranteed if the client is capable of producing specific, randomly chosen single qubit states. While from a theoretical point of view, this may constitute the lowest possible quantum requirement, from a pragmatic point of view, generation of such states to be sent along long distances can never be achieved perfectly. We introduce the concept of ɛ blindness for UBQC, in analogy to the concept of ɛ security developed for other cryptographic protocols, allowing us to characterize the robustness and security properties of the protocol under possible imperfections. We also present a remote blind single qubit preparation protocol with weak coherent pulses for the client to prepare, in a delegated fashion, quantum states arbitrarily close to perfect random single qubit states. This allows us to efficiently achieve ɛ-blind UBQC for any ɛ>0, even if the channel between the client and the server is arbitrarily lossy.
NASA Technical Reports Server (NTRS)
Ramins, P.; Kosmahl, H. G.; Force, D. A.; Palmer, R. W.; Dayton, J. A., Jr.
1985-01-01
A computational procedure for the design of TWT-refocuser-MDC systems was used to design a short dynamic refocusing system and highly efficient four-stage depressed collector for a 200-W, 8- to 18-GHz, TWT. The computations were carried out with advanced, multidimensional computer programs which model the electron beam as a series of disks of charge and follow their trajectories from the RF input of the TWT, through the slow-wave structure and refocusing section, to their points of impact in the depressed collector. Secondary emission losses in the MDC were treated semi-quantitatively by injecting a representative beam of secondary electrons into the MDC analysis at the point of impact of each primary beam. A comparison of computed and measured TWT and MDC performance showed very good agreement. The electrodes of the MDC were fabricated from a particular form of isotropic graphite that was selected for its low secondary electron yield, ease of machinability, and vacuum properties. This MDC was tested (at CW) for more than 1000 hr with negligible degradation in TWT and MDC performances.
Use of Terrestrial Laser Scanner for Rigid Airport Pavement Management.
Barbarella, Maurizio; D'Amico, Fabrizio; De Blasiis, Maria Rosaria; Di Benedetto, Alessandro; Fiani, Margherita
2017-12-26
The evaluation of the structural efficiency of airport infrastructures is a complex task. Faulting is one of the most important indicators of rigid pavement performance. The aim of our study is to provide a new method for faulting detection and computation on jointed concrete pavements. Nowadays, the assessment of faulting is performed with the use of laborious and time-consuming measurements that strongly hinder aircraft traffic. We proposed a field procedure for Terrestrial Laser Scanner data acquisition and a computation flow chart in order to identify and quantify the fault size at each joint of apron slabs. The total point cloud has been used to compute the least square plane fitting those points. The best-fit plane for each slab has been computed too. The attitude of each slab plane with respect to both the adjacent ones and the apron reference plane has been determined by the normal vectors to the surfaces. Faulting has been evaluated as the difference in elevation between the slab planes along chosen sections. For a more accurate evaluation of the faulting value, we have then considered a few strips of data covering rectangular areas of different sizes across the joints. The accuracy of the estimated quantities has been computed too.
NASA Astrophysics Data System (ADS)
Wang, Yongbo; Sheng, Yehua; Lu, Guonian; Tian, Peng; Zhang, Kai
2008-04-01
Surface reconstruction is an important task in the field of 3d-GIS, computer aided design and computer graphics (CAD & CG), virtual simulation and so on. Based on available incremental surface reconstruction methods, a feature-constrained surface reconstruction approach for point cloud is presented. Firstly features are extracted from point cloud under the rules of curvature extremes and minimum spanning tree. By projecting local sample points to the fitted tangent planes and using extracted features to guide and constrain the process of local triangulation and surface propagation, topological relationship among sample points can be achieved. For the constructed models, a process named consistent normal adjustment and regularization is adopted to adjust normal of each face so that the correct surface model is achieved. Experiments show that the presented approach inherits the convenient implementation and high efficiency of traditional incremental surface reconstruction method, meanwhile, it avoids improper propagation of normal across sharp edges, which means the applicability of incremental surface reconstruction is greatly improved. Above all, appropriate k-neighborhood can help to recognize un-sufficient sampled areas and boundary parts, the presented approach can be used to reconstruct both open and close surfaces without additional interference.
Dual domain material point method for multiphase flows
NASA Astrophysics Data System (ADS)
Zhang, Duan
2017-11-01
Although the particle-in-cell method was first invented in the 60's for fluid computations, one of its later versions, the material point method, is mostly used for solid calculations. Recent development of the multi-velocity formulations for multiphase flows and fluid-structure interactions requires the Lagrangian capability of the method be combined with Eulerian calculations for fluids. Because of different numerical representations of the materials, additional numerical schemes are needed to ensure continuity of the materials. New applications of the method to compute fluid motions have revealed numerical difficulties in various versions of the method. To resolve these difficulties, the dual domain material point method is introduced and improved. Unlike other particle based methods, the material point method uses both Lagrangian particles and Eulerian mesh, therefore it avoids direct communication between particles. With this unique property and the Lagrangian capability of the method, it is shown that a multiscale numerical scheme can be efficiently built based on the dual domain material point method. In this talk, the theoretical foundation of the method will be introduced. Numerical examples will be shown. Work sponsored by the next generation code project of LANL.
Efficient robust doubly adaptive regularized regression with applications.
Karunamuni, Rohana J; Kong, Linglong; Tu, Wei
2018-01-01
We consider the problem of estimation and variable selection for general linear regression models. Regularized regression procedures have been widely used for variable selection, but most existing methods perform poorly in the presence of outliers. We construct a new penalized procedure that simultaneously attains full efficiency and maximum robustness. Furthermore, the proposed procedure satisfies the oracle properties. The new procedure is designed to achieve sparse and robust solutions by imposing adaptive weights on both the decision loss and the penalty function. The proposed method of estimation and variable selection attains full efficiency when the model is correct and, at the same time, achieves maximum robustness when outliers are present. We examine the robustness properties using the finite-sample breakdown point and an influence function. We show that the proposed estimator attains the maximum breakdown point. Furthermore, there is no loss in efficiency when there are no outliers or the error distribution is normal. For practical implementation of the proposed method, we present a computational algorithm. We examine the finite-sample and robustness properties using Monte Carlo studies. Two datasets are also analyzed.
Fast Edge Detection and Segmentation of Terrestrial Laser Scans Through Normal Variation Analysis
NASA Astrophysics Data System (ADS)
Che, E.; Olsen, M. J.
2017-09-01
Terrestrial Laser Scanning (TLS) utilizes light detection and ranging (lidar) to effectively and efficiently acquire point cloud data for a wide variety of applications. Segmentation is a common procedure of post-processing to group the point cloud into a number of clusters to simplify the data for the sequential modelling and analysis needed for most applications. This paper presents a novel method to rapidly segment TLS data based on edge detection and region growing. First, by computing the projected incidence angles and performing the normal variation analysis, the silhouette edges and intersection edges are separated from the smooth surfaces. Then a modified region growing algorithm groups the points lying on the same smooth surface. The proposed method efficiently exploits the gridded scan pattern utilized during acquisition of TLS data from most sensors and takes advantage of parallel programming to process approximately 1 million points per second. Moreover, the proposed segmentation does not require estimation of the normal at each point, which limits the errors in normal estimation propagating to segmentation. Both an indoor and outdoor scene are used for an experiment to demonstrate and discuss the effectiveness and robustness of the proposed segmentation method.
Blow, Nikolaus; Biswas, Pradipta
2017-01-01
As computers become more and more essential for everyday life, people who cannot use them are missing out on an important tool. The predominant method of interaction with a screen is a mouse, and difficulty in using a mouse can be a huge obstacle for people who would otherwise gain great value from using a computer. If mouse pointing were to be made easier, then a large number of users may be able to begin using a computer efficiently where they may previously have been unable to. The present article aimed to improve pointing speeds for people with arm or hand impairments. The authors investigated different smoothing and prediction models on a stored data set involving 25 people, and the best of these algorithms were chosen. A web-based prototype was developed combining a polynomial smoothing algorithm with a time-weighted gradient target prediction model. The adapted interface gave an average improvement of 13.5% in target selection times in a 10-person study of representative users of the system. A demonstration video of the system is available at https://youtu.be/sAzbrKHivEY.
Implementing Molecular Dynamics for Hybrid High Performance Computers - 1. Short Range Forces
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, W Michael; Wang, Peng; Plimpton, Steven J
The use of accelerators such as general-purpose graphics processing units (GPGPUs) have become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power requirements. Hybrid high performance computers, machines with more than one type of floating-point processor, are now becoming more prevalent due to these advantages. In this work, we discuss several important issues in porting a large molecular dynamics code for use on parallel hybrid machines - 1) choosing a hybrid parallel decomposition that works on central processing units (CPUs) with distributed memory and accelerator cores with shared memory,more » 2) minimizing the amount of code that must be ported for efficient acceleration, 3) utilizing the available processing power from both many-core CPUs and accelerators, and 4) choosing a programming model for acceleration. We present our solution to each of these issues for short-range force calculation in the molecular dynamics package LAMMPS. We describe algorithms for efficient short range force calculation on hybrid high performance machines. We describe a new approach for dynamic load balancing of work between CPU and accelerator cores. We describe the Geryon library that allows a single code to compile with both CUDA and OpenCL for use on a variety of accelerators. Finally, we present results on a parallel test cluster containing 32 Fermi GPGPUs and 180 CPU cores.« less
Speech recognition for embedded automatic positioner for laparoscope
NASA Astrophysics Data System (ADS)
Chen, Xiaodong; Yin, Qingyun; Wang, Yi; Yu, Daoyin
2014-07-01
In this paper a novel speech recognition methodology based on Hidden Markov Model (HMM) is proposed for embedded Automatic Positioner for Laparoscope (APL), which includes a fixed point ARM processor as the core. The APL system is designed to assist the doctor in laparoscopic surgery, by implementing the specific doctor's vocal control to the laparoscope. Real-time respond to the voice commands asks for more efficient speech recognition algorithm for the APL. In order to reduce computation cost without significant loss in recognition accuracy, both arithmetic and algorithmic optimizations are applied in the method presented. First, depending on arithmetic optimizations most, a fixed point frontend for speech feature analysis is built according to the ARM processor's character. Then the fast likelihood computation algorithm is used to reduce computational complexity of the HMM-based recognition algorithm. The experimental results show that, the method shortens the recognition time within 0.5s, while the accuracy higher than 99%, demonstrating its ability to achieve real-time vocal control to the APL.
NASA Astrophysics Data System (ADS)
Cassan, Arnaud
2017-07-01
The exoplanet detection rate from gravitational microlensing has grown significantly in recent years thanks to a great enhancement of resources and improved observational strategy. Current observatories include ground-based wide-field and/or robotic world-wide networks of telescopes, as well as space-based observatories such as satellites Spitzer or Kepler/K2. This results in a large quantity of data to be processed and analysed, which is a challenge for modelling codes because of the complexity of the parameter space to be explored and the intensive computations required to evaluate the models. In this work, I present a method that allows to compute the quadrupole and hexadecapole approximations of the finite-source magnification with more efficiency than previously available codes, with routines about six times and four times faster, respectively. The quadrupole takes just about twice the time of a point-source evaluation, which advocates for generalizing its use to large portions of the light curves. The corresponding routines are available as open-source python codes.
Using Speech Recognition to Enhance the Tongue Drive System Functionality in Computer Access
Huo, Xueliang; Ghovanloo, Maysam
2013-01-01
Tongue Drive System (TDS) is a wireless tongue operated assistive technology (AT), which can enable people with severe physical disabilities to access computers and drive powered wheelchairs using their volitional tongue movements. TDS offers six discrete commands, simultaneously available to the users, for pointing and typing as a substitute for mouse and keyboard in computer access, respectively. To enhance the TDS performance in typing, we have added a microphone, an audio codec, and a wireless audio link to its readily available 3-axial magnetic sensor array, and combined it with a commercially available speech recognition software, the Dragon Naturally Speaking, which is regarded as one of the most efficient ways for text entry. Our preliminary evaluations indicate that the combined TDS and speech recognition technologies can provide end users with significantly higher performance than using each technology alone, particularly in completing tasks that require both pointing and text entry, such as web surfing. PMID:22255801
The large-scale three-point correlation function of the SDSS BOSS DR12 CMASS galaxies
NASA Astrophysics Data System (ADS)
Slepian, Zachary; Eisenstein, Daniel J.; Beutler, Florian; Chuang, Chia-Hsun; Cuesta, Antonio J.; Ge, Jian; Gil-Marín, Héctor; Ho, Shirley; Kitaura, Francisco-Shu; McBride, Cameron K.; Nichol, Robert C.; Percival, Will J.; Rodríguez-Torres, Sergio; Ross, Ashley J.; Scoccimarro, Román; Seo, Hee-Jong; Tinker, Jeremy; Tojeiro, Rita; Vargas-Magaña, Mariana
2017-06-01
We report a measurement of the large-scale three-point correlation function of galaxies using the largest data set for this purpose to date, 777 202 luminous red galaxies in the Sloan Digital Sky Survey Baryon Acoustic Oscillation Spectroscopic Survey (SDSS BOSS) DR12 CMASS sample. This work exploits the novel algorithm of Slepian & Eisenstein to compute the multipole moments of the 3PCF in O(N^2) time, with N the number of galaxies. Leading-order perturbation theory models the data well in a compressed basis where one triangle side is integrated out. We also present an accurate and computationally efficient means of estimating the covariance matrix. With these techniques, the redshift-space linear and non-linear bias are measured, with 2.6 per cent precision on the former if σ8 is fixed. The data also indicate a 2.8σ preference for the BAO, confirming the presence of BAO in the three-point function.
Automated Approach to Very High-Order Aeroacoustic Computations. Revision
NASA Technical Reports Server (NTRS)
Dyson, Rodger W.; Goodrich, John W.
2001-01-01
Computational aeroacoustics requires efficient, high-resolution simulation tools. For smooth problems, this is best accomplished with very high-order in space and time methods on small stencils. However, the complexity of highly accurate numerical methods can inhibit their practical application, especially in irregular geometries. This complexity is reduced by using a special form of Hermite divided-difference spatial interpolation on Cartesian grids, and a Cauchy-Kowalewski recursion procedure for time advancement. In addition, a stencil constraint tree reduces the complexity of interpolating grid points that am located near wall boundaries. These procedures are used to develop automatically and to implement very high-order methods (> 15) for solving the linearized Euler equations that can achieve less than one grid point per wavelength resolution away from boundaries by including spatial derivatives of the primitive variables at each grid point. The accuracy of stable surface treatments is currently limited to 11th order for grid aligned boundaries and to 2nd order for irregular boundaries.
An Automated Approach to Very High Order Aeroacoustic Computations in Complex Geometries
NASA Technical Reports Server (NTRS)
Dyson, Rodger W.; Goodrich, John W.
2000-01-01
Computational aeroacoustics requires efficient, high-resolution simulation tools. And for smooth problems, this is best accomplished with very high order in space and time methods on small stencils. But the complexity of highly accurate numerical methods can inhibit their practical application, especially in irregular geometries. This complexity is reduced by using a special form of Hermite divided-difference spatial interpolation on Cartesian grids, and a Cauchy-Kowalewslci recursion procedure for time advancement. In addition, a stencil constraint tree reduces the complexity of interpolating grid points that are located near wall boundaries. These procedures are used to automatically develop and implement very high order methods (>15) for solving the linearized Euler equations that can achieve less than one grid point per wavelength resolution away from boundaries by including spatial derivatives of the primitive variables at each grid point. The accuracy of stable surface treatments is currently limited to 11th order for grid aligned boundaries and to 2nd order for irregular boundaries.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Computational Research Division, Lawrence Berkeley National Laboratory; NERSC, Lawrence Berkeley National Laboratory; Computer Science Department, University of California, Berkeley
2009-05-04
We apply auto-tuning to a hybrid MPI-pthreads lattice Boltzmann computation running on the Cray XT4 at National Energy Research Scientific Computing Center (NERSC). Previous work showed that multicore-specific auto-tuning can improve the performance of lattice Boltzmann magnetohydrodynamics (LBMHD) by a factor of 4x when running on dual- and quad-core Opteron dual-socket SMPs. We extend these studies to the distributed memory arena via a hybrid MPI/pthreads implementation. In addition to conventional auto-tuning at the local SMP node, we tune at the message-passing level to determine the optimal aspect ratio as well as the correct balance between MPI tasks and threads permore » MPI task. Our study presents a detailed performance analysis when moving along an isocurve of constant hardware usage: fixed total memory, total cores, and total nodes. Overall, our work points to approaches for improving intra- and inter-node efficiency on large-scale multicore systems for demanding scientific applications.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Atanassov, E.; Dimitrov, D., E-mail: d.slavov@bas.bg, E-mail: emanouil@parallel.bas.bg, E-mail: gurov@bas.bg; Gurov, T.
2015-10-28
The recent developments in the area of high-performance computing are driven not only by the desire for ever higher performance but also by the rising costs of electricity. The use of various types of accelerators like GPUs, Intel Xeon Phi has become mainstream and many algorithms and applications have been ported to make use of them where available. In Financial Mathematics the question of optimal use of computational resources should also take into account the limitations on space, because in many use cases the servers are deployed close to the exchanges. In this work we evaluate various algorithms for optionmore » pricing that we have implemented for different target architectures in terms of their energy and space efficiency. Since it has been established that low-discrepancy sequences may be better than pseudorandom numbers for these types of algorithms, we also test the Sobol and Halton sequences. We present the raw results, the computed metrics and conclusions from our tests.« less
On computing the global time-optimal motions of robotic manipulators in the presence of obstacles
NASA Technical Reports Server (NTRS)
Shiller, Zvi; Dubowsky, Steven
1991-01-01
A method for computing the time-optimal motions of robotic manipulators is presented that considers the nonlinear manipulator dynamics, actuator constraints, joint limits, and obstacles. The optimization problem is reduced to a search for the time-optimal path in the n-dimensional position space. A small set of near-optimal paths is first efficiently selected from a grid, using a branch and bound search and a series of lower bound estimates on the traveling time along a given path. These paths are further optimized with a local path optimization to yield the global optimal solution. Obstacles are considered by eliminating the collision points from the tessellated space and by adding a penalty function to the motion time in the local optimization. The computational efficiency of the method stems from the reduced dimensionality of the searched spaced and from combining the grid search with a local optimization. The method is demonstrated in several examples for two- and six-degree-of-freedom manipulators with obstacles.
Solution of nonlinear time-dependent PDEs through componentwise approximation of matrix functions
NASA Astrophysics Data System (ADS)
Cibotarica, Alexandru; Lambers, James V.; Palchak, Elisabeth M.
2016-09-01
Exponential propagation iterative (EPI) methods provide an efficient approach to the solution of large stiff systems of ODEs, compared to standard integrators. However, the bulk of the computational effort in these methods is due to products of matrix functions and vectors, which can become very costly at high resolution due to an increase in the number of Krylov projection steps needed to maintain accuracy. In this paper, it is proposed to modify EPI methods by using Krylov subspace spectral (KSS) methods, instead of standard Krylov projection methods, to compute products of matrix functions and vectors. Numerical experiments demonstrate that this modification causes the number of Krylov projection steps to become bounded independently of the grid size, thus dramatically improving efficiency and scalability. As a result, for each test problem featured, as the total number of grid points increases, the growth in computation time is just below linear, while other methods achieved this only on selected test problems or not at all.
NASA Astrophysics Data System (ADS)
Atanassov, E.; Dimitrov, D.; Gurov, T.
2015-10-01
The recent developments in the area of high-performance computing are driven not only by the desire for ever higher performance but also by the rising costs of electricity. The use of various types of accelerators like GPUs, Intel Xeon Phi has become mainstream and many algorithms and applications have been ported to make use of them where available. In Financial Mathematics the question of optimal use of computational resources should also take into account the limitations on space, because in many use cases the servers are deployed close to the exchanges. In this work we evaluate various algorithms for option pricing that we have implemented for different target architectures in terms of their energy and space efficiency. Since it has been established that low-discrepancy sequences may be better than pseudorandom numbers for these types of algorithms, we also test the Sobol and Halton sequences. We present the raw results, the computed metrics and conclusions from our tests.
NASA Technical Reports Server (NTRS)
Newman, M. B.; Pipano, A.
1973-01-01
A new eigensolution routine, FEER (Fast Eigensolution Extraction Routine), used in conjunction with NASTRAN at Israel Aircraft Industries is described. The FEER program is based on an automatic matrix reduction scheme whereby the lower modes of structures with many degrees of freedom can be accurately extracted from a tridiagonal eigenvalue problem whose size is of the same order of magnitude as the number of required modes. The process is effected without arbitrary lumping of masses at selected node points or selection of nodes to be retained in the analysis set. The results of computational efficiency studies are presented, showing major arithmetic operation counts and actual computer run times of FEER as compared to other methods of eigenvalue extraction, including those available in the NASTRAN READ module. It is concluded that the tridiagonal reduction method used in FEER would serve as a valuable addition to NASTRAN for highly increased efficiency in obtaining structural vibration modes.
NASA Astrophysics Data System (ADS)
Yi, Jin; Li, Xinyu; Xiao, Mi; Xu, Junnan; Zhang, Lin
2017-01-01
Engineering design often involves different types of simulation, which results in expensive computational costs. Variable fidelity approximation-based design optimization approaches can realize effective simulation and efficiency optimization of the design space using approximation models with different levels of fidelity and have been widely used in different fields. As the foundations of variable fidelity approximation models, the selection of sample points of variable-fidelity approximation, called nested designs, is essential. In this article a novel nested maximin Latin hypercube design is constructed based on successive local enumeration and a modified novel global harmony search algorithm. In the proposed nested designs, successive local enumeration is employed to select sample points for a low-fidelity model, whereas the modified novel global harmony search algorithm is employed to select sample points for a high-fidelity model. A comparative study with multiple criteria and an engineering application are employed to verify the efficiency of the proposed nested designs approach.
NASA Astrophysics Data System (ADS)
Chaidee, S.; Pakawanwong, P.; Suppakitpaisarn, V.; Teerasawat, P.
2017-09-01
In this work, we devise an efficient method for the land-use optimization problem based on Laguerre Voronoi diagram. Previous Voronoi diagram-based methods are more efficient and more suitable for interactive design than discrete optimization-based method, but, in many cases, their outputs do not satisfy area constraints. To cope with the problem, we propose a force-directed graph drawing algorithm, which automatically allocates generating points of Voronoi diagram to appropriate positions. Then, we construct a Laguerre Voronoi diagram based on these generating points, use linear programs to adjust each cell, and reconstruct the diagram based on the adjustment. We adopt the proposed method to the practical case study of Chiang Mai University's allocated land for a mixed-use complex. For this case study, compared to other Voronoi diagram-based method, we decrease the land allocation error by 62.557 %. Although our computation time is larger than the previous Voronoi-diagram-based method, it is still suitable for interactive design.
Vehicle trajectory linearisation to enable efficient optimisation of the constant speed racing line
NASA Astrophysics Data System (ADS)
Timings, Julian P.; Cole, David J.
2012-06-01
A driver model is presented capable of optimising the trajectory of a simple dynamic nonlinear vehicle, at constant forward speed, so that progression along a predefined track is maximised as a function of time. In doing so, the model is able to continually operate a vehicle at its lateral-handling limit, maximising vehicle performance. The technique used forms a part of the solution to the motor racing objective of minimising lap time. A new approach of formulating the minimum lap time problem is motivated by the need for a more computationally efficient and robust tool-set for understanding on-the-limit driving behaviour. This has been achieved through set point-dependent linearisation of the vehicle model and coupling the vehicle-track system using an intrinsic coordinate description. Through this, the geometric vehicle trajectory had been linearised relative to the track reference, leading to new path optimisation algorithm which can be formed as a computationally efficient convex quadratic programming problem.
A power-efficient ZF precoding scheme for multi-user indoor visible light communication systems
NASA Astrophysics Data System (ADS)
Zhao, Qiong; Fan, Yangyu; Deng, Lijun; Kang, Bochao
2017-02-01
In this study, we propose a power-efficient ZF precoding scheme for visible light communication (VLC) downlink multi-user multiple-input-single-output (MU-MISO) systems, which incorporates the zero-forcing (ZF) and the characteristics of VLC systems. The main idea of this scheme is that the channel matrix used to perform pseudoinverse comes from the set of optical Access Points (APs) shared by more than one user, instead of the set of all involved serving APs as the existing ZF precoding schemes often used. By doing this, the waste of power, which is caused by the transmission of one user's data in the un-serving APs, can be avoided. In addition, the size of the channel matrix needs to perform pseudoinverse becomes smaller, which helps to reduce the computation complexity. Simulation results in two scenarios show that the proposed ZF precoding scheme has higher power efficiency, better bit error rate (BER) performance and lower computation complexity compared with traditional ZF precoding schemes.
Refinement of Methods for Evaluation of Near-Hypersingular Integrals in BEM Formulations
NASA Technical Reports Server (NTRS)
Fink, Patricia W.; Khayat, Michael A.; Wilton, Donald R.
2006-01-01
In this paper, we present advances in singularity cancellation techniques applied to integrals in BEM formulations that are nearly hypersingular. Significant advances have been made recently in singularity cancellation techniques applied to 1 R type kernels [M. Khayat, D. Wilton, IEEE Trans. Antennas and Prop., 53, pp. 3180-3190, 2005], as well as to the gradients of these kernels [P. Fink, D. Wilton, and M. Khayat, Proc. ICEAA, pp. 861-864, Torino, Italy, 2005] on curved subdomains. In these approaches, the source triangle is divided into three tangent subtriangles with a common vertex at the normal projection of the observation point onto the source element or the extended surface containing it. The geometry of a typical tangent subtriangle and its local rectangular coordinate system with origin at the projected observation point is shown in Fig. 1. Whereas singularity cancellation techniques for 1 R type kernels are now nearing maturity, the efficient handling of near-hypersingular kernels still needs attention. For example, in the gradient reference above, techniques are presented for computing the normal component of the gradient relative to the plane containing the tangent subtriangle. These techniques, summarized in the transformations in Table 1, are applied at the sub-triangle level and correspond particularly to the case in which the normal projection of the observation point lies within the boundary of the source element. They are found to be highly efficient as z approaches zero. Here, we extend the approach to cover two instances not previously addressed. First, we consider the case in which the normal projection of the observation point lies external to the source element. For such cases, we find that simple modifications to the transformations of Table 1 permit significant savings in computational cost. Second, we present techniques that permit accurate computation of the tangential components of the gradient; i.e., tangent to the plane containing the source element.
Spatially explicit spectral analysis of point clouds and geospatial data
Buscombe, Daniel D.
2015-01-01
The increasing use of spatially explicit analyses of high-resolution spatially distributed data (imagery and point clouds) for the purposes of characterising spatial heterogeneity in geophysical phenomena necessitates the development of custom analytical and computational tools. In recent years, such analyses have become the basis of, for example, automated texture characterisation and segmentation, roughness and grain size calculation, and feature detection and classification, from a variety of data types. In this work, much use has been made of statistical descriptors of localised spatial variations in amplitude variance (roughness), however the horizontal scale (wavelength) and spacing of roughness elements is rarely considered. This is despite the fact that the ratio of characteristic vertical to horizontal scales is not constant and can yield important information about physical scaling relationships. Spectral analysis is a hitherto under-utilised but powerful means to acquire statistical information about relevant amplitude and wavelength scales, simultaneously and with computational efficiency. Further, quantifying spatially distributed data in the frequency domain lends itself to the development of stochastic models for probing the underlying mechanisms which govern the spatial distribution of geological and geophysical phenomena. The software packagePySESA (Python program for Spatially Explicit Spectral Analysis) has been developed for generic analyses of spatially distributed data in both the spatial and frequency domains. Developed predominantly in Python, it accesses libraries written in Cython and C++ for efficiency. It is open source and modular, therefore readily incorporated into, and combined with, other data analysis tools and frameworks with particular utility for supporting research in the fields of geomorphology, geophysics, hydrography, photogrammetry and remote sensing. The analytical and computational structure of the toolbox is described, and its functionality illustrated with an example of a high-resolution bathymetric point cloud data collected with multibeam echosounder.
Performance analysis of a dual-tree algorithm for computing spatial distance histograms
Chen, Shaoping; Tu, Yi-Cheng; Xia, Yuni
2011-01-01
Many scientific and engineering fields produce large volume of spatiotemporal data. The storage, retrieval, and analysis of such data impose great challenges to database systems design. Analysis of scientific spatiotemporal data often involves computing functions of all point-to-point interactions. One such analytics, the Spatial Distance Histogram (SDH), is of vital importance to scientific discovery. Recently, algorithms for efficient SDH processing in large-scale scientific databases have been proposed. These algorithms adopt a recursive tree-traversing strategy to process point-to-point distances in the visited tree nodes in batches, thus require less time when compared to the brute-force approach where all pairwise distances have to be computed. Despite the promising experimental results, the complexity of such algorithms has not been thoroughly studied. In this paper, we present an analysis of such algorithms based on a geometric modeling approach. The main technique is to transform the analysis of point counts into a problem of quantifying the area of regions where pairwise distances can be processed in batches by the algorithm. From the analysis, we conclude that the number of pairwise distances that are left to be processed decreases exponentially with more levels of the tree visited. This leads to the proof of a time complexity lower than the quadratic time needed for a brute-force algorithm and builds the foundation for a constant-time approximate algorithm. Our model is also general in that it works for a wide range of point spatial distributions, histogram types, and space-partitioning options in building the tree. PMID:21804753
NASA Astrophysics Data System (ADS)
Lari, Z.; El-Sheimy, N.
2017-09-01
In recent years, the increasing incidence of climate-related disasters has tremendously affected our environment. In order to effectively manage and reduce dramatic impacts of such events, the development of timely disaster management plans is essential. Since these disasters are spatial phenomena, timely provision of geospatial information is crucial for effective development of response and management plans. Due to inaccessibility of the affected areas and limited budget of first-responders, timely acquisition of the required geospatial data for these applications is usually possible only using low-cost imaging and georefencing sensors mounted on unmanned platforms. Despite rapid collection of the required data using these systems, available processing techniques are not yet capable of delivering geospatial information to responders and decision makers in a timely manner. To address this issue, this paper introduces a new technique for dense 3D reconstruction of the affected scenes which can deliver and improve the needed geospatial information incrementally. This approach is implemented based on prior 3D knowledge of the scene and employs computationally-efficient 2D triangulation, feature descriptor, feature matching and point verification techniques to optimize and speed up 3D dense scene reconstruction procedure. To verify the feasibility and computational efficiency of the proposed approach, an experiment using a set of consecutive images collected onboard a UAV platform and prior low-density airborne laser scanning over the same area is conducted and step by step results are provided. A comparative analysis of the proposed approach and an available image-based dense reconstruction technique is also conducted to prove the computational efficiency and competency of this technique for delivering geospatial information with pre-specified accuracy.
A fast marching algorithm for the factored eikonal equation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Treister, Eran, E-mail: erantreister@gmail.com; Haber, Eldad, E-mail: haber@math.ubc.ca; Department of Mathematics, The University of British Columbia, Vancouver, BC
The eikonal equation is instrumental in many applications in several fields ranging from computer vision to geoscience. This equation can be efficiently solved using the iterative Fast Sweeping (FS) methods and the direct Fast Marching (FM) methods. However, when used for a point source, the original eikonal equation is known to yield inaccurate numerical solutions, because of a singularity at the source. In this case, the factored eikonal equation is often preferred, and is known to yield a more accurate numerical solution. One application that requires the solution of the eikonal equation for point sources is travel time tomography. Thismore » inverse problem may be formulated using the eikonal equation as a forward problem. While this problem has been solved using FS in the past, the more recent choice for applying it involves FM methods because of the efficiency in which sensitivities can be obtained using them. However, while several FS methods are available for solving the factored equation, the FM method is available only for the original eikonal equation. In this paper we develop a Fast Marching algorithm for the factored eikonal equation, using both first and second order finite-difference schemes. Our algorithm follows the same lines as the original FM algorithm and requires the same computational effort. In addition, we show how to obtain sensitivities using this FM method and apply travel time tomography, formulated as an inverse factored eikonal equation. Numerical results in two and three dimensions show that our algorithm solves the factored eikonal equation efficiently, and demonstrate the achieved accuracy for computing the travel time. We also demonstrate a recovery of a 2D and 3D heterogeneous medium by travel time tomography using the eikonal equation for forward modeling and inversion by Gauss–Newton.« less
Soma, David B; Homme, Jason H; Jacobson, Robert M
2013-01-01
We sought to determine if tablet computers-supported by a laboratory experience focused upon skill-development-would improve not only evidence-based medicine (EBM) knowledge but also skills and behavior. We conducted a prospective cohort study where we provided tablet computers to our pediatric residents and then held a series of laboratory sessions focused on speed and efficiency in performing EBM at the bedside. We evaluated the intervention with pre- and postintervention tests and surveys based on a validated tool available for use on MedEdPORTAL. The attending pediatric hospitalists also completed surveys regarding their observations of the residents' behavior. All 38 pediatric residents completed the preintervention test and the pre- and postintervention surveys. All but one completed the posttest. All 7 attending pediatric hospitalists completed their surveys. The testing, targeted to assess EBM knowledge, revealed a median increase of 16 points out of a possible 60 points (P < .0001). We found substantial increases in individual resident's test scores across all 3 years of residency. Resident responses demonstrated statistically significant improvements in self-reported comfort with 6 out of 6 EBM skills and statistically significant increases in self-reported frequencies for 4 out of 7 EBM behaviors. Attending pediatric hospitalists reported improvements in 5 of 7 resident behaviors. This novel approach for teaching EBM to pediatric residents improved knowledge, skills, and behavior through the introduction of a tablet computer and laboratory sessions designed to teach the quick and efficient application of EBM at the bedside. Copyright © 2013 Academic Pediatric Association. Published by Elsevier Inc. All rights reserved.
Implementing Molecular Dynamics on Hybrid High Performance Computers - Three-Body Potentials
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, W Michael; Yamada, Masako
The use of coprocessors or accelerators such as graphics processing units (GPUs) has become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power re- quirements. Hybrid high-performance computers, defined as machines with nodes containing more than one type of floating-point processor (e.g. CPU and GPU), are now becoming more prevalent due to these advantages. Although there has been extensive research into methods to efficiently use accelerators to improve the performance of molecular dynamics (MD) employing pairwise potential energy models, little is reported in the literature for models that includemore » many-body effects. 3-body terms are required for many popular potentials such as MEAM, Tersoff, REBO, AIREBO, Stillinger-Weber, Bond-Order Potentials, and others. Because the per-atom simulation times are much higher for models incorporating 3-body terms, there is a clear need for efficient algo- rithms usable on hybrid high performance computers. Here, we report a shared-memory force-decomposition for 3-body potentials that avoids memory conflicts to allow for a deterministic code with substantial performance improvements on hybrid machines. We describe modifications necessary for use in distributed memory MD codes and show results for the simulation of water with Stillinger-Weber on the hybrid Titan supercomputer. We compare performance of the 3-body model to the SPC/E water model when using accelerators. Finally, we demonstrate that our approach can attain a speedup of 5.1 with acceleration on Titan for production simulations to study water droplet freezing on a surface.« less
Fast Simulation of Dynamic Ultrasound Images Using the GPU.
Storve, Sigurd; Torp, Hans
2017-10-01
Simulated ultrasound data is a valuable tool for development and validation of quantitative image analysis methods in echocardiography. Unfortunately, simulation time can become prohibitive for phantoms consisting of a large number of point scatterers. The COLE algorithm by Gao et al. is a fast convolution-based simulator that trades simulation accuracy for improved speed. We present highly efficient parallelized CPU and GPU implementations of the COLE algorithm with an emphasis on dynamic simulations involving moving point scatterers. We argue that it is crucial to minimize the amount of data transfers from the CPU to achieve good performance on the GPU. We achieve this by storing the complete trajectories of the dynamic point scatterers as spline curves in the GPU memory. This leads to good efficiency when simulating sequences consisting of a large number of frames, such as B-mode and tissue Doppler data for a full cardiac cycle. In addition, we propose a phase-based subsample delay technique that efficiently eliminates flickering artifacts seen in B-mode sequences when COLE is used without enough temporal oversampling. To assess the performance, we used a laptop computer and a desktop computer, each equipped with a multicore Intel CPU and an NVIDIA GPU. Running the simulator on a high-end TITAN X GPU, we observed two orders of magnitude speedup compared to the parallel CPU version, three orders of magnitude speedup compared to simulation times reported by Gao et al. in their paper on COLE, and a speedup of 27000 times compared to the multithreaded version of Field II, using numbers reported in a paper by Jensen. We hope that by releasing the simulator as an open-source project we will encourage its use and further development.
A rational interpolation method to compute frequency response
NASA Technical Reports Server (NTRS)
Kenney, Charles; Stubberud, Stephen; Laub, Alan J.
1993-01-01
A rational interpolation method for approximating a frequency response is presented. The method is based on a product formulation of finite differences, thereby avoiding the numerical problems incurred by near-equal-valued subtraction. Also, resonant pole and zero cancellation schemes are developed that increase the accuracy and efficiency of the interpolation method. Selection techniques of interpolation points are also discussed.
NASA Astrophysics Data System (ADS)
Latypov, Marat I.; Kalidindi, Surya R.
2017-10-01
There is a critical need for the development and verification of practically useful multiscale modeling strategies for simulating the mechanical response of multiphase metallic materials with heterogeneous microstructures. In this contribution, we present data-driven reduced order models for effective yield strength and strain partitioning in such microstructures. These models are built employing the recently developed framework of Materials Knowledge Systems that employ 2-point spatial correlations (or 2-point statistics) for the quantification of the heterostructures and principal component analyses for their low-dimensional representation. The models are calibrated to a large collection of finite element (FE) results obtained for a diverse range of microstructures with various sizes, shapes, and volume fractions of the phases. The performance of the models is evaluated by comparing the predictions of yield strength and strain partitioning in two-phase materials with the corresponding predictions from a classical self-consistent model as well as results of full-field FE simulations. The reduced-order models developed in this work show an excellent combination of accuracy and computational efficiency, and therefore present an important advance towards computationally efficient microstructure-sensitive multiscale modeling frameworks.
Selection of Thermal Worst-Case Orbits via Modified Efficient Global Optimization
NASA Technical Reports Server (NTRS)
Moeller, Timothy M.; Wilhite, Alan W.; Liles, Kaitlin A.
2014-01-01
Efficient Global Optimization (EGO) was used to select orbits with worst-case hot and cold thermal environments for the Stratospheric Aerosol and Gas Experiment (SAGE) III. The SAGE III system thermal model changed substantially since the previous selection of worst-case orbits (which did not use the EGO method), so the selections were revised to ensure the worst cases are being captured. The EGO method consists of first conducting an initial set of parametric runs, generated with a space-filling Design of Experiments (DoE) method, then fitting a surrogate model to the data and searching for points of maximum Expected Improvement (EI) to conduct additional runs. The general EGO method was modified by using a multi-start optimizer to identify multiple new test points at each iteration. This modification facilitates parallel computing and decreases the burden of user interaction when the optimizer code is not integrated with the model. Thermal worst-case orbits for SAGE III were successfully identified and shown by direct comparison to be more severe than those identified in the previous selection. The EGO method is a useful tool for this application and can result in computational savings if the initial Design of Experiments (DoE) is selected appropriately.
NASA Astrophysics Data System (ADS)
Kim, Jeong-Gyu; Kim, Woong-Tae; Ostriker, Eve C.; Skinner, M. Aaron
2017-12-01
We present an implementation of an adaptive ray-tracing (ART) module in the Athena hydrodynamics code that accurately and efficiently handles the radiative transfer involving multiple point sources on a three-dimensional Cartesian grid. We adopt a recently proposed parallel algorithm that uses nonblocking, asynchronous MPI communications to accelerate transport of rays across the computational domain. We validate our implementation through several standard test problems, including the propagation of radiation in vacuum and the expansions of various types of H II regions. Additionally, scaling tests show that the cost of a full ray trace per source remains comparable to that of the hydrodynamics update on up to ∼ {10}3 processors. To demonstrate application of our ART implementation, we perform a simulation of star cluster formation in a marginally bound, turbulent cloud, finding that its star formation efficiency is 12% when both radiation pressure forces and photoionization by UV radiation are treated. We directly compare the radiation forces computed from the ART scheme with those from the M1 closure relation. Although the ART and M1 schemes yield similar results on large scales, the latter is unable to resolve the radiation field accurately near individual point sources.
Geodesic Distance Algorithm for Extracting the Ascending Aorta from 3D CT Images
Jang, Yeonggul; Jung, Ho Yub; Hong, Youngtaek; Cho, Iksung; Shim, Hackjoon; Chang, Hyuk-Jae
2016-01-01
This paper presents a method for the automatic 3D segmentation of the ascending aorta from coronary computed tomography angiography (CCTA). The segmentation is performed in three steps. First, the initial seed points are selected by minimizing a newly proposed energy function across the Hough circles. Second, the ascending aorta is segmented by geodesic distance transformation. Third, the seed points are effectively transferred through the next axial slice by a novel transfer function. Experiments are performed using a database composed of 10 patients' CCTA images. For the experiment, the ground truths are annotated manually on the axial image slices by a medical expert. A comparative evaluation with state-of-the-art commercial aorta segmentation algorithms shows that our approach is computationally more efficient and accurate under the DSC (Dice Similarity Coefficient) measurements. PMID:26904151
NASA Astrophysics Data System (ADS)
Ge, Xuming
2017-08-01
The coarse registration of point clouds from urban building scenes has become a key topic in applications of terrestrial laser scanning technology. Sampling-based algorithms in the random sample consensus (RANSAC) model have emerged as mainstream solutions to address coarse registration problems. In this paper, we propose a novel combined solution to automatically align two markerless point clouds from building scenes. Firstly, the method segments non-ground points from ground points. Secondly, the proposed method detects feature points from each cross section and then obtains semantic keypoints by connecting feature points with specific rules. Finally, the detected semantic keypoints from two point clouds act as inputs to a modified 4PCS algorithm. Examples are presented and the results compared with those of K-4PCS to demonstrate the main contributions of the proposed method, which are the extension of the original 4PCS to handle heavy datasets and the use of semantic keypoints to improve K-4PCS in relation to registration accuracy and computational efficiency.
Computational fluid dynamics uses in fluid dynamics/aerodynamics education
NASA Technical Reports Server (NTRS)
Holst, Terry L.
1994-01-01
The field of computational fluid dynamics (CFD) has advanced to the point where it can now be used for the purpose of fluid dynamics physics education. Because of the tremendous wealth of information available from numerical simulation, certain fundamental concepts can be efficiently communicated using an interactive graphical interrogation of the appropriate numerical simulation data base. In other situations, a large amount of aerodynamic information can be communicated to the student by interactive use of simple CFD tools on a workstation or even in a personal computer environment. The emphasis in this presentation is to discuss ideas for how this process might be implemented. Specific examples, taken from previous publications, will be used to highlight the presentation.
A Worst-Case Approach for On-Line Flutter Prediction
NASA Technical Reports Server (NTRS)
Lind, Rick C.; Brenner, Martin J.
1998-01-01
Worst-case flutter margins may be computed for a linear model with respect to a set of uncertainty operators using the structured singular value. This paper considers an on-line implementation to compute these robust margins in a flight test program. Uncertainty descriptions are updated at test points to account for unmodeled time-varying dynamics of the airplane by ensuring the robust model is not invalidated by measured flight data. Robust margins computed with respect to this uncertainty remain conservative to the changing dynamics throughout the flight. A simulation clearly demonstrates this method can improve the efficiency of flight testing by accurately predicting the flutter margin to improve safety while reducing the necessary flight time.
NASA Technical Reports Server (NTRS)
Cebeci, T.; Carr, L. W.
1978-01-01
A computer program is described which provides solutions of two dimensional equations appropriate to laminar and turbulent boundary layers for boundary conditions with an external flow which fluctuates in magnitude. The program is based on the numerical solution of the governing boundary layer equations by an efficient two point finite difference method. An eddy viscosity formulation was used to model the Reynolds shear stress term. The main features of the method are briefly described and instructions for the computer program with a listing are provided. Sample calculations to demonstrate its usage and capabilities for laminar and turbulent unsteady boundary layers with an external flow which fluctuated in magnitude are presented.
A vectorized algorithm for 3D dynamics of a tethered satellite
NASA Technical Reports Server (NTRS)
Wilson, Howard B.
1989-01-01
Equations of motion characterizing the three dimensional motion of a tethered satellite during the retrieval phase are studied. The mathematical model involves an arbitrary number of point masses connected by weightless cords. Motion occurs in a gravity gradient field. The formulation presented accounts for general functions describing support point motion, rate of tether retrieval, and arbitrary forces applied to the point masses. The matrix oriented program language MATLAB is used to produce an efficient vectorized formulation for computing natural frequencies and mode shapes for small oscillations about the static equilibrium configuration; and for integrating the nonlinear differential equations governing large amplitude motions. An example of time response pertaining to the skip rope effect is investigated.
NASA Technical Reports Server (NTRS)
Slobin, S. D.; Bathker, D. A.
1988-01-01
The gain, phase, and pointing performance of the Deep Space Network (DSN) 70 m antennas are investigated using theoretical antenna analysis computer programs that consider the gravity induced deformation of the antenna surface and quadripod structure. The microwave effects are calculated for normal subreflector focusing motion and for special fixed-subreflector conditions that may be used during the Voyager 2 Neptune encounter. The frequency stability effects of stepwise lateral and axial subreflector motions are also described. Comparisons with recently measured antenna efficiency and subreflector motion tests are presented. A modification to the existing 70 m antenna pointing squint correction constant is proposed.
Energy-efficient STDP-based learning circuits with memristor synapses
NASA Astrophysics Data System (ADS)
Wu, Xinyu; Saxena, Vishal; Campbell, Kristy A.
2014-05-01
It is now accepted that the traditional von Neumann architecture, with processor and memory separation, is ill suited to process parallel data streams which a mammalian brain can efficiently handle. Moreover, researchers now envision computing architectures which enable cognitive processing of massive amounts of data by identifying spatio-temporal relationships in real-time and solving complex pattern recognition problems. Memristor cross-point arrays, integrated with standard CMOS technology, are expected to result in massively parallel and low-power Neuromorphic computing architectures. Recently, significant progress has been made in spiking neural networks (SNN) which emulate data processing in the cortical brain. These architectures comprise of a dense network of neurons and the synapses formed between the axons and dendrites. Further, unsupervised or supervised competitive learning schemes are being investigated for global training of the network. In contrast to a software implementation, hardware realization of these networks requires massive circuit overhead for addressing and individually updating network weights. Instead, we employ bio-inspired learning rules such as the spike-timing-dependent plasticity (STDP) to efficiently update the network weights locally. To realize SNNs on a chip, we propose to use densely integrating mixed-signal integrate-andfire neurons (IFNs) and cross-point arrays of memristors in back-end-of-the-line (BEOL) of CMOS chips. Novel IFN circuits have been designed to drive memristive synapses in parallel while maintaining overall power efficiency (<1 pJ/spike/synapse), even at spike rate greater than 10 MHz. We present circuit design details and simulation results of the IFN with memristor synapses, its response to incoming spike trains and STDP learning characterization.
Implicit method for the computation of unsteady flows on unstructured grids
NASA Technical Reports Server (NTRS)
Venkatakrishnan, V.; Mavriplis, D. J.
1995-01-01
An implicit method for the computation of unsteady flows on unstructured grids is presented. Following a finite difference approximation for the time derivative, the resulting nonlinear system of equations is solved at each time step by using an agglomeration multigrid procedure. The method allows for arbitrarily large time steps and is efficient in terms of computational effort and storage. Inviscid and viscous unsteady flows are computed to validate the procedure. The issue of the mass matrix which arises with vertex-centered finite volume schemes is addressed. The present formulation allows the mass matrix to be inverted indirectly. A mesh point movement and reconnection procedure is described that allows the grids to evolve with the motion of bodies. As an example of flow over bodies in relative motion, flow over a multi-element airfoil system undergoing deployment is computed.
Quantum simulation of quantum field theory using continuous variables
Marshall, Kevin; Pooser, Raphael C.; Siopsis, George; ...
2015-12-14
Much progress has been made in the field of quantum computing using continuous variables over the last couple of years. This includes the generation of extremely large entangled cluster states (10,000 modes, in fact) as well as a fault tolerant architecture. This has lead to the point that continuous-variable quantum computing can indeed be thought of as a viable alternative for universal quantum computing. With that in mind, we present a new algorithm for continuous-variable quantum computers which gives an exponential speedup over the best known classical methods. Specifically, this relates to efficiently calculating the scattering amplitudes in scalar bosonicmore » quantum field theory, a problem that is known to be hard using a classical computer. Thus, we give an experimental implementation based on cluster states that is feasible with today's technology.« less
Quantum simulation of quantum field theory using continuous variables
DOE Office of Scientific and Technical Information (OSTI.GOV)
Marshall, Kevin; Pooser, Raphael C.; Siopsis, George
Much progress has been made in the field of quantum computing using continuous variables over the last couple of years. This includes the generation of extremely large entangled cluster states (10,000 modes, in fact) as well as a fault tolerant architecture. This has lead to the point that continuous-variable quantum computing can indeed be thought of as a viable alternative for universal quantum computing. With that in mind, we present a new algorithm for continuous-variable quantum computers which gives an exponential speedup over the best known classical methods. Specifically, this relates to efficiently calculating the scattering amplitudes in scalar bosonicmore » quantum field theory, a problem that is known to be hard using a classical computer. Thus, we give an experimental implementation based on cluster states that is feasible with today's technology.« less
Multi-Depth-Map Raytracing for Efficient Large-Scene Reconstruction.
Arikan, Murat; Preiner, Reinhold; Wimmer, Michael
2016-02-01
With the enormous advances of the acquisition technology over the last years, fast processing and high-quality visualization of large point clouds have gained increasing attention. Commonly, a mesh surface is reconstructed from the point cloud and a high-resolution texture is generated over the mesh from the images taken at the site to represent surface materials. However, this global reconstruction and texturing approach becomes impractical with increasing data sizes. Recently, due to its potential for scalability and extensibility, a method for texturing a set of depth maps in a preprocessing and stitching them at runtime has been proposed to represent large scenes. However, the rendering performance of this method is strongly dependent on the number of depth maps and their resolution. Moreover, for the proposed scene representation, every single depth map has to be textured by the images, which in practice heavily increases processing costs. In this paper, we present a novel method to break these dependencies by introducing an efficient raytracing of multiple depth maps. In a preprocessing phase, we first generate high-resolution textured depth maps by rendering the input points from image cameras and then perform a graph-cut based optimization to assign a small subset of these points to the images. At runtime, we use the resulting point-to-image assignments (1) to identify for each view ray which depth map contains the closest ray-surface intersection and (2) to efficiently compute this intersection point. The resulting algorithm accelerates both the texturing and the rendering of the depth maps by an order of magnitude.
An effective and secure key-management scheme for hierarchical access control in E-medicine system.
Odelu, Vanga; Das, Ashok Kumar; Goswami, Adrijit
2013-04-01
Recently several hierarchical access control schemes are proposed in the literature to provide security of e-medicine systems. However, most of them are either insecure against 'man-in-the-middle attack' or they require high storage and computational overheads. Wu and Chen proposed a key management method to solve dynamic access control problems in a user hierarchy based on hybrid cryptosystem. Though their scheme improves computational efficiency over Nikooghadam et al.'s approach, it suffers from large storage space for public parameters in public domain and computational inefficiency due to costly elliptic curve point multiplication. Recently, Nikooghadam and Zakerolhosseini showed that Wu-Chen's scheme is vulnerable to man-in-the-middle attack. In order to remedy this security weakness in Wu-Chen's scheme, they proposed a secure scheme which is again based on ECC (elliptic curve cryptography) and efficient one-way hash function. However, their scheme incurs huge computational cost for providing verification of public information in the public domain as their scheme uses ECC digital signature which is costly when compared to symmetric-key cryptosystem. In this paper, we propose an effective access control scheme in user hierarchy which is only based on symmetric-key cryptosystem and efficient one-way hash function. We show that our scheme reduces significantly the storage space for both public and private domains, and computational complexity when compared to Wu-Chen's scheme, Nikooghadam-Zakerolhosseini's scheme, and other related schemes. Through the informal and formal security analysis, we further show that our scheme is secure against different attacks and also man-in-the-middle attack. Moreover, dynamic access control problems in our scheme are also solved efficiently compared to other related schemes, making our scheme is much suitable for practical applications of e-medicine systems.
An efficient implementation of a high-order filter for a cubed-sphere spectral element model
NASA Astrophysics Data System (ADS)
Kang, Hyun-Gyu; Cheong, Hyeong-Bin
2017-03-01
A parallel-scalable, isotropic, scale-selective spatial filter was developed for the cubed-sphere spectral element model on the sphere. The filter equation is a high-order elliptic (Helmholtz) equation based on the spherical Laplacian operator, which is transformed into cubed-sphere local coordinates. The Laplacian operator is discretized on the computational domain, i.e., on each cell, by the spectral element method with Gauss-Lobatto Lagrange interpolating polynomials (GLLIPs) as the orthogonal basis functions. On the global domain, the discrete filter equation yielded a linear system represented by a highly sparse matrix. The density of this matrix increases quadratically (linearly) with the order of GLLIP (order of the filter), and the linear system is solved in only O (Ng) operations, where Ng is the total number of grid points. The solution, obtained by a row reduction method, demonstrated the typical accuracy and convergence rate of the cubed-sphere spectral element method. To achieve computational efficiency on parallel computers, the linear system was treated by an inverse matrix method (a sparse matrix-vector multiplication). The density of the inverse matrix was lowered to only a few times of the original sparse matrix without degrading the accuracy of the solution. For better computational efficiency, a local-domain high-order filter was introduced: The filter equation is applied to multiple cells, and then the central cell was only used to reconstruct the filtered field. The parallel efficiency of applying the inverse matrix method to the global- and local-domain filter was evaluated by the scalability on a distributed-memory parallel computer. The scale-selective performance of the filter was demonstrated on Earth topography. The usefulness of the filter as a hyper-viscosity for the vorticity equation was also demonstrated.
Exploring Neutrino Oscillation Parameter Space with a Monte Carlo Algorithm
NASA Astrophysics Data System (ADS)
Espejel, Hugo; Ernst, David; Cogswell, Bernadette; Latimer, David
2015-04-01
The χ2 (or likelihood) function for a global analysis of neutrino oscillation data is first calculated as a function of the neutrino mixing parameters. A computational challenge is to obtain the minima or the allowed regions for the mixing parameters. The conventional approach is to calculate the χ2 (or likelihood) function on a grid for a large number of points, and then marginalize over the likelihood function. As the number of parameters increases with the number of neutrinos, making the calculation numerically efficient becomes necessary. We implement a new Monte Carlo algorithm (D. Foreman-Mackey, D. W. Hogg, D. Lang and J. Goodman, Publications of the Astronomical Society of the Pacific, 125 306 (2013)) to determine its computational efficiency at finding the minima and allowed regions. We examine a realistic example to compare the historical and the new methods.
Cellular automatons applied to gas dynamic problems
NASA Technical Reports Server (NTRS)
Long, Lyle N.; Coopersmith, Robert M.; Mclachlan, B. G.
1987-01-01
This paper compares the results of a relatively new computational fluid dynamics method, cellular automatons, with experimental data and analytical results. This technique has been shown to qualitatively predict fluidlike behavior; however, there have been few published comparisons with experiment or other theories. Comparisons are made for a one-dimensional supersonic piston problem, Stokes first problem, and the flow past a normal flat plate. These comparisons are used to assess the ability of the method to accurately model fluid dynamic behavior and to point out its limitations. Reasonable results were obtained for all three test cases, but the fundamental limitations of cellular automatons are numerous. It may be misleading, at this time, to say that cellular automatons are a computationally efficient technique. Other methods, based on continuum or kinetic theory, would also be very efficient if as little of the physics were included.
NASA Astrophysics Data System (ADS)
Thompson, Kyle Bonner
An algorithm is described to efficiently compute aerothermodynamic design sensitivities using a decoupled variable set. In a conventional approach to computing design sensitivities for reacting flows, the species continuity equations are fully coupled to the conservation laws for momentum and energy. In this algorithm, the species continuity equations are solved separately from the mixture continuity, momentum, and total energy equations. This decoupling simplifies the implicit system, so that the flow solver can be made significantly more efficient, with very little penalty on overall scheme robustness. Most importantly, the computational cost of the point implicit relaxation is shown to scale linearly with the number of species for the decoupled system, whereas the fully coupled approach scales quadratically. Also, the decoupled method significantly reduces the cost in wall time and memory in comparison to the fully coupled approach. This decoupled approach for computing design sensitivities with the adjoint system is demonstrated for inviscid flow in chemical non-equilibrium around a re-entry vehicle with a retro-firing annular nozzle. The sensitivities of the surface temperature and mass flow rate through the nozzle plenum are computed with respect to plenum conditions and verified against sensitivities computed using a complex-variable finite-difference approach. The decoupled scheme significantly reduces the computational time and memory required to complete the optimization, making this an attractive method for high-fidelity design of hypersonic vehicles.
An 81.6 μW FastICA processor for epileptic seizure detection.
Yang, Chia-Hsiang; Shih, Yi-Hsin; Chiueh, Herming
2015-02-01
To improve the performance of epileptic seizure detection, independent component analysis (ICA) is applied to multi-channel signals to separate artifacts and signals of interest. FastICA is an efficient algorithm to compute ICA. To reduce the energy dissipation, eigenvalue decomposition (EVD) is utilized in the preprocessing stage to reduce the convergence time of iterative calculation of ICA components. EVD is computed efficiently through an array structure of processing elements running in parallel. Area-efficient EVD architecture is realized by leveraging the approximate Jacobi algorithm, leading to a 77.2% area reduction. By choosing proper memory element and reduced wordlength, the power and area of storage memory are reduced by 95.6% and 51.7%, respectively. The chip area is minimized through fixed-point implementation and architectural transformations. Given a latency constraint of 0.1 s, an 86.5% area reduction is achieved compared to the direct-mapped architecture. Fabricated in 90 nm CMOS, the core area of the chip is 0.40 mm(2). The FastICA processor, part of an integrated epileptic control SoC, dissipates 81.6 μW at 0.32 V. The computation delay of a frame of 256 samples for 8 channels is 84.2 ms. Compared to prior work, 0.5% power dissipation, 26.7% silicon area, and 3.4 × computation speedup are achieved. The performance of the chip was verified by human dataset.
NASA Technical Reports Server (NTRS)
Sanz, J.; Pischel, K.; Hubler, D.
1992-01-01
An application for parallel computation on a combined cluster of powerful workstations and supercomputers was developed. A Parallel Virtual Machine (PVM) is used as message passage language on a macro-tasking parallelization of the Aerodynamic Inverse Design and Analysis for a Full Engine computer code. The heterogeneous nature of the cluster is perfectly handled by the controlling host machine. Communication is established via Ethernet with the TCP/IP protocol over an open network. A reasonable overhead is imposed for internode communication, rendering an efficient utilization of the engaged processors. Perhaps one of the most interesting features of the system is its versatile nature, that permits the usage of the computational resources available that are experiencing less use at a given point in time.
Quantified Event Automata: Towards Expressive and Efficient Runtime Monitors
NASA Technical Reports Server (NTRS)
Barringer, Howard; Falcone, Ylies; Havelund, Klaus; Reger, Giles; Rydeheard, David
2012-01-01
Runtime verification is the process of checking a property on a trace of events produced by the execution of a computational system. Runtime verification techniques have recently focused on parametric specifications where events take data values as parameters. These techniques exist on a spectrum inhabited by both efficient and expressive techniques. These characteristics are usually shown to be conflicting - in state-of-the-art solutions, efficiency is obtained at the cost of loss of expressiveness and vice-versa. To seek a solution to this conflict we explore a new point on the spectrum by defining an alternative runtime verification approach.We introduce a new formalism for concisely capturing expressive specifications with parameters. Our technique is more expressive than the currently most efficient techniques while at the same time allowing for optimizations.
NASA Astrophysics Data System (ADS)
Dettmer, J.; Quijano, J. E.; Dosso, S. E.; Holland, C. W.; Mandolesi, E.
2016-12-01
Geophysical seabed properties are important for the detection and classification of unexploded ordnance. However, current surveying methods such as vertical seismic profiling, coring, or inversion are of limited use when surveying large areas with high spatial sampling density. We consider surveys based on a source and receiver array towed by an autonomous vehicle which produce large volumes of seabed reflectivity data that contain unprecedented and detailed seabed information. The data are analyzed with a particle filter, which requires efficient reflection-coefficient computation, efficient inversion algorithms and efficient use of computer resources. The filter quantifies information content of multiple sequential data sets by considering results from previous data along the survey track to inform the importance sampling at the current point. Challenges arise from environmental changes along the track where the number of sediment layers and their properties change. This is addressed by a trans-dimensional model in the filter which allows layering complexity to change along a track. Efficiency is improved by likelihood tempering of various particle subsets and including exchange moves (parallel tempering). The filter is implemented on a hybrid computer that combines central processing units (CPUs) and graphics processing units (GPUs) to exploit three levels of parallelism: (1) fine-grained parallel computation of spherical reflection coefficients with a GPU implementation of Levin integration; (2) updating particles by concurrent CPU processes which exchange information using automatic load balancing (coarse grained parallelism); (3) overlapping CPU-GPU communication (a major bottleneck) with GPU computation by staggering CPU access to the multiple GPUs. The algorithm is applied to spherical reflection coefficients for data sets along a 14-km track on the Malta Plateau, Mediterranean Sea. We demonstrate substantial efficiency gains over previous methods. [This research was supported in part by the U.S. Dept of Defense, thought the Strategic Environmental Research and Development Program (SERDP).
Efficient volumetric estimation from plenoptic data
NASA Astrophysics Data System (ADS)
Anglin, Paul; Reeves, Stanley J.; Thurow, Brian S.
2013-03-01
The commercial release of the Lytro camera, and greater availability of plenoptic imaging systems in general, have given the image processing community cost-effective tools for light-field imaging. While this data is most commonly used to generate planar images at arbitrary focal depths, reconstruction of volumetric fields is also possible. Similarly, deconvolution is a technique that is conventionally used in planar image reconstruction, or deblurring, algorithms. However, when leveraged with the ability of a light-field camera to quickly reproduce multiple focal planes within an imaged volume, deconvolution offers a computationally efficient method of volumetric reconstruction. Related research has shown than light-field imaging systems in conjunction with tomographic reconstruction techniques are also capable of estimating the imaged volume and have been successfully applied to particle image velocimetry (PIV). However, while tomographic volumetric estimation through algorithms such as multiplicative algebraic reconstruction techniques (MART) have proven to be highly accurate, they are computationally intensive. In this paper, the reconstruction problem is shown to be solvable by deconvolution. Deconvolution offers significant improvement in computational efficiency through the use of fast Fourier transforms (FFTs) when compared to other tomographic methods. This work describes a deconvolution algorithm designed to reconstruct a 3-D particle field from simulated plenoptic data. A 3-D extension of existing 2-D FFT-based refocusing techniques is presented to further improve efficiency when computing object focal stacks and system point spread functions (PSF). Reconstruction artifacts are identified; their underlying source and methods of mitigation are explored where possible, and reconstructions of simulated particle fields are provided.
Probability, statistics, and computational science.
Beerenwinkel, Niko; Siebourg, Juliane
2012-01-01
In this chapter, we review basic concepts from probability theory and computational statistics that are fundamental to evolutionary genomics. We provide a very basic introduction to statistical modeling and discuss general principles, including maximum likelihood and Bayesian inference. Markov chains, hidden Markov models, and Bayesian network models are introduced in more detail as they occur frequently and in many variations in genomics applications. In particular, we discuss efficient inference algorithms and methods for learning these models from partially observed data. Several simple examples are given throughout the text, some of which point to models that are discussed in more detail in subsequent chapters.
BRYNTRN: A baryon transport model
NASA Technical Reports Server (NTRS)
Wilson, John W.; Townsend, Lawrence W.; Nealy, John E.; Chun, Sang Y.; Hong, B. S.; Buck, Warren W.; Lamkin, S. L.; Ganapol, Barry D.; Khan, Ferdous; Cucinotta, Francis A.
1989-01-01
The development of an interaction data base and a numerical solution to the transport of baryons through an arbitrary shield material based on a straight ahead approximation of the Boltzmann equation are described. The code is most accurate for continuous energy boundary values, but gives reasonable results for discrete spectra at the boundary using even a relatively coarse energy grid (30 points) and large spatial increments (1 cm in H2O). The resulting computer code is self-contained, efficient and ready to use. The code requires only a very small fraction of the computer resources required for Monte Carlo codes.
A reflection model for eclipsing binary stars
NASA Technical Reports Server (NTRS)
Wood, D. B.
1973-01-01
A highly accurate reflection model has been developed which emphasizes efficiency of computer calculation. It is assumed that the heating of the irradiated star must depend upon the following properties of the irradiating star: (1) effective temperature; (2) apparent area as seen from a point on the surface of the irradiated star; (3) limb darkening; and (4) zenith distance of the apparent centre as seen from a point on the surface of the irradiated star. The algorithm eliminates the need to integrate over the irradiating star while providing a highly accurate representation of the integrated bolometric flux, even for gravitationally distorted stars.
Neural network-based feature point descriptors for registration of optical and SAR images
NASA Astrophysics Data System (ADS)
Abulkhanov, Dmitry; Konovalenko, Ivan; Nikolaev, Dmitry; Savchik, Alexey; Shvets, Evgeny; Sidorchuk, Dmitry
2018-04-01
Registration of images of different nature is an important technique used in image fusion, change detection, efficient information representation and other problems of computer vision. Solving this task using feature-based approaches is usually more complex than registration of several optical images because traditional feature descriptors (SIFT, SURF, etc.) perform poorly when images have different nature. In this paper we consider the problem of registration of SAR and optical images. We train neural network to build feature point descriptors and use RANSAC algorithm to align found matches. Experimental results are presented that confirm the method's effectiveness.
An analytical method to predict efficiency of aircraft gearboxes
NASA Technical Reports Server (NTRS)
Anderson, N. E.; Loewenthal, S. H.; Black, J. D.
1984-01-01
A spur gear efficiency prediction method previously developed by the authors was extended to include power loss of planetary gearsets. A friction coefficient model was developed for MIL-L-7808 oil based on disc machine data. This combined with the recent capability of predicting losses in spur gears of nonstandard proportions allows the calculation of power loss for complete aircraft gearboxes that utilize spur gears. The method was applied to the T56/501 turboprop gearbox and compared with measured test data. Bearing losses were calculated with large scale computer programs. Breakdowns of the gearbox losses point out areas for possible improvement.
Efficient Kriging via Fast Matrix-Vector Products
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess; Raykar, Vikas C.; Duraiswami, Ramani; Mount, David M.
2008-01-01
Interpolating scattered data points is a problem of wide ranging interest. Ordinary kriging is an optimal scattered data estimator, widely used in geosciences and remote sensing. A generalized version of this technique, called cokriging, can be used for image fusion of remotely sensed data. However, it is computationally very expensive for large data sets. We demonstrate the time efficiency and accuracy of approximating ordinary kriging through the use of fast matrixvector products combined with iterative methods. We used methods based on the fast Multipole methods and nearest neighbor searching techniques for implementations of the fast matrix-vector products.
Use of Terrestrial Laser Scanner for Rigid Airport Pavement Management
Di Benedetto, Alessandro; Fiani, Margherita
2017-01-01
The evaluation of the structural efficiency of airport infrastructures is a complex task. Faulting is one of the most important indicators of rigid pavement performance. The aim of our study is to provide a new method for faulting detection and computation on jointed concrete pavements. Nowadays, the assessment of faulting is performed with the use of laborious and time-consuming measurements that strongly hinder aircraft traffic. We proposed a field procedure for Terrestrial Laser Scanner data acquisition and a computation flow chart in order to identify and quantify the fault size at each joint of apron slabs. The total point cloud has been used to compute the least square plane fitting those points. The best-fit plane for each slab has been computed too. The attitude of each slab plane with respect to both the adjacent ones and the apron reference plane has been determined by the normal vectors to the surfaces. Faulting has been evaluated as the difference in elevation between the slab planes along chosen sections. For a more accurate evaluation of the faulting value, we have then considered a few strips of data covering rectangular areas of different sizes across the joints. The accuracy of the estimated quantities has been computed too. PMID:29278386
A two-layer multiple-time-scale turbulence model and grid independence study
NASA Technical Reports Server (NTRS)
Kim, S.-W.; Chen, C.-P.
1989-01-01
A two-layer multiple-time-scale turbulence model is presented. The near-wall model is based on the classical Kolmogorov-Prandtl turbulence hypothesis and the semi-empirical logarithmic law of the wall. In the two-layer model presented, the computational domain of the conservation of mass equation and the mean momentum equation penetrated up to the wall, where no slip boundary condition has been prescribed; and the near wall boundary of the turbulence equations has been located at the fully turbulent region, yet very close to the wall, where the standard wall function method has been applied. Thus, the conservation of mass constraint can be satisfied more rigorously in the two-layer model than in the standard wall function method. In most of the two-layer turbulence models, the number of grid points to be used inside the near-wall layer posed the issue of computational efficiency. The present finite element computational results showed that the grid independent solutions were obtained with as small as two grid points, i.e., one quadratic element, inside the near wall layer. Comparison of the computational results obtained by using the two-layer model and those obtained by using the wall function method is also presented.
Experimental data filtration algorithm
NASA Astrophysics Data System (ADS)
Oanta, E.; Tamas, R.; Danisor, A.
2017-08-01
Experimental data reduction is an important topic because the resulting information is used to calibrate the theoretical models and to verify the accuracy of their results. The paper presents some ideas used to extract a subset of points from the initial set of points which defines an experimentally acquired curve. The objective is to get a subset with significantly fewer points as the initial data set and which accurately defines a smooth curve that preserves the shape of the initial curve. Being a general study we used only data filtering criteria based geometric features that at a later stage may be related to upper level conditions specific to the phenomenon under investigation. Five algorithms were conceived and implemented in an original software consisting of more than 1800 computer code lines which has a flexible structure that allows us to easily update it using new algorithms. The software instrument was used to process the data of several case studies. Conclusions are drawn regarding the values of the parameters used in the algorithms to decide if a series of points may be considered either noise, or a relevant part of the curve. Being a general analysis, the result is a computer based trial-and-error method that efficiently solves this kind of problems.
Cardamone, Salvatore; Hughes, Timothy J; Popelier, Paul L A
2014-06-14
Atomistic simulation of chemical systems is currently limited by the elementary description of electrostatics that atomic point-charges offer. Unfortunately, a model of one point-charge for each atom fails to capture the anisotropic nature of electronic features such as lone pairs or π-systems. Higher order electrostatic terms, such as those offered by a multipole moment expansion, naturally recover these important electronic features. The question remains as to why such a description has not yet been widely adopted by popular molecular mechanics force fields. There are two widely-held misconceptions about the more rigorous formalism of multipolar electrostatics: (1) Accuracy: the implementation of multipole moments, compared to point-charges, offers little to no advantage in terms of an accurate representation of a system's energetics, structure and dynamics. (2) Efficiency: atomistic simulation using multipole moments is computationally prohibitive compared to simulation using point-charges. Whilst the second of these may have found some basis when computational power was a limiting factor, the first has no theoretical grounding. In the current work, we disprove the two statements above and systematically demonstrate that multipole moments are not discredited by either. We hope that this perspective will help in catalysing the transition to more realistic electrostatic modelling, to be adopted by popular molecular simulation software.
Rodríguez, Alfonso; Valverde, Juan; Portilla, Jorge; Otero, Andrés; Riesgo, Teresa; de la Torre, Eduardo
2018-06-08
Cyber-Physical Systems are experiencing a paradigm shift in which processing has been relocated to the distributed sensing layer and is no longer performed in a centralized manner. This approach, usually referred to as Edge Computing, demands the use of hardware platforms that are able to manage the steadily increasing requirements in computing performance, while keeping energy efficiency and the adaptability imposed by the interaction with the physical world. In this context, SRAM-based FPGAs and their inherent run-time reconfigurability, when coupled with smart power management strategies, are a suitable solution. However, they usually fail in user accessibility and ease of development. In this paper, an integrated framework to develop FPGA-based high-performance embedded systems for Edge Computing in Cyber-Physical Systems is presented. This framework provides a hardware-based processing architecture, an automated toolchain, and a runtime to transparently generate and manage reconfigurable systems from high-level system descriptions without additional user intervention. Moreover, it provides users with support for dynamically adapting the available computing resources to switch the working point of the architecture in a solution space defined by computing performance, energy consumption and fault tolerance. Results show that it is indeed possible to explore this solution space at run time and prove that the proposed framework is a competitive alternative to software-based edge computing platforms, being able to provide not only faster solutions, but also higher energy efficiency for computing-intensive algorithms with significant levels of data-level parallelism.
Toward a Global Bundle Adjustment of SPOT 5 - HRS Images
NASA Astrophysics Data System (ADS)
Massera, S.; Favé, P.; Gachet, R.; Orsoni, A.
2012-07-01
The HRS (High Resolution Stereoscopic) instrument carried on SPOT 5 enables quasi-simultaneous acquisition of stereoscopic images on wide segments - 120 km wide - with two forward and backward-looking telescopes observing the Earth with an angle of 20° ahead and behind the vertical. For 8 years IGN (Institut Géographique National) has been developing techniques to achieve spatiotriangulation of these images. During this time the capacities of bundle adjustment of SPOT 5 - HRS spatial images have largely improved. Today a global single block composed of about 20,000 images can be computed in reasonable calculation time. The progression was achieved step by step: first computed blocks were only composed of 40 images, then bigger blocks were computed. Finally only one global block is now computed. In the same time calculation tools have improved: for example the adjustment of 2,000 images of North Africa takes about 2 minutes whereas 8 hours were needed two years ago. To reach such a result a new independent software was developed to compute fast and efficient bundle adjustments. In the same time equipment - GCPs (Ground Control Points) and tie points - and techniques have also evolved over the last 10 years. Studies were made to get recommendations about the equipment in order to make an accurate single block. Tie points can now be quickly and automatically computed with SURF (Speeded Up Robust Features) techniques. Today the updated equipment is composed of about 500 GCPs and studies show that the ideal configuration is around 100 tie points by square degree. With such an equipment, the location of the global HRS block becomes a few meters accurate whereas non adjusted images are only 15 m accurate. This paper will describe the methods used in IGN Espace to compute a global single block composed of almost 20,000 HRS images, 500 GCPs and several million of tie points in reasonable calculation time. Many advantages can be found to use such a block. Because the global block is unique it becomes easier to manage the historic and the different evolutions of the computations (new images, new GCPs or tie points). The location is now unique and consequently coherent all around the world, avoiding steps and artifacts on the borders of DSMs (Digital Surface Models) and OrthoImages historically calculated from different blocks. No extrapolation far from GCPs in the limits of images is done anymore. Using the global block as a reference will allow new images from other sources to be easily located on this reference.
Superposition and alignment of labeled point clouds.
Fober, Thomas; Glinca, Serghei; Klebe, Gerhard; Hüllermeier, Eyke
2011-01-01
Geometric objects are often represented approximately in terms of a finite set of points in three-dimensional euclidean space. In this paper, we extend this representation to what we call labeled point clouds. A labeled point cloud is a finite set of points, where each point is not only associated with a position in three-dimensional space, but also with a discrete class label that represents a specific property. This type of model is especially suitable for modeling biomolecules such as proteins and protein binding sites, where a label may represent an atom type or a physico-chemical property. Proceeding from this representation, we address the question of how to compare two labeled points clouds in terms of their similarity. Using fuzzy modeling techniques, we develop a suitable similarity measure as well as an efficient evolutionary algorithm to compute it. Moreover, we consider the problem of establishing an alignment of the structures in the sense of a one-to-one correspondence between their basic constituents. From a biological point of view, alignments of this kind are of great interest, since mutually corresponding molecular constituents offer important information about evolution and heredity, and can also serve as a means to explain a degree of similarity. In this paper, we therefore develop a method for computing pairwise or multiple alignments of labeled point clouds. To this end, we proceed from an optimal superposition of the corresponding point clouds and construct an alignment which is as much as possible in agreement with the neighborhood structure established by this superposition. We apply our methods to the structural analysis of protein binding sites.
Solving Constraint Satisfaction Problems with Networks of Spiking Neurons
Jonke, Zeno; Habenschuss, Stefan; Maass, Wolfgang
2016-01-01
Network of neurons in the brain apply—unlike processors in our current generation of computer hardware—an event-based processing strategy, where short pulses (spikes) are emitted sparsely by neurons to signal the occurrence of an event at a particular point in time. Such spike-based computations promise to be substantially more power-efficient than traditional clocked processing schemes. However, it turns out to be surprisingly difficult to design networks of spiking neurons that can solve difficult computational problems on the level of single spikes, rather than rates of spikes. We present here a new method for designing networks of spiking neurons via an energy function. Furthermore, we show how the energy function of a network of stochastically firing neurons can be shaped in a transparent manner by composing the networks of simple stereotypical network motifs. We show that this design approach enables networks of spiking neurons to produce approximate solutions to difficult (NP-hard) constraint satisfaction problems from the domains of planning/optimization and verification/logical inference. The resulting networks employ noise as a computational resource. Nevertheless, the timing of spikes plays an essential role in their computations. Furthermore, networks of spiking neurons carry out for the Traveling Salesman Problem a more efficient stochastic search for good solutions compared with stochastic artificial neural networks (Boltzmann machines) and Gibbs sampling. PMID:27065785
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kumar, Jitendra; HargroveJr., William Walter; Norman, Steven P
Vegetation canopy structure is a critically important habit characteristic for many threatened and endangered birds and other animal species, and it is key information needed by forest and wildlife managers for monitoring and managing forest resources, conservation planning and fostering biodiversity. Advances in Light Detection and Ranging (LiDAR) technologies have enabled remote sensing-based studies of vegetation canopies by capturing three-dimensional structures, yielding information not available in two-dimensional images of the landscape pro- vided by traditional multi-spectral remote sensing platforms. However, the large volume data sets produced by airborne LiDAR instruments pose a significant computational challenge, requiring algorithms to identify andmore » analyze patterns of interest buried within LiDAR point clouds in a computationally efficient manner, utilizing state-of-art computing infrastructure. We developed and applied a computationally efficient approach to analyze a large volume of LiDAR data and to characterize and map the vegetation canopy structures for 139,859 hectares (540 sq. miles) in the Great Smoky Mountains National Park. This study helps improve our understanding of the distribution of vegetation and animal habitats in this extremely diverse ecosystem.« less
An efficient algorithm for the generalized Foldy-Lax formulation
NASA Astrophysics Data System (ADS)
Huang, Kai; Li, Peijun; Zhao, Hongkai
2013-02-01
Consider the scattering of a time-harmonic plane wave incident on a two-scale heterogeneous medium, which consists of scatterers that are much smaller than the wavelength and extended scatterers that are comparable to the wavelength. In this work we treat those small scatterers as isotropic point scatterers and use a generalized Foldy-Lax formulation to model wave propagation and capture multiple scattering among point scatterers and extended scatterers. Our formulation is given as a coupled system, which combines the original Foldy-Lax formulation for the point scatterers and the regular boundary integral equation for the extended obstacle scatterers. The existence and uniqueness of the solution for the formulation is established in terms of physical parameters such as the scattering coefficient and the separation distances. Computationally, an efficient physically motivated Gauss-Seidel iterative method is proposed to solve the coupled system, where only a linear system of algebraic equations for point scatterers or a boundary integral equation for a single extended obstacle scatterer is required to solve at each step of iteration. The convergence of the iterative method is also characterized in terms of physical parameters. Numerical tests for the far-field patterns of scattered fields arising from uniformly or randomly distributed point scatterers and single or multiple extended obstacle scatterers are presented.
A multimedia retrieval framework based on semi-supervised ranking and relevance feedback.
Yang, Yi; Nie, Feiping; Xu, Dong; Luo, Jiebo; Zhuang, Yueting; Pan, Yunhe
2012-04-01
We present a new framework for multimedia content analysis and retrieval which consists of two independent algorithms. First, we propose a new semi-supervised algorithm called ranking with Local Regression and Global Alignment (LRGA) to learn a robust Laplacian matrix for data ranking. In LRGA, for each data point, a local linear regression model is used to predict the ranking scores of its neighboring points. A unified objective function is then proposed to globally align the local models from all the data points so that an optimal ranking score can be assigned to each data point. Second, we propose a semi-supervised long-term Relevance Feedback (RF) algorithm to refine the multimedia data representation. The proposed long-term RF algorithm utilizes both the multimedia data distribution in multimedia feature space and the history RF information provided by users. A trace ratio optimization problem is then formulated and solved by an efficient algorithm. The algorithms have been applied to several content-based multimedia retrieval applications, including cross-media retrieval, image retrieval, and 3D motion/pose data retrieval. Comprehensive experiments on four data sets have demonstrated its advantages in precision, robustness, scalability, and computational efficiency.
Churkin, Alexander; Barash, Danny
2008-01-01
Background RNAmute is an interactive Java application which, given an RNA sequence, calculates the secondary structure of all single point mutations and organizes them into categories according to their similarity to the predicted structure of the wild type. The secondary structure predictions are performed using the Vienna RNA package. A more efficient implementation of RNAmute is needed, however, to extend from the case of single point mutations to the general case of multiple point mutations, which may often be desired for computational predictions alongside mutagenesis experiments. But analyzing multiple point mutations, a process that requires traversing all possible mutations, becomes highly expensive since the running time is O(nm) for a sequence of length n with m-point mutations. Using Vienna's RNAsubopt, we present a method that selects only those mutations, based on stability considerations, which are likely to be conformational rearranging. The approach is best examined using the dot plot representation for RNA secondary structure. Results Using RNAsubopt, the suboptimal solutions for a given wild-type sequence are calculated once. Then, specific mutations are selected that are most likely to cause a conformational rearrangement. For an RNA sequence of about 100 nts and 3-point mutations (n = 100, m = 3), for example, the proposed method reduces the running time from several hours or even days to several minutes, thus enabling the practical application of RNAmute to the analysis of multiple-point mutations. Conclusion A highly efficient addition to RNAmute that is as user friendly as the original application but that facilitates the practical analysis of multiple-point mutations is presented. Such an extension can now be exploited prior to site-directed mutagenesis experiments by virologists, for example, who investigate the change of function in an RNA virus via mutations that disrupt important motifs in its secondary structure. A complete explanation of the application, called MultiRNAmute, is available at [1]. PMID:18445289
From transistor to trapped-ion computers for quantum chemistry.
Yung, M-H; Casanova, J; Mezzacapo, A; McClean, J; Lamata, L; Aspuru-Guzik, A; Solano, E
2014-01-07
Over the last few decades, quantum chemistry has progressed through the development of computational methods based on modern digital computers. However, these methods can hardly fulfill the exponentially-growing resource requirements when applied to large quantum systems. As pointed out by Feynman, this restriction is intrinsic to all computational models based on classical physics. Recently, the rapid advancement of trapped-ion technologies has opened new possibilities for quantum control and quantum simulations. Here, we present an efficient toolkit that exploits both the internal and motional degrees of freedom of trapped ions for solving problems in quantum chemistry, including molecular electronic structure, molecular dynamics, and vibronic coupling. We focus on applications that go beyond the capacity of classical computers, but may be realizable on state-of-the-art trapped-ion systems. These results allow us to envision a new paradigm of quantum chemistry that shifts from the current transistor to a near-future trapped-ion-based technology.
From transistor to trapped-ion computers for quantum chemistry
Yung, M.-H.; Casanova, J.; Mezzacapo, A.; McClean, J.; Lamata, L.; Aspuru-Guzik, A.; Solano, E.
2014-01-01
Over the last few decades, quantum chemistry has progressed through the development of computational methods based on modern digital computers. However, these methods can hardly fulfill the exponentially-growing resource requirements when applied to large quantum systems. As pointed out by Feynman, this restriction is intrinsic to all computational models based on classical physics. Recently, the rapid advancement of trapped-ion technologies has opened new possibilities for quantum control and quantum simulations. Here, we present an efficient toolkit that exploits both the internal and motional degrees of freedom of trapped ions for solving problems in quantum chemistry, including molecular electronic structure, molecular dynamics, and vibronic coupling. We focus on applications that go beyond the capacity of classical computers, but may be realizable on state-of-the-art trapped-ion systems. These results allow us to envision a new paradigm of quantum chemistry that shifts from the current transistor to a near-future trapped-ion-based technology. PMID:24395054
Some safe and sensible shortcuts for efficiently upscaled updates of existing elevation models.
NASA Astrophysics Data System (ADS)
Knudsen, Thomas; Aasbjerg Nielsen, Allan
2013-04-01
The Danish national elevation model, DK-DEM, was introduced in 2009 and is based on LiDAR data collected in the time frame 2005-2007. Hence, DK-DEM is aging, and it is time to consider how to integrate new data with the current model in a way that improves the representation of new landscape features, while still preserving the overall (very high) quality of the model. In LiDAR terms, 2005 is equivalent to some time between the palaeolithic and the neolithic. So evidently, when (and if) an update project is launched, we may expect some notable improvements due to the technical and scientific developments from the last half decade. To estimate the magnitude of these potential improvements, and to devise efficient and effective ways of integrating the new and old data, we currently carry out a number of case studies based on comparisons between the current terrain model (with a ground sample distance, GSD, of 1.6 m), and a number of new high resolution point clouds (10-70 points/m2). Not knowing anything about the terms of a potential update project, we consider multiple scenarios ranging from business as usual: A new model with the same GSD, but improved precision, to aggressive upscaling: A new model with 4 times better GSD, i.e. a 16-fold increase in the amount of data. Especially in the latter case speeding up the gridding process is important. Luckily recent results from one of our case studies reveal that for very high resolution data in smooth terrain (which is the common case in Denmark), using local mean (LM) as grid value estimator is only negligibly worse than using the theoretically "best" estimator, i.e. ordinary kriging (OK) with rigorous modelling of the semivariogram. The bias in a leave one out cross validation differs on the micrometer level, while the RMSE differs on the 0.1 mm level. This is fortunate, since a LM estimator can be implemented in plain stream mode, letting the points from the unstructured point cloud (i.e. no TIN generation) stream through the processor, individually contributing to the nearest grid posts in a memory mapped grid file. Algorithmically this is very efficient, but it would be even more efficient if we did not have to handle so much data. Another of our recent case studies focuses on this. The basic idea is to ignore data that does not tell us anything new. We do this by looking at anomalies between the current height model and the new point cloud, then computing a correction grid for the current model. Points with insignificant anomalies are simply removed from the point cloud, and the correction grid is computed using the remaining point anomalies only. Hence, we only compute updates in areas of significant change, speeding up the process, and giving us new insight of the precision of the current model which in turn results in improved metadata for both the current and the new model. Currently we focus on simple approaches for creating a smooth update process for integration of heterogeneous data sets. On the other hand, as years go by and multiple generations of data become available, more advanced approaches will probably become necessary (e.g. a multi campaign bundle adjustment, improving the oldest data using cross-over adjustment with newer campaigns). But to prepare for such approaches, it is important already now to organize and evaluate the ancillary (GPS, INS) and engineering level data for the current data sets. This is essential if future generations of DEM users should be able to benefit from future conceptions of "some safe and sensible shortcuts for efficiently upscaled updates of existing elevation models".
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pal, Ranjan; Chelmis, Charalampos; Aman, Saima
The advent of smart meters and advanced communication infrastructures catalyzes numerous smart grid applications such as dynamic demand response, and paves the way to solve challenging research problems in sustainable energy consumption. The space of solution possibilities are restricted primarily by the huge amount of generated data requiring considerable computational resources and efficient algorithms. To overcome this Big Data challenge, data clustering techniques have been proposed. Current approaches however do not scale in the face of the “increasing dimensionality” problem where a cluster point is represented by the entire customer consumption time series. To overcome this aspect we first rethinkmore » the way cluster points are created and designed, and then design an efficient online clustering technique for demand response (DR) in order to analyze high volume, high dimensional energy consumption time series data at scale, and on the fly. Our online algorithm is randomized in nature, and provides optimal performance guarantees in a computationally efficient manner. Unlike prior work we (i) study the consumption properties of the whole population simultaneously rather than developing individual models for each customer separately, claiming it to be a ‘killer’ approach that breaks the “curse of dimensionality” in online time series clustering, and (ii) provide tight performance guarantees in theory to validate our approach. Our insights are driven by the field of sociology, where collective behavior often emerges as the result of individual patterns and lifestyles.« less
Three-dimensional tracking for efficient fire fighting in complex situations
NASA Astrophysics Data System (ADS)
Akhloufi, Moulay; Rossi, Lucile
2009-05-01
Each year, hundred millions hectares of forests burn causing human and economic losses. For efficient fire fighting, the personnel in the ground need tools permitting the prediction of fire front propagation. In this work, we present a new technique for automatically tracking fire spread in three-dimensional space. The proposed approach uses a stereo system to extract a 3D shape from fire images. A new segmentation technique is proposed and permits the extraction of fire regions in complex unstructured scenes. It works in the visible spectrum and combines information extracted from YUV and RGB color spaces. Unlike other techniques, our algorithm does not require previous knowledge about the scene. The resulting fire regions are classified into different homogenous zones using clustering techniques. Contours are then extracted and a feature detection algorithm is used to detect interest points like local maxima and corners. Extracted points from stereo images are then used to compute the 3D shape of the fire front. The resulting data permits to build the fire volume. The final model is used to compute important spatial and temporal fire characteristics like: spread dynamics, local orientation, heading direction, etc. Tests conducted on the ground show the efficiency of the proposed scheme. This scheme is being integrated with a fire spread mathematical model in order to predict and anticipate the fire behaviour during fire fighting. Also of interest to fire-fighters, is the proposed automatic segmentation technique that can be used in early detection of fire in complex scenes.
Computational wing optimization and comparisons with experiment for a semi-span wing model
NASA Technical Reports Server (NTRS)
Waggoner, E. G.; Haney, H. P.; Ballhaus, W. F.
1978-01-01
A computational wing optimization procedure was developed and verified by an experimental investigation of a semi-span variable camber wing model in the NASA Ames Research Center 14 foot transonic wind tunnel. The Bailey-Ballhaus transonic potential flow analysis and Woodward-Carmichael linear theory codes were linked to Vanderplaats constrained minimization routine to optimize model configurations at several subsonic and transonic design points. The 35 deg swept wing is characterized by multi-segmented leading and trailing edge flaps whose hinge lines are swept relative to the leading and trailing edges of the wing. By varying deflection angles of the flap segments, camber and twist distribution can be optimized for different design conditions. Results indicate that numerical optimization can be both an effective and efficient design tool. The optimized configurations had as good or better lift to drag ratios at the design points as the best designs previously tested during an extensive parametric study.
An Automatic Registration Algorithm for 3D Maxillofacial Model
NASA Astrophysics Data System (ADS)
Qiu, Luwen; Zhou, Zhongwei; Guo, Jixiang; Lv, Jiancheng
2016-09-01
3D image registration aims at aligning two 3D data sets in a common coordinate system, which has been widely used in computer vision, pattern recognition and computer assisted surgery. One challenging problem in 3D registration is that point-wise correspondences between two point sets are often unknown apriori. In this work, we develop an automatic algorithm for 3D maxillofacial models registration including facial surface model and skull model. Our proposed registration algorithm can achieve a good alignment result between partial and whole maxillofacial model in spite of ambiguous matching, which has a potential application in the oral and maxillofacial reparative and reconstructive surgery. The proposed algorithm includes three steps: (1) 3D-SIFT features extraction and FPFH descriptors construction; (2) feature matching using SAC-IA; (3) coarse rigid alignment and refinement by ICP. Experiments on facial surfaces and mandible skull models demonstrate the efficiency and robustness of our algorithm.
Gaussian process surrogates for failure detection: A Bayesian experimental design approach
NASA Astrophysics Data System (ADS)
Wang, Hongqiao; Lin, Guang; Li, Jinglai
2016-05-01
An important task of uncertainty quantification is to identify the probability of undesired events, in particular, system failures, caused by various sources of uncertainties. In this work we consider the construction of Gaussian process surrogates for failure detection and failure probability estimation. In particular, we consider the situation that the underlying computer models are extremely expensive, and in this setting, determining the sampling points in the state space is of essential importance. We formulate the problem as an optimal experimental design for Bayesian inferences of the limit state (i.e., the failure boundary) and propose an efficient numerical scheme to solve the resulting optimization problem. In particular, the proposed limit-state inference method is capable of determining multiple sampling points at a time, and thus it is well suited for problems where multiple computer simulations can be performed in parallel. The accuracy and performance of the proposed method is demonstrated by both academic and practical examples.
NASA Technical Reports Server (NTRS)
Mclennan, G. A.
1986-01-01
This report describes, and is a User's Manual for, a computer code (ANL/RBC) which calculates cycle performance for Rankine bottoming cycles extracting heat from a specified source gas stream. The code calculates cycle power and efficiency and the sizes for the heat exchangers, using tabular input of the properties of the cycle working fluid. An option is provided to calculate the costs of system components from user defined input cost functions. These cost functions may be defined in equation form or by numerical tabular data. A variety of functional forms have been included for these functions and they may be combined to create very general cost functions. An optional calculation mode can be used to determine the off-design performance of a system when operated away from the design-point, using the heat exchanger areas calculated for the design-point.
NASA Technical Reports Server (NTRS)
Rosenberg, L. S.; Revere, W. R.; Selcuk, M. K.
1981-01-01
A computer simulation code was employed to evaluate several generic types of solar power systems (up to 10 MWe). Details of the simulation methodology, and the solar plant concepts are given along with cost and performance results. The Solar Energy Simulation computer code (SESII) was used, which optimizes the size of the collector field and energy storage subsystem for given engine-generator and energy-transport characteristics. Nine plant types were examined which employed combinations of different technology options, such as: distributed or central receivers with one- or two-axis tracking or no tracking; point- or line-focusing concentrator; central or distributed power conversion; Rankin, Brayton, or Stirling thermodynamic cycles; and thermal or electrical storage. Optimal cost curves were plotted as a function of levelized busbar energy cost and annualized plant capacity. Point-focusing distributed receiver systems were found to be most efficient (17-26 percent).
Material point method modeling in oil and gas reservoirs
Vanderheyden, William Brian; Zhang, Duan
2016-06-28
A computer system and method of simulating the behavior of an oil and gas reservoir including changes in the margins of frangible solids. A system of equations including state equations such as momentum, and conservation laws such as mass conservation and volume fraction continuity, are defined and discretized for at least two phases in a modeled volume, one of which corresponds to frangible material. A material point model technique for numerically solving the system of discretized equations, to derive fluid flow at each of a plurality of mesh nodes in the modeled volume, and the velocity of at each of a plurality of particles representing the frangible material in the modeled volume. A time-splitting technique improves the computational efficiency of the simulation while maintaining accuracy on the deformation scale. The method can be applied to derive accurate upscaled model equations for larger volume scale simulations.
Optimization of Time-Dependent Particle Tracing Using Tetrahedral Decomposition
NASA Technical Reports Server (NTRS)
Kenwright, David; Lane, David
1995-01-01
An efficient algorithm is presented for computing particle paths, streak lines and time lines in time-dependent flows with moving curvilinear grids. The integration, velocity interpolation and step-size control are all performed in physical space which avoids the need to transform the velocity field into computational space. This leads to higher accuracy because there are no Jacobian matrix approximations or expensive matrix inversions. Integration accuracy is maintained using an adaptive step-size control scheme which is regulated by the path line curvature. The problem of cell-searching, point location and interpolation in physical space is simplified by decomposing hexahedral cells into tetrahedral cells. This enables the point location to be done analytically and substantially faster than with a Newton-Raphson iterative method. Results presented show this algorithm is up to six times faster than particle tracers which operate on hexahedral cells yet produces almost identical particle trajectories.
Ding, Xuemei; Bucholc, Magda; Wang, Haiying; Glass, David H; Wang, Hui; Clarke, Dave H; Bjourson, Anthony John; Dowey, Le Roy C; O'Kane, Maurice; Prasad, Girijesh; Maguire, Liam; Wong-Lin, KongFatt
2018-06-27
There is currently a lack of an efficient, objective and systemic approach towards the classification of Alzheimer's disease (AD), due to its complex etiology and pathogenesis. As AD is inherently dynamic, it is also not clear how the relationships among AD indicators vary over time. To address these issues, we propose a hybrid computational approach for AD classification and evaluate it on the heterogeneous longitudinal AIBL dataset. Specifically, using clinical dementia rating as an index of AD severity, the most important indicators (mini-mental state examination, logical memory recall, grey matter and cerebrospinal volumes from MRI and active voxels from PiB-PET brain scans, ApoE, and age) can be automatically identified from parallel data mining algorithms. In this work, Bayesian network modelling across different time points is used to identify and visualize time-varying relationships among the significant features, and importantly, in an efficient way using only coarse-grained data. Crucially, our approach suggests key data features and their appropriate combinations that are relevant for AD severity classification with high accuracy. Overall, our study provides insights into AD developments and demonstrates the potential of our approach in supporting efficient AD diagnosis.
NASA Astrophysics Data System (ADS)
Peng, Haijun; Wang, Wei
2016-10-01
An adaptive surrogate model-based multi-objective optimization strategy that combines the benefits of invariant manifolds and low-thrust control toward developing a low-computational-cost transfer trajectory between libration orbits around the L1 and L2 libration points in the Sun-Earth system has been proposed in this paper. A new structure for a multi-objective transfer trajectory optimization model that divides the transfer trajectory into several segments and gives the dominations for invariant manifolds and low-thrust control in different segments has been established. To reduce the computational cost of multi-objective transfer trajectory optimization, a mixed sampling strategy-based adaptive surrogate model has been proposed. Numerical simulations show that the results obtained from the adaptive surrogate-based multi-objective optimization are in agreement with the results obtained using direct multi-objective optimization methods, and the computational workload of the adaptive surrogate-based multi-objective optimization is only approximately 10% of that of direct multi-objective optimization. Furthermore, the generating efficiency of the Pareto points of the adaptive surrogate-based multi-objective optimization is approximately 8 times that of the direct multi-objective optimization. Therefore, the proposed adaptive surrogate-based multi-objective optimization provides obvious advantages over direct multi-objective optimization methods.
Advanced display object selection methods for enhancing user-computer productivity
NASA Technical Reports Server (NTRS)
Osga, Glenn A.
1993-01-01
The User-Interface Technology Branch at NCCOSC RDT&E Division has been conducting a series of studies to address the suitability of commercial off-the-shelf (COTS) graphic user-interface (GUI) methods for efficiency and performance in critical naval combat systems. This paper presents an advanced selection algorithm and method developed to increase user performance when making selections on tactical displays. The method has also been applied with considerable success to a variety of cursor and pointing tasks. Typical GUI's allow user selection by: (1) moving a cursor with a pointing device such as a mouse, trackball, joystick, touchscreen; and (2) placing the cursor on the object. Examples of GUI objects are the buttons, icons, folders, scroll bars, etc. used in many personal computer and workstation applications. This paper presents an improved method of selection and the theoretical basis for the significant performance gains achieved with various input devices tested. The method is applicable to all GUI styles and display sizes, and is particularly useful for selections on small screens such as notebook computers. Considering the amount of work-hours spent pointing and clicking across all styles of available graphic user-interfaces, the cost/benefit in applying this method to graphic user-interfaces is substantial, with the potential for increasing productivity across thousands of users and applications.
Computationally-Efficient Minimum-Time Aircraft Routes in the Presence of Winds
NASA Technical Reports Server (NTRS)
Jardin, Matthew R.
2004-01-01
A computationally efficient algorithm for minimizing the flight time of an aircraft in a variable wind field has been invented. The algorithm, referred to as Neighboring Optimal Wind Routing (NOWR), is based upon neighboring-optimal-control (NOC) concepts and achieves minimum-time paths by adjusting aircraft heading according to wind conditions at an arbitrary number of wind measurement points along the flight route. The NOWR algorithm may either be used in a fast-time mode to compute minimum- time routes prior to flight, or may be used in a feedback mode to adjust aircraft heading in real-time. By traveling minimum-time routes instead of direct great-circle (direct) routes, flights across the United States can save an average of about 7 minutes, and as much as one hour of flight time during periods of strong jet-stream winds. The neighboring optimal routes computed via the NOWR technique have been shown to be within 1.5 percent of the absolute minimum-time routes for flights across the continental United States. On a typical 450-MHz Sun Ultra workstation, the NOWR algorithm produces complete minimum-time routes in less than 40 milliseconds. This corresponds to a rate of 25 optimal routes per second. The closest comparable optimization technique runs approximately 10 times slower. Airlines currently use various trial-and-error search techniques to determine which of a set of commonly traveled routes will minimize flight time. These algorithms are too computationally expensive for use in real-time systems, or in systems where many optimal routes need to be computed in a short amount of time. Instead of operating in real-time, airlines will typically plan a trajectory several hours in advance using wind forecasts. If winds change significantly from forecasts, the resulting flights will no longer be minimum-time. The need for a computationally efficient wind-optimal routing algorithm is even greater in the case of new air-traffic-control automation concepts. For air-traffic-control automation, thousands of wind-optimal routes may need to be computed and checked for conflicts in just a few minutes. These factors motivated the need for a more efficient wind-optimal routing algorithm.
NASA Astrophysics Data System (ADS)
Yaparova, N.
2017-10-01
We consider the problem of heating a cylindrical body with an internal thermal source when the main characteristics of the material such as specific heat, thermal conductivity and material density depend on the temperature at each point of the body. We can control the surface temperature and the heat flow from the surface inside the cylinder, but it is impossible to measure the temperature on axis and the initial temperature in the entire body. This problem is associated with the temperature measurement challenge and appears in non-destructive testing, in thermal monitoring of heat treatment and technical diagnostics of operating equipment. The mathematical model of heating is represented as nonlinear parabolic PDE with the unknown initial condition. In this problem, both the Dirichlet and Neumann boundary conditions are given and it is required to calculate the temperature values at the internal points of the body. To solve this problem, we propose the numerical method based on using of finite-difference equations and a regularization technique. The computational scheme involves solving the problem at each spatial step. As a result, we obtain the temperature function at each internal point of the cylinder beginning from the surface down to the axis. The application of the regularization technique ensures the stability of the scheme and allows us to significantly simplify the computational procedure. We investigate the stability of the computational scheme and prove the dependence of the stability on the discretization steps and error level of the measurement results. To obtain the experimental temperature error estimates, computational experiments were carried out. The computational results are consistent with the theoretical error estimates and confirm the efficiency and reliability of the proposed computational scheme.
NASA Astrophysics Data System (ADS)
Takemiya, Tetsushi
In modern aerospace engineering, the physics-based computational design method is becoming more important, as it is more efficient than experiments and because it is more suitable in designing new types of aircraft (e.g., unmanned aerial vehicles or supersonic business jets) than the conventional design method, which heavily relies on historical data. To enhance the reliability of the physics-based computational design method, researchers have made tremendous efforts to improve the fidelity of models. However, high-fidelity models require longer computational time, so the advantage of efficiency is partially lost. This problem has been overcome with the development of variable fidelity optimization (VFO). In VFO, different fidelity models are simultaneously employed in order to improve the speed and the accuracy of convergence in an optimization process. Among the various types of VFO methods, one of the most promising methods is the approximation management framework (AMF). In the AMF, objective and constraint functions of a low-fidelity model are scaled at a design point so that the scaled functions, which are referred to as "surrogate functions," match those of a high-fidelity model. Since scaling functions and the low-fidelity model constitutes surrogate functions, evaluating the surrogate functions is faster than evaluating the high-fidelity model. Therefore, in the optimization process, in which gradient-based optimization is implemented and thus many function calls are required, the surrogate functions are used instead of the high-fidelity model to obtain a new design point. The best feature of the AMF is that it may converge to a local optimum of the high-fidelity model in much less computational time than the high-fidelity model. However, through literature surveys and implementations of the AMF, the author xx found that (1) the AMF is very vulnerable when the computational analysis models have numerical noise, which is very common in high-fidelity models, and that (2) the AMF terminates optimization erroneously when the optimization problems have constraints. The first problem is due to inaccuracy in computing derivatives in the AMF, and the second problem is due to erroneous treatment of the trust region ratio, which sets the size of the domain for an optimization in the AMF. In order to solve the first problem of the AMF, automatic differentiation (AD) technique, which reads the codes of analysis models and automatically generates new derivative codes based on some mathematical rules, is applied. If derivatives are computed with the generated derivative code, they are analytical, and the required computational time is independent of the number of design variables, which is very advantageous for realistic aerospace engineering problems. However, if analysis models implement iterative computations such as computational fluid dynamics (CFD), which solves system partial differential equations iteratively, computing derivatives through the AD requires a massive memory size. The author solved this deficiency by modifying the AD approach and developing a more efficient implementation with CFD, and successfully applied the AD to general CFD software. In order to solve the second problem of the AMF, the governing equation of the trust region ratio, which is very strict against the violation of constraints, is modified so that it can accept the violation of constraints within some tolerance. By accepting violations of constraints during the optimization process, the AMF can continue optimization without terminating immaturely and eventually find the true optimum design point. With these modifications, the AMF is referred to as "Robust AMF," and it is applied to airfoil and wing aerodynamic design problems using Euler CFD software. The former problem has 21 design variables, and the latter 64. In both problems, derivatives computed with the proposed AD method are first compared with those computed with the finite differentiation (FD) method, and then, the Robust AMF is implemented along with the sequential quadratic programming (SQP) optimization method with only high-fidelity models. The proposed AD method computes derivatives more accurately and faster than the FD method, and the Robust AMF successfully optimizes shapes of the airfoil and the wing in a much shorter time than SQP with only high-fidelity models. These results clearly show the effectiveness of the Robust AMF. Finally, the feasibility of reducing computational time for calculating derivatives and the necessity of AMF with an optimum design point always in the feasible region are discussed as future work.
NASA Astrophysics Data System (ADS)
Zoka, Yoshifumi; Yorino, Naoto; Kawano, Koki; Suenari, Hiroyasu
This paper proposes a fast computation method for Available Transfer Capability (ATC) with respect to thermal and voltage magnitude limits. In the paper, ATC is formulated as an optimization problem. In order to obtain the efficiency for the N-1 outage contingency calculations, linear sensitivity methods are applied for screening and ranking all contingency selections with respect to the thermal and voltage magnitude limits margin to identify the severest case. In addition, homotopy functions are used for the generator QV constrains to reduce the maximum error of the linear estimation. Then, the Primal-Dual Interior Point Method (PDIPM) is used to solve the optimization problem for the severest case only, in which the solutions of ATC can be obtained efficiently. The effectiveness of the proposed method is demonstrated through IEEE 30, 57, 118-bus systems.
An Efficient Solution Method for Multibody Systems with Loops Using Multiple Processors
NASA Technical Reports Server (NTRS)
Ghosh, Tushar K.; Nguyen, Luong A.; Quiocho, Leslie J.
2015-01-01
This paper describes a multibody dynamics algorithm formulated for parallel implementation on multiprocessor computing platforms using the divide-and-conquer approach. The system of interest is a general topology of rigid and elastic articulated bodies with or without loops. The algorithm divides the multibody system into a number of smaller sets of bodies in chain or tree structures, called "branches" at convenient joints called "connection points", and uses an Order-N (O (N)) approach to formulate the dynamics of each branch in terms of the unknown spatial connection forces. The equations of motion for the branches, leaving the connection forces as unknowns, are implemented in separate processors in parallel for computational efficiency, and the equations for all the unknown connection forces are synthesized and solved in one or several processors. The performances of two implementations of this divide-and-conquer algorithm in multiple processors are compared with an existing method implemented on a single processor.
NASA Astrophysics Data System (ADS)
Moreno, Javier; Somolinos, Álvaro; Romero, Gustavo; González, Iván; Cátedra, Felipe
2017-08-01
A method for the rigorous computation of the electromagnetic scattering of large dielectric volumes is presented. One goal is to simplify the analysis of large dielectric targets with translational symmetries taken advantage of their Toeplitz symmetry. Then, the matrix-fill stage of the Method of Moments is efficiently obtained because the number of coupling terms to compute is reduced. The Multilevel Fast Multipole Method is applied to solve the problem. Structured meshes are obtained efficiently to approximate the dielectric volumes. The regular mesh grid is achieved by using parallelepipeds whose centres have been identified as internal to the target. The ray casting algorithm is used to classify the parallelepiped centres. It may become a bottleneck when too many points are evaluated in volumes defined by parametric surfaces, so a hierarchical algorithm is proposed to minimize the number of evaluations. Measurements and analytical results are included for validation purposes.
Computational Aerothermodynamics in Aeroassist Applications
NASA Technical Reports Server (NTRS)
Gnoffo, Peter A.
2001-01-01
Aeroassisted planetary entry uses atmospheric drag to decelerate spacecraft from super-orbital to orbital or suborbital velocities. Numerical simulation of flow fields surrounding these spacecraft during hypersonic atmospheric entry is required to define aerothermal loads. The severe compression in the shock layer in front of the vehicle and subsequent, rapid expansion into the wake are characterized by high temperature, thermo-chemical nonequilibrium processes. Implicit algorithms required for efficient, stable computation of the governing equations involving disparate time scales of convection, diffusion, chemical reactions, and thermal relaxation are discussed. Robust point-implicit strategies are utilized in the initialization phase; less robust but more efficient line-implicit strategies are applied in the endgame. Applications to ballutes (balloon-like decelerators) in the atmospheres of Venus, Mars, Titan, Saturn, and Neptune and a Mars Sample Return Orbiter (MSRO) are featured. Examples are discussed where time-accurate simulation is required to achieve a steady-state solution.
Efficient computation paths for the systematic analysis of sensitivities
NASA Astrophysics Data System (ADS)
Greppi, Paolo; Arato, Elisabetta
2013-01-01
A systematic sensitivity analysis requires computing the model on all points of a multi-dimensional grid covering the domain of interest, defined by the ranges of variability of the inputs. The issues to efficiently perform such analyses on algebraic models are handling solution failures within and close to the feasible region and minimizing the total iteration count. Scanning the domain in the obvious order is sub-optimal in terms of total iterations and is likely to cause many solution failures. The problem of choosing a better order can be translated geometrically into finding Hamiltonian paths on certain grid graphs. This work proposes two paths, one based on a mixed-radix Gray code and the other, a quasi-spiral path, produced by a novel heuristic algorithm. Some simple, easy-to-visualize examples are presented, followed by performance results for the quasi-spiral algorithm and the practical application of the different paths in a process simulation tool.
Regional agriculture surveys using ERTS-1 data
NASA Technical Reports Server (NTRS)
Draeger, W. C.; Nichols, J. D.; Benson, A. S.; Larrabee, D. G.; Jenkus, W. M.; Hay, C. M.
1974-01-01
The Center for Remote Sensing Research has conducted studies designed to evaluate the potential application of ERTS data in performing agricultural inventories, and to develop efficient methods of data handling and analysis useful in the operational context for performing large area surveys. This work has resulted in the development of an integrated system utilizing both human and computer analysis of ground, aerial, and space imagery, which has been shown to be very efficient for regional crop acreage inventories. The technique involves: (1) the delineation of ERTS images into relatively homogeneous strata by human interpreters, (2) the point-by-point classification of the area within each strata on the basis of crop type using a human/machine interactive digital image processing system; and (3) a multistage sampling procedure for the collection of supporting aerial and ground data used in the adjustment and verification of the classification results.
An interior-point method-based solver for simulation of aircraft parts riveting
NASA Astrophysics Data System (ADS)
Stefanova, Maria; Yakunin, Sergey; Petukhova, Margarita; Lupuleac, Sergey; Kokkolaras, Michael
2018-05-01
The particularities of the aircraft parts riveting process simulation necessitate the solution of a large amount of contact problems. A primal-dual interior-point method-based solver is proposed for solving such problems efficiently. The proposed method features a worst case polynomial complexity bound ? on the number of iterations, where n is the dimension of the problem and ε is a threshold related to desired accuracy. In practice, the convergence is often faster than this worst case bound, which makes the method applicable to large-scale problems. The computational challenge is solving the system of linear equations because the associated matrix is ill conditioned. To that end, the authors introduce a preconditioner and a strategy for determining effective initial guesses based on the physics of the problem. Numerical results are compared with ones obtained using the Goldfarb-Idnani algorithm. The results demonstrate the efficiency of the proposed method.
Tavares, Ana Beatriz M L A; Lima Neto, José X; Fulco, Umberto L; Albuquerque, Eudenilson L
2018-01-30
Much of the recent excitement in the cancer immunotherapy approach has been generated by the recognition that immune checkpoint proteins, like the receptor PD-1, can be blocked by antibody-based drugs with profound effects. Promising clinical data have already been released pointing to the efficiency of the drug pembrolizumab to block the PD-1 pathway, triggering the T-lymphocytes to destroy the cancer cells. Thus, a deep understanding of this drug/receptor complex is essential for the improvement of new drugs targeting the protein PD-1. In this context, by employing quantum chemistry methods based on the Density Functional Theory (DFT), we investigate in silico the binding energy features of the receptor PD-1 in complex with its drug inhibitor. Our computational results give a better understanding of the binding mechanisms, being also an efficient alternative towards the development of antibody-based drugs, pointing to new treatments for cancer therapy.
Spatial adaptation procedures on tetrahedral meshes for unsteady aerodynamic flow calculations
NASA Technical Reports Server (NTRS)
Rausch, Russ D.; Batina, John T.; Yang, Henry T. Y.
1993-01-01
Spatial adaptation procedures for the accurate and efficient solution of steady and unsteady inviscid flow problems are described. The adaptation procedures were developed and implemented within a three-dimensional, unstructured-grid, upwind-type Euler code. These procedures involve mesh enrichment and mesh coarsening to either add points in high gradient regions of the flow or remove points where they are not needed, respectively, to produce solutions of high spatial accuracy at minimal computational cost. A detailed description of the enrichment and coarsening procedures are presented and comparisons with experimental data for an ONERA M6 wing and an exact solution for a shock-tube problem are presented to provide an assessment of the accuracy and efficiency of the capability. Steady and unsteady results, obtained using spatial adaptation procedures, are shown to be of high spatial accuracy, primarily in that discontinuities such as shock waves are captured very sharply.
Spatial adaptation procedures on tetrahedral meshes for unsteady aerodynamic flow calculations
NASA Technical Reports Server (NTRS)
Rausch, Russ D.; Batina, John T.; Yang, Henry T. Y.
1993-01-01
Spatial adaptation procedures for the accurate and efficient solution of steady and unsteady inviscid flow problems are described. The adaptation procedures were developed and implemented within a three-dimensional, unstructured-grid, upwind-type Euler code. These procedures involve mesh enrichment and mesh coarsening to either add points in high gradient regions of the flow or remove points where they are not needed, respectively, to produce solutions of high spatial accuracy at minimal computational cost. The paper gives a detailed description of the enrichment and coarsening procedures and presents comparisons with experimental data for an ONERA M6 wing and an exact solution for a shock-tube problem to provide an assessment of the accuracy and efficiency of the capability. Steady and unsteady results, obtained using spatial adaptation procedures, are shown to be of high spatial accuracy, primarily in that discontinuities such as shock waves are captured very sharply.
Grover, Ginni; DeLuca, Keith; Quirin, Sean; DeLuca, Jennifer; Piestun, Rafael
2012-01-01
Super-resolution imaging with photo-activatable or photo-switchable probes is a promising tool in biological applications to reveal previously unresolved intra-cellular details with visible light. This field benefits from developments in the areas of molecular probes, optical systems, and computational post-processing of the data. The joint design of optics and reconstruction processes using double-helix point spread functions (DH-PSF) provides high resolution three-dimensional (3D) imaging over a long depth-of-field. We demonstrate for the first time a method integrating a Fisher information efficient DH-PSF design, a surface relief optical phase mask, and an optimal 3D localization estimator. 3D super-resolution imaging using photo-switchable dyes reveals the 3D microtubule network in mammalian cells with localization precision approaching the information theoretical limit over a depth of 1.2 µm. PMID:23187521
Exploring the Feasibility of a DNA Computer: Design of an ALU Using Sticker-Based DNA Model.
Sarkar, Mayukh; Ghosal, Prasun; Mohanty, Saraju P
2017-09-01
Since its inception, DNA computing has advanced to offer an extremely powerful, energy-efficient emerging technology for solving hard computational problems with its inherent massive parallelism and extremely high data density. This would be much more powerful and general purpose when combined with other existing well-known algorithmic solutions that exist for conventional computing architectures using a suitable ALU. Thus, a specifically designed DNA Arithmetic and Logic Unit (ALU) that can address operations suitable for both domains can mitigate the gap between these two. An ALU must be able to perform all possible logic operations, including NOT, OR, AND, XOR, NOR, NAND, and XNOR; compare, shift etc., integer and floating point arithmetic operations (addition, subtraction, multiplication, and division). In this paper, design of an ALU has been proposed using sticker-based DNA model with experimental feasibility analysis. Novelties of this paper may be in manifold. First, the integer arithmetic operations performed here are 2s complement arithmetic, and the floating point operations follow the IEEE 754 floating point format, resembling closely to a conventional ALU. Also, the output of each operation can be reused for any next operation. So any algorithm or program logic that users can think of can be implemented directly on the DNA computer without any modification. Second, once the basic operations of sticker model can be automated, the implementations proposed in this paper become highly suitable to design a fully automated ALU. Third, proposed approaches are easy to implement. Finally, these approaches can work on sufficiently large binary numbers.
Slicing Method for curved façade and window extraction from point clouds
NASA Astrophysics Data System (ADS)
Iman Zolanvari, S. M.; Laefer, Debra F.
2016-09-01
Laser scanning technology is a fast and reliable method to survey structures. However, the automatic conversion of such data into solid models for computation remains a major challenge, especially where non-rectilinear features are present. Since, openings and the overall dimensions of the buildings are the most critical elements in computational models for structural analysis, this article introduces the Slicing Method as a new, computationally-efficient method for extracting overall façade and window boundary points for reconstructing a façade into a geometry compatible for computational modelling. After finding a principal plane, the technique slices a façade into limited portions, with each slice representing a unique, imaginary section passing through a building. This is done along a façade's principal axes to segregate window and door openings from structural portions of the load-bearing masonry walls. The method detects each opening area's boundaries, as well as the overall boundary of the façade, in part, by using a one-dimensional projection to accelerate processing. Slices were optimised as 14.3 slices per vertical metre of building and 25 slices per horizontal metre of building, irrespective of building configuration or complexity. The proposed procedure was validated by its application to three highly decorative, historic brick buildings. Accuracy in excess of 93% was achieved with no manual intervention on highly complex buildings and nearly 100% on simple ones. Furthermore, computational times were less than 3 sec for data sets up to 2.6 million points, while similar existing approaches required more than 16 hr for such datasets.
Floating-to-Fixed-Point Conversion for Digital Signal Processors
NASA Astrophysics Data System (ADS)
Menard, Daniel; Chillet, Daniel; Sentieys, Olivier
2006-12-01
Digital signal processing applications are specified with floating-point data types but they are usually implemented in embedded systems with fixed-point arithmetic to minimise cost and power consumption. Thus, methodologies which establish automatically the fixed-point specification are required to reduce the application time-to-market. In this paper, a new methodology for the floating-to-fixed point conversion is proposed for software implementations. The aim of our approach is to determine the fixed-point specification which minimises the code execution time for a given accuracy constraint. Compared to previous methodologies, our approach takes into account the DSP architecture to optimise the fixed-point formats and the floating-to-fixed-point conversion process is coupled with the code generation process. The fixed-point data types and the position of the scaling operations are optimised to reduce the code execution time. To evaluate the fixed-point computation accuracy, an analytical approach is used to reduce the optimisation time compared to the existing methods based on simulation. The methodology stages are described and several experiment results are presented to underline the efficiency of this approach.
Nonvolatile Memory Materials for Neuromorphic Intelligent Machines.
Jeong, Doo Seok; Hwang, Cheol Seong
2018-04-18
Recent progress in deep learning extends the capability of artificial intelligence to various practical tasks, making the deep neural network (DNN) an extremely versatile hypothesis. While such DNN is virtually built on contemporary data centers of the von Neumann architecture, physical (in part) DNN of non-von Neumann architecture, also known as neuromorphic computing, can remarkably improve learning and inference efficiency. Particularly, resistance-based nonvolatile random access memory (NVRAM) highlights its handy and efficient application to the multiply-accumulate (MAC) operation in an analog manner. Here, an overview is given of the available types of resistance-based NVRAMs and their technological maturity from the material- and device-points of view. Examples within the strategy are subsequently addressed in comparison with their benchmarks (virtual DNN in deep learning). A spiking neural network (SNN) is another type of neural network that is more biologically plausible than the DNN. The successful incorporation of resistance-based NVRAM in SNN-based neuromorphic computing offers an efficient solution to the MAC operation and spike timing-based learning in nature. This strategy is exemplified from a material perspective. Intelligent machines are categorized according to their architecture and learning type. Also, the functionality and usefulness of NVRAM-based neuromorphic computing are addressed. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
NASA Astrophysics Data System (ADS)
Song, Seok Goo; Kwak, Sangmin; Lee, Kyungbook; Park, Donghee
2017-04-01
It is a critical element to predict the intensity and variability of strong ground motions in seismic hazard assessment. The characteristics and variability of earthquake rupture process may be a dominant factor in determining the intensity and variability of near-source strong ground motions. Song et al. (2014) demonstrated that the variability of earthquake rupture scenarios could be effectively quantified in the framework of 1-point and 2-point statistics of earthquake source parameters, constrained by rupture dynamics and past events. The developed pseudo-dynamic source modeling schemes were also validated against the recorded ground motion data of past events and empirical ground motion prediction equations (GMPEs) at the broadband platform (BBP) developed by the Southern California Earthquake Center (SCEC). Recently we improved the computational efficiency of the developed pseudo-dynamic source-modeling scheme by adopting the nonparametric co-regionalization algorithm, introduced and applied in geostatistics initially. We also investigated the effect of earthquake rupture process on near-source ground motion characteristics in the framework of 1-point and 2-point statistics, particularly focusing on the forward directivity region. Finally we will discuss whether the pseudo-dynamic source modeling can reproduce the variability (standard deviation) of empirical GMPEs and the efficiency of 1-point and 2-point statistics to address the variability of ground motions.
Circling motion and screen edges as an alternative input method for on-screen target manipulation.
Ka, Hyun W; Simpson, Richard C
2017-04-01
To investigate a new alternative interaction method, called circling interface, for manipulating on-screen objects. To specify a target, the user makes a circling motion around the target. To specify a desired pointing command with the circling interface, each edge of the screen is used. The user selects a command before circling the target. To evaluate the circling interface, we conducted an experiment with 16 participants, comparing the performance on pointing tasks with different combinations of selection method (circling interface, physical mouse and dwelling interface) and input device (normal computer mouse, head pointer and joystick mouse emulator). A circling interface is compatible with many types of pointing devices, not requiring physical activation of mouse buttons, and is more efficient than dwell-clicking. Across all common pointing operations, the circling interface had a tendency to produce faster performance with a head-mounted mouse emulator than with a joystick mouse. The performance accuracy of the circling interface outperformed the dwelling interface. It was demonstrated that the circling interface has the potential as another alternative pointing method for selecting and manipulating objects in a graphical user interface. Implications for Rehabilitation A circling interface will improve clinical practice by providing an alternative pointing method that does not require physically activating mouse buttons and is more efficient than dwell-clicking. The Circling interface can also work with AAC devices.
Efficient grid-based techniques for density functional theory
NASA Astrophysics Data System (ADS)
Rodriguez-Hernandez, Juan Ignacio
Understanding the chemical and physical properties of molecules and materials at a fundamental level often requires quantum-mechanical models for these substance's electronic structure. This type of many body quantum mechanics calculation is computationally demanding, hindering its application to substances with more than a few hundreds atoms. The supreme goal of many researches in quantum chemistry---and the topic of this dissertation---is to develop more efficient computational algorithms for electronic structure calculations. In particular, this dissertation develops two new numerical integration techniques for computing molecular and atomic properties within conventional Kohn-Sham-Density Functional Theory (KS-DFT) of molecular electronic structure. The first of these grid-based techniques is based on the transformed sparse grid construction. In this construction, a sparse grid is generated in the unit cube and then mapped to real space according to the pro-molecular density using the conditional distribution transformation. The transformed sparse grid was implemented in program deMon2k, where it is used as the numerical integrator for the exchange-correlation energy and potential in the KS-DFT procedure. We tested our grid by computing ground state energies, equilibrium geometries, and atomization energies. The accuracy on these test calculations shows that our grid is more efficient than some previous integration methods: our grids use fewer points to obtain the same accuracy. The transformed sparse grids were also tested for integrating, interpolating and differentiating in different dimensions (n = 1,2,3,6). The second technique is a grid-based method for computing atomic properties within QTAIM. It was also implemented in deMon2k. The performance of the method was tested by computing QTAIM atomic energies, charges, dipole moments, and quadrupole moments. For medium accuracy, our method is the fastest one we know of.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, Y; Tan, J; Jiang, S
Purpose: High dose rate (HDR) brachytherapy treatment planning is conventionally performed in a manual fashion. Yet it is highly desirable to perform computerized automated planning to improve treatment planning efficiency, eliminate human errors, and reduce plan quality variation. The goal of this research is to develop an automatic treatment planning tool for HDR brachytherapy with a cylinder applicator for vaginal cancer. Methods: After inserting the cylinder applicator into the patient, a CT scan was acquired and was loaded to an in-house developed treatment planning software. The cylinder applicator was automatically segmented using image-processing techniques. CTV was generated based on user-specifiedmore » treatment depth and length. Locations of relevant points (apex point, prescription point, and vaginal surface point), central applicator channel coordinates, and dwell positions were determined according to their geometric relations with the applicator. Dwell time was computed through an inverse optimization process. The planning information was written into DICOM-RT plan and structure files to transfer the automatically generated plan to a commercial treatment planning system for plan verification and delivery. Results: We have tested the system retrospectively in nine patients treated with vaginal cylinder applicator. These cases were selected with different treatment prescriptions, lengths, depths, and cylinder diameters to represent a large patient population. Our system was able to generate treatment plans for these cases with clinically acceptable quality. Computation time varied from 3–6 min. Conclusion: We have developed a system to perform automated treatment planning for HDR brachytherapy with a cylinder applicator. Such a novel system has greatly improved treatment planning efficiency and reduced plan quality variation. It also served as a testbed to demonstrate the feasibility of automatic HDR treatment planning for more complicated cases.« less
Efficiency and security problems of anonymous key agreement protocol based on chaotic maps
NASA Astrophysics Data System (ADS)
Yoon, Eun-Jun
2012-07-01
In 2011, Niu-Wang proposed an anonymous key agreement protocol based on chaotic maps in [Niu Y, Wang X. An anonymous key agreement protocol based on chaotic maps. Commun Nonlinear Sci Simulat 2011;16(4):1986-92]. Niu-Wang's protocol not only achieves session key agreement between a server and a user, but also allows the user to anonymously interact with the server. Nevertheless, this paper points out that Niu-Wang's protocol has the following efficiency and security problems: (1) The protocol has computational efficiency problem when a trusted third party decrypts the user sending message. (2) The protocol is vulnerable to Denial of Service (DoS) attack based on illegal message modification by an attacker.
Radio-frequency measurement in semiconductor quantum computation
NASA Astrophysics Data System (ADS)
Han, TianYi; Chen, MingBo; Cao, Gang; Li, HaiOu; Xiao, Ming; Guo, GuoPing
2017-05-01
Semiconductor quantum dots have attracted wide interest for the potential realization of quantum computation. To realize efficient quantum computation, fast manipulation and the corresponding readout are necessary. In the past few decades, considerable progress of quantum manipulation has been achieved experimentally. To meet the requirements of high-speed readout, radio-frequency (RF) measurement has been developed in recent years, such as RF-QPC (radio-frequency quantum point contact) and RF-DGS (radio-frequency dispersive gate sensor). Here we specifically demonstrate the principle of the radio-frequency reflectometry, then review the development and applications of RF measurement, which provides a feasible way to achieve high-bandwidth readout in quantum coherent control and also enriches the methods to study these artificial mesoscopic quantum systems. Finally, we prospect the future usage of radio-frequency reflectometry in scaling-up of the quantum computing models.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sharma, D; Badano, A; Sempau, J
Purpose: Variance reduction techniques (VRTs) are employed in Monte Carlo simulations to obtain estimates with reduced statistical uncertainty for a given simulation time. In this work, we study the bias and efficiency of a VRT for estimating the response of imaging detectors. Methods: We implemented Directed Sampling (DS), preferentially directing a fraction of emitted optical photons directly towards the detector by altering the isotropic model. The weight of each optical photon is appropriately modified to maintain simulation estimates unbiased. We use a Monte Carlo tool called fastDETECT2 (part of the hybridMANTIS open-source package) for optical transport, modified for VRT. Themore » weight of each photon is calculated as the ratio of original probability (no VRT) and the new probability for a particular direction. For our analysis of bias and efficiency, we use pulse height spectra, point response functions, and Swank factors. We obtain results for a variety of cases including analog (no VRT, isotropic distribution), and DS with 0.2 and 0.8 optical photons directed towards the sensor plane. We used 10,000, 25-keV primaries. Results: The Swank factor for all cases in our simplified model converged fast (within the first 100 primaries) to a stable value of 0.9. The root mean square error per pixel for DS VRT for the point response function between analog and VRT cases was approximately 5e-4. Conclusion: Our preliminary results suggest that DS VRT does not affect the estimate of the mean for the Swank factor. Our findings indicate that it may be possible to design VRTs for imaging detector simulations to increase computational efficiency without introducing bias.« less
NASA Astrophysics Data System (ADS)
Xin, Meiting; Li, Bing; Yan, Xiao; Chen, Lei; Wei, Xiang
2018-02-01
A robust coarse-to-fine registration method based on the backpropagation (BP) neural network and shift window technology is proposed in this study. Specifically, there are three steps: coarse alignment between the model data and measured data, data simplification based on the BP neural network and point reservation in the contour region of point clouds, and fine registration with the reweighted iterative closest point algorithm. In the process of rough alignment, the initial rotation matrix and the translation vector between the two datasets are obtained. After performing subsequent simplification operations, the number of points can be reduced greatly. Therefore, the time and space complexity of the accurate registration can be significantly reduced. The experimental results show that the proposed method improves the computational efficiency without loss of accuracy.
Assessment of Preconditioner for a USM3D Hierarchical Adaptive Nonlinear Method (HANIM) (Invited)
NASA Technical Reports Server (NTRS)
Pandya, Mohagna J.; Diskin, Boris; Thomas, James L.; Frink, Neal T.
2016-01-01
Enhancements to the previously reported mixed-element USM3D Hierarchical Adaptive Nonlinear Iteration Method (HANIM) framework have been made to further improve robustness, efficiency, and accuracy of computational fluid dynamic simulations. The key enhancements include a multi-color line-implicit preconditioner, a discretely consistent symmetry boundary condition, and a line-mapping method for the turbulence source term discretization. The USM3D iterative convergence for the turbulent flows is assessed on four configurations. The configurations include a two-dimensional (2D) bump-in-channel, the 2D NACA 0012 airfoil, a three-dimensional (3D) bump-in-channel, and a 3D hemisphere cylinder. The Reynolds Averaged Navier Stokes (RANS) solutions have been obtained using a Spalart-Allmaras turbulence model and families of uniformly refined nested grids. Two types of HANIM solutions using line- and point-implicit preconditioners have been computed. Additional solutions using the point-implicit preconditioner alone (PA) method that broadly represents the baseline solver technology have also been computed. The line-implicit HANIM shows superior iterative convergence in most cases with progressively increasing benefits on finer grids.
Swirling Flow Computation at the Trailing Edge of Radial-Axial Hydraulic Turbines
NASA Astrophysics Data System (ADS)
Susan-Resiga, Romeo; Muntean, Sebastian; Popescu, Constantin
2016-11-01
Modern hydraulic turbines require optimized runners within a range of operating points with respect to minimum weighted average draft tube losses and/or flow instabilities. Tractable optimization methodologies must include realistic estimations of the swirling flow exiting the runner and further ingested by the draft tube, prior to runner design. The paper presents a new mathematical model and the associated numerical algorithm for computing the swirling flow at the trailing edge of Francis turbine runner, operated at arbitrary discharge. The general turbomachinery throughflow theory is particularized for an arbitrary hub-to-shroud line in the meridian half-plane and the resulting boundary value problem is solved with the finite element method. The results obtained with the present model are validated against full 3D runner flow computations within a range of discharge value. The mathematical model incorporates the full information for the relative flow direction, as well as the curvatures of the hub-to-shroud line and meridian streamlines, respectively. It is shown that the flow direction can be frozen within a range of operating points in the neighborhood of the best efficiency regime.
A New Approach for Constructing Highly Stable High Order CESE Schemes
NASA Technical Reports Server (NTRS)
Chang, Sin-Chung
2010-01-01
A new approach is devised to construct high order CESE schemes which would avoid the common shortcomings of traditional high order schemes including: (a) susceptibility to computational instabilities; (b) computational inefficiency due to their local implicit nature (i.e., at each mesh points, need to solve a system of linear/nonlinear equations involving all the mesh variables associated with this mesh point); (c) use of large and elaborate stencils which complicates boundary treatments and also makes efficient parallel computing much harder; (d) difficulties in applications involving complex geometries; and (e) use of problem-specific techniques which are needed to overcome stability problems but often cause undesirable side effects. In fact it will be shown that, with the aid of a conceptual leap, one can build from a given 2nd-order CESE scheme its 4th-, 6th-, 8th-,... order versions which have the same stencil and same stability conditions of the 2nd-order scheme, and also retain all other advantages of the latter scheme. A sketch of multidimensional extensions will also be provided.
Efficient Implementation of an Optimal Interpolator for Large Spatial Data Sets
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess; Mount, David M.
2007-01-01
Scattered data interpolation is a problem of interest in numerous areas such as electronic imaging, smooth surface modeling, and computational geometry. Our motivation arises from applications in geology and mining, which often involve large scattered data sets and a demand for high accuracy. The method of choice is ordinary kriging. This is because it is a best unbiased estimator. Unfortunately, this interpolant is computationally very expensive to compute exactly. For n scattered data points, computing the value of a single interpolant involves solving a dense linear system of size roughly n x n. This is infeasible for large n. In practice, kriging is solved approximately by local approaches that are based on considering only a relatively small'number of points that lie close to the query point. There are many problems with this local approach, however. The first is that determining the proper neighborhood size is tricky, and is usually solved by ad hoc methods such as selecting a fixed number of nearest neighbors or all the points lying within a fixed radius. Such fixed neighborhood sizes may not work well for all query points, depending on local density of the point distribution. Local methods also suffer from the problem that the resulting interpolant is not continuous. Meyer showed that while kriging produces smooth continues surfaces, it has zero order continuity along its borders. Thus, at interface boundaries where the neighborhood changes, the interpolant behaves discontinuously. Therefore, it is important to consider and solve the global system for each interpolant. However, solving such large dense systems for each query point is impractical. Recently a more principled approach to approximating kriging has been proposed based on a technique called covariance tapering. The problems arise from the fact that the covariance functions that are used in kriging have global support. Our implementations combine, utilize, and enhance a number of different approaches that have been introduced in literature for solving large linear systems for interpolation of scattered data points. For very large systems, exact methods such as Gaussian elimination are impractical since they require 0(n(exp 3)) time and 0(n(exp 2)) storage. As Billings et al. suggested, we use an iterative approach. In particular, we use the SYMMLQ method, for solving the large but sparse ordinary kriging systems that result from tapering. The main technical issue that need to be overcome in our algorithmic solution is that the points' covariance matrix for kriging should be symmetric positive definite. The goal of tapering is to obtain a sparse approximate representation of the covariance matrix while maintaining its positive definiteness. Furrer et al. used tapering to obtain a sparse linear system of the form Ax = b, where A is the tapered symmetric positive definite covariance matrix. Thus, Cholesky factorization could be used to solve their linear systems. They implemented an efficient sparse Cholesky decomposition method. They also showed if these tapers are used for a limited class of covariance models, the solution of the system converges to the solution of the original system. Matrix A in the ordinary kriging system, while symmetric, is not positive definite. Thus, their approach is not applicable to the ordinary kriging system. Therefore, we use tapering only to obtain a sparse linear system. Then, we use SYMMLQ to solve the ordinary kriging system. We show that solving large kriging systems becomes practical via tapering and iterative methods, and results in lower estimation errors compared to traditional local approaches, and significant memory savings compared to the original global system. We also developed a more efficient variant of the sparse SYMMLQ method for large ordinary kriging systems. This approach adaptively finds the correct local neighborhood for each query point in the interpolation process.
A Hybrid Shared-Memory Parallel Max-Tree Algorithm for Extreme Dynamic-Range Images.
Moschini, Ugo; Meijster, Arnold; Wilkinson, Michael H F
2018-03-01
Max-trees, or component trees, are graph structures that represent the connected components of an image in a hierarchical way. Nowadays, many application fields rely on images with high-dynamic range or floating point values. Efficient sequential algorithms exist to build trees and compute attributes for images of any bit depth. However, we show that the current parallel algorithms perform poorly already with integers at bit depths higher than 16 bits per pixel. We propose a parallel method combining the two worlds of flooding and merging max-tree algorithms. First, a pilot max-tree of a quantized version of the image is built in parallel using a flooding method. Later, this structure is used in a parallel leaf-to-root approach to compute efficiently the final max-tree and to drive the merging of the sub-trees computed by the threads. We present an analysis of the performance both on simulated and actual 2D images and 3D volumes. Execution times are about better than the fastest sequential algorithm and speed-up goes up to on 64 threads.
Reducing Earth Topography Resolution for SMAP Mission Ground Tracks Using K-Means Clustering
NASA Technical Reports Server (NTRS)
Rizvi, Farheen
2013-01-01
The K-means clustering algorithm is used to reduce Earth topography resolution for the SMAP mission ground tracks. As SMAP propagates in orbit, knowledge of the radar antenna footprints on Earth is required for the antenna misalignment calibration. Each antenna footprint contains a latitude and longitude location pair on the Earth surface. There are 400 pairs in one data set for the calibration model. It is computationally expensive to calculate corresponding Earth elevation for these data pairs. Thus, the antenna footprint resolution is reduced. Similar topographical data pairs are grouped together with the K-means clustering algorithm. The resolution is reduced to the mean of each topographical cluster called the cluster centroid. The corresponding Earth elevation for each cluster centroid is assigned to the entire group. Results show that 400 data points are reduced to 60 while still maintaining algorithm performance and computational efficiency. In this work, sensitivity analysis is also performed to show a trade-off between algorithm performance versus computational efficiency as the number of cluster centroids and algorithm iterations are increased.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Youshan, E-mail: ysliu@mail.iggcas.ac.cn; Teng, Jiwen, E-mail: jwteng@mail.iggcas.ac.cn; Xu, Tao, E-mail: xutao@mail.iggcas.ac.cn
2017-05-01
The mass-lumped method avoids the cost of inverting the mass matrix and simultaneously maintains spatial accuracy by adopting additional interior integration points, known as cubature points. To date, such points are only known analytically in tensor domains, such as quadrilateral or hexahedral elements. Thus, the diagonal-mass-matrix spectral element method (SEM) in non-tensor domains always relies on numerically computed interpolation points or quadrature points. However, only the cubature points for degrees 1 to 6 are known, which is the reason that we have developed a p-norm-based optimization algorithm to obtain higher-order cubature points. In this way, we obtain and tabulate newmore » cubature points with all positive integration weights for degrees 7 to 9. The dispersion analysis illustrates that the dispersion relation determined from the new optimized cubature points is comparable to that of the mass and stiffness matrices obtained by exact integration. Simultaneously, the Lebesgue constant for the new optimized cubature points indicates its surprisingly good interpolation properties. As a result, such points provide both good interpolation properties and integration accuracy. The Courant–Friedrichs–Lewy (CFL) numbers are tabulated for the conventional Fekete-based triangular spectral element (TSEM), the TSEM with exact integration, and the optimized cubature-based TSEM (OTSEM). A complementary study demonstrates the spectral convergence of the OTSEM. A numerical example conducted on a half-space model demonstrates that the OTSEM improves the accuracy by approximately one order of magnitude compared to the conventional Fekete-based TSEM. In particular, the accuracy of the 7th-order OTSEM is even higher than that of the 14th-order Fekete-based TSEM. Furthermore, the OTSEM produces a result that can compete in accuracy with the quadrilateral SEM (QSEM). The high accuracy of the OTSEM is also tested with a non-flat topography model. In terms of computational efficiency, the OTSEM is more efficient than the Fekete-based TSEM, although it is slightly costlier than the QSEM when a comparable numerical accuracy is required. - Highlights: • Higher-order cubature points for degrees 7 to 9 are developed. • The effects of quadrature rule on the mass and stiffness matrices has been conducted. • The cubature points have always positive integration weights. • Freeing from the inversion of a wide bandwidth mass matrix. • The accuracy of the TSEM has been improved in about one order of magnitude.« less
NASA Astrophysics Data System (ADS)
Costantini, Mario; Malvarosa, Fabio; Minati, Federico
2010-03-01
Phase unwrapping and integration of finite differences are key problems in several technical fields. In SAR interferometry and differential and persistent scatterers interferometry digital elevation models and displacement measurements can be obtained after unambiguously determining the phase values and reconstructing the mean velocities and elevations of the observed targets, which can be performed by integrating differential estimates of these quantities (finite differences between neighboring points).In this paper we propose a general formulation for robust and efficient integration of finite differences and phase unwrapping, which includes standard techniques methods as sub-cases. The proposed approach allows obtaining more reliable and accurate solutions by exploiting redundant differential estimates (not only between nearest neighboring points) and multi-dimensional information (e.g. multi-temporal, multi-frequency, multi-baseline observations), or external data (e.g. GPS measurements). The proposed approach requires the solution of linear or quadratic programming problems, for which computationally efficient algorithms exist.The validation tests obtained on real SAR data confirm the validity of the method, which was integrated in our production chain and successfully used also in massive productions.
Simulation of the Francis-99 Hydro Turbine During Steady and Transient Operation
NASA Astrophysics Data System (ADS)
Dewan, Yuvraj; Custer, Chad; Ivashchenko, Artem
2017-01-01
Numerical simulation of the Francis-99 hydroturbine with correlation to experimental measurements are presented. Steady operation of the hydroturbine is analyzed at three operating conditions: the best efficiency point (BEP), high load (HL), and part load (PL). It is shown that global quantities such as net head, discharge and efficiency are well predicted. Additionally, time-averaged velocity predictions compare well with PIV measurements obtained in the draft tube immediately downstream of the runner. Differences in vortex rope structure between operating points are discussed. Unsteady operation of the hydroturbine from BEP to HL and from BEP to PL are modeled. It is shown that simulation methods used to model the steady operation produce predictions that correlate well with experiment for transient operation. Time-domain unsteady simulation is used for both steady and unsteady operation. The full-fidelity geometry including all components is meshed using an unstructured polyhedral mesh with body-fitted prism layers. Guide vane rotation for transient operation is imposed using fully-conservative, computationally efficient mesh morphing. The commercial solver STAR-CCM+ is used for all portions of the analysis including meshing, solving and post-processing.
Practical low-cost visual communication using binary images for deaf sign language.
Manoranjan, M D; Robinson, J A
2000-03-01
Deaf sign language transmitted by video requires a temporal resolution of 8 to 10 frames/s for effective communication. Conventional videoconferencing applications, when operated over low bandwidth telephone lines, provide very low temporal resolution of pictures, of the order of less than a frame per second, resulting in jerky movement of objects. This paper presents a practical solution for sign language communication, offering adequate temporal resolution of images using moving binary sketches or cartoons, implemented on standard personal computer hardware with low-cost cameras and communicating over telephone lines. To extract cartoon points an efficient feature extraction algorithm adaptive to the global statistics of the image is proposed. To improve the subjective quality of the binary images, irreversible preprocessing techniques, such as isolated point removal and predictive filtering, are used. A simple, efficient and fast recursive temporal prefiltering scheme, using histograms of successive frames, reduces the additive and multiplicative noise from low-cost cameras. An efficient three-dimensional (3-D) compression scheme codes the binary sketches. Subjective tests performed on the system confirm that it can be used for sign language communication over telephone lines.
A method for improved accuracy in three dimensions for determining wheel/rail contact points
NASA Astrophysics Data System (ADS)
Yang, Xinwen; Gu, Shaojie; Zhou, Shunhua; Zhou, Yu; Lian, Songliang
2015-11-01
Searching for the contact points between wheels and rails is important because these points represent the points of exerted contact forces. In order to obtain an accurate contact point and an in-depth description of the wheel/rail contact behaviours on a curved track or in a turnout, a method with improved accuracy in three dimensions is proposed to determine the contact points and the contact patches between the wheel and the rail when considering the effect of the yaw angle and the roll angle on the motion of the wheel set. The proposed method, with no need of the curve fitting of the wheel and rail profiles, can accurately, directly, and comprehensively determine the contact interface distances between the wheel and the rail. The range iteration algorithm is used to improve the computation efficiency and reduce the calculation required. The present computation method is applied for the analysis of the contact of rails of CHINA (CHN) 75 kg/m and wheel sets of wearing type tread of China's freight cars. In addition, it can be proved that the results of the proposed method are consistent with that of Kalker's program CONTACT, and the maximum deviation from the wheel/rail contact patch area of this two methods is approximately 5%. The proposed method, can also be used to investigate static wheel/rail contact. Some wheel/rail contact points and contact patch distributions are discussed and assessed, wheel and rail non-worn and worn profiles included.
Archer, Charles J.; Faraj, Ahmad A.; Inglett, Todd A.; Ratterman, Joseph D.
2012-10-23
Methods, apparatus, and products are disclosed for providing nearest neighbor point-to-point communications among compute nodes of an operational group in a global combining network of a parallel computer, each compute node connected to each adjacent compute node in the global combining network through a link, that include: identifying each link in the global combining network for each compute node of the operational group; designating one of a plurality of point-to-point class routing identifiers for each link such that no compute node in the operational group is connected to two adjacent compute nodes in the operational group with links designated for the same class routing identifiers; and configuring each compute node of the operational group for point-to-point communications with each adjacent compute node in the global combining network through the link between that compute node and that adjacent compute node using that link's designated class routing identifier.
A Fast Algorithm for Lattice Hyperonic Potentials
NASA Astrophysics Data System (ADS)
Nemura, Hidekatsu; Aoki, Sinya; Doi, Takumi; Gongyo, Shinya; Hatsuda, Tetsuo; Ikeda, Yoichi; Inoue, Takashi; Iritani, Takumi; Ishii, Noriyoshi; Miyamoto, Takaya; Murano, Keiko; Sasaki, Kenji
We describe an efficient algorithm to compute a large number of baryon-baryon interactions from NN to ΞΞ by means of HAL QCD method, which lays the groundwork for the nearly physical point lattice QCD calculation with volume (96a)4 ≈ (8.2 fm)4. Preliminary results of ΛN potential calculated with quark masses corresponding to (mπ, mK) ≈ (146,525) MeV are presented.
Descriptive and Computer Aided Drawing Perspective on an Unfolded Polyhedral Projection Surface
NASA Astrophysics Data System (ADS)
Dzwierzynska, Jolanta
2017-10-01
The aim of the herby study is to develop a method of direct and practical mapping of perspective on an unfolded prism polyhedral projection surface. The considered perspective representation is a rectilinear central projection onto a surface composed of several flat elements. In the paper two descriptive methods of drawing perspective are presented: direct and indirect. The graphical mapping of the effects of the representation is realized directly on the unfolded flat projection surface. That is due to the projective and graphical connection between points displayed on the polyhedral background and their counterparts received on the unfolded flat surface. For a significant improvement of the construction of line, analytical algorithms are formulated. They draw a perspective image of a segment of line passing through two different points determined by their coordinates in a spatial coordinate system of axis x, y, z. Compared to other perspective construction methods that use information about points, for computer vision and the computer aided design, our algorithms utilize data about lines, which are applied very often in architectural forms. Possibility of drawing lines in the considered perspective enables drawing an edge perspective image of an architectural object. The application of the changeable base elements of perspective as a horizon height and a station point location enable drawing perspective image from different viewing positions. The analytical algorithms for drawing perspective images are formulated in Mathcad software, however, they can be implemented in the majority of computer graphical packages, which can make drawing perspective more efficient and easier. The representation presented in the paper and the way of its direct mapping on the flat unfolded projection surface can find application in presentation of architectural space in advertisement and art.
Model reduction method using variable-separation for stochastic saddle point problems
NASA Astrophysics Data System (ADS)
Jiang, Lijian; Li, Qiuqi
2018-02-01
In this paper, we consider a variable-separation (VS) method to solve the stochastic saddle point (SSP) problems. The VS method is applied to obtain the solution in tensor product structure for stochastic partial differential equations (SPDEs) in a mixed formulation. The aim of such a technique is to construct a reduced basis approximation of the solution of the SSP problems. The VS method attempts to get a low rank separated representation of the solution for SSP in a systematic enrichment manner. No iteration is performed at each enrichment step. In order to satisfy the inf-sup condition in the mixed formulation, we enrich the separated terms for the primal system variable at each enrichment step. For the SSP problems by regularization or penalty, we propose a more efficient variable-separation (VS) method, i.e., the variable-separation by penalty method. This can avoid further enrichment of the separated terms in the original mixed formulation. The computation of the variable-separation method decomposes into offline phase and online phase. Sparse low rank tensor approximation method is used to significantly improve the online computation efficiency when the number of separated terms is large. For the applications of SSP problems, we present three numerical examples to illustrate the performance of the proposed methods.
An efficient distribution method for nonlinear transport problems in stochastic porous media
NASA Astrophysics Data System (ADS)
Ibrahima, F.; Tchelepi, H.; Meyer, D. W.
2015-12-01
Because geophysical data are inexorably sparse and incomplete, stochastic treatments of simulated responses are convenient to explore possible scenarios and assess risks in subsurface problems. In particular, understanding how uncertainties propagate in porous media with nonlinear two-phase flow is essential, yet challenging, in reservoir simulation and hydrology. We give a computationally efficient and numerically accurate method to estimate the one-point probability density (PDF) and cumulative distribution functions (CDF) of the water saturation for the stochastic Buckley-Leverett problem when the probability distributions of the permeability and porosity fields are available. The method draws inspiration from the streamline approach and expresses the distributions of interest essentially in terms of an analytically derived mapping and the distribution of the time of flight. In a large class of applications the latter can be estimated at low computational costs (even via conventional Monte Carlo). Once the water saturation distribution is determined, any one-point statistics thereof can be obtained, especially its average and standard deviation. Moreover, rarely available in other approaches, yet crucial information such as the probability of rare events and saturation quantiles (e.g. P10, P50 and P90) can be derived from the method. We provide various examples and comparisons with Monte Carlo simulations to illustrate the performance of the method.
SymPix: A Spherical Grid for Efficient Sampling of Rotationally Invariant Operators
NASA Astrophysics Data System (ADS)
Seljebotn, D. S.; Eriksen, H. K.
2016-02-01
We present SymPix, a special-purpose spherical grid optimized for efficiently sampling rotationally invariant linear operators. This grid is conceptually similar to the Gauss-Legendre (GL) grid, aligning sample points with iso-latitude rings located on Legendre polynomial zeros. Unlike the GL grid, however, the number of grid points per ring varies as a function of latitude, avoiding expensive oversampling near the poles and ensuring nearly equal sky area per grid point. The ratio between the number of grid points in two neighboring rings is required to be a low-order rational number (3, 2, 1, 4/3, 5/4, or 6/5) to maintain a high degree of symmetries. Our main motivation for this grid is to solve linear systems using multi-grid methods, and to construct efficient preconditioners through pixel-space sampling of the linear operator in question. As a benchmark and representative example, we compute a preconditioner for a linear system that involves the operator \\widehat{{\\boldsymbol{D}}}+{\\widehat{{\\boldsymbol{B}}}}T{{\\boldsymbol{N}}}-1\\widehat{{\\boldsymbol{B}}}, where \\widehat{{\\boldsymbol{B}}} and \\widehat{{\\boldsymbol{D}}} may be described as both local and rotationally invariant operators, and {\\boldsymbol{N}} is diagonal in the pixel domain. For a bandwidth limit of {{\\ell }}{max} = 3000, we find that our new SymPix implementation yields average speed-ups of 360 and 23 for {\\widehat{{\\boldsymbol{B}}}}T{{\\boldsymbol{N}}}-1\\widehat{{\\boldsymbol{B}}} and \\widehat{{\\boldsymbol{D}}}, respectively, compared with the previous state-of-the-art implementation.
A new template matching method based on contour information
NASA Astrophysics Data System (ADS)
Cai, Huiying; Zhu, Feng; Wu, Qingxiao; Li, Sicong
2014-11-01
Template matching is a significant approach in machine vision due to its effectiveness and robustness. However, most of the template matching methods are so time consuming that they can't be used to many real time applications. The closed contour matching method is a popular kind of template matching methods. This paper presents a new closed contour template matching method which is suitable for two dimensional objects. Coarse-to-fine searching strategy is used to improve the matching efficiency and a partial computation elimination scheme is proposed to further speed up the searching process. The method consists of offline model construction and online matching. In the process of model construction, triples and distance image are obtained from the template image. A certain number of triples which are composed by three points are created from the contour information that is extracted from the template image. The rule to select the three points is that the template contour is divided equally into three parts by these points. The distance image is obtained here by distance transform. Each point on the distance image represents the nearest distance between current point and the points on the template contour. During the process of matching, triples of the searching image are created with the same rule as the triples of the model. Through the similarity that is invariant to rotation, translation and scaling between triangles, the triples corresponding to the triples of the model are found. Then we can obtain the initial RST (rotation, translation and scaling) parameters mapping the searching contour to the template contour. In order to speed up the searching process, the points on the searching contour are sampled to reduce the number of the triples. To verify the RST parameters, the searching contour is projected into the distance image, and the mean distance can be computed rapidly by simple operations of addition and multiplication. In the fine searching process, the initial RST parameters are discrete to obtain the final accurate pose of the object. Experimental results show that the proposed method is reasonable and efficient, and can be used in many real time applications.
PL-VIO: Tightly-Coupled Monocular Visual–Inertial Odometry Using Point and Line Features
Zhao, Ji; Guo, Yue; He, Wenhao; Yuan, Kui
2018-01-01
To address the problem of estimating camera trajectory and to build a structural three-dimensional (3D) map based on inertial measurements and visual observations, this paper proposes point–line visual–inertial odometry (PL-VIO), a tightly-coupled monocular visual–inertial odometry system exploiting both point and line features. Compared with point features, lines provide significantly more geometrical structure information on the environment. To obtain both computation simplicity and representational compactness of a 3D spatial line, Plücker coordinates and orthonormal representation for the line are employed. To tightly and efficiently fuse the information from inertial measurement units (IMUs) and visual sensors, we optimize the states by minimizing a cost function which combines the pre-integrated IMU error term together with the point and line re-projection error terms in a sliding window optimization framework. The experiments evaluated on public datasets demonstrate that the PL-VIO method that combines point and line features outperforms several state-of-the-art VIO systems which use point features only. PMID:29642648
Processing mechanics of alternate twist ply (ATP) yarn technology
NASA Astrophysics Data System (ADS)
Elkhamy, Donia Said
Ply yarns are important in many textile manufacturing processes and various applications. The primary process used for producing ply yarns is cabling. The speed of cabling is limited to about 35m/min. With the world's increasing demands of ply yarn supply, cabling is incompatible with today's demand activated manufacturing strategies. The Alternate Twist Ply (ATP) yarn technology is a relatively new process for producing ply yarns with improved productivity and flexibility. This technology involves self plying of twisted singles yarn to produce ply yarn. The ATP process can run more than ten times faster than cabling. To implement the ATP process to produce ply yarns there are major quality issues; uniform Twist Profile and yarn Twist Efficiency. The goal of this thesis is to improve these issues through process modeling based on understanding the physics and processing mechanics of the ATP yarn system. In our study we determine the main parameters that control the yarn twist profile. Process modeling of the yarn twist across different process zones was done. A computational model was designed to predict the process parameters required to achieve a square wave twist profile. Twist efficiency, a measure of yarn torsional stability and bulk, is determined by the ratio of ply yarn twist to singles yarn twist. Response Surface Methodology was used to develop the processing window that can reproduce ATP yarns with high twist efficiency. Equilibrium conditions of tensions and torques acting on the yarns at the self ply point were analyzed and determined the pathway for achieving higher twist efficiency. Mechanistic modeling relating equilibrium conditions to the twist efficiency was developed. A static tester was designed to zoom into the self ply zone of the ATP yarn. A computer controlled, prototypic ATP machine was constructed and confirmed the mechanistic model results. Optimum parameters achieving maximum twist efficiency were determined in this study. The successful results of this work have led to the filing of a US patent disclosing the method for producing ATP yarns with high yarn twist efficiency using a high convergence angle at the self ply point together with applying ply torque.
NASA Technical Reports Server (NTRS)
Assanis, D. N.; Ekchian, J. A.; Heywood, J. B.; Replogle, K. K.
1984-01-01
Reductions in heat loss at appropriate points in the diesel engine which result in substantially increased exhaust enthalpy were shown. The concepts for this increased enthalpy are the turbocharged, turbocompounded diesel engine cycle. A computer simulation of the heavy duty turbocharged turbo-compounded diesel engine system was undertaken. This allows the definition of the tradeoffs which are associated with the introduction of ceramic materials in various parts of the total engine system, and the study of system optimization. The basic assumptions and the mathematical relationships used in the simulation of the model engine are described.
A fast discrete S-transform for biomedical signal processing.
Brown, Robert A; Frayne, Richard
2008-01-01
Determining the frequency content of a signal is a basic operation in signal and image processing. The S-transform provides both the true frequency and globally referenced phase measurements characteristic of the Fourier transform and also generates local spectra, as does the wavelet transform. Due to this combination, the S-transform has been successfully demonstrated in a variety of biomedical signal and image processing tasks. However, the computational demands of the S-transform have limited its application in medicine to this point in time. This abstract introduces the fast S-transform, a more efficient discrete implementation of the classic S-transform with dramatically reduced computational requirements.
Economical and accurate protocol for calculating hydrogen-bond-acceptor strengths.
El Kerdawy, Ahmed; Tautermann, Christofer S; Clark, Timothy; Fox, Thomas
2013-12-23
A series of density functional/basis set combinations and second-order Møller-Plesset calculations have been used to test their ability to reproduce the trends observed experimentally for the strengths of hydrogen-bond acceptors in order to identify computationally efficient techniques for routine use in the computational drug-design process. The effects of functionals, basis sets, counterpoise corrections, and constraints on the optimized geometries were tested and analyzed, and recommendations (M06-2X/cc-pVDZ and X3LYP/cc-pVDZ with single-point counterpoise corrections or X3LYP/aug-cc-pVDZ without counterpoise) were made for suitable moderately high-throughput techniques.
Application of square-root filtering for spacecraft attitude control
NASA Technical Reports Server (NTRS)
Sorensen, J. A.; Schmidt, S. F.; Goka, T.
1978-01-01
Suitable digital algorithms are developed and tested for providing on-board precision attitude estimation and pointing control for potential use in the Landsat-D spacecraft. These algorithms provide pointing accuracy of better than 0.01 deg. To obtain necessary precision with efficient software, a six state-variable square-root Kalman filter combines two star tracker measurements to update attitude estimates obtained from processing three gyro outputs. The validity of the estimation and control algorithms are established, and the sensitivity of their performance to various error sources and software parameters are investigated by detailed digital simulation. Spacecraft computer memory, cycle time, and accuracy requirements are estimated.
ADRC for spacecraft attitude and position synchronization in libration point orbits
NASA Astrophysics Data System (ADS)
Gao, Chen; Yuan, Jianping; Zhao, Yakun
2018-04-01
This paper addresses the problem of spacecraft attitude and position synchronization in libration point orbits between a leader and a follower. Using dual quaternion, the dimensionless relative coupled dynamical model is derived considering computation efficiency and accuracy. Then a model-independent dimensionless cascade pose-feedback active disturbance rejection controller is designed to spacecraft attitude and position tracking control problems considering parameter uncertainties and external disturbances. Numerical simulations for the final approach phase in spacecraft rendezvous and docking and formation flying are done, and the results show high-precision tracking errors and satisfactory convergent rates under bounded control torque and force which validate the proposed approach.
Efficient Geometric Sound Propagation Using Visibility Culling
NASA Astrophysics Data System (ADS)
Chandak, Anish
2011-07-01
Simulating propagation of sound can improve the sense of realism in interactive applications such as video games and can lead to better designs in engineering applications such as architectural acoustics. In this thesis, we present geometric sound propagation techniques which are faster than prior methods and map well to upcoming parallel multi-core CPUs. We model specular reflections by using the image-source method and model finite-edge diffraction by using the well-known Biot-Tolstoy-Medwin (BTM) model. We accelerate the computation of specular reflections by applying novel visibility algorithms, FastV and AD-Frustum, which compute visibility from a point. We accelerate finite-edge diffraction modeling by applying a novel visibility algorithm which computes visibility from a region. Our visibility algorithms are based on frustum tracing and exploit recent advances in fast ray-hierarchy intersections, data-parallel computations, and scalable, multi-core algorithms. The AD-Frustum algorithm adapts its computation to the scene complexity and allows small errors in computing specular reflection paths for higher computational efficiency. FastV and our visibility algorithm from a region are general, object-space, conservative visibility algorithms that together significantly reduce the number of image sources compared to other techniques while preserving the same accuracy. Our geometric propagation algorithms are an order of magnitude faster than prior approaches for modeling specular reflections and two to ten times faster for modeling finite-edge diffraction. Our algorithms are interactive, scale almost linearly on multi-core CPUs, and can handle large, complex, and dynamic scenes. We also compare the accuracy of our sound propagation algorithms with other methods. Once sound propagation is performed, it is desirable to listen to the propagated sound in interactive and engineering applications. We can generate smooth, artifact-free output audio signals by applying efficient audio-processing algorithms. We also present the first efficient audio-processing algorithm for scenarios with simultaneously moving source and moving receiver (MS-MR) which incurs less than 25% overhead compared to static source and moving receiver (SS-MR) or moving source and static receiver (MS-SR) scenario.
NASA Astrophysics Data System (ADS)
Rampidis, I.; Nikolopoulos, A.; Koukouzas, N.; Grammelis, P.; Kakaras, E.
2007-09-01
This work aims to present a pure 3-D CFD model, accurate and efficient, for the simulation of a pilot scale CFB hydrodynamics. The accuracy of the model was investigated as a function of the numerical parameters, in order to derive an optimum model setup with respect to computational cost. The necessity of the in depth examination of hydrodynamics emerges by the trend to scale up CFBCs. This scale up brings forward numerous design problems and uncertainties, which can be successfully elucidated by CFD techniques. Deriving guidelines for setting a computational efficient model is important as the scale of the CFBs grows fast, while computational power is limited. However, the optimum efficiency matter has not been investigated thoroughly in the literature as authors were more concerned for their models accuracy and validity. The objective of this work is to investigate the parameters that influence the efficiency and accuracy of CFB computational fluid dynamics models, find the optimum set of these parameters and thus establish this technique as a competitive method for the simulation and design of industrial, large scale beds, where the computational cost is otherwise prohibitive. During the tests that were performed in this work, the influence of turbulence modeling approach, time and space density and discretization schemes were investigated on a 1.2 MWth CFB test rig. Using Fourier analysis dominant frequencies were extracted in order to estimate the adequate time period for the averaging of all instantaneous values. The compliance with the experimental measurements was very good. The basic differences between the predictions that arose from the various model setups were pointed out and analyzed. The results showed that a model with high order space discretization schemes when applied on a coarse grid and averaging of the instantaneous scalar values for a 20 sec period, adequately described the transient hydrodynamic behaviour of a pilot CFB while the computational cost was kept low. Flow patterns inside the bed such as the core-annulus flow and the transportation of clusters were at least qualitatively captured.
On the efficient and reliable numerical solution of rate-and-state friction problems
NASA Astrophysics Data System (ADS)
Pipping, Elias; Kornhuber, Ralf; Rosenau, Matthias; Oncken, Onno
2016-03-01
We present a mathematically consistent numerical algorithm for the simulation of earthquake rupture with rate-and-state friction. Its main features are adaptive time stepping, a novel algebraic solution algorithm involving nonlinear multigrid and a fixed point iteration for the rate-and-state decoupling. The algorithm is applied to a laboratory scale subduction zone which allows us to compare our simulations with experimental results. Using physical parameters from the experiment, we find a good fit of recurrence time of slip events as well as their rupture width and peak slip. Computations in 3-D confirm efficiency and robustness of our algorithm.
NASA Technical Reports Server (NTRS)
Dunbar, P. M.; Hauser, J. R.
1976-01-01
Various mechanisms which limit the conversion efficiency of silicon solar cells were studied. The effects of changes in solar cell geometry such as layer thickness on performance were examined. The effects of various antireflecting layers were also examined. It was found that any single film antireflecting layer results in a significant surface loss of photons. The use of surface texturing techniques or low loss antireflecting layers can enhance by several percentage points the conversion efficiency of silicon cells. The basic differences between n(+)-p-p(+) and p(+)-n-n(+) cells are treated. A significant part of the study was devoted to the importance of surface region lifetime and heavy doping effects on efficiency. Heavy doping bandgap reduction effects are enhanced by low surface layer lifetimes, and conversely, the reduction in solar cell efficiency due to low surface layer lifetime is further enhanced by heavy doping effects. A series of computer studies is reported which seeks to determine the best cell structure and doping levels for maximum efficiency.
Multi-GPU three dimensional Stokes solver for simulating glacier flow
NASA Astrophysics Data System (ADS)
Licul, Aleksandar; Herman, Frédéric; Podladchikov, Yuri; Räss, Ludovic; Omlin, Samuel
2016-04-01
Here we present how we have recently developed a three-dimensional Stokes solver on the GPUs and apply it to a glacier flow. We numerically solve the Stokes momentum balance equations together with the incompressibility equation, while also taking into account strong nonlinearities for ice rheology. We have developed a fully three-dimensional numerical MATLAB application based on an iterative finite difference scheme with preconditioning of residuals. Differential equations are discretized on a regular staggered grid. We have ported it to C-CUDA to run it on GPU's in parallel, using MPI. We demonstrate the accuracy and efficiency of our developed model by manufactured analytical solution test for three-dimensional Stokes ice sheet models (Leng et al.,2013) and by comparison with other well-established ice sheet models on diagnostic ISMIP-HOM benchmark experiments (Pattyn et al., 2008). The results show that our developed model is capable to accurately and efficiently solve Stokes system of equations in a variety of different test scenarios, while preserving good parallel efficiency on up to 80 GPU's. For example, in 3D test scenarios with 250000 grid points our solver converges in around 3 minutes for single precision computations and around 10 minutes for double precision computations. We have also optimized the developed code to efficiently run on our newly acquired state-of-the-art GPU cluster octopus. This allows us to solve our problem on more than 20 million grid points, by just increasing the number of GPU used, while keeping the computation time the same. In future work we will apply our solver to real world applications and implement the free surface evolution capabilities. REFERENCES Leng,W.,Ju,L.,Gunzburger,M. & Price,S., 2013. Manufactured solutions and the verification of three-dimensional stokes ice-sheet models. Cryosphere 7,19-29. Pattyn, F., Perichon, L., Aschwanden, A., Breuer, B., de Smedt, B., Gagliardini, O., Gudmundsson,G.H., Hindmarsh, R.C.A., Hubbard, A., Johnson, J.V., Kleiner, T., Konovalov,Y., Martin, C., Payne, A.J., Pollard, D., Price, S., Rckamp, M., Saito, F., Souk, O.,Sugiyama, S. & Zwinger, T., 2008. Benchmark experiments for higher-order and full-stokes ice sheet models (ismiphom). The Cryosphere 2, 95-108.
NASA Astrophysics Data System (ADS)
Liu, Y.; Pau, G. S. H.; Finsterle, S.
2015-12-01
Parameter inversion involves inferring the model parameter values based on sparse observations of some observables. To infer the posterior probability distributions of the parameters, Markov chain Monte Carlo (MCMC) methods are typically used. However, the large number of forward simulations needed and limited computational resources limit the complexity of the hydrological model we can use in these methods. In view of this, we studied the implicit sampling (IS) method, an efficient importance sampling technique that generates samples in the high-probability region of the posterior distribution and thus reduces the number of forward simulations that we need to run. For a pilot-point inversion of a heterogeneous permeability field based on a synthetic ponded infiltration experiment simulated with TOUGH2 (a subsurface modeling code), we showed that IS with linear map provides an accurate Bayesian description of the parameterized permeability field at the pilot points with just approximately 500 forward simulations. We further studied the use of surrogate models to improve the computational efficiency of parameter inversion. We implemented two reduced-order models (ROMs) for the TOUGH2 forward model. One is based on polynomial chaos expansion (PCE), of which the coefficients are obtained using the sparse Bayesian learning technique to mitigate the "curse of dimensionality" of the PCE terms. The other model is Gaussian process regression (GPR) for which different covariance, likelihood and inference models are considered. Preliminary results indicate that ROMs constructed based on the prior parameter space perform poorly. It is thus impractical to replace this hydrological model by a ROM directly in a MCMC method. However, the IS method can work with a ROM constructed for parameters in the close vicinity of the maximum a posteriori probability (MAP) estimate. We will discuss the accuracy and computational efficiency of using ROMs in the implicit sampling procedure for the hydrological problem considered. This work was supported, in part, by the U.S. Dept. of Energy under Contract No. DE-AC02-05CH11231
Gaussian polarizable-ion tight binding.
Boleininger, Max; Guilbert, Anne Ay; Horsfield, Andrew P
2016-10-14
To interpret ultrafast dynamics experiments on large molecules, computer simulation is required due to the complex response to the laser field. We present a method capable of efficiently computing the static electronic response of large systems to external electric fields. This is achieved by extending the density-functional tight binding method to include larger basis sets and by multipole expansion of the charge density into electrostatically interacting Gaussian distributions. Polarizabilities for a range of hydrocarbon molecules are computed for a multipole expansion up to quadrupole order, giving excellent agreement with experimental values, with average errors similar to those from density functional theory, but at a small fraction of the cost. We apply the model in conjunction with the polarizable-point-dipoles model to estimate the internal fields in amorphous poly(3-hexylthiophene-2,5-diyl).
Gaussian polarizable-ion tight binding
NASA Astrophysics Data System (ADS)
Boleininger, Max; Guilbert, Anne AY; Horsfield, Andrew P.
2016-10-01
To interpret ultrafast dynamics experiments on large molecules, computer simulation is required due to the complex response to the laser field. We present a method capable of efficiently computing the static electronic response of large systems to external electric fields. This is achieved by extending the density-functional tight binding method to include larger basis sets and by multipole expansion of the charge density into electrostatically interacting Gaussian distributions. Polarizabilities for a range of hydrocarbon molecules are computed for a multipole expansion up to quadrupole order, giving excellent agreement with experimental values, with average errors similar to those from density functional theory, but at a small fraction of the cost. We apply the model in conjunction with the polarizable-point-dipoles model to estimate the internal fields in amorphous poly(3-hexylthiophene-2,5-diyl).
Study of the stability of a SEIRS model for computer worm propagation
NASA Astrophysics Data System (ADS)
Hernández Guillén, J. D.; Martín del Rey, A.; Hernández Encinas, L.
2017-08-01
Nowadays, malware is the most important threat to information security. In this sense, several mathematical models to simulate malware spreading have appeared. They are compartmental models where the population of devices is classified into different compartments: susceptible, exposed, infectious, recovered, etc. The main goal of this work is to propose an improved SEIRS (Susceptible-Exposed-Infectious-Recovered-Susceptible) mathematical model to simulate computer worm propagation. It is a continuous model whose dynamic is ruled by means of a system of ordinary differential equations. It considers more realistic parameters related to the propagation; in fact, a modified incidence rate has been used. Moreover, the equilibrium points are computed and their local and global stability analyses are studied. From the explicit expression of the basic reproductive number, efficient control measures are also obtained.
A fast and automatic mosaic method for high-resolution satellite images
NASA Astrophysics Data System (ADS)
Chen, Hongshun; He, Hui; Xiao, Hongyu; Huang, Jing
2015-12-01
We proposed a fast and fully automatic mosaic method for high-resolution satellite images. First, the overlapped rectangle is computed according to geographical locations of the reference and mosaic images and feature points on both the reference and mosaic images are extracted by a scale-invariant feature transform (SIFT) algorithm only from the overlapped region. Then, the RANSAC method is used to match feature points of both images. Finally, the two images are fused into a seamlessly panoramic image by the simple linear weighted fusion method or other method. The proposed method is implemented in C++ language based on OpenCV and GDAL, and tested by Worldview-2 multispectral images with a spatial resolution of 2 meters. Results show that the proposed method can detect feature points efficiently and mosaic images automatically.
Sellers, Michael S; Lísal, Martin; Brennan, John K
2016-03-21
We present an extension of various free-energy methodologies to determine the chemical potential of the solid and liquid phases of a fully-flexible molecule using classical simulation. The methods are applied to the Smith-Bharadwaj atomistic potential representation of cyclotrimethylene trinitramine (RDX), a well-studied energetic material, to accurately determine the solid and liquid phase Gibbs free energies, and the melting point (Tm). We outline an efficient technique to find the absolute chemical potential and melting point of a fully-flexible molecule using one set of simulations to compute the solid absolute chemical potential and one set of simulations to compute the solid-liquid free energy difference. With this combination, only a handful of simulations are needed, whereby the absolute quantities of the chemical potentials are obtained, for use in other property calculations, such as the characterization of crystal polymorphs or the determination of the entropy. Using the LAMMPS molecular simulator, the Frenkel and Ladd and pseudo-supercritical path techniques are adapted to generate 3rd order fits of the solid and liquid chemical potentials. Results yield the thermodynamic melting point Tm = 488.75 K at 1.0 atm. We also validate these calculations and compare this melting point to one obtained from a typical superheated simulation technique.
Chen, Haibin; Zhong, Zichun; Liao, Yuliang; Pompoš, Arnold; Hrycushko, Brian; Albuquerque, Kevin; Zhen, Xin; Zhou, Linghong; Gu, Xuejun
2016-02-07
GEC-ESTRO guidelines for high dose rate cervical brachytherapy advocate the reporting of the D2cc (the minimum dose received by the maximally exposed 2cc volume) to organs at risk. Due to large interfractional organ motion, reporting of accurate cumulative D2cc over a multifractional course is a non-trivial task requiring deformable image registration and deformable dose summation. To efficiently and accurately describe the point-to-point correspondence of the bladder wall over all treatment fractions while preserving local topologies, we propose a novel graphic processing unit (GPU)-based non-rigid point matching algorithm. This is achieved by introducing local anatomic information into the iterative update of correspondence matrix computation in the 'thin plate splines-robust point matching' (TPS-RPM) scheme. The performance of the GPU-based TPS-RPM with local topology preservation algorithm (TPS-RPM-LTP) was evaluated using four numerically simulated synthetic bladders having known deformations, a custom-made porcine bladder phantom embedded with twenty one fiducial markers, and 29 fractional computed tomography (CT) images from seven cervical cancer patients. Results show that TPS-RPM-LTP achieved excellent geometric accuracy with landmark residual distance error (RDE) of 0.7 ± 0.3 mm for the numerical synthetic data with different scales of bladder deformation and structure complexity, and 3.7 ± 1.8 mm and 1.6 ± 0.8 mm for the porcine bladder phantom with large and small deformation, respectively. The RDE accuracy of the urethral orifice landmarks in patient bladders was 3.7 ± 2.1 mm. When compared to the original TPS-RPM, the TPS-RPM-LTP improved landmark matching by reducing landmark RDE by 50 ± 19%, 37 ± 11% and 28 ± 11% for the synthetic, porcine phantom and the patient bladders, respectively. This was achieved with a computational time of less than 15 s in all cases with GPU acceleration. The efficiency and accuracy shown with the TPS-RPM-LTP indicate that it is a practical and promising tool for bladder dose summation in adaptive cervical cancer brachytherapy.
NASA Astrophysics Data System (ADS)
Chen, Haibin; Zhong, Zichun; Liao, Yuliang; Pompoš, Arnold; Hrycushko, Brian; Albuquerque, Kevin; Zhen, Xin; Zhou, Linghong; Gu, Xuejun
2016-02-01
GEC-ESTRO guidelines for high dose rate cervical brachytherapy advocate the reporting of the D2cc (the minimum dose received by the maximally exposed 2cc volume) to organs at risk. Due to large interfractional organ motion, reporting of accurate cumulative D2cc over a multifractional course is a non-trivial task requiring deformable image registration and deformable dose summation. To efficiently and accurately describe the point-to-point correspondence of the bladder wall over all treatment fractions while preserving local topologies, we propose a novel graphic processing unit (GPU)-based non-rigid point matching algorithm. This is achieved by introducing local anatomic information into the iterative update of correspondence matrix computation in the ‘thin plate splines-robust point matching’ (TPS-RPM) scheme. The performance of the GPU-based TPS-RPM with local topology preservation algorithm (TPS-RPM-LTP) was evaluated using four numerically simulated synthetic bladders having known deformations, a custom-made porcine bladder phantom embedded with twenty one fiducial markers, and 29 fractional computed tomography (CT) images from seven cervical cancer patients. Results show that TPS-RPM-LTP achieved excellent geometric accuracy with landmark residual distance error (RDE) of 0.7 ± 0.3 mm for the numerical synthetic data with different scales of bladder deformation and structure complexity, and 3.7 ± 1.8 mm and 1.6 ± 0.8 mm for the porcine bladder phantom with large and small deformation, respectively. The RDE accuracy of the urethral orifice landmarks in patient bladders was 3.7 ± 2.1 mm. When compared to the original TPS-RPM, the TPS-RPM-LTP improved landmark matching by reducing landmark RDE by 50 ± 19%, 37 ± 11% and 28 ± 11% for the synthetic, porcine phantom and the patient bladders, respectively. This was achieved with a computational time of less than 15 s in all cases with GPU acceleration. The efficiency and accuracy shown with the TPS-RPM-LTP indicate that it is a practical and promising tool for bladder dose summation in adaptive cervical cancer brachytherapy.
Accelerated computer generated holography using sparse bases in the STFT domain.
Blinder, David; Schelkens, Peter
2018-01-22
Computer-generated holography at high resolutions is a computationally intensive task. Efficient algorithms are needed to generate holograms at acceptable speeds, especially for real-time and interactive applications such as holographic displays. We propose a novel technique to generate holograms using a sparse basis representation in the short-time Fourier space combined with a wavefront-recording plane placed in the middle of the 3D object. By computing the point spread functions in the transform domain, we update only a small subset of the precomputed largest-magnitude coefficients to significantly accelerate the algorithm over conventional look-up table methods. We implement the algorithm on a GPU, and report a speedup factor of over 30. We show that this transform is superior over wavelet-based approaches, and show quantitative and qualitative improvements over the state-of-the-art WASABI method; we report accuracy gains of 2dB PSNR, as well improved view preservation.
3D reconstruction of wooden member of ancient architecture from point clouds
NASA Astrophysics Data System (ADS)
Zhang, Ruiju; Wang, Yanmin; Li, Deren; Zhao, Jun; Song, Daixue
2006-10-01
This paper presents a 3D reconstruction method to model wooden member of ancient architecture from point clouds based on improved deformable model. Three steps are taken to recover the shape of wooden member. Firstly, Hessian matrix is adopted to compute the axe of wooden member. Secondly, an initial model of wooden member is made by contour orthogonal to its axis. Thirdly, an accurate model is got through the coupling effect between the initial model and the point clouds of the wooden member according to the theory of improved deformable model. Every step and algorithm is studied and described in the paper. Using the point clouds captured from Forbidden City of China, shaft member and beam member are taken as examples to test the method proposed in the paper. Results show the efficiency and robustness of the method addressed in the literature to model the wooden member of ancient architecture.
Increasing Efficiency at the NTF by Optimizing Model AoA Positioning
NASA Technical Reports Server (NTRS)
Crawford, Bradley L.; Spells, Courtney
2006-01-01
The National Transonic Facility (NTF) at NASA Langley Research Center (LaRC) is a national resource for aeronautical research and development. The government, military and private industries rely on the capability of this facility for realistic flight data. Reducing the operation costs and keeping the NTF affordable is essential for aeronautics research. The NTF is undertaking an effort to reduce the time between data points during a pitch polar. This reduction is being driven by the operating costs of a cryogenic facility. If the time per data point can be reduced, a substantial cost savings can be realized from a reduction in liquid nitrogen (LN2) consumption. It is known that angle-of-attack (AoA) positioning is the longest lead-time item between points. In January 2005 a test was conducted at the NTF to determine the cause of the long lead-time so that an effort could be made to improve efficiency. The AoA signal at the NTF originates from onboard instrumentation then travels through a number of different systems including the signal conditioner, digital voltmeter, and the data system where the AoA angle is calculated. It is then fed into a closed loop control system that sets the model position. Each process along this path adds to the time per data point affecting the efficiency of the data taking process. Due to the nature of the closed loop feed back AoA control and the signal path, it takes approximately 18 seconds to take one pitch pause point with a typical AoA increment. Options are being investigated to reduce the time delay between points by modifying the signal path. These options include: reduced signal filtering, using analog channels instead of a digital volt meter (DVM), re-routing the signal directly to the AoA control computer and implementing new control algorithms. Each of these has potential to reduce the positioning time and together the savings could be significant. These timesaving efforts are essential but must be weighed against possible loss of data quality. For example, a reduction in filtering can introduce noise into the signal and using analog channels could result in some loss of accuracy. Data quality assessments need to be performed concurrently with timesaving techniques since data quality parameters are essential in maintaining facility integrity. This paper will highlight time saving efforts being undertaken or studied at the NTF. It will outline the instrumentation and computer systems involved in setting of the model pitch attitude then suggest changes to the process and discuss how these system changes would effect the time between data points. It also discusses the issue of data quality and how the potential efficiency changes in the system could affect it. Lastly, it will discuss the possibility of using an open loop control system and give some pros and cons of this method.
NASA Technical Reports Server (NTRS)
Nobbs, Steven G.
1995-01-01
An overview of the performance seeking control (PSC) algorithm and details of the important components of the algorithm are given. The onboard propulsion system models, the linear programming optimization, and engine control interface are described. The PSC algorithm receives input from various computers on the aircraft including the digital flight computer, digital engine control, and electronic inlet control. The PSC algorithm contains compact models of the propulsion system including the inlet, engine, and nozzle. The models compute propulsion system parameters, such as inlet drag and fan stall margin, which are not directly measurable in flight. The compact models also compute sensitivities of the propulsion system parameters to change in control variables. The engine model consists of a linear steady state variable model (SSVM) and a nonlinear model. The SSVM is updated with efficiency factors calculated in the engine model update logic, or Kalman filter. The efficiency factors are used to adjust the SSVM to match the actual engine. The propulsion system models are mathematically integrated to form an overall propulsion system model. The propulsion system model is then optimized using a linear programming optimization scheme. The goal of the optimization is determined from the selected PSC mode of operation. The resulting trims are used to compute a new operating point about which the optimization process is repeated. This process is continued until an overall (global) optimum is reached before applying the trims to the controllers.
Heat-driven liquid metal cooling device for the thermal management of a computer chip
NASA Astrophysics Data System (ADS)
Ma, Kun-Quan; Liu, Jing
2007-08-01
The tremendous heat generated in a computer chip or very large scale integrated circuit raises many challenging issues to be solved. Recently, liquid metal with a low melting point was established as the most conductive coolant for efficiently cooling the computer chip. Here, by making full use of the double merits of the liquid metal, i.e. superior heat transfer performance and electromagnetically drivable ability, we demonstrate for the first time the liquid-cooling concept for the thermal management of a computer chip using waste heat to power the thermoelectric generator (TEG) and thus the flow of the liquid metal. Such a device consumes no external net energy, which warrants it a self-supporting and completely silent liquid-cooling module. Experiments on devices driven by one or two stage TEGs indicate that a dramatic temperature drop on the simulating chip has been realized without the aid of any fans. The higher the heat load, the larger will be the temperature decrease caused by the cooling device. Further, the two TEGs will generate a larger current if a copper plate is sandwiched between them to enhance heat dissipation there. This new method is expected to be significant in future thermal management of a desk or notebook computer, where both efficient cooling and extremely low energy consumption are of major concern.
NASA Astrophysics Data System (ADS)
Kenway, Gaetan K. W.
This thesis presents new tools and techniques developed to address the challenging problem of high-fidelity aerostructural optimization with respect to large numbers of design variables. A new mesh-movement scheme is developed that is both computationally efficient and sufficiently robust to accommodate large geometric design changes and aerostructural deformations. A fully coupled Newton-Krylov method is presented that accelerates the convergence of aerostructural systems and provides a 20% performance improvement over the traditional nonlinear block Gauss-Seidel approach and can handle more exible structures. A coupled adjoint method is used that efficiently computes derivatives for a gradient-based optimization algorithm. The implementation uses only machine accurate derivative techniques and is verified to yield fully consistent derivatives by comparing against the complex step method. The fully-coupled large-scale coupled adjoint solution method is shown to have 30% better performance than the segregated approach. The parallel scalability of the coupled adjoint technique is demonstrated on an Euler Computational Fluid Dynamics (CFD) model with more than 80 million state variables coupled to a detailed structural finite-element model of the wing with more than 1 million degrees of freedom. Multi-point high-fidelity aerostructural optimizations of a long-range wide-body, transonic transport aircraft configuration are performed using the developed techniques. The aerostructural analysis employs Euler CFD with a 2 million cell mesh and a structural finite element model with 300 000 DOF. Two design optimization problems are solved: one where takeoff gross weight is minimized, and another where fuel burn is minimized. Each optimization uses a multi-point formulation with 5 cruise conditions and 2 maneuver conditions. The optimization problems have 476 design variables are optimal results are obtained within 36 hours of wall time using 435 processors. The TOGW minimization results in a 4.2% reduction in TOGW with a 6.6% fuel burn reduction, while the fuel burn optimization resulted in a 11.2% fuel burn reduction with no change to the takeoff gross weight.
A method to calculate the gamma ray detection efficiency of a cylindrical NaI (Tl) crystal
NASA Astrophysics Data System (ADS)
Ahmadi, S.; Ashrafi, S.; Yazdansetad, F.
2018-05-01
Given a wide range application of NaI(Tl) detector in industrial and medical sectors, computation of the related detection efficiency in different distances of a radioactive source, especially for calibration purposes, is the subject of radiation detection studies. In this work, a 2in both in radius and height cylindrical NaI (Tl) scintillator was used, and by changing the radial, axial, and diagonal positions of an isotropic 137Cs point source relative to the detector, the solid angles and the interaction probabilities of gamma photons with the detector's sensitive area have been calculated. The calculations present the geometric and intrinsic efficiency as the functions of detector's dimensions and the position of the source. The calculation model is in good agreement with experiment, and MCNPX simulation.
2D-RBUC for efficient parallel compression of residuals
NASA Astrophysics Data System (ADS)
Đurđević, Đorđe M.; Tartalja, Igor I.
2018-02-01
In this paper, we present a method for lossless compression of residuals with an efficient SIMD parallel decompression. The residuals originate from lossy or near lossless compression of height fields, which are commonly used to represent models of terrains. The algorithm is founded on the existing RBUC method for compression of non-uniform data sources. We have adapted the method to capture 2D spatial locality of height fields, and developed the data decompression algorithm for modern GPU architectures already present even in home computers. In combination with the point-level SIMD-parallel lossless/lossy high field compression method HFPaC, characterized by fast progressive decompression and seamlessly reconstructed surface, the newly proposed method trades off small efficiency degradation for a non negligible compression ratio (measured up to 91%) benefit.
Simulation of a Geiger-Mode Imaging LADAR System for Performance Assessment
Kim, Seongjoon; Lee, Impyeong; Kwon, Yong Joon
2013-01-01
As LADAR systems applications gradually become more diverse, new types of systems are being developed. When developing new systems, simulation studies are an essential prerequisite. A simulator enables performance predictions and optimal system parameters at the design level, as well as providing sample data for developing and validating application algorithms. The purpose of the study is to propose a method for simulating a Geiger-mode imaging LADAR system. We develop simulation software to assess system performance and generate sample data for the applications. The simulation is based on three aspects of modeling—the geometry, radiometry and detection. The geometric model computes the ranges to the reflection points of the laser pulses. The radiometric model generates the return signals, including the noises. The detection model determines the flight times of the laser pulses based on the nature of the Geiger-mode detector. We generated sample data using the simulator with the system parameters and analyzed the detection performance by comparing the simulated points to the reference points. The proportion of the outliers in the simulated points reached 25.53%, indicating the need for efficient outlier elimination algorithms. In addition, the false alarm rate and dropout rate of the designed system were computed as 1.76% and 1.06%, respectively. PMID:23823970
Development of a probabilistic analysis methodology for structural reliability estimation
NASA Technical Reports Server (NTRS)
Torng, T. Y.; Wu, Y.-T.
1991-01-01
The novel probabilistic analysis method for assessment of structural reliability presented, which combines fast-convolution with an efficient structural reliability analysis, can after identifying the most important point of a limit state proceed to establish a quadratic-performance function. It then transforms the quadratic function into a linear one, and applies fast convolution. The method is applicable to problems requiring computer-intensive structural analysis. Five illustrative examples of the method's application are given.
C.A.D. and ergonomic workstations conception
NASA Astrophysics Data System (ADS)
Keravel, Francine
1986-07-01
Computer Aided Design is able to perform workstation's conception. An ergonomic data could be complete this view and warrant a coherent fiability conception. Complexe form representation machines, anthropometric data and environment factors are allowed to perceive the limit points between humain and new technology situation. Work ability users, safety, confort and human efficiency could be also included. Such a programm with expert system integration will give a complete listing appreciation about workstation's conception.
Development of a High-Throughput Microwave Imaging System for Concealed Weapons Detection
2016-07-15
hardware. Index Terms—Microwave imaging, multistatic radar, Fast Fourier Transform (FFT). I. INTRODUCTION Near-field microwave imaging is a non-ionizing...configuration, but its computational demands are extreme. Fast Fourier Transform (FFT) imaging has long been used to efficiently construct images sampled with...Simulated image of 25 point scatterers imaged at range 1.5m, with array layout depicted in Fig. 3. Left: image formed with Equation (5) ( Fourier
Assessment of Li/SOCL2 Battery Technology; Reserve, Thin-Cell Design. Volume 3
1990-06-01
power density and efficiency of an operating electrochemical system . The method is general - the examples to illustrate the selected points pertain to... System : Design, Manufacturing and QC Considerations), S. Szpak, P. A. Mosier-Boss, and J. J. Smith, 34th International Power Sources Symposium, Cherry...I) the computer time required to evaluate the integral in Eqn. Ill, and (iii the lack of generality in the attainable lineshapes. However, since this
Extending the BEAGLE library to a multi-FPGA platform.
Jin, Zheming; Bakos, Jason D
2013-01-19
Maximum Likelihood (ML)-based phylogenetic inference using Felsenstein's pruning algorithm is a standard method for estimating the evolutionary relationships amongst a set of species based on DNA sequence data, and is used in popular applications such as RAxML, PHYLIP, GARLI, BEAST, and MrBayes. The Phylogenetic Likelihood Function (PLF) and its associated scaling and normalization steps comprise the computational kernel for these tools. These computations are data intensive but contain fine grain parallelism that can be exploited by coprocessor architectures such as FPGAs and GPUs. A general purpose API called BEAGLE has recently been developed that includes optimized implementations of Felsenstein's pruning algorithm for various data parallel architectures. In this paper, we extend the BEAGLE API to a multiple Field Programmable Gate Array (FPGA)-based platform called the Convey HC-1. The core calculation of our implementation, which includes both the phylogenetic likelihood function (PLF) and the tree likelihood calculation, has an arithmetic intensity of 130 floating-point operations per 64 bytes of I/O, or 2.03 ops/byte. Its performance can thus be calculated as a function of the host platform's peak memory bandwidth and the implementation's memory efficiency, as 2.03 × peak bandwidth × memory efficiency. Our FPGA-based platform has a peak bandwidth of 76.8 GB/s and our implementation achieves a memory efficiency of approximately 50%, which gives an average throughput of 78 Gflops. This represents a ~40X speedup when compared with BEAGLE's CPU implementation on a dual Xeon 5520 and 3X speedup versus BEAGLE's GPU implementation on a Tesla T10 GPU for very large data sizes. The power consumption is 92 W, yielding a power efficiency of 1.7 Gflops per Watt. The use of data parallel architectures to achieve high performance for likelihood-based phylogenetic inference requires high memory bandwidth and a design methodology that emphasizes high memory efficiency. To achieve this objective, we integrated 32 pipelined processing elements (PEs) across four FPGAs. For the design of each PE, we developed a specialized synthesis tool to generate a floating-point pipeline with resource and throughput constraints to match the target platform. We have found that using low-latency floating-point operators can significantly reduce FPGA area and still meet timing requirement on the target platform. We found that this design methodology can achieve performance that exceeds that of a GPU-based coprocessor.
NASA Astrophysics Data System (ADS)
Shahamatnia, Ehsan; Dorotovič, Ivan; Fonseca, Jose M.; Ribeiro, Rita A.
2016-03-01
Developing specialized software tools is essential to support studies of solar activity evolution. With new space missions such as Solar Dynamics Observatory (SDO), solar images are being produced in unprecedented volumes. To capitalize on that huge data availability, the scientific community needs a new generation of software tools for automatic and efficient data processing. In this paper a prototype of a modular framework for solar feature detection, characterization, and tracking is presented. To develop an efficient system capable of automatic solar feature tracking and measuring, a hybrid approach combining specialized image processing, evolutionary optimization, and soft computing algorithms is being followed. The specialized hybrid algorithm for tracking solar features allows automatic feature tracking while gathering characterization details about the tracked features. The hybrid algorithm takes advantages of the snake model, a specialized image processing algorithm widely used in applications such as boundary delineation, image segmentation, and object tracking. Further, it exploits the flexibility and efficiency of Particle Swarm Optimization (PSO), a stochastic population based optimization algorithm. PSO has been used successfully in a wide range of applications including combinatorial optimization, control, clustering, robotics, scheduling, and image processing and video analysis applications. The proposed tool, denoted PSO-Snake model, was already successfully tested in other works for tracking sunspots and coronal bright points. In this work, we discuss the application of the PSO-Snake algorithm for calculating the sidereal rotational angular velocity of the solar corona. To validate the results we compare them with published manual results performed by an expert.
NASA Astrophysics Data System (ADS)
Ibrahima, Fayadhoi; Meyer, Daniel; Tchelepi, Hamdi
2016-04-01
Because geophysical data are inexorably sparse and incomplete, stochastic treatments of simulated responses are crucial to explore possible scenarios and assess risks in subsurface problems. In particular, nonlinear two-phase flows in porous media are essential, yet challenging, in reservoir simulation and hydrology. Adding highly heterogeneous and uncertain input, such as the permeability and porosity fields, transforms the estimation of the flow response into a tough stochastic problem for which computationally expensive Monte Carlo (MC) simulations remain the preferred option.We propose an alternative approach to evaluate the probability distribution of the (water) saturation for the stochastic Buckley-Leverett problem when the probability distributions of the permeability and porosity fields are available. We give a computationally efficient and numerically accurate method to estimate the one-point probability density (PDF) and cumulative distribution functions (CDF) of the (water) saturation. The distribution method draws inspiration from a Lagrangian approach of the stochastic transport problem and expresses the saturation PDF and CDF essentially in terms of a deterministic mapping and the distribution and statistics of scalar random fields. In a large class of applications these random fields can be estimated at low computational costs (few MC runs), thus making the distribution method attractive. Even though the method relies on a key assumption of fixed streamlines, we show that it performs well for high input variances, which is the case of interest. Once the saturation distribution is determined, any one-point statistics thereof can be obtained, especially the saturation average and standard deviation. Moreover, the probability of rare events and saturation quantiles (e.g. P10, P50 and P90) can be efficiently derived from the distribution method. These statistics can then be used for risk assessment, as well as data assimilation and uncertainty reduction in the prior knowledge of input distributions. We provide various examples and comparisons with MC simulations to illustrate the performance of the method.
The Pointing Self-calibration Algorithm for Aperture Synthesis Radio Telescopes
NASA Astrophysics Data System (ADS)
Bhatnagar, S.; Cornwell, T. J.
2017-11-01
This paper is concerned with algorithms for calibration of direction-dependent effects (DDE) in aperture synthesis radio telescopes (ASRT). After correction of direction-independent effects (DIE) using self-calibration, imaging performance can be limited by the imprecise knowledge of the forward gain of the elements in the array. In general, the forward gain pattern is directionally dependent and varies with time due to a number of reasons. Some factors, such as rotation of the primary beam with Parallactic Angle for Azimuth-Elevation mount antennas are known a priori. Some, such as antenna pointing errors and structural deformation/projection effects for aperture-array elements cannot be measured a priori. Thus, in addition to algorithms to correct for DD effects known a priori, algorithms to solve for DD gains are required for high dynamic range imaging. Here, we discuss a mathematical framework for antenna-based DDE calibration algorithms and show that this framework leads to computationally efficient optimal algorithms that scale well in a parallel computing environment. As an example of an antenna-based DD calibration algorithm, we demonstrate the Pointing SelfCal (PSC) algorithm to solve for the antenna pointing errors. Our analysis show that the sensitivity of modern ASRT is sufficient to solve for antenna pointing errors and other DD effects. We also discuss the use of the PSC algorithm in real-time calibration systems and extensions for antenna Shape SelfCal algorithm for real-time tracking and corrections for pointing offsets and changes in antenna shape.
The Pointing Self-calibration Algorithm for Aperture Synthesis Radio Telescopes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bhatnagar, S.; Cornwell, T. J., E-mail: sbhatnag@nrao.edu
This paper is concerned with algorithms for calibration of direction-dependent effects (DDE) in aperture synthesis radio telescopes (ASRT). After correction of direction-independent effects (DIE) using self-calibration, imaging performance can be limited by the imprecise knowledge of the forward gain of the elements in the array. In general, the forward gain pattern is directionally dependent and varies with time due to a number of reasons. Some factors, such as rotation of the primary beam with Parallactic Angle for Azimuth–Elevation mount antennas are known a priori. Some, such as antenna pointing errors and structural deformation/projection effects for aperture-array elements cannot be measuredmore » a priori. Thus, in addition to algorithms to correct for DD effects known a priori, algorithms to solve for DD gains are required for high dynamic range imaging. Here, we discuss a mathematical framework for antenna-based DDE calibration algorithms and show that this framework leads to computationally efficient optimal algorithms that scale well in a parallel computing environment. As an example of an antenna-based DD calibration algorithm, we demonstrate the Pointing SelfCal (PSC) algorithm to solve for the antenna pointing errors. Our analysis show that the sensitivity of modern ASRT is sufficient to solve for antenna pointing errors and other DD effects. We also discuss the use of the PSC algorithm in real-time calibration systems and extensions for antenna Shape SelfCal algorithm for real-time tracking and corrections for pointing offsets and changes in antenna shape.« less
Gridless, pattern-driven point cloud completion and extension
NASA Astrophysics Data System (ADS)
Gravey, Mathieu; Mariethoz, Gregoire
2016-04-01
While satellites offer Earth observation with a wide coverage, other remote sensing techniques such as terrestrial LiDAR can acquire very high-resolution data on an area that is limited in extension and often discontinuous due to shadow effects. Here we propose a numerical approach to merge these two types of information, thereby reconstructing high-resolution data on a continuous large area. It is based on a pattern matching process that completes the areas where only low-resolution data is available, using bootstrapped high-resolution patterns. Currently, the most common approach to pattern matching is to interpolate the point data on a grid. While this approach is computationally efficient, it presents major drawbacks for point clouds processing because a significant part of the information is lost in the point-to-grid resampling, and that a prohibitive amount of memory is needed to store large grids. To address these issues, we propose a gridless method that compares point clouds subsets without the need to use a grid. On-the-fly interpolation involves a heavy computational load, which is met by using a GPU high-optimized implementation and a hierarchical pattern searching strategy. The method is illustrated using data from the Val d'Arolla, Swiss Alps, where high-resolution terrestrial LiDAR data are fused with lower-resolution Landsat and WorldView-3 acquisitions, such that the density of points is homogeneized (data completion) and that it is extend to a larger area (data extension).
THE SCREENING AND RANKING ALGORITHM FOR CHANGE-POINTS DETECTION IN MULTIPLE SAMPLES
Song, Chi; Min, Xiaoyi; Zhang, Heping
2016-01-01
The chromosome copy number variation (CNV) is the deviation of genomic regions from their normal copy number states, which may associate with many human diseases. Current genetic studies usually collect hundreds to thousands of samples to study the association between CNV and diseases. CNVs can be called by detecting the change-points in mean for sequences of array-based intensity measurements. Although multiple samples are of interest, the majority of the available CNV calling methods are single sample based. Only a few multiple sample methods have been proposed using scan statistics that are computationally intensive and designed toward either common or rare change-points detection. In this paper, we propose a novel multiple sample method by adaptively combining the scan statistic of the screening and ranking algorithm (SaRa), which is computationally efficient and is able to detect both common and rare change-points. We prove that asymptotically this method can find the true change-points with almost certainty and show in theory that multiple sample methods are superior to single sample methods when shared change-points are of interest. Additionally, we report extensive simulation studies to examine the performance of our proposed method. Finally, using our proposed method as well as two competing approaches, we attempt to detect CNVs in the data from the Primary Open-Angle Glaucoma Genes and Environment study, and conclude that our method is faster and requires less information while our ability to detect the CNVs is comparable or better. PMID:28090239
NASA Astrophysics Data System (ADS)
Aviat, Félix; Lagardère, Louis; Piquemal, Jean-Philip
2017-10-01
In a recent paper [F. Aviat et al., J. Chem. Theory Comput. 13, 180-190 (2017)], we proposed the Truncated Conjugate Gradient (TCG) approach to compute the polarization energy and forces in polarizable molecular simulations. The method consists in truncating the conjugate gradient algorithm at a fixed predetermined order leading to a fixed computational cost and can thus be considered "non-iterative." This gives the possibility to derive analytical forces avoiding the usual energy conservation (i.e., drifts) issues occurring with iterative approaches. A key point concerns the evaluation of the analytical gradients, which is more complex than that with a usual solver. In this paper, after reviewing the present state of the art of polarization solvers, we detail a viable strategy for the efficient implementation of the TCG calculation. The complete cost of the approach is then measured as it is tested using a multi-time step scheme and compared to timings using usual iterative approaches. We show that the TCG methods are more efficient than traditional techniques, making it a method of choice for future long molecular dynamics simulations using polarizable force fields where energy conservation matters. We detail the various steps required for the implementation of the complete method by software developers.
Aviat, Félix; Lagardère, Louis; Piquemal, Jean-Philip
2017-10-28
In a recent paper [F. Aviat et al., J. Chem. Theory Comput. 13, 180-190 (2017)], we proposed the Truncated Conjugate Gradient (TCG) approach to compute the polarization energy and forces in polarizable molecular simulations. The method consists in truncating the conjugate gradient algorithm at a fixed predetermined order leading to a fixed computational cost and can thus be considered "non-iterative." This gives the possibility to derive analytical forces avoiding the usual energy conservation (i.e., drifts) issues occurring with iterative approaches. A key point concerns the evaluation of the analytical gradients, which is more complex than that with a usual solver. In this paper, after reviewing the present state of the art of polarization solvers, we detail a viable strategy for the efficient implementation of the TCG calculation. The complete cost of the approach is then measured as it is tested using a multi-time step scheme and compared to timings using usual iterative approaches. We show that the TCG methods are more efficient than traditional techniques, making it a method of choice for future long molecular dynamics simulations using polarizable force fields where energy conservation matters. We detail the various steps required for the implementation of the complete method by software developers.
Sensitivity analysis and approximation methods for general eigenvalue problems
NASA Technical Reports Server (NTRS)
Murthy, D. V.; Haftka, R. T.
1986-01-01
Optimization of dynamic systems involving complex non-hermitian matrices is often computationally expensive. Major contributors to the computational expense are the sensitivity analysis and reanalysis of a modified design. The present work seeks to alleviate this computational burden by identifying efficient sensitivity analysis and approximate reanalysis methods. For the algebraic eigenvalue problem involving non-hermitian matrices, algorithms for sensitivity analysis and approximate reanalysis are classified, compared and evaluated for efficiency and accuracy. Proper eigenvector normalization is discussed. An improved method for calculating derivatives of eigenvectors is proposed based on a more rational normalization condition and taking advantage of matrix sparsity. Important numerical aspects of this method are also discussed. To alleviate the problem of reanalysis, various approximation methods for eigenvalues are proposed and evaluated. Linear and quadratic approximations are based directly on the Taylor series. Several approximation methods are developed based on the generalized Rayleigh quotient for the eigenvalue problem. Approximation methods based on trace theorem give high accuracy without needing any derivatives. Operation counts for the computation of the approximations are given. General recommendations are made for the selection of appropriate approximation technique as a function of the matrix size, number of design variables, number of eigenvalues of interest and the number of design points at which approximation is sought.
A Semi-Vectorization Algorithm to Synthesis of Gravitational Anomaly Quantities on the Earth
NASA Astrophysics Data System (ADS)
Abdollahzadeh, M.; Eshagh, M.; Najafi Alamdari, M.
2009-04-01
The Earth's gravitational potential can be expressed by the well-known spherical harmonic expansion. The computational time of summing up this expansion is an important practical issue which can be reduced by an efficient numerical algorithm. This paper proposes such a method for block-wise synthesizing the anomaly quantities on the Earth surface using vectorization. Fully-vectorization means transformation of the summations to the simple matrix and vector products. It is not a practical for the matrices with large dimensions. Here a semi-vectorization algorithm is proposed to avoid working with large vectors and matrices. It speeds up the computations by using one loop for the summation either on degrees or on orders. The former is a good option to synthesize the anomaly quantities on the Earth surface considering a digital elevation model (DEM). This approach is more efficient than the two-step method which computes the quantities on the reference ellipsoid and continues them upward to the Earth surface. The algorithm has been coded in MATLAB which synthesizes a global grid of 5â²Ã- 5â² (corresponding 9 million points) of gravity anomaly or geoid height using a geopotential model to degree 360 in 10000 seconds by an ordinary computer with 2G RAM.
Finite difference elastic wave modeling with an irregular free surface using ADER scheme
NASA Astrophysics Data System (ADS)
Almuhaidib, Abdulaziz M.; Nafi Toksöz, M.
2015-06-01
In numerical modeling of seismic wave propagation in the earth, we encounter two important issues: the free surface and the topography of the surface (i.e. irregularities). In this study, we develop a 2D finite difference solver for the elastic wave equation that combines a 4th- order ADER scheme (Arbitrary high-order accuracy using DERivatives), which is widely used in aeroacoustics, with the characteristic variable method at the free surface boundary. The idea is to treat the free surface boundary explicitly by using ghost values of the solution for points beyond the free surface to impose the physical boundary condition. The method is based on the velocity-stress formulation. The ultimate goal is to develop a numerical solver for the elastic wave equation that is stable, accurate and computationally efficient. The solver treats smooth arbitrary-shaped boundaries as simple plane boundaries. The computational cost added by treating the topography is negligible compared to flat free surface because only a small number of grid points near the boundary need to be computed. In the presence of topography, using 10 grid points per shortest shear-wavelength, the solver yields accurate results. Benchmark numerical tests using several complex models that are solved by our method and other independent accurate methods show an excellent agreement, confirming the validity of the method for modeling elastic waves with an irregular free surface.
NASA Technical Reports Server (NTRS)
Bardina, J. E.
1994-01-01
A new computational efficient 3-D compressible Reynolds-averaged implicit Navier-Stokes method with advanced two equation turbulence models for high speed flows is presented. All convective terms are modeled using an entropy satisfying higher-order Total Variation Diminishing (TVD) scheme based on implicit upwind flux-difference split approximations and arithmetic averaging procedure of primitive variables. This method combines the best features of data management and computational efficiency of space marching procedures with the generality and stability of time dependent Navier-Stokes procedures to solve flows with mixed supersonic and subsonic zones, including streamwise separated flows. Its robust stability derives from a combination of conservative implicit upwind flux-difference splitting with Roe's property U to provide accurate shock capturing capability that non-conservative schemes do not guarantee, alternating symmetric Gauss-Seidel 'method of planes' relaxation procedure coupled with a three-dimensional two-factor diagonal-dominant approximate factorization scheme, TVD flux limiters of higher-order flux differences satisfying realizability, and well-posed characteristic-based implicit boundary-point a'pproximations consistent with the local characteristics domain of dependence. The efficiency of the method is highly increased with Newton Raphson acceleration which allows convergence in essentially one forward sweep for supersonic flows. The method is verified by comparing with experiment and other Navier-Stokes methods. Here, results of adiabatic and cooled flat plate flows, compression corner flow, and 3-D hypersonic shock-wave/turbulent boundary layer interaction flows are presented. The robust 3-D method achieves a better computational efficiency of at least one order of magnitude over the CNS Navier-Stokes code. It provides cost-effective aerodynamic predictions in agreement with experiment, and the capability of predicting complex flow structures in complex geometries with good accuracy.
Fast global image smoothing based on weighted least squares.
Min, Dongbo; Choi, Sunghwan; Lu, Jiangbo; Ham, Bumsub; Sohn, Kwanghoon; Do, Minh N
2014-12-01
This paper presents an efficient technique for performing a spatially inhomogeneous edge-preserving image smoothing, called fast global smoother. Focusing on sparse Laplacian matrices consisting of a data term and a prior term (typically defined using four or eight neighbors for 2D image), our approach efficiently solves such global objective functions. In particular, we approximate the solution of the memory-and computation-intensive large linear system, defined over a d-dimensional spatial domain, by solving a sequence of 1D subsystems. Our separable implementation enables applying a linear-time tridiagonal matrix algorithm to solve d three-point Laplacian matrices iteratively. Our approach combines the best of two paradigms, i.e., efficient edge-preserving filters and optimization-based smoothing. Our method has a comparable runtime to the fast edge-preserving filters, but its global optimization formulation overcomes many limitations of the local filtering approaches. Our method also achieves high-quality results as the state-of-the-art optimization-based techniques, but runs ∼10-30 times faster. Besides, considering the flexibility in defining an objective function, we further propose generalized fast algorithms that perform Lγ norm smoothing (0 < γ < 2) and support an aggregated (robust) data term for handling imprecise data constraints. We demonstrate the effectiveness and efficiency of our techniques in a range of image processing and computer graphics applications.
Zhang, Hanming; Wang, Linyuan; Yan, Bin; Li, Lei; Cai, Ailong; Hu, Guoen
2016-01-01
Total generalized variation (TGV)-based computed tomography (CT) image reconstruction, which utilizes high-order image derivatives, is superior to total variation-based methods in terms of the preservation of edge information and the suppression of unfavorable staircase effects. However, conventional TGV regularization employs l1-based form, which is not the most direct method for maximizing sparsity prior. In this study, we propose a total generalized p-variation (TGpV) regularization model to improve the sparsity exploitation of TGV and offer efficient solutions to few-view CT image reconstruction problems. To solve the nonconvex optimization problem of the TGpV minimization model, we then present an efficient iterative algorithm based on the alternating minimization of augmented Lagrangian function. All of the resulting subproblems decoupled by variable splitting admit explicit solutions by applying alternating minimization method and generalized p-shrinkage mapping. In addition, approximate solutions that can be easily performed and quickly calculated through fast Fourier transform are derived using the proximal point method to reduce the cost of inner subproblems. The accuracy and efficiency of the simulated and real data are qualitatively and quantitatively evaluated to validate the efficiency and feasibility of the proposed method. Overall, the proposed method exhibits reasonable performance and outperforms the original TGV-based method when applied to few-view problems.
Scaling predictive modeling in drug development with cloud computing.
Moghadam, Behrooz Torabi; Alvarsson, Jonathan; Holm, Marcus; Eklund, Martin; Carlsson, Lars; Spjuth, Ola
2015-01-26
Growing data sets with increased time for analysis is hampering predictive modeling in drug discovery. Model building can be carried out on high-performance computer clusters, but these can be expensive to purchase and maintain. We have evaluated ligand-based modeling on cloud computing resources where computations are parallelized and run on the Amazon Elastic Cloud. We trained models on open data sets of varying sizes for the end points logP and Ames mutagenicity and compare with model building parallelized on a traditional high-performance computing cluster. We show that while high-performance computing results in faster model building, the use of cloud computing resources is feasible for large data sets and scales well within cloud instances. An additional advantage of cloud computing is that the costs of predictive models can be easily quantified, and a choice can be made between speed and economy. The easy access to computational resources with no up-front investments makes cloud computing an attractive alternative for scientists, especially for those without access to a supercomputer, and our study shows that it enables cost-efficient modeling of large data sets on demand within reasonable time.
NASA Astrophysics Data System (ADS)
Yu, Jian; Yin, Qian; Guo, Ping; Luo, A.-li
2014-09-01
This paper presents an efficient method for the extraction of astronomical spectra from two-dimensional (2D) multifibre spectrographs based on the regularized least-squares QR-factorization (LSQR) algorithm. We address two issues: we propose a modified Gaussian point spread function (PSF) for modelling the 2D PSF from multi-emission-line gas-discharge lamp images (arc images), and we develop an efficient deconvolution method to extract spectra in real circumstances. The proposed modified 2D Gaussian PSF model can fit various types of 2D PSFs, including different radial distortion angles and ellipticities. We adopt the regularized LSQR algorithm to solve the sparse linear equations constructed from the sparse convolution matrix, which we designate the deconvolution spectrum extraction method. Furthermore, we implement a parallelized LSQR algorithm based on graphics processing unit programming in the Compute Unified Device Architecture to accelerate the computational processing. Experimental results illustrate that the proposed extraction method can greatly reduce the computational cost and memory use of the deconvolution method and, consequently, increase its efficiency and practicability. In addition, the proposed extraction method has a stronger noise tolerance than other methods, such as the boxcar (aperture) extraction and profile extraction methods. Finally, we present an analysis of the sensitivity of the extraction results to the radius and full width at half-maximum of the 2D PSF.
Spiking Neurons for Analysis of Patterns
NASA Technical Reports Server (NTRS)
Huntsberger, Terrance
2008-01-01
Artificial neural networks comprising spiking neurons of a novel type have been conceived as improved pattern-analysis and pattern-recognition computational systems. These neurons are represented by a mathematical model denoted the state-variable model (SVM), which among other things, exploits a computational parallelism inherent in spiking-neuron geometry. Networks of SVM neurons offer advantages of speed and computational efficiency, relative to traditional artificial neural networks. The SVM also overcomes some of the limitations of prior spiking-neuron models. There are numerous potential pattern-recognition, tracking, and data-reduction (data preprocessing) applications for these SVM neural networks on Earth and in exploration of remote planets. Spiking neurons imitate biological neurons more closely than do the neurons of traditional artificial neural networks. A spiking neuron includes a central cell body (soma) surrounded by a tree-like interconnection network (dendrites). Spiking neurons are so named because they generate trains of output pulses (spikes) in response to inputs received from sensors or from other neurons. They gain their speed advantage over traditional neural networks by using the timing of individual spikes for computation, whereas traditional artificial neurons use averages of activity levels over time. Moreover, spiking neurons use the delays inherent in dendritic processing in order to efficiently encode the information content of incoming signals. Because traditional artificial neurons fail to capture this encoding, they have less processing capability, and so it is necessary to use more gates when implementing traditional artificial neurons in electronic circuitry. Such higher-order functions as dynamic tasking are effected by use of pools (collections) of spiking neurons interconnected by spike-transmitting fibers. The SVM includes adaptive thresholds and submodels of transport of ions (in imitation of such transport in biological neurons). These features enable the neurons to adapt their responses to high-rate inputs from sensors, and to adapt their firing thresholds to mitigate noise or effects of potential sensor failure. The mathematical derivation of the SVM starts from a prior model, known in the art as the point soma model, which captures all of the salient properties of neuronal response while keeping the computational cost low. The point-soma latency time is modified to be an exponentially decaying function of the strength of the applied potential. Choosing computational efficiency over biological fidelity, the dendrites surrounding a neuron are represented by simplified compartmental submodels and there are no dendritic spines. Updates to the dendritic potential, calcium-ion concentrations and conductances, and potassium-ion conductances are done by use of equations similar to those of the point soma. Diffusion processes in dendrites are modeled by averaging among nearest-neighbor compartments. Inputs to each of the dendritic compartments come from sensors. Alternatively or in addition, when an affected neuron is part of a pool, inputs can come from other spiking neurons. At present, SVM neural networks are implemented by computational simulation, using algorithms that encode the SVM and its submodels. However, it should be possible to implement these neural networks in hardware: The differential equations for the dendritic and cellular processes in the SVM model of spiking neurons map to equivalent circuits that can be implemented directly in analog very-large-scale integrated (VLSI) circuits.
a Hadoop-Based Algorithm of Generating dem Grid from Point Cloud Data
NASA Astrophysics Data System (ADS)
Jian, X.; Xiao, X.; Chengfang, H.; Zhizhong, Z.; Zhaohui, W.; Dengzhong, Z.
2015-04-01
Airborne LiDAR technology has proven to be the most powerful tools to obtain high-density, high-accuracy and significantly detailed surface information of terrain and surface objects within a short time, and from which the Digital Elevation Model of high quality can be extracted. Point cloud data generated from the pre-processed data should be classified by segmentation algorithms, so as to differ the terrain points from disorganized points, then followed by a procedure of interpolating the selected points to turn points into DEM data. The whole procedure takes a long time and huge computing resource due to high-density, that is concentrated on by a number of researches. Hadoop is a distributed system infrastructure developed by the Apache Foundation, which contains a highly fault-tolerant distributed file system (HDFS) with high transmission rate and a parallel programming model (Map/Reduce). Such a framework is appropriate for DEM generation algorithms to improve efficiency. Point cloud data of Dongting Lake acquired by Riegl LMS-Q680i laser scanner was utilized as the original data to generate DEM by a Hadoop-based algorithms implemented in Linux, then followed by another traditional procedure programmed by C++ as the comparative experiment. Then the algorithm's efficiency, coding complexity, and performance-cost ratio were discussed for the comparison. The results demonstrate that the algorithm's speed depends on size of point set and density of DEM grid, and the non-Hadoop implementation can achieve a high performance when memory is big enough, but the multiple Hadoop implementation can achieve a higher performance-cost ratio, while point set is of vast quantities on the other hand.
Efficient Redundancy Techniques in Cloud and Desktop Grid Systems using MAP/G/c-type Queues
NASA Astrophysics Data System (ADS)
Chakravarthy, Srinivas R.; Rumyantsev, Alexander
2018-03-01
Cloud computing is continuing to prove its flexibility and versatility in helping industries and businesses as well as academia as a way of providing needed computing capacity. As an important alternative to cloud computing, desktop grids allow to utilize the idle computer resources of an enterprise/community by means of distributed computing system, providing a more secure and controllable environment with lower operational expenses. Further, both cloud computing and desktop grids are meant to optimize limited resources and at the same time to decrease the expected latency for users. The crucial parameter for optimization both in cloud computing and in desktop grids is the level of redundancy (replication) for service requests/workunits. In this paper we study the optimal replication policies by considering three variations of Fork-Join systems in the context of a multi-server queueing system with a versatile point process for the arrivals. For services we consider phase type distributions as well as shifted exponential and Weibull. We use both analytical and simulation approach in our analysis and report some interesting qualitative results.
Particle trajectory computation on a 3-dimensional engine inlet. Final Report Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Kim, J. J.
1986-01-01
A 3-dimensional particle trajectory computer code was developed to compute the distribution of water droplet impingement efficiency on a 3-dimensional engine inlet. The computed results provide the essential droplet impingement data required for the engine inlet anti-icing system design and analysis. The droplet trajectories are obtained by solving the trajectory equation using the fourth order Runge-Kutta and Adams predictor-corrector schemes. A compressible 3-D full potential flow code is employed to obtain a cylindrical grid definition of the flowfield on and about the engine inlet. The inlet surface is defined mathematically through a system of bi-cubic parametric patches in order to compute the droplet impingement points accurately. Analysis results of the 3-D trajectory code obtained for an axisymmetric droplet impingement problem are in good agreement with NACA experimental data. Experimental data are not yet available for the engine inlet impingement problem analyzed. Applicability of the method to solid particle impingement problems, such as engine sand ingestion, is also demonstrated.
Object recognition and localization from 3D point clouds by maximum-likelihood estimation
NASA Astrophysics Data System (ADS)
Dantanarayana, Harshana G.; Huntley, Jonathan M.
2017-08-01
We present an algorithm based on maximum-likelihood analysis for the automated recognition of objects, and estimation of their pose, from 3D point clouds. Surfaces segmented from depth images are used as the features, unlike `interest point'-based algorithms which normally discard such data. Compared to the 6D Hough transform, it has negligible memory requirements, and is computationally efficient compared to iterative closest point algorithms. The same method is applicable to both the initial recognition/pose estimation problem as well as subsequent pose refinement through appropriate choice of the dispersion of the probability density functions. This single unified approach therefore avoids the usual requirement for different algorithms for these two tasks. In addition to the theoretical description, a simple 2 degrees of freedom (d.f.) example is given, followed by a full 6 d.f. analysis of 3D point cloud data from a cluttered scene acquired by a projected fringe-based scanner, which demonstrated an RMS alignment error as low as 0.3 mm.
Gibbs sampling on large lattice with GMRF
NASA Astrophysics Data System (ADS)
Marcotte, Denis; Allard, Denis
2018-02-01
Gibbs sampling is routinely used to sample truncated Gaussian distributions. These distributions naturally occur when associating latent Gaussian fields to category fields obtained by discrete simulation methods like multipoint, sequential indicator simulation and object-based simulation. The latent Gaussians are often used in data assimilation and history matching algorithms. When the Gibbs sampling is applied on a large lattice, the computing cost can become prohibitive. The usual practice of using local neighborhoods is unsatisfying as it can diverge and it does not reproduce exactly the desired covariance. A better approach is to use Gaussian Markov Random Fields (GMRF) which enables to compute the conditional distributions at any point without having to compute and invert the full covariance matrix. As the GMRF is locally defined, it allows simultaneous updating of all points that do not share neighbors (coding sets). We propose a new simultaneous Gibbs updating strategy on coding sets that can be efficiently computed by convolution and applied with an acceptance/rejection method in the truncated case. We study empirically the speed of convergence, the effect of choice of boundary conditions, of the correlation range and of GMRF smoothness. We show that the convergence is slower in the Gaussian case on the torus than for the finite case studied in the literature. However, in the truncated Gaussian case, we show that short scale correlation is quickly restored and the conditioning categories at each lattice point imprint the long scale correlation. Hence our approach enables to realistically apply Gibbs sampling on large 2D or 3D lattice with the desired GMRF covariance.
Guided wave radiation from a point source in the proximity of a pipe bend
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brath, A. J.; Nagy, P. B.; Simonetti, F.
Throughout the oil and gas industry corrosion and erosion damage monitoring play a central role in managing asset integrity. Recently, the use of guided wave technology in conjunction with tomography techniques has provided the possibility of obtaining point-by-point maps of wall thickness loss over the entire volume of a pipeline section between two ring arrays of ultrasonic transducers. However, current research has focused on straight pipes while little work has been done on pipe bends which are also the most susceptible to developing damage. Tomography of the bend is challenging due to the complexity and computational cost of the 3-Dmore » elastic model required to accurately describe guided wave propagation. To overcome this limitation, we introduce a 2-D anisotropic inhomogeneous acoustic model which represents a generalization of the conventional unwrapping used for straight pipes. The shortest-path ray-tracing method is then applied to the 2-D model to compute ray paths and predict the arrival times of the fundamental flexural mode, A0, excited by a point source on the straight section of pipe entering the bend and detected on the opposite side. Good agreement is found between predictions and experiments performed on an 8” diameter (D) pipe with 1.5 D bend radius. The 2-D model also reveals the existence of an acoustic lensing effect which leads to a focusing phenomenon also confirmed by the experiments. The computational efficiency of the 2-D model makes it ideally suited for tomography algorithms.« less
New Primary Dew-Point Generators at HMI/FSB-LPM in the Range from -70 °C to +60 °C
NASA Astrophysics Data System (ADS)
Zvizdic, Davor; Heinonen, Martti; Sestan, Danijel
2012-09-01
To extend the dew-point range and to improve the uncertainties of the humidity scale realization at HMI/FSB-LPM, new primary low- and high-range dew-point generators were developed and implemented in cooperation with MIKES, in 2009 through EUROMET Project No. 912. The low-range saturator is designed for primary realization of the dew-point temperature scale from -70 °C to + 5 °C, while the high-range saturator covers the range from 1 °C to 60 °C. The system is designed as a single-pressure, single-pass dew-point generator. MIKES designed and constructed both the saturators to be implemented in dew-point calibration systems at LPM. The LPM took care of purchasing and adapting liquid baths, of implementing the temperature and pressure measurement equipment appropriate for use in the systems, and development of gas preparation and flow control systems as well as of the computer-based automated data acquisition. The principle and the design of the generator are described in detail and schematically depicted. The tests were performed at MIKES to investigate how close both the saturators are to an ideal saturator. Results of the tests show that both the saturators are efficient enough for a primary realization of the dew-point temperature scale from -70 °C to + 60 °C, in the specified flow-rate ranges. The estimated standard uncertainties due to the non-ideal saturation efficiency are between 0.02 °C and 0.05 °C.
NASA Astrophysics Data System (ADS)
Bui-Thanh, T.; Girolami, M.
2014-11-01
We consider the Riemann manifold Hamiltonian Monte Carlo (RMHMC) method for solving statistical inverse problems governed by partial differential equations (PDEs). The Bayesian framework is employed to cast the inverse problem into the task of statistical inference whose solution is the posterior distribution in infinite dimensional parameter space conditional upon observation data and Gaussian prior measure. We discretize both the likelihood and the prior using the H1-conforming finite element method together with a matrix transfer technique. The power of the RMHMC method is that it exploits the geometric structure induced by the PDE constraints of the underlying inverse problem. Consequently, each RMHMC posterior sample is almost uncorrelated/independent from the others providing statistically efficient Markov chain simulation. However this statistical efficiency comes at a computational cost. This motivates us to consider computationally more efficient strategies for RMHMC. At the heart of our construction is the fact that for Gaussian error structures the Fisher information matrix coincides with the Gauss-Newton Hessian. We exploit this fact in considering a computationally simplified RMHMC method combining state-of-the-art adjoint techniques and the superiority of the RMHMC method. Specifically, we first form the Gauss-Newton Hessian at the maximum a posteriori point and then use it as a fixed constant metric tensor throughout RMHMC simulation. This eliminates the need for the computationally costly differential geometric Christoffel symbols, which in turn greatly reduces computational effort at a corresponding loss of sampling efficiency. We further reduce the cost of forming the Fisher information matrix by using a low rank approximation via a randomized singular value decomposition technique. This is efficient since a small number of Hessian-vector products are required. The Hessian-vector product in turn requires only two extra PDE solves using the adjoint technique. Various numerical results up to 1025 parameters are presented to demonstrate the ability of the RMHMC method in exploring the geometric structure of the problem to propose (almost) uncorrelated/independent samples that are far away from each other, and yet the acceptance rate is almost unity. The results also suggest that for the PDE models considered the proposed fixed metric RMHMC can attain almost as high a quality performance as the original RMHMC, i.e. generating (almost) uncorrelated/independent samples, while being two orders of magnitude less computationally expensive.
Lungu, Claudiu N; Diudea, Mircea V
2018-01-01
Lipid II, a peptidoglycan, is a precursor in bacterial cell synthesis. It has both hydrophilic and lipophilic properties. The molecule translocates a bacterial membrane to deliver and incorporate "building blocks" from disaccharide-pentapeptide into the peptidoglican wall. Lipid II is a valid antibiotic target. A receptor binding pocket may be occupied by a ligand in various plausible conformations, among which only few ones are energetically related to a biological activity in the physiological efficiency domain. This paper reports the mapping of the conformational space of Lipid II in its interaction with Teixobactin and other Lipid II ligands. In order to study computationally the complex between Lipid II and ligands, a docking study was first carried on. Docking site was retrieved form literature. After docking, 5 ligand conformations and further 5 complexes (denoted 00 to 04) for each molecule were taken into account. For each structure, conformational studies were performed. Statistical analysis, conformational analysis and molecular dynamics based clustering were used to predict the potency of these compounds. A score for potency prediction was developed. Appling lipid II classification according to Lipid II conformational energy, a conformation of Teixobactin proved to be energetically favorable, followed by Oritravicin, Dalbavycin, Telvanicin, Teicoplamin and Vancomycin, respectively. Scoring of molecules according to cluster band and PCA produced the same result. Molecules classified according to standard deviations showed Dalbavycin as the most favorable conformation, followed by Teicoplamin, Telvanicin, Teixobactin, Oritravicin and Vancomycin, respectively. Total score showing best energetic efficiency of complex formation shows Teixobactin to have the best conformation (a score of 15 points) followed by Dalbavycin (14 points), Oritravicin (12v points), Telvanicin (10 points), Teicoplamin (9 points), Vancomycin (3 points). Statistical analysis of conformations can be used to predict the efficiency of ligand - target interaction and consecutively to find insight regarding ligand potency and postulate about favorable conformation of ligand and binding site. In this study it was shown that Teixobactin is more efficient in binding with Lipid II compared to Vancomycin, results confirmed by experimental data reported in literature. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Gas dynamic design of the pipe line compressor with 90% efficiency. Model test approval
NASA Astrophysics Data System (ADS)
Galerkin, Y.; Rekstin, A.; Soldatova, K.
2015-08-01
Gas dynamic design of the pipe line compressor 32 MW was made for PAO SMPO (Sumy, Ukraine). The technical specification requires compressor efficiency of 90%. The customer offered favorable scheme - single-stage design with console impeller and axial inlet. The authors used the standard optimization methodology of 2D impellers. The original methodology of internal scroll profiling was used to minimize efficiency losses. Radically improved 5th version of the Universal modeling method computer programs was used for precise calculation of expected performances. The customer fulfilled model tests in a 1:2 scale. Tests confirmed the calculated parameters at the design point (maximum efficiency of 90%) and in the whole range of flow rates. As far as the authors know none of compressors have achieved such efficiency. The principles and methods of gas-dynamic design are presented below. The data of the 32 MW compressor presented by the customer in their report at the 16th International Compressor conference (September 2014, Saint- Petersburg) and later transferred to the authors.
GPU computing of compressible flow problems by a meshless method with space-filling curves
NASA Astrophysics Data System (ADS)
Ma, Z. H.; Wang, H.; Pu, S. H.
2014-04-01
A graphic processing unit (GPU) implementation of a meshless method for solving compressible flow problems is presented in this paper. Least-square fit is used to discretize the spatial derivatives of Euler equations and an upwind scheme is applied to estimate the flux terms. The compute unified device architecture (CUDA) C programming model is employed to efficiently and flexibly port the meshless solver from CPU to GPU. Considering the data locality of randomly distributed points, space-filling curves are adopted to re-number the points in order to improve the memory performance. Detailed evaluations are firstly carried out to assess the accuracy and conservation property of the underlying numerical method. Then the GPU accelerated flow solver is used to solve external steady flows over aerodynamic configurations. Representative results are validated through extensive comparisons with the experimental, finite volume or other available reference solutions. Performance analysis reveals that the running time cost of simulations is significantly reduced while impressive (more than an order of magnitude) speedups are achieved.
A Novel Stimulus Artifact Removal Technique for High-Rate Electrical Stimulation
Heffer, Leon F; Fallon, James B
2008-01-01
Electrical stimulus artifact corrupting electrophysiological recordings often make the subsequent analysis of the underlying neural response difficult. This is particularly evident when investigating short-latency neural activity in response to high-rate electrical stimulation. We developed and evaluated an off-line technique for the removal of stimulus artifact from electrophysiological recordings. Pulsatile electrical stimulation was presented at rates of up to 5000 pulses/s during extracellular recordings of guinea pig auditory nerve fibers. Stimulus artifact was removed by replacing the sample points at each stimulus artifact event with values interpolated along a straight line, computed from neighbouring sample points. This technique required only that artifact events be identifiable and that the artifact duration remained less than both the inter-stimulus interval and the time course of the action potential. We have demonstrated that this computationally efficient sample-and-interpolate technique removes the stimulus artifact with minimal distortion of the action potential waveform. We suggest that this technique may have potential applications in a range of electrophysiological recording systems. PMID:18339428
NASA Technical Reports Server (NTRS)
Solarna, David; Moser, Gabriele; Le Moigne-Stewart, Jacqueline; Serpico, Sebastiano B.
2017-01-01
Because of the large variety of sensors and spacecraft collecting data, planetary science needs to integrate various multi-sensor and multi-temporal images. These multiple data represent a precious asset, as they allow the study of targets spectral responses and of changes in the surface structure; because of their variety, they also require accurate and robust registration. A new crater detection algorithm, used to extract features that will be integrated in an image registration framework, is presented. A marked point process-based method has been developed to model the spatial distribution of elliptical objects (i.e. the craters) and a birth-death Markov chain Monte Carlo method, coupled with a region-based scheme aiming at computational efficiency, is used to find the optimal configuration fitting the image. The extracted features are exploited, together with a newly defined fitness function based on a modified Hausdorff distance, by an image registration algorithm whose architecture has been designed to minimize the computational time.
Efficient frequent pattern mining algorithm based on node sets in cloud computing environment
NASA Astrophysics Data System (ADS)
Billa, V. N. Vinay Kumar; Lakshmanna, K.; Rajesh, K.; Reddy, M. Praveen Kumar; Nagaraja, G.; Sudheer, K.
2017-11-01
The ultimate goal of Data Mining is to determine the hidden information which is useful in making decisions using the large databases collected by an organization. This Data Mining involves many tasks that are to be performed during the process. Mining frequent itemsets is the one of the most important tasks in case of transactional databases. These transactional databases contain the data in very large scale where the mining of these databases involves the consumption of physical memory and time in proportion to the size of the database. A frequent pattern mining algorithm is said to be efficient only if it consumes less memory and time to mine the frequent itemsets from the given large database. Having these points in mind in this thesis we proposed a system which mines frequent itemsets in an optimized way in terms of memory and time by using cloud computing as an important factor to make the process parallel and the application is provided as a service. A complete framework which uses a proven efficient algorithm called FIN algorithm. FIN algorithm works on Nodesets and POC (pre-order coding) tree. In order to evaluate the performance of the system we conduct the experiments to compare the efficiency of the same algorithm applied in a standalone manner and in cloud computing environment on a real time data set which is traffic accidents data set. The results show that the memory consumption and execution time taken for the process in the proposed system is much lesser than those of standalone system.
Seismic imaging using finite-differences and parallel computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ober, C.C.
1997-12-31
A key to reducing the risks and costs of associated with oil and gas exploration is the fast, accurate imaging of complex geologies, such as salt domes in the Gulf of Mexico and overthrust regions in US onshore regions. Prestack depth migration generally yields the most accurate images, and one approach to this is to solve the scalar wave equation using finite differences. As part of an ongoing ACTI project funded by the US Department of Energy, a finite difference, 3-D prestack, depth migration code has been developed. The goal of this work is to demonstrate that massively parallel computersmore » can be used efficiently for seismic imaging, and that sufficient computing power exists (or soon will exist) to make finite difference, prestack, depth migration practical for oil and gas exploration. Several problems had to be addressed to get an efficient code for the Intel Paragon. These include efficient I/O, efficient parallel tridiagonal solves, and high single-node performance. Furthermore, to provide portable code the author has been restricted to the use of high-level programming languages (C and Fortran) and interprocessor communications using MPI. He has been using the SUNMOS operating system, which has affected many of his programming decisions. He will present images created from two verification datasets (the Marmousi Model and the SEG/EAEG 3D Salt Model). Also, he will show recent images from real datasets, and point out locations of improved imaging. Finally, he will discuss areas of current research which will hopefully improve the image quality and reduce computational costs.« less
The computational complexity of elliptic curve integer sub-decomposition (ISD) method
NASA Astrophysics Data System (ADS)
Ajeena, Ruma Kareem K.; Kamarulhaili, Hailiza
2014-07-01
The idea of the GLV method of Gallant, Lambert and Vanstone (Crypto 2001) is considered a foundation stone to build a new procedure to compute the elliptic curve scalar multiplication. This procedure, that is integer sub-decomposition (ISD), will compute any multiple kP of elliptic curve point P which has a large prime order n with two low-degrees endomorphisms ψ1 and ψ2 of elliptic curve E over prime field Fp. The sub-decomposition of values k1 and k2, not bounded by ±C√n , gives us new integers k11, k12, k21 and k22 which are bounded by ±C√n and can be computed through solving the closest vector problem in lattice. The percentage of a successful computation for the scalar multiplication increases by ISD method, which improved the computational efficiency in comparison with the general method for computing scalar multiplication in elliptic curves over the prime fields. This paper will present the mechanism of ISD method and will shed light mainly on the computation complexity of the ISD approach that will be determined by computing the cost of operations. These operations include elliptic curve operations and finite field operations.
An adaptive Gaussian process-based iterative ensemble smoother for data assimilation
NASA Astrophysics Data System (ADS)
Ju, Lei; Zhang, Jiangjiang; Meng, Long; Wu, Laosheng; Zeng, Lingzao
2018-05-01
Accurate characterization of subsurface hydraulic conductivity is vital for modeling of subsurface flow and transport. The iterative ensemble smoother (IES) has been proposed to estimate the heterogeneous parameter field. As a Monte Carlo-based method, IES requires a relatively large ensemble size to guarantee its performance. To improve the computational efficiency, we propose an adaptive Gaussian process (GP)-based iterative ensemble smoother (GPIES) in this study. At each iteration, the GP surrogate is adaptively refined by adding a few new base points chosen from the updated parameter realizations. Then the sensitivity information between model parameters and measurements is calculated from a large number of realizations generated by the GP surrogate with virtually no computational cost. Since the original model evaluations are only required for base points, whose number is much smaller than the ensemble size, the computational cost is significantly reduced. The applicability of GPIES in estimating heterogeneous conductivity is evaluated by the saturated and unsaturated flow problems, respectively. Without sacrificing estimation accuracy, GPIES achieves about an order of magnitude of speed-up compared with the standard IES. Although subsurface flow problems are considered in this study, the proposed method can be equally applied to other hydrological models.
NASA Technical Reports Server (NTRS)
Hanebutte, Ulf R.; Joslin, Ronald D.; Zubair, Mohammad
1994-01-01
The implementation and the performance of a parallel spatial direct numerical simulation (PSDNS) code are reported for the IBM SP1 supercomputer. The spatially evolving disturbances that are associated with laminar-to-turbulent in three-dimensional boundary-layer flows are computed with the PS-DNS code. By remapping the distributed data structure during the course of the calculation, optimized serial library routines can be utilized that substantially increase the computational performance. Although the remapping incurs a high communication penalty, the parallel efficiency of the code remains above 40% for all performed calculations. By using appropriate compile options and optimized library routines, the serial code achieves 52-56 Mflops on a single node of the SP1 (45% of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a 'real world' simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP for the same simulation. The scalability information provides estimated computational costs that match the actual costs relative to changes in the number of grid points.
Output-Sensitive Construction of Reeb Graphs.
Doraiswamy, H; Natarajan, V
2012-01-01
The Reeb graph of a scalar function represents the evolution of the topology of its level sets. This paper describes a near-optimal output-sensitive algorithm for computing the Reeb graph of scalar functions defined over manifolds or non-manifolds in any dimension. Key to the simplicity and efficiency of the algorithm is an alternate definition of the Reeb graph that considers equivalence classes of level sets instead of individual level sets. The algorithm works in two steps. The first step locates all critical points of the function in the domain. Critical points correspond to nodes in the Reeb graph. Arcs connecting the nodes are computed in the second step by a simple search procedure that works on a small subset of the domain that corresponds to a pair of critical points. The paper also describes a scheme for controlled simplification of the Reeb graph and two different graph layout schemes that help in the effective presentation of Reeb graphs for visual analysis of scalar fields. Finally, the Reeb graph is employed in four different applications-surface segmentation, spatially-aware transfer function design, visualization of interval volumes, and interactive exploration of time-varying data.
Mathematical design of a novel input/instruction device using a moving acoustic emitter
NASA Astrophysics Data System (ADS)
Wang, Xianchao; Guo, Yukun; Li, Jingzhi; Liu, Hongyu
2017-10-01
This paper is concerned with the mathematical design of a novel input/instruction device using a moving emitter. The emitter acts as a point source and can be installed on a digital pen or worn on the finger of the human being who desires to interact/communicate with the computer. The input/instruction can be recognized by identifying the moving trajectory of the emitter performed by the human being from the collected wave field data. The identification process is modelled as an inverse source problem where one intends to identify the trajectory of a moving point source. There are several salient features of our study which distinguish our result from the existing ones in the literature. First, the point source is moving in an inhomogeneous background medium, which models the human body. Second, the dynamical wave field data are collected in a limited aperture. Third, the reconstruction method is independent of the background medium, and it is totally direct without any matrix inversion. Hence, it is efficient and robust with respect to the measurement noise. Both theoretical justifications and computational experiments are presented to verify our novel findings.
A New Privacy-Preserving Handover Authentication Scheme for Wireless Networks
Wang, Changji; Yuan, Yuan; Wu, Jiayuan
2017-01-01
Handover authentication is a critical issue in wireless networks, which is being used to ensure mobile nodes wander over multiple access points securely and seamlessly. A variety of handover authentication schemes for wireless networks have been proposed in the literature. Unfortunately, existing handover authentication schemes are vulnerable to a few security attacks, or incur high communication and computation costs. Recently, He et al. proposed a handover authentication scheme PairHand and claimed it can resist various attacks without rigorous security proofs. In this paper, we show that PairHand does not meet forward secrecy and strong anonymity. More seriously, it is vulnerable to key compromise attack, where an adversary can recover the private key of any mobile node. Then, we propose a new efficient and provably secure handover authentication scheme for wireless networks based on elliptic curve cryptography. Compared with existing schemes, our proposed scheme can resist key compromise attack, and achieves forward secrecy and strong anonymity. Moreover, it is more efficient in terms of computation and communication. PMID:28632171
A New Privacy-Preserving Handover Authentication Scheme for Wireless Networks.
Wang, Changji; Yuan, Yuan; Wu, Jiayuan
2017-06-20
Handover authentication is a critical issue in wireless networks, which is being used to ensure mobile nodes wander over multiple access points securely and seamlessly. A variety of handover authentication schemes for wireless networks have been proposed in the literature. Unfortunately, existing handover authentication schemes are vulnerable to a few security attacks, or incur high communication and computation costs. Recently, He et al. proposed a handover authentication scheme PairHand and claimed it can resist various attacks without rigorous security proofs. In this paper, we show that PairHand does not meet forward secrecy and strong anonymity. More seriously, it is vulnerable to key compromise attack, where an adversary can recover the private key of any mobile node. Then, we propose a new efficient and provably secure handover authentication scheme for wireless networks based on elliptic curve cryptography. Compared with existing schemes, our proposed scheme can resist key compromise attack, and achieves forward secrecy and strong anonymity. Moreover, it is more efficient in terms of computation and communication.
Method for computationally efficient design of dielectric laser accelerator structures
Hughes, Tyler; Veronis, Georgios; Wootton, Kent P.; ...
2017-06-22
Here, dielectric microstructures have generated much interest in recent years as a means of accelerating charged particles when powered by solid state lasers. The acceleration gradient (or particle energy gain per unit length) is an important figure of merit. To design structures with high acceleration gradients, we explore the adjoint variable method, a highly efficient technique used to compute the sensitivity of an objective with respect to a large number of parameters. With this formalism, the sensitivity of the acceleration gradient of a dielectric structure with respect to its entire spatial permittivity distribution is calculated by the use of onlymore » two full-field electromagnetic simulations, the original and ‘adjoint’. The adjoint simulation corresponds physically to the reciprocal situation of a point charge moving through the accelerator gap and radiating. Using this formalism, we perform numerical optimizations aimed at maximizing acceleration gradients, which generate fabricable structures of greatly improved performance in comparison to previously examined geometries.« less
Minimized state complexity of quantum-encoded cryptic processes
NASA Astrophysics Data System (ADS)
Riechers, Paul M.; Mahoney, John R.; Aghamohammadi, Cina; Crutchfield, James P.
2016-05-01
The predictive information required for proper trajectory sampling of a stochastic process can be more efficiently transmitted via a quantum channel than a classical one. This recent discovery allows quantum information processing to drastically reduce the memory necessary to simulate complex classical stochastic processes. It also points to a new perspective on the intrinsic complexity that nature must employ in generating the processes we observe. The quantum advantage increases with codeword length: the length of process sequences used in constructing the quantum communication scheme. In analogy with the classical complexity measure, statistical complexity, we use this reduced communication cost as an entropic measure of state complexity in the quantum representation. Previously difficult to compute, the quantum advantage is expressed here in closed form using spectral decomposition. This allows for efficient numerical computation of the quantum-reduced state complexity at all encoding lengths, including infinite. Additionally, it makes clear how finite-codeword reduction in state complexity is controlled by the classical process's cryptic order, and it allows asymptotic analysis of infinite-cryptic-order processes.
NASA Astrophysics Data System (ADS)
Kozynchenko, Alexander I.; Kozynchenko, Sergey A.
2017-03-01
In the paper, a problem of improving efficiency of the particle-particle- particle-mesh (P3M) algorithm in computing the inter-particle electrostatic forces is considered. The particle-mesh (PM) part of the algorithm is modified in such a way that the space field equation is solved by the direct method of summation of potentials over the ensemble of particles lying not too close to a reference particle. For this purpose, a specific matrix "pattern" is introduced to describe the spatial field distribution of a single point charge, so the "pattern" contains pre-calculated potential values. This approach allows to reduce a set of arithmetic operations performed at the innermost of nested loops down to an addition and assignment operators and, therefore, to decrease the running time substantially. The simulation model developed in C++ substantiates this view, showing the descent accuracy acceptable in particle beam calculations together with the improved speed performance.
Peng, Hanchuan; Tang, Jianyong; Xiao, Hang; Bria, Alessandro; Zhou, Jianlong; Butler, Victoria; Zhou, Zhi; Gonzalez-Bellido, Paloma T; Oh, Seung W; Chen, Jichao; Mitra, Ananya; Tsien, Richard W; Zeng, Hongkui; Ascoli, Giorgio A; Iannello, Giulio; Hawrylycz, Michael; Myers, Eugene; Long, Fuhui
2014-07-11
Three-dimensional (3D) bioimaging, visualization and data analysis are in strong need of powerful 3D exploration techniques. We develop virtual finger (VF) to generate 3D curves, points and regions-of-interest in the 3D space of a volumetric image with a single finger operation, such as a computer mouse stroke, or click or zoom from the 2D-projection plane of an image as visualized with a computer. VF provides efficient methods for acquisition, visualization and analysis of 3D images for roundworm, fruitfly, dragonfly, mouse, rat and human. Specifically, VF enables instant 3D optical zoom-in imaging, 3D free-form optical microsurgery, and 3D visualization and annotation of terabytes of whole-brain image volumes. VF also leads to orders of magnitude better efficiency of automated 3D reconstruction of neurons and similar biostructures over our previous systems. We use VF to generate from images of 1,107 Drosophila GAL4 lines a projectome of a Drosophila brain.
A Novel Latin Hypercube Algorithm via Translational Propagation
Pan, Guang; Ye, Pengcheng
2014-01-01
Metamodels have been widely used in engineering design to facilitate analysis and optimization of complex systems that involve computationally expensive simulation programs. The accuracy of metamodels is directly related to the experimental designs used. Optimal Latin hypercube designs are frequently used and have been shown to have good space-filling and projective properties. However, the high cost in constructing them limits their use. In this paper, a methodology for creating novel Latin hypercube designs via translational propagation and successive local enumeration algorithm (TPSLE) is developed without using formal optimization. TPSLE algorithm is based on the inspiration that a near optimal Latin Hypercube design can be constructed by a simple initial block with a few points generated by algorithm SLE as a building block. In fact, TPSLE algorithm offers a balanced trade-off between the efficiency and sampling performance. The proposed algorithm is compared to two existing algorithms and is found to be much more efficient in terms of the computation time and has acceptable space-filling and projective properties. PMID:25276844
NASA Astrophysics Data System (ADS)
Poirier, Vincent
Mesh deformation schemes play an important role in numerical aerodynamic optimization. As the aerodynamic shape changes, the computational mesh must adapt to conform to the deformed geometry. In this work, an extension to an existing fast and robust Radial Basis Function (RBF) mesh movement scheme is presented. Using a reduced set of surface points to define the mesh deformation increases the efficiency of the RBF method; however, at the cost of introducing errors into the parameterization by not recovering the exact displacement of all surface points. A secondary mesh movement is implemented, within an adjoint-based optimization framework, to eliminate these errors. The proposed scheme is tested within a 3D Euler flow by reducing the pressure drag while maintaining lift of a wing-body configured Boeing-747 and an Onera-M6 wing. As well, an inverse pressure design is executed on the Onera-M6 wing and an inverse span loading case is presented for a wing-body configured DLR-F6 aircraft.
The computer-generated treatment plan... Create the nucleus to successful systems.
Bernhardt, Christene
2004-12-01
A well-managed, highly efficient practice relies on the comprehensive information provided through effective treatment planning. Computer-generated treatment plans are successful only if the 14 key points of information are included within the plan. Major systems such as scheduling, financial agreements, and insurance processing fail if adequate information is not provided through the treatment plan. Successful interactions with patients at the time of the consultation rely heavily on having adequate information at your fingertips. The treatment plan is truly the foundation to all communications that must occur during the patient's experience, and ensures that every team member has clear and easy access to the status of each patient as treatment unfolds.
Window-based method for approximating the Hausdorff in three-dimensional range imagery
Koch, Mark W [Albuquerque, NM
2009-06-02
One approach to pattern recognition is to use a template from a database of objects and match it to a probe image containing the unknown. Accordingly, the Hausdorff distance can be used to measure the similarity of two sets of points. In particular, the Hausdorff can measure the goodness of a match in the presence of occlusion, clutter, and noise. However, existing 3D algorithms for calculating the Hausdorff are computationally intensive, making them impractical for pattern recognition that requires scanning of large databases. The present invention is directed to a new method that can efficiently, in time and memory, compute the Hausdorff for 3D range imagery. The method uses a window-based approach.
Implementation of the Sun Position Calculation in the PDC-1 Control Microprocessor
NASA Technical Reports Server (NTRS)
Stallkamp, J. A.
1984-01-01
The several computational approaches to providing the local azimuth and elevation angles of the Sun as a function of local time and then the utilization of the most appropriate method in the PDC-1 microprocessor are presented. The full algorithm, the FORTRAN form, is felt to be very useful in any kind or size of computer. It was used in the PDC-1 unit to generate efficient code for the microprocessor with its floating point arithmetic chip. The balance of the presentation consists of a brief discussion of the tracking requirements for PPDC-1, the planetary motion equations from the first to the final version, and the local azimuth-elevation geometry.
Semi-physical Simulation Platform of a Parafoil Nonlinear Dynamic System
NASA Astrophysics Data System (ADS)
Gao, Hai-Tao; Yang, Sheng-Bo; Zhu, Er-Lin; Sun, Qing-Lin; Chen, Zeng-Qiang; Kang, Xiao-Feng
2013-11-01
Focusing on the problems in the process of simulation and experiment on a parafoil nonlinear dynamic system, such as limited methods, high cost and low efficiency we present a semi-physical simulation platform. It is designed by connecting parts of physical objects to a computer, and remedies the defect that a computer simulation is divorced from a real environment absolutely. The main components of the platform and its functions, as well as simulation flows, are introduced. The feasibility and validity are verified through a simulation experiment. The experimental results show that the platform has significance for improving the quality of the parafoil fixed-point airdrop system, shortening the development cycle and saving cost.
Photogrammetric 3d Building Reconstruction from Thermal Images
NASA Astrophysics Data System (ADS)
Maset, E.; Fusiello, A.; Crosilla, F.; Toldo, R.; Zorzetto, D.
2017-08-01
This paper addresses the problem of 3D building reconstruction from thermal infrared (TIR) images. We show that a commercial Computer Vision software can be used to automatically orient sequences of TIR images taken from an Unmanned Aerial Vehicle (UAV) and to generate 3D point clouds, without requiring any GNSS/INS data about position and attitude of the images nor camera calibration parameters. Moreover, we propose a procedure based on Iterative Closest Point (ICP) algorithm to create a model that combines high resolution and geometric accuracy of RGB images with the thermal information deriving from TIR images. The process can be carried out entirely by the aforesaid software in a simple and efficient way.
Wang, Ke; Nirmalathas, Ampalavanapillai; Lim, Christina; Skafidas, Efstratios; Alameh, Kamal
2013-07-01
In this paper, we propose and experimentally demonstrate a free-space based high-speed reconfigurable card-to-card optical interconnect architecture with broadcast capability, which is required for control functionalities and efficient parallel computing applications. Experimental results show that 10 Gb/s data can be broadcast to all receiving channels for up to 30 cm with a worst-case receiver sensitivity better than -12.20 dBm. In addition, arbitrary multicasting with the same architecture is also investigated. 10 Gb/s reconfigurable point-to-point link and multicast channels are simultaneously demonstrated with a measured receiver sensitivity power penalty of ~1.3 dB due to crosstalk.
Physical realization of topological quantum walks on IBM-Q and beyond
NASA Astrophysics Data System (ADS)
Balu, Radhakrishnan; Castillo, Daniel; Siopsis, George
2018-07-01
We discuss an efficient physical realization of topological quantum walks on a one-dimensional finite lattice with periodic boundary conditions (circle). The N-point lattice is realized with {log}}2N qubits, and the quantum circuit utilizes a number of quantum gates that are polynomial in the number of qubits. In a certain scaling limit, we show that a large number of steps are implemented with a number of quantum gates which are independent of the number of steps. We ran the quantum algorithm on the IBM-Q five-qubit quantum computer, thus experimentally demonstrating topological features, such as boundary bound states, on a one-dimensional lattice with N = 4 points.
Scalable domain decomposition solvers for stochastic PDEs in high performance computing
Desai, Ajit; Khalil, Mohammad; Pettit, Chris; ...
2017-09-21
Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less
Scalable domain decomposition solvers for stochastic PDEs in high performance computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Desai, Ajit; Khalil, Mohammad; Pettit, Chris
Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less
Mantle convection on modern supercomputers
NASA Astrophysics Data System (ADS)
Weismüller, Jens; Gmeiner, Björn; Mohr, Marcus; Waluga, Christian; Wohlmuth, Barbara; Rüde, Ulrich; Bunge, Hans-Peter
2015-04-01
Mantle convection is the cause for plate tectonics, the formation of mountains and oceans, and the main driving mechanism behind earthquakes. The convection process is modeled by a system of partial differential equations describing the conservation of mass, momentum and energy. Characteristic to mantle flow is the vast disparity of length scales from global to microscopic, turning mantle convection simulations into a challenging application for high-performance computing. As system size and technical complexity of the simulations continue to increase, design and implementation of simulation models for next generation large-scale architectures demand an interdisciplinary co-design. Here we report about recent advances of the TERRA-NEO project, which is part of the high visibility SPPEXA program, and a joint effort of four research groups in computer sciences, mathematics and geophysical application under the leadership of FAU Erlangen. TERRA-NEO develops algorithms for future HPC infrastructures, focusing on high computational efficiency and resilience in next generation mantle convection models. We present software that can resolve the Earth's mantle with up to 1012 grid points and scales efficiently to massively parallel hardware with more than 50,000 processors. We use our simulations to explore the dynamic regime of mantle convection assessing the impact of small scale processes on global mantle flow.
Efficient free energy calculations of quantum systems through computer simulations
NASA Astrophysics Data System (ADS)
Antonelli, Alex; Ramirez, Rafael; Herrero, Carlos; Hernandez, Eduardo
2009-03-01
In general, the classical limit is assumed in computer simulation calculations of free energy. This approximation, however, is not justifiable for a class of systems in which quantum contributions for the free energy cannot be neglected. The inclusion of quantum effects is important for the determination of reliable phase diagrams of these systems. In this work, we present a new methodology to compute the free energy of many-body quantum systems [1]. This methodology results from the combination of the path integral formulation of statistical mechanics and efficient non-equilibrium methods to estimate free energy, namely, the adiabatic switching and reversible scaling methods. A quantum Einstein crystal is used as a model to show the accuracy and reliability the methodology. This new method is applied to the calculation of solid-liquid coexistence properties of neon. Our findings indicate that quantum contributions to properties such as, melting point, latent heat of fusion, entropy of fusion, and slope of melting line can be up to 10% of the calculated values using the classical approximation. [1] R. M. Ramirez, C. P. Herrero, A. Antonelli, and E. R. Hernández, Journal of Chemical Physics 129, 064110 (2008)
Computer-assisted map projection research
Snyder, John Parr
1985-01-01
Computers have opened up areas of map projection research which were previously too complicated to utilize, for example, using a least-squares fit to a very large number of points. One application has been in the efficient transfer of data between maps on different projections. While the transfer of moderate amounts of data is satisfactorily accomplished using the analytical map projection formulas, polynomials are more efficient for massive transfers. Suitable coefficients for the polynomials may be determined more easily for general cases using least squares instead of Taylor series. A second area of research is in the determination of a map projection fitting an unlabeled map, so that accurate data transfer can take place. The computer can test one projection after another, and include iteration where required. A third area is in the use of least squares to fit a map projection with optimum parameters to the region being mapped, so that distortion is minimized. This can be accomplished for standard conformal, equalarea, or other types of projections. Even less distortion can result if complex transformations of conformal projections are utilized. This bulletin describes several recent applications of these principles, as well as historical usage and background.
Nguyen, Tuan-Anh; Nakib, Amir; Nguyen, Huy-Nam
2016-06-01
The Non-local means denoising filter has been established as gold standard for image denoising problem in general and particularly in medical imaging due to its efficiency. However, its computation time limited its applications in real world application, especially in medical imaging. In this paper, a distributed version on parallel hybrid architecture is proposed to solve the computation time problem and a new method to compute the filters' coefficients is also proposed, where we focused on the implementation and the enhancement of filters' parameters via taking the neighborhood of the current voxel more accurately into account. In terms of implementation, our key contribution consists in reducing the number of shared memory accesses. The different tests of the proposed method were performed on the brain-web database for different levels of noise. Performances and the sensitivity were quantified in terms of speedup, peak signal to noise ratio, execution time, the number of floating point operations. The obtained results demonstrate the efficiency of the proposed method. Moreover, the implementation is compared to that of other techniques, recently published in the literature. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Improving GPU-accelerated adaptive IDW interpolation algorithm using fast kNN search.
Mei, Gang; Xu, Nengxiong; Xu, Liangliang
2016-01-01
This paper presents an efficient parallel Adaptive Inverse Distance Weighting (AIDW) interpolation algorithm on modern Graphics Processing Unit (GPU). The presented algorithm is an improvement of our previous GPU-accelerated AIDW algorithm by adopting fast k-nearest neighbors (kNN) search. In AIDW, it needs to find several nearest neighboring data points for each interpolated point to adaptively determine the power parameter; and then the desired prediction value of the interpolated point is obtained by weighted interpolating using the power parameter. In this work, we develop a fast kNN search approach based on the space-partitioning data structure, even grid, to improve the previous GPU-accelerated AIDW algorithm. The improved algorithm is composed of the stages of kNN search and weighted interpolating. To evaluate the performance of the improved algorithm, we perform five groups of experimental tests. The experimental results indicate: (1) the improved algorithm can achieve a speedup of up to 1017 over the corresponding serial algorithm; (2) the improved algorithm is at least two times faster than our previous GPU-accelerated AIDW algorithm; and (3) the utilization of fast kNN search can significantly improve the computational efficiency of the entire GPU-accelerated AIDW algorithm.
NASA Astrophysics Data System (ADS)
Weinmann, Martin; Jutzi, Boris; Hinz, Stefan; Mallet, Clément
2015-07-01
3D scene analysis in terms of automatically assigning 3D points a respective semantic label has become a topic of great importance in photogrammetry, remote sensing, computer vision and robotics. In this paper, we address the issue of how to increase the distinctiveness of geometric features and select the most relevant ones among these for 3D scene analysis. We present a new, fully automated and versatile framework composed of four components: (i) neighborhood selection, (ii) feature extraction, (iii) feature selection and (iv) classification. For each component, we consider a variety of approaches which allow applicability in terms of simplicity, efficiency and reproducibility, so that end-users can easily apply the different components and do not require expert knowledge in the respective domains. In a detailed evaluation involving 7 neighborhood definitions, 21 geometric features, 7 approaches for feature selection, 10 classifiers and 2 benchmark datasets, we demonstrate that the selection of optimal neighborhoods for individual 3D points significantly improves the results of 3D scene analysis. Additionally, we show that the selection of adequate feature subsets may even further increase the quality of the derived results while significantly reducing both processing time and memory consumption.
Glocker, Ben; Paragios, Nikos; Komodakis, Nikos; Tziritas, Georgios; Navab, Nassir
2007-01-01
In this paper we propose a novel non-rigid volume registration based on discrete labeling and linear programming. The proposed framework reformulates registration as a minimal path extraction in a weighted graph. The space of solutions is represented using a set of a labels which are assigned to predefined displacements. The graph topology corresponds to a superimposed regular grid onto the volume. Links between neighborhood control points introduce smoothness, while links between the graph nodes and the labels (end-nodes) measure the cost induced to the objective function through the selection of a particular deformation for a given control point once projected to the entire volume domain, Higher order polynomials are used to express the volume deformation from the ones of the control points. Efficient linear programming that can guarantee the optimal solution up to (a user-defined) bound is considered to recover the optimal registration parameters. Therefore, the method is gradient free, can encode various similarity metrics (simple changes on the graph construction), can guarantee a globally sub-optimal solution and is computational tractable. Experimental validation using simulated data with known deformation, as well as manually segmented data demonstrate the extreme potentials of our approach.
1987-10-01
19 treated in interaction with each other and the hardware and software design. The authors point out some of the inadequacies in HP technologies and...life cycle costs recognition performance on secondary tasks effort/efficiency number of wins ( gaming tasks) number of instructors needed amount of...student interacts with this material in real time via a terminal and display system. The computer performs many functions, such as diagnose student
A Prototype Decision Support System for the Location of Military Water Points.
1980-06-01
create an environ- ment which is conductive to an efficient man/machine decision making system . This could be accomplished by designing the operating...Figure 12. Flowchart of Program COMPUTE 50 Procedure This Decision Support System was designed to be interactive. That is, it requests data from the user...Pg. 82-114, 1974. 24. Geoffrion, A.M. and G.W. Graves, "Multicomodity Distribution System Design by Benders Partition", Management Science, Vol. 20, Pg
Motion compensation for ultra wide band SAR
NASA Technical Reports Server (NTRS)
Madsen, S.
2001-01-01
This paper describes an algorithm that combines wavenumber domain processing with a procedure that enables motion compensation to be applied as a function of target range and azimuth angle. First, data are processed with nominal motion compensation applied, partially focusing the image, then the motion compensation of individual subpatches is refined. The results show that the proposed algorithm is effective in compensating for deviations from a straight flight path, from both a performance and a computational efficiency point of view.
Combining Image Processing with Signal Processing to Improve Transmitter Geolocation Estimation
2014-03-27
transmitter by searching a grid of possible transmitter locations within the image region. At each evaluated grid point, theoretical TDOA values are computed...requires converting the image to a grayscale intensity image. This allows efficient manipulation of data and ease of comparison among pixel values . The...cluster of redundant y values along the top edge of an ideal rectangle. The same is true for the bottom edge, as well as for the x values along the
An Efficient Downlink Scheduling Strategy Using Normal Graphs for Multiuser MIMO Wireless Systems
NASA Astrophysics Data System (ADS)
Chen, Jung-Chieh; Wu, Cheng-Hsuan; Lee, Yao-Nan; Wen, Chao-Kai
Inspired by the success of the low-density parity-check (LDPC) codes in the field of error-control coding, in this paper we propose transforming the downlink multiuser multiple-input multiple-output scheduling problem into an LDPC-like problem using the normal graph. Based on the normal graph framework, soft information, which indicates the probability that each user will be scheduled to transmit packets at the access point through a specified angle-frequency sub-channel, is exchanged among the local processors to iteratively optimize the multiuser transmission schedule. Computer simulations show that the proposed algorithm can efficiently schedule simultaneous multiuser transmission which then increases the overall channel utilization and reduces the average packet delay.
An Efficient Authenticated Key Transfer Scheme in Client-Server Networks
NASA Astrophysics Data System (ADS)
Shi, Runhua; Zhang, Shun
2017-10-01
In this paper, we presented a novel authenticated key transfer scheme in client-server networks, which can achieve two secure goals of remote user authentication and the session key establishment between the remote user and the server. Especially, the proposed scheme can subtly provide two fully different authentications: identity-base authentication and anonymous authentication, while the remote user only holds a private key. Furthermore, our scheme only needs to transmit 1-round messages from the remote user to the server, thus it is very efficient in communication complexity. In addition, the most time-consuming computation in our scheme is elliptic curve scalar point multiplication, so it is also feasible even for mobile devices.
Nguyen, N; Milanfar, P; Golub, G
2001-01-01
In many image restoration/resolution enhancement applications, the blurring process, i.e., point spread function (PSF) of the imaging system, is not known or is known only to within a set of parameters. We estimate these PSF parameters for this ill-posed class of inverse problem from raw data, along with the regularization parameters required to stabilize the solution, using the generalized cross-validation method (GCV). We propose efficient approximation techniques based on the Lanczos algorithm and Gauss quadrature theory, reducing the computational complexity of the GCV. Data-driven PSF and regularization parameter estimation experiments with synthetic and real image sequences are presented to demonstrate the effectiveness and robustness of our method.
Active electromagnetic invisibility cloaking and radiation force cancellation
NASA Astrophysics Data System (ADS)
Mitri, F. G.
2018-03-01
This investigation shows that an active emitting electromagnetic (EM) Dirichlet source (i.e., with axial polarization of the electric field) in a homogeneous non-dissipative/non-absorptive medium placed near a perfectly conducting boundary can render total invisibility (i.e. zero extinction cross-section or efficiency) in addition to a radiation force cancellation on its surface. Based upon the Poynting theorem, the mathematical expression for the extinction, radiation and amplification cross-sections (or efficiencies) are derived using the partial-wave series expansion method in cylindrical coordinates. Moreover, the analysis is extended to compute the self-induced EM radiation force on the active source, resulting from the waves reflected by the boundary. The numerical results predict the generation of a zero extinction efficiency, achieving total invisibility, in addition to a radiation force cancellation which depend on the source size, the distance from the boundary and the associated EM mode order of the active source. Furthermore, an attractive EM pushing force on the active source directed toward the boundary or a repulsive pulling one pointing away from it can arise accordingly. The numerical predictions and computational results find potential applications in the design and development of EM cloaking devices, invisibility and stealth technologies.
Efficient visibility-driven medical image visualisation via adaptive binned visibility histogram.
Jung, Younhyun; Kim, Jinman; Kumar, Ashnil; Feng, David Dagan; Fulham, Michael
2016-07-01
'Visibility' is a fundamental optical property that represents the observable, by users, proportion of the voxels in a volume during interactive volume rendering. The manipulation of this 'visibility' improves the volume rendering processes; for instance by ensuring the visibility of regions of interest (ROIs) or by guiding the identification of an optimal rendering view-point. The construction of visibility histograms (VHs), which represent the distribution of all the visibility of all voxels in the rendered volume, enables users to explore the volume with real-time feedback about occlusion patterns among spatially related structures during volume rendering manipulations. Volume rendered medical images have been a primary beneficiary of VH given the need to ensure that specific ROIs are visible relative to the surrounding structures, e.g. the visualisation of tumours that may otherwise be occluded by neighbouring structures. VH construction and its subsequent manipulations, however, are computationally expensive due to the histogram binning of the visibilities. This limits the real-time application of VH to medical images that have large intensity ranges and volume dimensions and require a large number of histogram bins. In this study, we introduce an efficient adaptive binned visibility histogram (AB-VH) in which a smaller number of histogram bins are used to represent the visibility distribution of the full VH. We adaptively bin medical images by using a cluster analysis algorithm that groups the voxels according to their intensity similarities into a smaller subset of bins while preserving the distribution of the intensity range of the original images. We increase efficiency by exploiting the parallel computation and multiple render targets (MRT) extension of the modern graphical processing units (GPUs) and this enables efficient computation of the histogram. We show the application of our method to single-modality computed tomography (CT), magnetic resonance (MR) imaging and multi-modality positron emission tomography-CT (PET-CT). In our experiments, the AB-VH markedly improved the computational efficiency for the VH construction and thus improved the subsequent VH-driven volume manipulations. This efficiency was achieved without major degradation in the VH visually and numerical differences between the AB-VH and its full-bin counterpart. We applied several variants of the K-means clustering algorithm with varying Ks (the number of clusters) and found that higher values of K resulted in better performance at a lower computational gain. The AB-VH also had an improved performance when compared to the conventional method of down-sampling of the histogram bins (equal binning) for volume rendering visualisation. Copyright © 2016 Elsevier Ltd. All rights reserved.
GPU-Accelerated Voxelwise Hepatic Perfusion Quantification
Wang, H; Cao, Y
2012-01-01
Voxelwise quantification of hepatic perfusion parameters from dynamic contrast enhanced (DCE) imaging greatly contributes to assessment of liver function in response to radiation therapy. However, the efficiency of the estimation of hepatic perfusion parameters voxel-by-voxel in the whole liver using a dual-input single-compartment model requires substantial improvement for routine clinical applications. In this paper, we utilize the parallel computation power of a graphics processing unit (GPU) to accelerate the computation, while maintaining the same accuracy as the conventional method. Using CUDA-GPU, the hepatic perfusion computations over multiple voxels are run across the GPU blocks concurrently but independently. At each voxel, non-linear least squares fitting the time series of the liver DCE data to the compartmental model is distributed to multiple threads in a block, and the computations of different time points are performed simultaneously and synchronically. An efficient fast Fourier transform in a block is also developed for the convolution computation in the model. The GPU computations of the voxel-by-voxel hepatic perfusion images are compared with ones by the CPU using the simulated DCE data and the experimental DCE MR images from patients. The computation speed is improved by 30 times using a NVIDIA Tesla C2050 GPU compared to a 2.67 GHz Intel Xeon CPU processor. To obtain liver perfusion maps with 626400 voxels in a patient’s liver, it takes 0.9 min with the GPU-accelerated voxelwise computation, compared to 110 min with the CPU, while both methods result in perfusion parameters differences less than 10−6. The method will be useful for generating liver perfusion images in clinical settings. PMID:22892645
NASA Technical Reports Server (NTRS)
Kwak, Dochan
2005-01-01
Over the past 30 years, numerical methods and simulation tools for fluid dynamic problems have advanced as a new discipline, namely, computational fluid dynamics (CFD). Although a wide spectrum of flow regimes are encountered in many areas of science and engineering, simulation of compressible flow has been the major driver for developing computational algorithms and tools. This is probably due to a large demand for predicting the aerodynamic performance characteristics of flight vehicles, such as commercial, military, and space vehicles. As flow analysis is required to be more accurate and computationally efficient for both commercial and mission-oriented applications (such as those encountered in meteorology, aerospace vehicle development, general fluid engineering and biofluid analysis) CFD tools for engineering become increasingly important for predicting safety, performance and cost. This paper presents the author's perspective on the maturity of CFD, especially from an aerospace engineering point of view.
Realizing universal Majorana fermionic quantum computation
NASA Astrophysics Data System (ADS)
Wu, Ya-Jie; He, Jing; Kou, Su-Peng
2014-08-01
Majorana fermionic quantum computation (MFQC) was proposed by S. B. Bravyi and A. Yu. Kitaev [Ann. Phys. (NY) 298, 210 (2002), 10.1006/aphy.2002.6254], who indicated that a (nontopological) fault-tolerant quantum computer built from Majorana fermions may be more efficient than that built from distinguishable two-state systems. However, until now scientists have not known how to realize a MFQC in a physical system. In this paper we propose a possible realization of MFQC. We find that the end of a line defect of a p-wave superconductor or superfluid in a honeycomb lattice traps a Majorana zero mode, which becomes the starting point of MFQC. Then we show how to manipulate Majorana fermions to perform universal MFQC, which possesses possibilities for high-level local controllability through individually addressing the quantum states of individual constituent elements by using timely cold-atom technology.
Advanced scoring method of eco-efficiency in European cities.
Moutinho, Victor; Madaleno, Mara; Robaina, Margarita; Villar, José
2018-01-01
This paper analyzes a set of selected German and French cities' performance in terms of the relative behavior of their eco-efficiencies, computed as the ratio of their gross domestic product (GDP) over their CO 2 emissions. For this analysis, eco-efficiency scores of the selected cities are computed using the data envelopment analysis (DEA) technique, taking the eco-efficiencies as outputs, and the inputs being the energy consumption, the population density, the labor productivity, the resource productivity, and the patents per inhabitant. Once DEA results are analyzed, the Malmquist productivity indexes (MPI) are used to assess the time evolution of the technical efficiency, technological efficiency, and productivity of the cities over the window periods 2000 to 2005 and 2005 to 2008. Some of the main conclusions are that (1) most of the analyzed cities seem to have suboptimal scales, being one of the causes of their inefficiency; (2) there is evidence that high GDP over CO 2 emissions does not imply high eco-efficiency scores, meaning that DEA like approaches are useful to complement more simplistic ranking procedures, pointing out potential inefficiencies at the input levels; (3) efficiencies performed worse during the period 2000-2005 than during the period 2005-2008, suggesting the possibility of corrective actions taken during or at the end of the first period but impacting only on the second period, probably due to an increasing environmental awareness of policymakers and governors; and (4) MPI analysis shows a positive technological evolution of all cities, according to the general technological evolution of the reference cities, reflecting a generalized convergence of most cities to their technological frontier and therefore an evolution in the right direction.
Lardeux, Frédéric; Torrico, Gino; Aliaga, Claudia
2016-07-04
In ELISAs, sera of individuals infected by Trypanosoma cruzi show absorbance values above a cut-off value. The cut-off is generally computed by means of formulas that need absorbance readings of negative (and sometimes positive) controls, which are included in the titer plates amongst the unknown samples. When no controls are available, other techniques should be employed such as change-point analysis. The method was applied to Bolivian dog sera processed by ELISA to diagnose T. cruzi infection. In each titer plate, the change-point analysis estimated a step point which correctly discriminated among known positive and known negative sera, unlike some of the six usual cut-off formulas tested. To analyse the ELISAs results, the change-point method was as good as the usual cut-off formula of the form "mean + 3 standard deviation of negative controls". Change-point analysis is therefore an efficient alternative method to analyse ELISA absorbance values when no controls are available.
Research on facial expression simulation based on depth image
NASA Astrophysics Data System (ADS)
Ding, Sha-sha; Duan, Jin; Zhao, Yi-wu; Xiao, Bo; Wang, Hao
2017-11-01
Nowadays, face expression simulation is widely used in film and television special effects, human-computer interaction and many other fields. Facial expression is captured by the device of Kinect camera .The method of AAM algorithm based on statistical information is employed to detect and track faces. The 2D regression algorithm is applied to align the feature points. Among them, facial feature points are detected automatically and 3D cartoon model feature points are signed artificially. The aligned feature points are mapped by keyframe techniques. In order to improve the animation effect, Non-feature points are interpolated based on empirical models. Under the constraint of Bézier curves we finish the mapping and interpolation. Thus the feature points on the cartoon face model can be driven if the facial expression varies. In this way the purpose of cartoon face expression simulation in real-time is came ture. The experiment result shows that the method proposed in this text can accurately simulate the facial expression. Finally, our method is compared with the previous method. Actual data prove that the implementation efficiency is greatly improved by our method.