[Orthogonal Vector Projection Algorithm for Spectral Unmixing].
Song, Mei-ping; Xu, Xing-wei; Chang, Chein-I; An, Ju-bai; Yao, Li
2015-12-01
Spectrum unmixing is an important part of hyperspectral technologies, which is essential for material quantity analysis in hyperspectral imagery. Most linear unmixing algorithms require computations of matrix multiplication and matrix inversion or matrix determination. These are difficult for programming, especially hard for realization on hardware. At the same time, the computation costs of the algorithms increase significantly as the number of endmembers grows. Here, based on the traditional algorithm Orthogonal Subspace Projection, a new method called. Orthogonal Vector Projection is prompted using orthogonal principle. It simplifies this process by avoiding matrix multiplication and inversion. It firstly computes the final orthogonal vector via Gram-Schmidt process for each endmember spectrum. And then, these orthogonal vectors are used as projection vector for the pixel signature. The unconstrained abundance can be obtained directly by projecting the signature to the projection vectors, and computing the ratio of projected vector length and orthogonal vector length. Compared to the Orthogonal Subspace Projection and Least Squares Error algorithms, this method does not need matrix inversion, which is much computation costing and hard to implement on hardware. It just completes the orthogonalization process by repeated vector operations, easy for application on both parallel computation and hardware. The reasonability of the algorithm is proved by its relationship with Orthogonal Sub-space Projection and Least Squares Error algorithms. And its computational complexity is also compared with the other two algorithms', which is the lowest one. At last, the experimental results on synthetic image and real image are also provided, giving another evidence for effectiveness of the method.
NASA Technical Reports Server (NTRS)
Rutishauser, David
2006-01-01
The motivation for this work comes from an observation that amidst the push for Massively Parallel (MP) solutions to high-end computing problems such as numerical physical simulations, large amounts of legacy code exist that are highly optimized for vector supercomputers. Because re-hosting legacy code often requires a complete re-write of the original code, which can be a very long and expensive effort, this work examines the potential to exploit reconfigurable computing machines in place of a vector supercomputer to implement an essentially unmodified legacy source code. Custom and reconfigurable computing resources could be used to emulate an original application's target platform to the extent required to achieve high performance. To arrive at an architecture that delivers the desired performance subject to limited resources involves solving a multi-variable optimization problem with constraints. Prior research in the area of reconfigurable computing has demonstrated that designing an optimum hardware implementation of a given application under hardware resource constraints is an NP-complete problem. The premise of the approach is that the general issue of applying reconfigurable computing resources to the implementation of an application, maximizing the performance of the computation subject to physical resource constraints, can be made a tractable problem by assuming a computational paradigm, such as vector processing. This research contributes a formulation of the problem and a methodology to design a reconfigurable vector processing implementation of a given application that satisfies a performance metric. A generic, parametric, architectural framework for vector processing implemented in reconfigurable logic is developed as a target for a scheduling/mapping algorithm that maps an input computation to a given instance of the architecture. This algorithm is integrated with an optimization framework to arrive at a specification of the architecture parameters that attempts to minimize execution time, while staying within resource constraints. The flexibility of using a custom reconfigurable implementation is exploited in a unique manner to leverage the lessons learned in vector supercomputer development. The vector processing framework is tailored to the application, with variable parameters that are fixed in traditional vector processing. Benchmark data that demonstrates the functionality and utility of the approach is presented. The benchmark data includes an identified bottleneck in a real case study example vector code, the NASA Langley Terminal Area Simulation System (TASS) application.
GPU Accelerated Vector Median Filter
NASA Technical Reports Server (NTRS)
Aras, Rifat; Shen, Yuzhong
2011-01-01
Noise reduction is an important step for most image processing tasks. For three channel color images, a widely used technique is vector median filter in which color values of pixels are treated as 3-component vectors. Vector median filters are computationally expensive; for a window size of n x n, each of the n(sup 2) vectors has to be compared with other n(sup 2) - 1 vectors in distances. General purpose computation on graphics processing units (GPUs) is the paradigm of utilizing high-performance many-core GPU architectures for computation tasks that are normally handled by CPUs. In this work. NVIDIA's Compute Unified Device Architecture (CUDA) paradigm is used to accelerate vector median filtering. which has to the best of our knowledge never been done before. The performance of GPU accelerated vector median filter is compared to that of the CPU and MPI-based versions for different image and window sizes, Initial findings of the study showed 100x improvement of performance of vector median filter implementation on GPUs over CPU implementations and further speed-up is expected after more extensive optimizations of the GPU algorithm .
Dynamic reduction of dimensions of a document vector in a document search and retrieval system
Jiao, Yu; Potok, Thomas E.
2011-05-03
The method and system of the invention involves processing each new document (20) coming into the system into a document vector (16), and creating a document vector with reduced dimensionality (17) for comparison with the data model (15) without recomputing the data model (15). These operations are carried out by a first computer (11) while a second computer (12) updates the data model (18), which can be comprised of an initial large group of documents (19) and is premised on the computing an initial data model (13, 14, 15) to provide a reference point for determining document vectors from documents processed from the data stream (20).
Optical computing and image processing using photorefractive gallium arsenide
NASA Technical Reports Server (NTRS)
Cheng, Li-Jen; Liu, Duncan T. H.
1990-01-01
Recent experimental results on matrix-vector multiplication and multiple four-wave mixing using GaAs are presented. Attention is given to a simple concept of using two overlapping holograms in GaAs to do two matrix-vector multiplication processes operating in parallel with a common input vector. This concept can be used to construct high-speed, high-capacity, reconfigurable interconnection and multiplexing modules, important for optical computing and neural-network applications.
NASA Astrophysics Data System (ADS)
Murni, Bustamam, A.; Ernastuti, Handhika, T.; Kerami, D.
2017-07-01
Calculation of the matrix-vector multiplication in the real-world problems often involves large matrix with arbitrary size. Therefore, parallelization is needed to speed up the calculation process that usually takes a long time. Graph partitioning techniques that have been discussed in the previous studies cannot be used to complete the parallelized calculation of matrix-vector multiplication with arbitrary size. This is due to the assumption of graph partitioning techniques that can only solve the square and symmetric matrix. Hypergraph partitioning techniques will overcome the shortcomings of the graph partitioning technique. This paper addresses the efficient parallelization of matrix-vector multiplication through hypergraph partitioning techniques using CUDA GPU-based parallel computing. CUDA (compute unified device architecture) is a parallel computing platform and programming model that was created by NVIDIA and implemented by the GPU (graphics processing unit).
An Algorithm for Converting Static Earth Sensor Measurements into Earth Observation Vectors
NASA Technical Reports Server (NTRS)
Harman, R.; Hashmall, Joseph A.; Sedlak, Joseph
2004-01-01
An algorithm has been developed that converts penetration angles reported by Static Earth Sensors (SESs) into Earth observation vectors. This algorithm allows compensation for variation in the horizon height including that caused by Earth oblateness. It also allows pitch and roll to be computed using any number (greater than 1) of simultaneous sensor penetration angles simplifying processing during periods of Sun and Moon interference. The algorithm computes body frame unit vectors through each SES cluster. It also computes GCI vectors from the spacecraft to the position on the Earth's limb where each cluster detects the Earth's limb. These body frame vectors are used as sensor observation vectors and the GCI vectors are used as reference vectors in an attitude solution. The attitude, with the unobservable yaw discarded, is iteratively refined to provide the Earth observation vector solution.
NASA Technical Reports Server (NTRS)
1975-01-01
NASA structural analysis (NASTRAN) computer program is operational on three series of third generation computers. The problem and difficulties involved in adapting NASTRAN to a fourth generation computer, namely, the Control Data STAR-100, are discussed. The salient features which distinguish Control Data STAR-100 from third generation computers are hardware vector processing capability and virtual memory. A feasible method is presented for transferring NASTRAN to Control Data STAR-100 system while retaining much of the machine-independent code. Basic matrix operations are noted for optimization for vector processing.
NASA Technical Reports Server (NTRS)
Mangalgiri, P. D.; Prabhakaran, R.
1986-01-01
An algorithm for vectorized computation of stiffness matrices of an 8 noded isoparametric hexahedron element for geometric nonlinear analysis was developed. This was used in conjunction with the earlier 2-D program GAMNAS to develop the new program NAS3D for geometric nonlinear analysis. A conventional, modified Newton-Raphson process is used for the nonlinear analysis. New schemes for the computation of stiffness and strain energy release rates is presented. The organization the program is explained and some results on four sample problems are given. The study of CPU times showed that savings by a factor of 11 to 13 were achieved when vectorized computation was used for the stiffness instead of the conventional scalar one. Finally, the scheme of inputting data is explained.
GaAs Supercomputing: Architecture, Language, And Algorithms For Image Processing
NASA Astrophysics Data System (ADS)
Johl, John T.; Baker, Nick C.
1988-10-01
The application of high-speed GaAs processors in a parallel system matches the demanding computational requirements of image processing. The architecture of the McDonnell Douglas Astronautics Company (MDAC) vector processor is described along with the algorithms and language translator. Most image and signal processing algorithms can utilize parallel processing and show a significant performance improvement over sequential versions. The parallelization performed by this system is within each vector instruction. Since each vector has many elements, each requiring some computation, useful concurrent arithmetic operations can easily be performed. Balancing the memory bandwidth with the computation rate of the processors is an important design consideration for high efficiency and utilization. The architecture features a bus-based execution unit consisting of four to eight 32-bit GaAs RISC microprocessors running at a 200 MHz clock rate for a peak performance of 1.6 BOPS. The execution unit is connected to a vector memory with three buses capable of transferring two input words and one output word every 10 nsec. The address generators inside the vector memory perform different vector addressing modes and feed the data to the execution unit. The functions discussed in this paper include basic MATRIX OPERATIONS, 2-D SPATIAL CONVOLUTION, HISTOGRAM, and FFT. For each of these algorithms, assembly language programs were run on a behavioral model of the system to obtain performance figures.
A low-cost vector processor boosting compute-intensive image processing operations
NASA Technical Reports Server (NTRS)
Adorf, Hans-Martin
1992-01-01
Low-cost vector processing (VP) is within reach of everyone seriously engaged in scientific computing. The advent of affordable add-on VP-boards for standard workstations complemented by mathematical/statistical libraries is beginning to impact compute-intensive tasks such as image processing. A case in point in the restoration of distorted images from the Hubble Space Telescope. A low-cost implementation is presented of the standard Tarasko-Richardson-Lucy restoration algorithm on an Intel i860-based VP-board which is seamlessly interfaced to a commercial, interactive image processing system. First experience is reported (including some benchmarks for standalone FFT's) and some conclusions are drawn.
Cheng, Jerome; Hipp, Jason; Monaco, James; Lucas, David R; Madabhushi, Anant; Balis, Ulysses J
2011-01-01
Spatially invariant vector quantization (SIVQ) is a texture and color-based image matching algorithm that queries the image space through the use of ring vectors. In prior studies, the selection of one or more optimal vectors for a particular feature of interest required a manual process, with the user initially stochastically selecting candidate vectors and subsequently testing them upon other regions of the image to verify the vector's sensitivity and specificity properties (typically by reviewing a resultant heat map). In carrying out the prior efforts, the SIVQ algorithm was noted to exhibit highly scalable computational properties, where each region of analysis can take place independently of others, making a compelling case for the exploration of its deployment on high-throughput computing platforms, with the hypothesis that such an exercise will result in performance gains that scale linearly with increasing processor count. An automated process was developed for the selection of optimal ring vectors to serve as the predicate matching operator in defining histopathological features of interest. Briefly, candidate vectors were generated from every possible coordinate origin within a user-defined vector selection area (VSA) and subsequently compared against user-identified positive and negative "ground truth" regions on the same image. Each vector from the VSA was assessed for its goodness-of-fit to both the positive and negative areas via the use of the receiver operating characteristic (ROC) transfer function, with each assessment resulting in an associated area-under-the-curve (AUC) figure of merit. Use of the above-mentioned automated vector selection process was demonstrated in two cases of use: First, to identify malignant colonic epithelium, and second, to identify soft tissue sarcoma. For both examples, a very satisfactory optimized vector was identified, as defined by the AUC metric. Finally, as an additional effort directed towards attaining high-throughput capability for the SIVQ algorithm, we demonstrated the successful incorporation of it with the MATrix LABoratory (MATLAB™) application interface. The SIVQ algorithm is suitable for automated vector selection settings and high throughput computation.
Timothy G. Wade; James D. Wickham; Maliha S. Nash; Anne C. Neale; Kurt H. Riitters; K. Bruce Jones
2003-01-01
AbstractGIS-based measurements that combine native raster and native vector data are commonly used in environmental assessments. Most of these measurements can be calculated using either raster or vector data formats and processing methods. Raster processes are more commonly used because they can be significantly faster computationally...
Vectorized Monte Carlo methods for reactor lattice analysis
NASA Technical Reports Server (NTRS)
Brown, F. B.
1984-01-01
Some of the new computational methods and equivalent mathematical representations of physics models used in the MCV code, a vectorized continuous-enery Monte Carlo code for use on the CYBER-205 computer are discussed. While the principal application of MCV is the neutronics analysis of repeating reactor lattices, the new methods used in MCV should be generally useful for vectorizing Monte Carlo for other applications. For background, a brief overview of the vector processing features of the CYBER-205 is included, followed by a discussion of the fundamentals of Monte Carlo vectorization. The physics models used in the MCV vectorized Monte Carlo code are then summarized. The new methods used in scattering analysis are presented along with details of several key, highly specialized computational routines. Finally, speedups relative to CDC-7600 scalar Monte Carlo are discussed.
The vectorization of a ray tracing program for image generation
NASA Technical Reports Server (NTRS)
Plunkett, D. J.; Cychosz, J. M.; Bailey, M. J.
1984-01-01
Ray tracing is a widely used method for producing realistic computer generated images. Ray tracing involves firing an imaginary ray from a view point, through a point on an image plane, into a three dimensional scene. The intersections of the ray with the objects in the scene determines what is visible at the point on the image plane. This process must be repeated many times, once for each point (commonly called a pixel) in the image plane. A typical image contains more than a million pixels making this process computationally expensive. A traditional ray tracing program processes one ray at a time. In such a serial approach, as much as ninety percent of the execution time is spent computing the intersection of a ray with the surface in the scene. With the CYBER 205, many rays can be intersected with all the bodies im the scene with a single series of vector operations. Vectorization of this intersection process results in large decreases in computation time. The CADLAB's interest in ray tracing stems from the need to produce realistic images of mechanical parts. A high quality image of a part during the design process can increase the productivity of the designer by helping him visualize the results of his work. To be useful in the design process, these images must be produced in a reasonable amount of time. This discussion will explain how the ray tracing process was vectorized and gives examples of the images obtained.
Vectorization with SIMD extensions speeds up reconstruction in electron tomography.
Agulleiro, J I; Garzón, E M; García, I; Fernández, J J
2010-06-01
Electron tomography allows structural studies of cellular structures at molecular detail. Large 3D reconstructions are needed to meet the resolution requirements. The processing time to compute these large volumes may be considerable and so, high performance computing techniques have been used traditionally. This work presents a vector approach to tomographic reconstruction that relies on the exploitation of the SIMD extensions available in modern processors in combination to other single processor optimization techniques. This approach succeeds in producing full resolution tomograms with an important reduction in processing time, as evaluated with the most common reconstruction algorithms, namely WBP and SIRT. The main advantage stems from the fact that this approach is to be run on standard computers without the need of specialized hardware, which facilitates the development, use and management of programs. Future trends in processor design open excellent opportunities for vector processing with processor's SIMD extensions in the field of 3D electron microscopy.
NASA Astrophysics Data System (ADS)
Imamura, Seigo; Ono, Kenji; Yokokawa, Mitsuo
2016-07-01
Ensemble computing, which is an instance of capacity computing, is an effective computing scenario for exascale parallel supercomputers. In ensemble computing, there are multiple linear systems associated with a common coefficient matrix. We improve the performance of iterative solvers for multiple vectors by solving them at the same time, that is, by solving for the product of the matrices. We implemented several iterative methods and compared their performance. The maximum performance on Sparc VIIIfx was 7.6 times higher than that of a naïve implementation. Finally, to deal with the different convergence processes of linear systems, we introduced a control method to eliminate the calculation of already converged vectors.
Optoelectronic Inner-Product Neural Associative Memory
NASA Technical Reports Server (NTRS)
Liu, Hua-Kuang
1993-01-01
Optoelectronic apparatus acts as artificial neural network performing associative recall of binary images. Recall process is iterative one involving optical computation of inner products between binary input vector and one or more reference binary vectors in memory. Inner-product method requires far less memory space than matrix-vector method.
Real time display Fourier-domain OCT using multi-thread parallel computing with data vectorization
NASA Astrophysics Data System (ADS)
Eom, Tae Joong; Kim, Hoon Seop; Kim, Chul Min; Lee, Yeung Lak; Choi, Eun-Seo
2011-03-01
We demonstrate a real-time display of processed OCT images using multi-thread parallel computing with a quad-core CPU of a personal computer. The data of each A-line are treated as one vector to maximize the data translation rate between the cores of the CPU and RAM stored image data. A display rate of 29.9 frames/sec for processed OCT data (4096 FFT-size x 500 A-scans) is achieved in our system using a wavelength swept source with 52-kHz swept frequency. The data processing times of the OCT image and a Doppler OCT image with a 4-time average are 23.8 msec and 91.4 msec.
Spiking Neural P Systems With Rules on Synapses Working in Maximum Spiking Strategy.
Tao Song; Linqiang Pan
2015-06-01
Spiking neural P systems (called SN P systems for short) are a class of parallel and distributed neural-like computation models inspired by the way the neurons process information and communicate with each other by means of impulses or spikes. In this work, we introduce a new variant of SN P systems, called SN P systems with rules on synapses working in maximum spiking strategy, and investigate the computation power of the systems as both number and vector generators. Specifically, we prove that i) if no limit is imposed on the number of spikes in any neuron during any computation, such systems can generate the sets of Turing computable natural numbers and the sets of vectors of positive integers computed by k-output register machine; ii) if an upper bound is imposed on the number of spikes in each neuron during any computation, such systems can characterize semi-linear sets of natural numbers as number generating devices; as vector generating devices, such systems can only characterize the family of sets of vectors computed by sequential monotonic counter machine, which is strictly included in family of semi-linear sets of vectors. This gives a positive answer to the problem formulated in Song et al., Theor. Comput. Sci., vol. 529, pp. 82-95, 2014.
Monitoring by Use of Clusters of Sensor-Data Vectors
NASA Technical Reports Server (NTRS)
Iverson, David L.
2007-01-01
The inductive monitoring system (IMS) is a system of computer hardware and software for automated monitoring of the performance, operational condition, physical integrity, and other aspects of the health of a complex engineering system (e.g., an industrial process line or a spacecraft). The input to the IMS consists of streams of digitized readings from sensors in the monitored system. The IMS determines the type and amount of any deviation of the monitored system from a nominal or normal ( healthy ) condition on the basis of a comparison between (1) vectors constructed from the incoming sensor data and (2) corresponding vectors in a database of nominal or normal behavior. The term inductive reflects the use of a process reminiscent of traditional mathematical induction to learn about normal operation and build the nominal-condition database. The IMS offers two major advantages over prior computational monitoring systems: The computational burden of the IMS is significantly smaller, and there is no need for abnormal-condition sensor data for training the IMS to recognize abnormal conditions. The figure schematically depicts the relationships among the computational processes effected by the IMS. Training sensor data are gathered during normal operation of the monitored system, detailed computational simulation of operation of the monitored system, or both. The training data are formed into vectors that are used to generate the database. The vectors in the database are clustered into regions that represent normal or nominal operation. Once the database has been generated, the IMS compares the vectors of incoming sensor data with vectors representative of the clusters. The monitored system is deemed to be operating normally or abnormally, depending on whether the vector of incoming sensor data is or is not, respectively, sufficiently close to one of the clusters. For this purpose, a distance between two vectors is calculated by a suitable metric (e.g., Euclidean distance) and "sufficiently close" signifies lying at a distance less than a specified threshold value. It must be emphasized that although the IMS is intended to detect off-nominal or abnormal performance or health, it is not necessarily capable of performing a thorough or detailed diagnosis. Limited diagnostic information may be available under some circumstances. For example, the distance of a vector of incoming sensor data from the nearest cluster could serve as an indication of the severity of a malfunction. The identity of the nearest cluster may be a clue as to the identity of the malfunctioning component or subsystem. It is possible to decrease the IMS computation time by use of a combination of cluster-indexing and -retrieval methods. For example, in one method, the distances between each cluster and two or more reference vectors can be used for the purpose of indexing and retrieval. The clusters are sorted into a list according to these distance values, typically in ascending order of distance. When a set of input data arrives and is to be tested, the data are first arranged as an ordered set (that is, a vector). The distances from the input vector to the reference points are computed. The search of clusters from the list can then be limited to those clusters lying within a certain distance range from the input vector; the computation time is reduced by not searching the clusters at a greater distance.
Virtual head rotation reveals a process of route reconstruction from human vestibular signals
Day, Brian L; Fitzpatrick, Richard C
2005-01-01
The vestibular organs can feed perceptual processes that build a picture of our route as we move about in the world. However, raw vestibular signals do not define the path taken because, during travel, the head can undergo accelerations unrelated to the route and also be orientated in any direction to vary the signal. This study investigated the computational process by which the brain transforms raw vestibular signals for the purpose of route reconstruction. We electrically stimulated the vestibular nerves of human subjects to evoke a virtual head rotation fixed in skull co-ordinates and measure its perceptual effect. The virtual head rotation caused subjects to perceive an illusory whole-body rotation that was a cyclic function of head-pitch angle. They perceived whole-body yaw rotation in one direction with the head pitched forwards, the opposite direction with the head pitched backwards, and no rotation with the head in an intermediate position. A model based on vector operations and the anatomy and firing properties of semicircular canals precisely predicted these perceptions. In effect, a neural process computes the vector dot product between the craniocentric vestibular vector of head rotation and the gravitational unit vector. This computation yields the signal of body rotation in the horizontal plane that feeds our perception of the route travelled. PMID:16002439
Vector computer memory bank contention
NASA Technical Reports Server (NTRS)
Bailey, D. H.
1985-01-01
A number of vector supercomputers feature very large memories. Unfortunately the large capacity memory chips that are used in these computers are much slower than the fast central processing unit (CPU) circuitry. As a result, memory bank reservation times (in CPU ticks) are much longer than on previous generations of computers. A consequence of these long reservation times is that memory bank contention is sharply increased, resulting in significantly lowered performance rates. The phenomenon of memory bank contention in vector computers is analyzed using both a Markov chain model and a Monte Carlo simulation program. The results of this analysis indicate that future generations of supercomputers must either employ much faster memory chips or else feature very large numbers of independent memory banks.
Vector computer memory bank contention
NASA Technical Reports Server (NTRS)
Bailey, David H.
1987-01-01
A number of vector supercomputers feature very large memories. Unfortunately the large capacity memory chips that are used in these computers are much slower than the fast central processing unit (CPU) circuitry. As a result, memory bank reservation times (in CPU ticks) are much longer than on previous generations of computers. A consequence of these long reservation times is that memory bank contention is sharply increased, resulting in significantly lowered performance rates. The phenomenon of memory bank contention in vector computers is analyzed using both a Markov chain model and a Monte Carlo simulation program. The results of this analysis indicate that future generations of supercomputers must either employ much faster memory chips or else feature very large numbers of independent memory banks.
A Neurocomputational Model of Goal-Directed Navigation in Insect-Inspired Artificial Agents
Goldschmidt, Dennis; Manoonpong, Poramate; Dasgupta, Sakyasingha
2017-01-01
Despite their small size, insect brains are able to produce robust and efficient navigation in complex environments. Specifically in social insects, such as ants and bees, these navigational capabilities are guided by orientation directing vectors generated by a process called path integration. During this process, they integrate compass and odometric cues to estimate their current location as a vector, called the home vector for guiding them back home on a straight path. They further acquire and retrieve path integration-based vector memories globally to the nest or based on visual landmarks. Although existing computational models reproduced similar behaviors, a neurocomputational model of vector navigation including the acquisition of vector representations has not been described before. Here we present a model of neural mechanisms in a modular closed-loop control—enabling vector navigation in artificial agents. The model consists of a path integration mechanism, reward-modulated global learning, random search, and action selection. The path integration mechanism integrates compass and odometric cues to compute a vectorial representation of the agent's current location as neural activity patterns in circular arrays. A reward-modulated learning rule enables the acquisition of vector memories by associating the local food reward with the path integration state. A motor output is computed based on the combination of vector memories and random exploration. In simulation, we show that the neural mechanisms enable robust homing and localization, even in the presence of external sensory noise. The proposed learning rules lead to goal-directed navigation and route formation performed under realistic conditions. Consequently, we provide a novel approach for vector learning and navigation in a simulated, situated agent linking behavioral observations to their possible underlying neural substrates. PMID:28446872
A Neurocomputational Model of Goal-Directed Navigation in Insect-Inspired Artificial Agents.
Goldschmidt, Dennis; Manoonpong, Poramate; Dasgupta, Sakyasingha
2017-01-01
Despite their small size, insect brains are able to produce robust and efficient navigation in complex environments. Specifically in social insects, such as ants and bees, these navigational capabilities are guided by orientation directing vectors generated by a process called path integration. During this process, they integrate compass and odometric cues to estimate their current location as a vector, called the home vector for guiding them back home on a straight path. They further acquire and retrieve path integration-based vector memories globally to the nest or based on visual landmarks. Although existing computational models reproduced similar behaviors, a neurocomputational model of vector navigation including the acquisition of vector representations has not been described before. Here we present a model of neural mechanisms in a modular closed-loop control-enabling vector navigation in artificial agents. The model consists of a path integration mechanism, reward-modulated global learning, random search, and action selection. The path integration mechanism integrates compass and odometric cues to compute a vectorial representation of the agent's current location as neural activity patterns in circular arrays. A reward-modulated learning rule enables the acquisition of vector memories by associating the local food reward with the path integration state. A motor output is computed based on the combination of vector memories and random exploration. In simulation, we show that the neural mechanisms enable robust homing and localization, even in the presence of external sensory noise. The proposed learning rules lead to goal-directed navigation and route formation performed under realistic conditions. Consequently, we provide a novel approach for vector learning and navigation in a simulated, situated agent linking behavioral observations to their possible underlying neural substrates.
Realistic Covariance Prediction for the Earth Science Constellation
NASA Technical Reports Server (NTRS)
Duncan, Matthew; Long, Anne
2006-01-01
Routine satellite operations for the Earth Science Constellation (ESC) include collision risk assessment between members of the constellation and other orbiting space objects. One component of the risk assessment process is computing the collision probability between two space objects. The collision probability is computed using Monte Carlo techniques as well as by numerically integrating relative state probability density functions. Each algorithm takes as inputs state vector and state vector uncertainty information for both objects. The state vector uncertainty information is expressed in terms of a covariance matrix. The collision probability computation is only as good as the inputs. Therefore, to obtain a collision calculation that is a useful decision-making metric, realistic covariance matrices must be used as inputs to the calculation. This paper describes the process used by the NASA/Goddard Space Flight Center's Earth Science Mission Operations Project to generate realistic covariance predictions for three of the Earth Science Constellation satellites: Aqua, Aura and Terra.
NASA Technical Reports Server (NTRS)
Noor, Ahmed K.; Peters, Jeanne M.
1989-01-01
A computational procedure is presented for the nonlinear dynamic analysis of unsymmetric structures on vector multiprocessor systems. The procedure is based on a novel hierarchical partitioning strategy in which the response of the unsymmetric and antisymmetric response vectors (modes), each obtained by using only a fraction of the degrees of freedom of the original finite element model. The three key elements of the procedure which result in high degree of concurrency throughout the solution process are: (1) mixed (or primitive variable) formulation with independent shape functions for the different fields; (2) operator splitting or restructuring of the discrete equations at each time step to delineate the symmetric and antisymmetric vectors constituting the response; and (3) two level iterative process for generating the response of the structure. An assessment is made of the effectiveness of the procedure on the CRAY X-MP/4 computers.
Guidelines for developing vectorizable computer programs
NASA Technical Reports Server (NTRS)
Miner, E. W.
1982-01-01
Some fundamental principles for developing computer programs which are compatible with array-oriented computers are presented. The emphasis is on basic techniques for structuring computer codes which are applicable in FORTRAN and do not require a special programming language or exact a significant penalty on a scalar computer. Researchers who are using numerical techniques to solve problems in engineering can apply these basic principles and thus develop transportable computer programs (in FORTRAN) which contain much vectorizable code. The vector architecture of the ASC is discussed so that the requirements of array processing can be better appreciated. The "vectorization" of a finite-difference viscous shock-layer code is used as an example to illustrate the benefits and some of the difficulties involved. Increases in computing speed with vectorization are illustrated with results from the viscous shock-layer code and from a finite-element shock tube code. The applicability of these principles was substantiated through running programs on other computers with array-associated computing characteristics, such as the Hewlett-Packard (H-P) 1000-F.
Solution of a large hydrodynamic problem using the STAR-100 computer
NASA Technical Reports Server (NTRS)
Weilmuenster, K. J.; Howser, L. M.
1976-01-01
A representative hydrodynamics problem, the shock initiated flow over a flat plate, was used for exploring data organizations and program structures needed to exploit the STAR-100 vector processing computer. A brief description of the problem is followed by a discussion of how each portion of the computational process was vectorized. Finally, timings of different portions of the program are compared with equivalent operations on serial machines. The speed up of the STAR-100 over the CDC 6600 program is shown to increase as the problem size increases. All computations were carried out on a CDC 6600 and a CDC STAR 100, with code written in FORTRAN for the 6600 and in STAR FORTRAN for the STAR 100.
Peuquet, D.J.
1981-01-01
Current graphic devices suitable for high-speed computer input and output of cartographic data are tending more and more to be raster-oriented, such as the rotating drum scanner and the color raster display. However, the majority of commonly used manipulative techniques in computer-assisted cartography and automated spatial data handling continue to require that the data be in vector format. This situation has recently precipitated the requirement for very fast techniques for converting digital cartographic data from raster to vector format for processing, and then back into raster format for plotting. The current article is part 1 of a 2 part paper concerned with examining the state-of-the-art in these conversion techniques. -from Author
Method and apparatus for optimized processing of sparse matrices
Taylor, Valerie E.
1993-01-01
A computer architecture for processing a sparse matrix is disclosed. The apparatus stores a value-row vector corresponding to nonzero values of a sparse matrix. Each of the nonzero values is located at a defined row and column position in the matrix. The value-row vector includes a first vector including nonzero values and delimiting characters indicating a transition from one column to another. The value-row vector also includes a second vector which defines row position values in the matrix corresponding to the nonzero values in the first vector and column position values in the matrix corresponding to the column position of the nonzero values in the first vector. The architecture also includes a circuit for detecting a special character within the value-row vector. Matrix-vector multiplication is executed on the value-row vector. This multiplication is performed by multiplying an index value of the first vector value by a column value from a second matrix to form a matrix-vector product which is added to a previous matrix-vector product.
Super and parallel computers and their impact on civil engineering
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kamat, M.P.
1986-01-01
This book presents the papers given at a conference on the use of supercomputers in civil engineering. Topics considered at the conference included solving nonlinear equations on a hypercube, a custom architectured parallel processing system, distributed data processing, algorithms, computer architecture, parallel processing, vector processing, computerized simulation, and cost benefit analysis.
Large Electroweak Corrections to Vector-Boson Scattering at the Large Hadron Collider.
Biedermann, Benedikt; Denner, Ansgar; Pellen, Mathieu
2017-06-30
For the first time full next-to-leading-order electroweak corrections to off-shell vector-boson scattering are presented. The computation features the complete matrix elements, including all nonresonant and off-shell contributions, to the electroweak process pp→μ^{+}ν_{μ}e^{+}ν_{e}jj and is fully differential. We find surprisingly large corrections, reaching -16% for the fiducial cross section, as an intrinsic feature of the vector-boson-scattering processes. We elucidate the origin of these large electroweak corrections upon using the double-pole approximation and the effective vector-boson approximation along with leading-logarithmic corrections.
NASA Astrophysics Data System (ADS)
Mori, Kensaku; Suenaga, Yasuhito; Toriwaki, Jun-ichiro
2003-05-01
This paper describes a software-based fast volume rendering (VolR) method on a PC platform by using multimedia instructions, such as SIMD instructions, which are currently available in PCs' CPUs. This method achieves fast rendering speed through highly optimizing software rather than an improved rendering algorithm. In volume rendering using a ray casting method, the system requires fast execution of the following processes: (a) interpolation of voxel or color values at sample points, (b) computation of normal vectors (gray-level gradient vectors), (c) calculation of shaded values obtained by dot-products of normal vectors and light source direction vectors, (d) memory access to a huge area, and (e) efficient ray skipping at translucent regions. The proposed software implements these fundamental processes in volume rending by using special instruction sets for multimedia processing. The proposed software can generate virtual endoscopic images of a 3-D volume of 512x512x489 voxel size by volume rendering with perspective projection, specular reflection, and on-the-fly normal vector computation on a conventional PC without any special hardware at thirteen frames per second. Semi-translucent display is also possible.
A parallel Jacobson-Oksman optimization algorithm. [parallel processing (computers)
NASA Technical Reports Server (NTRS)
Straeter, T. A.; Markos, A. T.
1975-01-01
A gradient-dependent optimization technique which exploits the vector-streaming or parallel-computing capabilities of some modern computers is presented. The algorithm, derived by assuming that the function to be minimized is homogeneous, is a modification of the Jacobson-Oksman serial minimization method. In addition to describing the algorithm, conditions insuring the convergence of the iterates of the algorithm and the results of numerical experiments on a group of sample test functions are presented. The results of these experiments indicate that this algorithm will solve optimization problems in less computing time than conventional serial methods on machines having vector-streaming or parallel-computing capabilities.
Properties of Vector Preisach Models
NASA Technical Reports Server (NTRS)
Kahler, Gary R.; Patel, Umesh D.; Torre, Edward Della
2004-01-01
This paper discusses rotational anisotropy and rotational accommodation of magnetic particle tape. These effects have a performance impact during the reading and writing of the recording process. We introduce the reduced vector model as the basis for the computations. Rotational magnetization models must accurately compute the anisotropic characteristics of ellipsoidally magnetizable media. An ellipticity factor is derived for these media that computes the two-dimensional magnetization trajectory for all applied fields. An orientation correction must be applied to the computed rotational magnetization. For isotropic materials, an orientation correction has been developed and presented. For anisotropic materials, an orientation correction is introduced.
A k-Vector Approach to Sampling, Interpolation, and Approximation
NASA Astrophysics Data System (ADS)
Mortari, Daniele; Rogers, Jonathan
2013-12-01
The k-vector search technique is a method designed to perform extremely fast range searching of large databases at computational cost independent of the size of the database. k-vector search algorithms have historically found application in satellite star-tracker navigation systems which index very large star catalogues repeatedly in the process of attitude estimation. Recently, the k-vector search algorithm has been applied to numerous other problem areas including non-uniform random variate sampling, interpolation of 1-D or 2-D tables, nonlinear function inversion, and solution of systems of nonlinear equations. This paper presents algorithms in which the k-vector search technique is used to solve each of these problems in a computationally-efficient manner. In instances where these tasks must be performed repeatedly on a static (or nearly-static) data set, the proposed k-vector-based algorithms offer an extremely fast solution technique that outperforms standard methods.
Hypercluster - Parallel processing for computational mechanics
NASA Technical Reports Server (NTRS)
Blech, Richard A.
1988-01-01
An account is given of the development status, performance capabilities and implications for further development of NASA-Lewis' testbed 'hypercluster' parallel computer network, in which multiple processors communicate through a shared memory. Processors have local as well as shared memory; the hypercluster is expanded in the same manner as the hypercube, with processor clusters replacing the normal single processor node. The NASA-Lewis machine has three nodes with a vector personality and one node with a scalar personality. Each of the vector nodes uses four board-level vector processors, while the scalar node uses four general-purpose microcomputer boards.
Signal processing applications of massively parallel charge domain computing devices
NASA Technical Reports Server (NTRS)
Fijany, Amir (Inventor); Barhen, Jacob (Inventor); Toomarian, Nikzad (Inventor)
1999-01-01
The present invention is embodied in a charge coupled device (CCD)/charge injection device (CID) architecture capable of performing a Fourier transform by simultaneous matrix vector multiplication (MVM) operations in respective plural CCD/CID arrays in parallel in O(1) steps. For example, in one embodiment, a first CCD/CID array stores charge packets representing a first matrix operator based upon permutations of a Hartley transform and computes the Fourier transform of an incoming vector. A second CCD/CID array stores charge packets representing a second matrix operator based upon different permutations of a Hartley transform and computes the Fourier transform of an incoming vector. The incoming vector is applied to the inputs of the two CCD/CID arrays simultaneously, and the real and imaginary parts of the Fourier transform are produced simultaneously in the time required to perform a single MVM operation in a CCD/CID array.
Multiscale vector fields for image pattern recognition
NASA Technical Reports Server (NTRS)
Low, Kah-Chan; Coggins, James M.
1990-01-01
A uniform processing framework for low-level vision computing in which a bank of spatial filters maps the image intensity structure at each pixel into an abstract feature space is proposed. Some properties of the filters and the feature space are described. Local orientation is measured by a vector sum in the feature space as follows: each filter's preferred orientation along with the strength of the filter's output determine the orientation and the length of a vector in the feature space; the vectors for all filters are summed to yield a resultant vector for a particular pixel and scale. The orientation of the resultant vector indicates the local orientation, and the magnitude of the vector indicates the strength of the local orientation preference. Limitations of the vector sum method are discussed. Investigations show that the processing framework provides a useful, redundant representation of image structure across orientation and scale.
Rational calculation accuracy in acousto-optical matrix-vector processor
NASA Astrophysics Data System (ADS)
Oparin, V. V.; Tigin, Dmitry V.
1994-01-01
The high speed of parallel computations for a comparatively small-size processor and acceptable power consumption makes the usage of acousto-optic matrix-vector multiplier (AOMVM) attractive for processing of large amounts of information in real time. The limited accuracy of computations is an essential disadvantage of such a processor. The reduced accuracy requirements allow for considerable simplification of the AOMVM architecture and the reduction of the demands on its components.
Controlling Flexible Robot Arms Using High Speed Dynamics Process
NASA Technical Reports Server (NTRS)
Jain, Abhinandan (Inventor)
1996-01-01
A robot manipulator controller for a flexible manipulator arm having plural bodies connected at respective movable hinges and flexible in plural deformation modes corresponding to respective modal spatial influence vectors relating deformations of plural spaced nodes of respective bodies to the plural deformation modes, operates by computing articulated body quantities for each of the bodies from respective modal spatial influence vectors, obtaining specified body forces for each of the bodies, and computing modal deformation accelerations of the nodes and hinge accelerations of the hinges from the specified body forces, from the articulated body quantities and from the modal spatial influence vectors. In one embodiment of the invention, the controller further operates by comparing the accelerations thus computed to desired manipulator motion to determine a motion discrepancy, and correcting the specified body forces so as to reduce the motion discrepancy. The manipulator bodies and hinges are characterized by respective vectors of deformation and hinge configuration variables, and computing modal deformation accelerations and hinge accelerations is carried out for each one of the bodies beginning with the outermost body by computing a residual body force from a residual body force of a previous body and from the vector of deformation and hinge configuration variables, computing a resultant hinge acceleration from the body force, the residual body force and the articulated hinge inertia, and revising the residual body force modal body acceleration.
Computation of output feedback gains for linear stochastic systems using the Zangnill-Powell Method
NASA Technical Reports Server (NTRS)
Kaufman, H.
1975-01-01
Because conventional optimal linear regulator theory results in a controller which requires the capability of measuring and/or estimating the entire state vector, it is of interest to consider procedures for computing controls which are restricted to be linear feedback functions of a lower dimensional output vector and which take into account the presence of measurement noise and process uncertainty. To this effect a stochastic linear model has been developed that accounts for process parameter and initial uncertainty, measurement noise, and a restricted number of measurable outputs. Optimization with respect to the corresponding output feedback gains was then performed for both finite and infinite time performance indices without gradient computation by using Zangwill's modification of a procedure originally proposed by Powell. Results using a seventh order process show the proposed procedures to be very effective.
Computation of output feedback gains for linear stochastic systems using the Zangwill-Powell method
NASA Technical Reports Server (NTRS)
Kaufman, H.
1977-01-01
Because conventional optimal linear regulator theory results in a controller which requires the capability of measuring and/or estimating the entire state vector, it is of interest to consider procedures for computing controls which are restricted to be linear feedback functions of a lower dimensional output vector and which take into account the presence of measurement noise and process uncertainty. To this effect a stochastic linear model has been developed that accounts for process parameter and initial uncertainty, measurement noise, and a restricted number of measurable outputs. Optimization with respect to the corresponding output feedback gains was then performed for both finite and infinite time performance indices without gradient computation by using Zangwill's modification of a procedure originally proposed by Powell.
Application of high-performance computing to numerical simulation of human movement
NASA Technical Reports Server (NTRS)
Anderson, F. C.; Ziegler, J. M.; Pandy, M. G.; Whalen, R. T.
1995-01-01
We have examined the feasibility of using massively-parallel and vector-processing supercomputers to solve large-scale optimization problems for human movement. Specifically, we compared the computational expense of determining the optimal controls for the single support phase of gait using a conventional serial machine (SGI Iris 4D25), a MIMD parallel machine (Intel iPSC/860), and a parallel-vector-processing machine (Cray Y-MP 8/864). With the human body modeled as a 14 degree-of-freedom linkage actuated by 46 musculotendinous units, computation of the optimal controls for gait could take up to 3 months of CPU time on the Iris. Both the Cray and the Intel are able to reduce this time to practical levels. The optimal solution for gait can be found with about 77 hours of CPU on the Cray and with about 88 hours of CPU on the Intel. Although the overall speeds of the Cray and the Intel were found to be similar, the unique capabilities of each machine are better suited to different portions of the computational algorithm used. The Intel was best suited to computing the derivatives of the performance criterion and the constraints whereas the Cray was best suited to parameter optimization of the controls. These results suggest that the ideal computer architecture for solving very large-scale optimal control problems is a hybrid system in which a vector-processing machine is integrated into the communication network of a MIMD parallel machine.
Computation of transonic potential flow about 3 dimensional inlets, ducts, and bodies
NASA Technical Reports Server (NTRS)
Reyhner, T. A.
1982-01-01
An analysis was developed and a computer code, P465 Version A, written for the prediction of transonic potential flow about three dimensional objects including inlet, duct, and body geometries. Finite differences and line relaxation are used to solve the complete potential flow equation. The coordinate system used for the calculations is independent of body geometry. Cylindrical coordinates are used for the computer code. The analysis is programmed in extended FORTRAN 4 for the CYBER 203 vector computer. The programming of the analysis is oriented toward taking advantage of the vector processing capabilities of this computer. Comparisons of computed results with experimental measurements are presented to verify the analysis. Descriptions of program input and output formats are also presented.
NASA Technical Reports Server (NTRS)
Nguyen, D. T.; Al-Nasra, M.; Zhang, Y.; Baddourah, M. A.; Agarwal, T. K.; Storaasli, O. O.; Carmona, E. A.
1991-01-01
Several parallel-vector computational improvements to the unconstrained optimization procedure are described which speed up the structural analysis-synthesis process. A fast parallel-vector Choleski-based equation solver, pvsolve, is incorporated into the well-known SAP-4 general-purpose finite-element code. The new code, denoted PV-SAP, is tested for static structural analysis. Initial results on a four processor CRAY 2 show that using pvsolve reduces the equation solution time by a factor of 14-16 over the original SAP-4 code. In addition, parallel-vector procedures for the Golden Block Search technique and the BFGS method are developed and tested for nonlinear unconstrained optimization. A parallel version of an iterative solver and the pvsolve direct solver are incorporated into the BFGS method. Preliminary results on nonlinear unconstrained optimization test problems, using pvsolve in the analysis, show excellent parallel-vector performance indicating that these parallel-vector algorithms can be used in a new generation of finite-element based structural design/analysis-synthesis codes.
NASA Technical Reports Server (NTRS)
Muellerschoen, R. J.
1988-01-01
A unified method to permute vector stored Upper triangular Diagonal factorized covariance and vector stored upper triangular Square Root Information arrays is presented. The method involves cyclic permutation of the rows and columns of the arrays and retriangularization with fast (slow) Givens rotations (reflections). Minimal computation is performed, and a one dimensional scratch array is required. To make the method efficient for large arrays on a virtual memory machine, computations are arranged so as to avoid expensive paging faults. This method is potentially important for processing large volumes of radio metric data in the Deep Space Network.
Brian hears: online auditory processing using vectorization over channels.
Fontaine, Bertrand; Goodman, Dan F M; Benichoux, Victor; Brette, Romain
2011-01-01
The human cochlea includes about 3000 inner hair cells which filter sounds at frequencies between 20 Hz and 20 kHz. This massively parallel frequency analysis is reflected in models of auditory processing, which are often based on banks of filters. However, existing implementations do not exploit this parallelism. Here we propose algorithms to simulate these models by vectorizing computation over frequency channels, which are implemented in "Brian Hears," a library for the spiking neural network simulator package "Brian." This approach allows us to use high-level programming languages such as Python, because with vectorized operations, the computational cost of interpretation represents a small fraction of the total cost. This makes it possible to define and simulate complex models in a simple way, while all previous implementations were model-specific. In addition, we show that these algorithms can be naturally parallelized using graphics processing units, yielding substantial speed improvements. We demonstrate these algorithms with several state-of-the-art cochlear models, and show that they compare favorably with existing, less flexible, implementations.
CFD Research, Parallel Computation and Aerodynamic Optimization
NASA Technical Reports Server (NTRS)
Ryan, James S.
1995-01-01
During the last five years, CFD has matured substantially. Pure CFD research remains to be done, but much of the focus has shifted to integration of CFD into the design process. The work under these cooperative agreements reflects this trend. The recent work, and work which is planned, is designed to enhance the competitiveness of the US aerospace industry. CFD and optimization approaches are being developed and tested, so that the industry can better choose which methods to adopt in their design processes. The range of computer architectures has been dramatically broadened, as the assumption that only huge vector supercomputers could be useful has faded. Today, researchers and industry can trade off time, cost, and availability, choosing vector supercomputers, scalable parallel architectures, networked workstations, or heterogenous combinations of these to complete required computations efficiently.
CFD Extraction Tool for TecPlot From DPLR Solutions
NASA Technical Reports Server (NTRS)
Norman, David
2013-01-01
This invention is a TecPlot macro of a computer program in the TecPlot programming language that processes data from DPLR solutions in TecPlot format. DPLR (Data-Parallel Line Relaxation) is a NASA computational fluid dynamics (CFD) code, and TecPlot is a commercial CFD post-processing tool. The Tec- Plot data is in SI units (same as DPLR output). The invention converts the SI units into British units. The macro modifies the TecPlot data with unit conversions, and adds some extra calculations. After unit conversions, the macro cuts a slice, and adds vectors on the current plot for output format. The macro can also process surface solutions. Existing solutions use manual conversion and superposition. The conversion is complicated because it must be applied to a range of inter-related scalars and vectors to describe a 2D or 3D flow field. It processes the CFD solution to create superposition/comparison of scalars and vectors. The existing manual solution is cumbersome, open to errors, slow, and cannot be inserted into an automated process. This invention is quick and easy to use, and can be inserted into an automated data-processing algorithm.
NASA Astrophysics Data System (ADS)
Kepner, J. V.; Janka, R. S.; Lebak, J.; Richards, M. A.
1999-12-01
The Vector/Signal/Image Processing Library (VSIPL) is a DARPA initiated effort made up of industry, government and academic representatives who have defined an industry standard API for vector, signal, and image processing primitives for real-time signal processing on high performance systems. VSIPL supports a wide range of data types (int, float, complex, ...) and layouts (vectors, matrices and tensors) and is ideal for astronomical data processing. The VSIPL API is intended to serve as an open, vendor-neutral, industry standard interface. The object-based VSIPL API abstracts the memory architecture of the underlying machine by using the concept of memory blocks and views. Early experiments with VSIPL code conversions have been carried out by the High Performance Computing Program team at the UCSD. Commercially, several major vendors of signal processors are actively developing implementations. VSIPL has also been explicitly required as part of a recent Rome Labs teraflop procurement. This poster presents the VSIPL API, its functionality and the status of various implementations.
NASA Technical Reports Server (NTRS)
Smith, R. E.; Pitts, J. I.; Lambiotte, J. J., Jr.
1978-01-01
The computer program FLO-22 for analyzing inviscid transonic flow past 3-D swept-wing configurations was modified to use vector operations and run on the STAR-100 computer. The vectorized version described herein was called FLO-22-V1. Vector operations were incorporated into Successive Line Over-Relaxation in the transformed horizontal direction. Vector relational operations and control vectors were used to implement upwind differencing at supersonic points. A high speed of computation and extended grid domain were characteristics of FLO-22-V1. The new program was not the optimal vectorization of Successive Line Over-Relaxation applied to transonic flow; however, it proved that vector operations can readily be implemented to increase the computation rate of the algorithm.
Multidirectional Scanning Model, MUSCLE, to Vectorize Raster Images with Straight Lines
Karas, Ismail Rakip; Bayram, Bulent; Batuk, Fatmagul; Akay, Abdullah Emin; Baz, Ibrahim
2008-01-01
This paper presents a new model, MUSCLE (Multidirectional Scanning for Line Extraction), for automatic vectorization of raster images with straight lines. The algorithm of the model implements the line thinning and the simple neighborhood methods to perform vectorization. The model allows users to define specified criteria which are crucial for acquiring the vectorization process. In this model, various raster images can be vectorized such as township plans, maps, architectural drawings, and machine plans. The algorithm of the model was developed by implementing an appropriate computer programming and tested on a basic application. Results, verified by using two well known vectorization programs (WinTopo and Scan2CAD), indicated that the model can successfully vectorize the specified raster data quickly and accurately. PMID:27879843
Multidirectional Scanning Model, MUSCLE, to Vectorize Raster Images with Straight Lines.
Karas, Ismail Rakip; Bayram, Bulent; Batuk, Fatmagul; Akay, Abdullah Emin; Baz, Ibrahim
2008-04-15
This paper presents a new model, MUSCLE (Multidirectional Scanning for Line Extraction), for automatic vectorization of raster images with straight lines. The algorithm of the model implements the line thinning and the simple neighborhood methods to perform vectorization. The model allows users to define specified criteria which are crucial for acquiring the vectorization process. In this model, various raster images can be vectorized such as township plans, maps, architectural drawings, and machine plans. The algorithm of the model was developed by implementing an appropriate computer programming and tested on a basic application. Results, verified by using two well known vectorization programs (WinTopo and Scan2CAD), indicated that the model can successfully vectorize the specified raster data quickly and accurately.
Reduced state feedback gain computation. [optimization and control theory for aircraft control
NASA Technical Reports Server (NTRS)
Kaufman, H.
1976-01-01
Because application of conventional optimal linear regulator theory to flight controller design requires the capability of measuring and/or estimating the entire state vector, it is of interest to consider procedures for computing controls which are restricted to be linear feedback functions of a lower dimensional output vector and which take into account the presence of measurement noise and process uncertainty. Therefore, a stochastic linear model that was developed is presented which accounts for aircraft parameter and initial uncertainty, measurement noise, turbulence, pilot command and a restricted number of measurable outputs. Optimization with respect to the corresponding output feedback gains was performed for both finite and infinite time performance indices without gradient computation by using Zangwill's modification of a procedure originally proposed by Powell. Results using a seventh order process show the proposed procedures to be very effective.
1988-03-01
PACKAGE BODY ) TLCSC P661 (CATALOG #P106-0) This package contains the CAMP parts required to do the vaypoint steering portion of navigation. The...3.3.4.1.6 PROCESSING The following describes the processing performed by this part: package body WaypointSteering is package body ...Steering_Vector_Operations is separate; package body Steering_Vector_Operations_with_Arcsin is separate; procedure Compute Turn_Angle_and Direction (UnitNormal C
Peuquet, D.J.
1981-01-01
Current graphic devices suitable for high-speed computer input and output of cartographic data are tending more and more to be raster-oriented, such as the rotating drum scanner and the color raster display. However, the majority of commonly used manipulative techniques in computer-assisted cartography and automated spatial data handling continue to require that the data be in vector format. The current article is the second part of a two-part paper that examines the state of the art in these conversion techniques. - from Author
Dual-scale topology optoelectronic processor.
Marsden, G C; Krishnamoorthy, A V; Esener, S C; Lee, S H
1991-12-15
The dual-scale topology optoelectronic processor (D-STOP) is a parallel optoelectronic architecture for matrix algebraic processing. The architecture can be used for matrix-vector multiplication and two types of vector outer product. The computations are performed electronically, which allows multiplication and summation concepts in linear algebra to be generalized to various nonlinear or symbolic operations. This generalization permits the application of D-STOP to many computational problems. The architecture uses a minimum number of optical transmitters, which thereby reduces fabrication requirements while maintaining area-efficient electronics. The necessary optical interconnections are space invariant, minimizing space-bandwidth requirements.
ERIC Educational Resources Information Center
Mikula, Brendon D.; Heckler, Andrew F.
2017-01-01
We propose a framework for improving accuracy, fluency, and retention of basic skills essential for solving problems relevant to STEM introductory courses, and implement the framework for the case of basic vector math skills over several semesters in an introductory physics course. Using an iterative development process, the framework begins with…
Optimized scalable network switch
Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Takken, Todd E [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY
2007-12-04
In a massively parallel computing system having a plurality of nodes configured in m multi-dimensions, each node including a computing device, a method for routing packets towards their destination nodes is provided which includes generating at least one of a 2m plurality of compact bit vectors containing information derived from downstream nodes. A multilevel arbitration process in which downstream information stored in the compact vectors, such as link status information and fullness of downstream buffers, is used to determine a preferred direction and virtual channel for packet transmission. Preferred direction ranges are encoded and virtual channels are selected by examining the plurality of compact bit vectors. This dynamic routing method eliminates the necessity of routing tables, thus enhancing scalability of the switch.
Optimized scalable network switch
Blumrich, Matthias A.; Chen, Dong; Coteus, Paul W.
2010-02-23
In a massively parallel computing system having a plurality of nodes configured in m multi-dimensions, each node including a computing device, a method for routing packets towards their destination nodes is provided which includes generating at least one of a 2m plurality of compact bit vectors containing information derived from downstream nodes. A multilevel arbitration process in which downstream information stored in the compact vectors, such as link status information and fullness of downstream buffers, is used to determine a preferred direction and virtual channel for packet transmission. Preferred direction ranges are encoded and virtual channels are selected by examining the plurality of compact bit vectors. This dynamic routing method eliminates the necessity of routing tables, thus enhancing scalability of the switch.
Acceleration of GPU-based Krylov solvers via data transfer reduction
Anzt, Hartwig; Tomov, Stanimire; Luszczek, Piotr; ...
2015-04-08
Krylov subspace iterative solvers are often the method of choice when solving large sparse linear systems. At the same time, hardware accelerators such as graphics processing units continue to offer significant floating point performance gains for matrix and vector computations through easy-to-use libraries of computational kernels. However, as these libraries are usually composed of a well optimized but limited set of linear algebra operations, applications that use them often fail to reduce certain data communications, and hence fail to leverage the full potential of the accelerator. In this study, we target the acceleration of Krylov subspace iterative methods for graphicsmore » processing units, and in particular the Biconjugate Gradient Stabilized solver that significant improvement can be achieved by reformulating the method to reduce data-communications through application-specific kernels instead of using the generic BLAS kernels, e.g. as provided by NVIDIA’s cuBLAS library, and by designing a graphics processing unit specific sparse matrix-vector product kernel that is able to more efficiently use the graphics processing unit’s computing power. Furthermore, we derive a model estimating the performance improvement, and use experimental data to validate the expected runtime savings. Finally, considering that the derived implementation achieves significantly higher performance, we assert that similar optimizations addressing algorithm structure, as well as sparse matrix-vector, are crucial for the subsequent development of high-performance graphics processing units accelerated Krylov subspace iterative methods.« less
Brian Hears: Online Auditory Processing Using Vectorization Over Channels
Fontaine, Bertrand; Goodman, Dan F. M.; Benichoux, Victor; Brette, Romain
2011-01-01
The human cochlea includes about 3000 inner hair cells which filter sounds at frequencies between 20 Hz and 20 kHz. This massively parallel frequency analysis is reflected in models of auditory processing, which are often based on banks of filters. However, existing implementations do not exploit this parallelism. Here we propose algorithms to simulate these models by vectorizing computation over frequency channels, which are implemented in “Brian Hears,” a library for the spiking neural network simulator package “Brian.” This approach allows us to use high-level programming languages such as Python, because with vectorized operations, the computational cost of interpretation represents a small fraction of the total cost. This makes it possible to define and simulate complex models in a simple way, while all previous implementations were model-specific. In addition, we show that these algorithms can be naturally parallelized using graphics processing units, yielding substantial speed improvements. We demonstrate these algorithms with several state-of-the-art cochlear models, and show that they compare favorably with existing, less flexible, implementations. PMID:21811453
Design of a mixer for the thrust-vectoring system on the high-alpha research vehicle
NASA Technical Reports Server (NTRS)
Pahle, Joseph W.; Bundick, W. Thomas; Yeager, Jessie C.; Beissner, Fred L., Jr.
1996-01-01
One of the advanced control concepts being investigated on the High-Alpha Research Vehicle (HARV) is multi-axis thrust vectoring using an experimental thrust-vectoring (TV) system consisting of three hydraulically actuated vanes per engine. A mixer is used to translate the pitch-, roll-, and yaw-TV commands into the appropriate TV-vane commands for distribution to the vane actuators. A computer-aided optimization process was developed to perform the inversion of the thrust-vectoring effectiveness data for use by the mixer in performing this command translation. Using this process a new mixer was designed for the HARV and evaluated in simulation and flight. An important element of the Mixer is the priority logic, which determines priority among the pitch-, roll-, and yaw-TV commands.
Semiautomated skeletonization of the pulmonary arterial tree in micro-CT images
NASA Astrophysics Data System (ADS)
Hanger, Christopher C.; Haworth, Steven T.; Molthen, Robert C.; Dawson, Christopher A.
2001-05-01
We present a simple and robust approach that utilizes planar images at different angular rotations combined with unfiltered back-projection to locate the central axes of the pulmonary arterial tree. Three-dimensional points are selected interactively by the user. The computer calculates a sub- volume unfiltered back-projection orthogonal to the vector connecting the two points and centered on the first point. Because more x-rays are absorbed at the thickest portion of the vessel, in the unfiltered back-projection, the darkest pixel is assumed to be the center of the vessel. The computer replaces this point with the newly computer-calculated point. A second back-projection is calculated around the original point orthogonal to a vector connecting the newly-calculated first point and user-determined second point. The darkest pixel within the reconstruction is determined. The computer then replaces the second point with the XYZ coordinates of the darkest pixel within this second reconstruction. Following a vector based on a moving average of previously determined 3- dimensional points along the vessel's axis, the computer continues this skeletonization process until stopped by the user. The computer estimates the vessel diameter along the set of previously determined points using a method similar to the full width-half max algorithm. On all subsequent vessels, the process works the same way except that at each point, distances between the current point and all previously determined points along different vessels are determined. If the difference is less than the previously estimated diameter, the vessels are assumed to branch. This user/computer interaction continues until the vascular tree has been skeletonized.
A border-ownership model based on computational electromagnetism.
Zainal, Zaem Arif; Satoh, Shunji
2018-03-01
The mathematical relation between a vector electric field and its corresponding scalar potential field is useful to formulate computational problems of lower/middle-order visual processing, specifically related to the assignment of borders to the side of the object: so-called border ownership (BO). BO coding is a key process for extracting the objects from the background, allowing one to organize a cluttered scene. We propose that the problem is solvable simultaneously by application of a theorem of electromagnetism, i.e., "conservative vector fields have zero rotation, or "curl." We hypothesize that (i) the BO signal is definable as a vector electric field with arrowheads pointing to the inner side of perceived objects, and (ii) its corresponding scalar field carries information related to perceived order in depth of occluding/occluded objects. A simple model was developed based on this computational theory. Model results qualitatively agree with object-side selectivity of BO-coding neurons, and with perceptions of object order. The model update rule can be reproduced as a plausible neural network that presents new interpretations of existing physiological results. Results of this study also suggest that T-junction detectors are unnecessary to calculate depth order. Copyright © 2017 Elsevier Ltd. All rights reserved.
Model-based VQ for image data archival, retrieval and distribution
NASA Technical Reports Server (NTRS)
Manohar, Mareboyana; Tilton, James C.
1995-01-01
An ideal image compression technique for image data archival, retrieval and distribution would be one with the asymmetrical computational requirements of Vector Quantization (VQ), but without the complications arising from VQ codebooks. Codebook generation and maintenance are stumbling blocks which have limited the use of VQ as a practical image compression algorithm. Model-based VQ (MVQ), a variant of VQ described here, has the computational properties of VQ but does not require explicit codebooks. The codebooks are internally generated using mean removed error and Human Visual System (HVS) models. The error model assumed is the Laplacian distribution with mean, lambda-computed from a sample of the input image. A Laplacian distribution with mean, lambda, is generated with uniform random number generator. These random numbers are grouped into vectors. These vectors are further conditioned to make them perceptually meaningful by filtering the DCT coefficients from each vector. The DCT coefficients are filtered by multiplying by a weight matrix that is found to be optimal for human perception. The inverse DCT is performed to produce the conditioned vectors for the codebook. The only image dependent parameter used in the generation of codebook is the mean, lambda, that is included in the coded file to repeat the codebook generation process for decoding.
Increasing the computational efficient of digital cross correlation by a vectorization method
NASA Astrophysics Data System (ADS)
Chang, Ching-Yuan; Ma, Chien-Ching
2017-08-01
This study presents a vectorization method for use in MATLAB programming aimed at increasing the computational efficiency of digital cross correlation in sound and images, resulting in a speedup of 6.387 and 36.044 times compared with performance values obtained from looped expression. This work bridges the gap between matrix operations and loop iteration, preserving flexibility and efficiency in program testing. This paper uses numerical simulation to verify the speedup of the proposed vectorization method as well as experiments to measure the quantitative transient displacement response subjected to dynamic impact loading. The experiment involved the use of a high speed camera as well as a fiber optic system to measure the transient displacement in a cantilever beam under impact from a steel ball. Experimental measurement data obtained from the two methods are in excellent agreement in both the time and frequency domain, with discrepancies of only 0.68%. Numerical and experiment results demonstrate the efficacy of the proposed vectorization method with regard to computational speed in signal processing and high precision in the correlation algorithm. We also present the source code with which to build MATLAB-executable functions on Windows as well as Linux platforms, and provide a series of examples to demonstrate the application of the proposed vectorization method.
Algorithms for solving large sparse systems of simultaneous linear equations on vector processors
NASA Technical Reports Server (NTRS)
David, R. E.
1984-01-01
Very efficient algorithms for solving large sparse systems of simultaneous linear equations have been developed for serial processing computers. These involve a reordering of matrix rows and columns in order to obtain a near triangular pattern of nonzero elements. Then an LU factorization is developed to represent the matrix inverse in terms of a sequence of elementary Gaussian eliminations, or pivots. In this paper it is shown how these algorithms are adapted for efficient implementation on vector processors. Results obtained on the CYBER 200 Model 205 are presented for a series of large test problems which show the comparative advantages of the triangularization and vector processing algorithms.
2014-05-01
fusion, space and astrophysical plasmas, but still the general picture can be presented quite well with the fluid approach [6, 7]. The microscopic...purpose computing CPU for algorithms where processing of large blocks of data is done in parallel. The reason for that is the GPU’s highly effective...parallel structure. Most of the image and video processing computations involve heavy matrix and vector op- erations over large amounts of data and
NASA Technical Reports Server (NTRS)
Kenner, B. G.; Lincoln, N. R.
1979-01-01
The manual is intended to show the revisions and additions to the current STAR FORTRAN. The changes are made to incorporate an FMP (Flow Model Processor) for use in the Numerical Aerodynamic Simulation Facility (NASF) for the purpose of simulating fluid flow over three-dimensional bodies in wind tunnel environments and in free space. The FORTRAN programming language for the STAR-100 computer contains both CDC and unique STAR extensions to the standard FORTRAN. Several of the STAR FORTRAN extensions to standard FOR-TRAN allow the FORTRAN user to exploit the vector processing capabilities of the STAR computer. In STAR FORTRAN, vectors can be expressed with an explicit notation, functions are provided that return vector results, and special call statements enable access to any machine instruction.
Real-Time Symbol Extraction From Grey-Level Images
NASA Astrophysics Data System (ADS)
Massen, R.; Simnacher, M.; Rosch, J.; Herre, E.; Wuhrer, H. W.
1988-04-01
A VME-bus image pipeline processor for extracting vectorized contours from grey-level images in real-time is presented. This 3 Giga operation per second processor uses large kernel convolvers and new non-linear neighbourhood processing algorithms to compute true 1-pixel wide and noise-free contours without thresholding even from grey-level images with quite varying edge sharpness. The local edge orientation is used as an additional cue to compute a list of vectors describing the closed and open contours in real-time and to dump a CAD-like symbolic image description into a symbol memory at pixel clock rate.
NASA Astrophysics Data System (ADS)
Li, Tao
2018-06-01
The complexity of aluminum electrolysis process leads the temperature for aluminum reduction cells hard to measure directly. However, temperature is the control center of aluminum production. To solve this problem, combining some aluminum plant's practice data, this paper presents a Soft-sensing model of temperature for aluminum electrolysis process on Improved Twin Support Vector Regression (ITSVR). ITSVR eliminates the slow learning speed of Support Vector Regression (SVR) and the over-fit risk of Twin Support Vector Regression (TSVR) by introducing a regularization term into the objective function of TSVR, which ensures the structural risk minimization principle and lower computational complexity. Finally, the model with some other parameters as auxiliary variable, predicts the temperature by ITSVR. The simulation result shows Soft-sensing model based on ITSVR has short time-consuming and better generalization.
QCD next-to-leading-order predictions matched to parton showers for vector-like quark models.
Fuks, Benjamin; Shao, Hua-Sheng
2017-01-01
Vector-like quarks are featured by a wealth of beyond the Standard Model theories and are consequently an important goal of many LHC searches for new physics. Those searches, as well as most related phenomenological studies, however, rely on predictions evaluated at the leading-order accuracy in QCD and consider well-defined simplified benchmark scenarios. Adopting an effective bottom-up approach, we compute next-to-leading-order predictions for vector-like-quark pair production and single production in association with jets, with a weak or with a Higgs boson in a general new physics setup. We additionally compute vector-like-quark contributions to the production of a pair of Standard Model bosons at the same level of accuracy. For all processes under consideration, we focus both on total cross sections and on differential distributions, most these calculations being performed for the first time in our field. As a result, our work paves the way to precise extraction of experimental limits on vector-like quarks thanks to an accurate control of the shapes of the relevant observables and emphasise the extra handles that could be provided by novel vector-like-quark probes never envisaged so far.
Associative Pattern Recognition In Analog VLSI Circuits
NASA Technical Reports Server (NTRS)
Tawel, Raoul
1995-01-01
Winner-take-all circuit selects best-match stored pattern. Prototype cascadable very-large-scale integrated (VLSI) circuit chips built and tested to demonstrate concept of electronic associative pattern recognition. Based on low-power, sub-threshold analog complementary oxide/semiconductor (CMOS) VLSI circuitry, each chip can store 128 sets (vectors) of 16 analog values (vector components), vectors representing known patterns as diverse as spectra, histograms, graphs, or brightnesses of pixels in images. Chips exploit parallel nature of vector quantization architecture to implement highly parallel processing in relatively simple computational cells. Through collective action, cells classify input pattern in fraction of microsecond while consuming power of few microwatts.
Computational Investigation of Fluidic Counterflow Thrust Vectoring
NASA Technical Reports Server (NTRS)
Hunter, Craig A.; Deere, Karen A.
1999-01-01
A computational study of fluidic counterflow thrust vectoring has been conducted. Two-dimensional numerical simulations were run using the computational fluid dynamics code PAB3D with two-equation turbulence closure and linear Reynolds stress modeling. For validation, computational results were compared to experimental data obtained at the NASA Langley Jet Exit Test Facility. In general, computational results were in good agreement with experimental performance data, indicating that efficient thrust vectoring can be obtained with low secondary flow requirements (less than 1% of the primary flow). An examination of the computational flowfield has revealed new details about the generation of a countercurrent shear layer, its relation to secondary suction, and its role in thrust vectoring. In addition to providing new information about the physics of counterflow thrust vectoring, this work appears to be the first documented attempt to simulate the counterflow thrust vectoring problem using computational fluid dynamics.
Transferring ecosystem simulation codes to supercomputers
NASA Technical Reports Server (NTRS)
Skiles, J. W.; Schulbach, C. H.
1995-01-01
Many ecosystem simulation computer codes have been developed in the last twenty-five years. This development took place initially on main-frame computers, then mini-computers, and more recently, on micro-computers and workstations. Supercomputing platforms (both parallel and distributed systems) have been largely unused, however, because of the perceived difficulty in accessing and using the machines. Also, significant differences in the system architectures of sequential, scalar computers and parallel and/or vector supercomputers must be considered. We have transferred a grassland simulation model (developed on a VAX) to a Cray Y-MP/C90. We describe porting the model to the Cray and the changes we made to exploit the parallelism in the application and improve code execution. The Cray executed the model 30 times faster than the VAX and 10 times faster than a Unix workstation. We achieved an additional speedup of 30 percent by using the compiler's vectoring and 'in-line' capabilities. The code runs at only about 5 percent of the Cray's peak speed because it ineffectively uses the vector and parallel processing capabilities of the Cray. We expect that by restructuring the code, it could execute an additional six to ten times faster.
GPU-accelerated adjoint algorithmic differentiation
NASA Astrophysics Data System (ADS)
Gremse, Felix; Höfter, Andreas; Razik, Lukas; Kiessling, Fabian; Naumann, Uwe
2016-03-01
Many scientific problems such as classifier training or medical image reconstruction can be expressed as minimization of differentiable real-valued cost functions and solved with iterative gradient-based methods. Adjoint algorithmic differentiation (AAD) enables automated computation of gradients of such cost functions implemented as computer programs. To backpropagate adjoint derivatives, excessive memory is potentially required to store the intermediate partial derivatives on a dedicated data structure, referred to as the ;tape;. Parallelization is difficult because threads need to synchronize their accesses during taping and backpropagation. This situation is aggravated for many-core architectures, such as Graphics Processing Units (GPUs), because of the large number of light-weight threads and the limited memory size in general as well as per thread. We show how these limitations can be mediated if the cost function is expressed using GPU-accelerated vector and matrix operations which are recognized as intrinsic functions by our AAD software. We compare this approach with naive and vectorized implementations for CPUs. We use four increasingly complex cost functions to evaluate the performance with respect to memory consumption and gradient computation times. Using vectorization, CPU and GPU memory consumption could be substantially reduced compared to the naive reference implementation, in some cases even by an order of complexity. The vectorization allowed usage of optimized parallel libraries during forward and reverse passes which resulted in high speedups for the vectorized CPU version compared to the naive reference implementation. The GPU version achieved an additional speedup of 7.5 ± 4.4, showing that the processing power of GPUs can be utilized for AAD using this concept. Furthermore, we show how this software can be systematically extended for more complex problems such as nonlinear absorption reconstruction for fluorescence-mediated tomography.
GPU-Accelerated Adjoint Algorithmic Differentiation.
Gremse, Felix; Höfter, Andreas; Razik, Lukas; Kiessling, Fabian; Naumann, Uwe
2016-03-01
Many scientific problems such as classifier training or medical image reconstruction can be expressed as minimization of differentiable real-valued cost functions and solved with iterative gradient-based methods. Adjoint algorithmic differentiation (AAD) enables automated computation of gradients of such cost functions implemented as computer programs. To backpropagate adjoint derivatives, excessive memory is potentially required to store the intermediate partial derivatives on a dedicated data structure, referred to as the "tape". Parallelization is difficult because threads need to synchronize their accesses during taping and backpropagation. This situation is aggravated for many-core architectures, such as Graphics Processing Units (GPUs), because of the large number of light-weight threads and the limited memory size in general as well as per thread. We show how these limitations can be mediated if the cost function is expressed using GPU-accelerated vector and matrix operations which are recognized as intrinsic functions by our AAD software. We compare this approach with naive and vectorized implementations for CPUs. We use four increasingly complex cost functions to evaluate the performance with respect to memory consumption and gradient computation times. Using vectorization, CPU and GPU memory consumption could be substantially reduced compared to the naive reference implementation, in some cases even by an order of complexity. The vectorization allowed usage of optimized parallel libraries during forward and reverse passes which resulted in high speedups for the vectorized CPU version compared to the naive reference implementation. The GPU version achieved an additional speedup of 7.5 ± 4.4, showing that the processing power of GPUs can be utilized for AAD using this concept. Furthermore, we show how this software can be systematically extended for more complex problems such as nonlinear absorption reconstruction for fluorescence-mediated tomography.
GPU-Accelerated Adjoint Algorithmic Differentiation
Gremse, Felix; Höfter, Andreas; Razik, Lukas; Kiessling, Fabian; Naumann, Uwe
2015-01-01
Many scientific problems such as classifier training or medical image reconstruction can be expressed as minimization of differentiable real-valued cost functions and solved with iterative gradient-based methods. Adjoint algorithmic differentiation (AAD) enables automated computation of gradients of such cost functions implemented as computer programs. To backpropagate adjoint derivatives, excessive memory is potentially required to store the intermediate partial derivatives on a dedicated data structure, referred to as the “tape”. Parallelization is difficult because threads need to synchronize their accesses during taping and backpropagation. This situation is aggravated for many-core architectures, such as Graphics Processing Units (GPUs), because of the large number of light-weight threads and the limited memory size in general as well as per thread. We show how these limitations can be mediated if the cost function is expressed using GPU-accelerated vector and matrix operations which are recognized as intrinsic functions by our AAD software. We compare this approach with naive and vectorized implementations for CPUs. We use four increasingly complex cost functions to evaluate the performance with respect to memory consumption and gradient computation times. Using vectorization, CPU and GPU memory consumption could be substantially reduced compared to the naive reference implementation, in some cases even by an order of complexity. The vectorization allowed usage of optimized parallel libraries during forward and reverse passes which resulted in high speedups for the vectorized CPU version compared to the naive reference implementation. The GPU version achieved an additional speedup of 7.5 ± 4.4, showing that the processing power of GPUs can be utilized for AAD using this concept. Furthermore, we show how this software can be systematically extended for more complex problems such as nonlinear absorption reconstruction for fluorescence-mediated tomography. PMID:26941443
Controlling flexible robot arms using a high speed dynamics process
NASA Technical Reports Server (NTRS)
Jain, Abhinandan (Inventor); Rodriguez, Guillermo (Inventor)
1992-01-01
Described here is a robot controller for a flexible manipulator arm having plural bodies connected at respective movable hinges, and flexible in plural deformation modes. It is operated by computing articulated body qualities for each of the bodies from the respective modal spatial influence vectors, obtaining specified body forces for each of the bodies, and computing modal deformation accelerations of the nodes and hinge accelerations of the hinges from the specified body forces, from the articulated body quantities and from the modal spatial influence vectors. In one embodiment of the invention, the controller further operates by comparing the accelerations thus computed to desired manipulator motion to determine a motion discrepancy, and correcting the specified body forces so as to reduce the motion discrepancy. The manipulator bodies and hinges are characterized by respective vectors of deformation and hinge configuration variables. Computing modal deformation accelerations and hinge accelerations is carried out for each of the bodies, beginning with the outermost body by computing a residual body force from a residual body force of a previous body, computing a resultant hinge acceleration from the body force, and then, for each one of the bodies beginning with the innermost body, computing a modal body acceleration from a modal body acceleration of a previous body, computing a modal deformation acceleration and hinge acceleration from the resulting hinge acceleration and from the modal body acceleration.
Improved parallel data partitioning by nested dissection with applications to information retrieval.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wolf, Michael M.; Chevalier, Cedric; Boman, Erik Gunnar
The computational work in many information retrieval and analysis algorithms is based on sparse linear algebra. Sparse matrix-vector multiplication is a common kernel in many of these computations. Thus, an important related combinatorial problem in parallel computing is how to distribute the matrix and the vectors among processors so as to minimize the communication cost. We focus on minimizing the total communication volume while keeping the computation balanced across processes. In [1], the first two authors presented a new 2D partitioning method, the nested dissection partitioning algorithm. In this paper, we improve on that algorithm and show that it ismore » a good option for data partitioning in information retrieval. We also show partitioning time can be substantially reduced by using the SCOTCH software, and quality improves in some cases, too.« less
Process for computing geometric perturbations for probabilistic analysis
Fitch, Simeon H. K. [Charlottesville, VA; Riha, David S [San Antonio, TX; Thacker, Ben H [San Antonio, TX
2012-04-10
A method for computing geometric perturbations for probabilistic analysis. The probabilistic analysis is based on finite element modeling, in which uncertainties in the modeled system are represented by changes in the nominal geometry of the model, referred to as "perturbations". These changes are accomplished using displacement vectors, which are computed for each node of a region of interest and are based on mean-value coordinate calculations.
Reconstruction of Sea State One
1988-02-01
this section only a general overview of the wave computer system will be offered. A more comprehensive treatment of this subject is available in Appendix...1) Sync Strip and Threshold Processing Card (2) Pulse Generation Logic Card (3) X Vector Logic Card (4) Y Vector Logic Card (5) Blanking Interval...output by this comparator when the threshold is crossed, which shall be referred to as threshold crossing (THC). (2) PULSE GENERATION LOGIC CARD Turning
Clifford support vector machines for classification, regression, and recurrence.
Bayro-Corrochano, Eduardo Jose; Arana-Daniel, Nancy
2010-11-01
This paper introduces the Clifford support vector machines (CSVM) as a generalization of the real and complex-valued support vector machines using the Clifford geometric algebra. In this framework, we handle the design of kernels involving the Clifford or geometric product. In this approach, one redefines the optimization variables as multivectors. This allows us to have a multivector as output. Therefore, we can represent multiple classes according to the dimension of the geometric algebra in which we work. We show that one can apply CSVM for classification and regression and also to build a recurrent CSVM. The CSVM is an attractive approach for the multiple input multiple output processing of high-dimensional geometric entities. We carried out comparisons between CSVM and the current approaches to solve multiclass classification and regression. We also study the performance of the recurrent CSVM with experiments involving time series. The authors believe that this paper can be of great use for researchers and practitioners interested in multiclass hypercomplex computing, particularly for applications in complex and quaternion signal and image processing, satellite control, neurocomputation, pattern recognition, computer vision, augmented virtual reality, robotics, and humanoids.
NASA Technical Reports Server (NTRS)
Weilmuenster, K. J.; Hamilton, H. H., II
1981-01-01
A computational technique for computing the three-dimensional inviscid flow over blunt bodies having large regions of embedded subsonic flow is detailed. Results, which were obtained using the CDC Cyber 203 vector processing computer, are presented for several analytic shapes with some comparison to experimental data. Finally, windward surface pressure computations over the first third of the Space Shuttle vehicle are compared with experimental data for angles of attack between 25 and 45 degrees.
NASA Astrophysics Data System (ADS)
Svatos, Adam Ladislav
This thesis describes the author's contributions to three separate projects. The bus of the NORSAT-2 satellite was developed by the Space Flight Laboratory (SFL) for the Norwegian Space Centre (NSC) and Space Norway. The author's contributions to the mission were performing unit tests for the components of all the spacecraft subsystems as well as designing and assembling the flatsat from flight spares. Gedex's Vector Gravimeter for Asteroids (VEGA) is an accelerometer for spacecraft. The author's contributions to this payload were modifying the instrument computer board schematic, designing the printed circuit board, developing and applying test software, and performing thermal acceptance testing of two instrument computer boards. The SFL's cylindrical Hall effect thruster combines the cylindrical configuration for a Hall thruster and uses permanent magnets to achieve miniaturization and low power consumption, respectively. The author's contributions were to design, build, and test an engineering model power processing unit.
Strategies for vectorizing the sparse matrix vector product on the CRAY XMP, CRAY 2, and CYBER 205
NASA Technical Reports Server (NTRS)
Bauschlicher, Charles W., Jr.; Partridge, Harry
1987-01-01
Large, randomly sparse matrix vector products are important in a number of applications in computational chemistry, such as matrix diagonalization and the solution of simultaneous equations. Vectorization of this process is considered for the CRAY XMP, CRAY 2, and CYBER 205, using a matrix of dimension of 20,000 with from 1 percent to 6 percent nonzeros. Efficient scatter/gather capabilities add coding flexibility and yield significant improvements in performance. For the CYBER 205, it is shown that minor changes in the IO can reduce the CPU time by a factor of 50. Similar changes in the CRAY codes make a far smaller improvement.
Programming the Navier-Stokes computer: An abstract machine model and a visual editor
NASA Technical Reports Server (NTRS)
Middleton, David; Crockett, Tom; Tomboulian, Sherry
1988-01-01
The Navier-Stokes computer is a parallel computer designed to solve Computational Fluid Dynamics problems. Each processor contains several floating point units which can be configured under program control to implement a vector pipeline with several inputs and outputs. Since the development of an effective compiler for this computer appears to be very difficult, machine level programming seems necessary and support tools for this process have been studied. These support tools are organized into a graphical program editor. A programming process is described by which appropriate computations may be efficiently implemented on the Navier-Stokes computer. The graphical editor would support this programming process, verifying various programmer choices for correctness and deducing values such as pipeline delays and network configurations. Step by step details are provided and demonstrated with two example programs.
Process for structural geologic analysis of topography and point data
Eliason, Jay R.; Eliason, Valerie L. C.
1987-01-01
A quantitative method of geologic structural analysis of digital terrain data is described for implementation on a computer. Assuming selected valley segments are controlled by the underlying geologic structure, topographic lows in the terrain data, defining valley bottoms, are detected, filtered and accumulated into a series line segments defining contiguous valleys. The line segments are then vectorized to produce vector segments, defining valley segments, which may be indicative of the underlying geologic structure. Coplanar analysis is performed on vector segment pairs to determine which vectors produce planes which represent underlying geologic structure. Point data such as fracture phenomena which can be related to fracture planes in 3-dimensional space can be analyzed to define common plane orientation and locations. The vectors, points, and planes are displayed in various formats for interpretation.
Dynamic visual attention: motion direction versus motion magnitude
NASA Astrophysics Data System (ADS)
Bur, A.; Wurtz, P.; Müri, R. M.; Hügli, H.
2008-02-01
Defined as an attentive process in the context of visual sequences, dynamic visual attention refers to the selection of the most informative parts of video sequence. This paper investigates the contribution of motion in dynamic visual attention, and specifically compares computer models designed with the motion component expressed either as the speed magnitude or as the speed vector. Several computer models, including static features (color, intensity and orientation) and motion features (magnitude and vector) are considered. Qualitative and quantitative evaluations are performed by comparing the computer model output with human saliency maps obtained experimentally from eye movement recordings. The model suitability is evaluated in various situations (synthetic and real sequences, acquired with fixed and moving camera perspective), showing advantages and inconveniences of each method as well as preferred domain of application.
Dissociable cognitive mechanisms underlying human path integration.
Wiener, Jan M; Berthoz, Alain; Wolbers, Thomas
2011-01-01
Path integration is a fundamental mechanism of spatial navigation. In non-human species, it is assumed to be an online process in which a homing vector is updated continuously during an outward journey. In contrast, human path integration has been conceptualized as a configural process in which travelers store working memory representations of path segments, with the computation of a homing vector only occurring when required. To resolve this apparent discrepancy, we tested whether humans can employ different path integration strategies in the same task. Using a triangle completion paradigm, participants were instructed either to continuously update the start position during locomotion (continuous strategy) or to remember the shape of the outbound path and to calculate home vectors on basis of this representation (configural strategy). While overall homing accuracy was superior in the configural condition, participants were quicker to respond during continuous updating, strongly suggesting that homing vectors were computed online. Corroborating these findings, we observed reliable differences in head orientation during the outbound path: when participants applied the continuous updating strategy, the head deviated significantly from straight ahead in direction of the start place, which can be interpreted as a continuous motor expression of the homing vector. Head orientation-a novel online measure for path integration-can thus inform about the underlying updating mechanism already during locomotion. In addition to demonstrating that humans can employ different cognitive strategies during path integration, our two-systems view helps to resolve recent controversies regarding the role of the medial temporal lobe in human path integration.
Personal Computer Transport Analysis Program
NASA Technical Reports Server (NTRS)
DiStefano, Frank, III; Wobick, Craig; Chapman, Kirt; McCloud, Peter
2012-01-01
The Personal Computer Transport Analysis Program (PCTAP) is C++ software used for analysis of thermal fluid systems. The program predicts thermal fluid system and component transients. The output consists of temperatures, flow rates, pressures, delta pressures, tank quantities, and gas quantities in the air, along with air scrubbing component performance. PCTAP s solution process assumes that the tubes in the system are well insulated so that only the heat transfer between fluid and tube wall and between adjacent tubes is modeled. The system described in the model file is broken down into its individual components; i.e., tubes, cold plates, heat exchangers, etc. A solution vector is built from the components and a flow is then simulated with fluid being transferred from one component to the next. The solution vector of components in the model file is built at the initiation of the run. This solution vector is simply a list of components in the order of their inlet dependency on other components. The component parameters are updated in the order in which they appear in the list at every time step. Once the solution vectors have been determined, PCTAP cycles through the components in the solution vector, executing their outlet function for each time-step increment.
Parallel Visualization Co-Processing of Overnight CFD Propulsion Applications
NASA Technical Reports Server (NTRS)
Edwards, David E.; Haimes, Robert
1999-01-01
An interactive visualization system pV3 is being developed for the investigation of advanced computational methodologies employing visualization and parallel processing for the extraction of information contained in large-scale transient engineering simulations. Visual techniques for extracting information from the data in terms of cutting planes, iso-surfaces, particle tracing and vector fields are included in this system. This paper discusses improvements to the pV3 system developed under NASA's Affordable High Performance Computing project.
Partitioning Rectangular and Structurally Nonsymmetric Sparse Matrices for Parallel Processing
DOE Office of Scientific and Technical Information (OSTI.GOV)
B. Hendrickson; T.G. Kolda
1998-09-01
A common operation in scientific computing is the multiplication of a sparse, rectangular or structurally nonsymmetric matrix and a vector. In many applications the matrix- transpose-vector product is also required. This paper addresses the efficient parallelization of these operations. We show that the problem can be expressed in terms of partitioning bipartite graphs. We then introduce several algorithms for this partitioning problem and compare their performance on a set of test matrices.
NASA Technical Reports Server (NTRS)
Kemp, William B., Jr.
1990-01-01
Guidelines are presented for use of the computer program PANCOR to assess the interference due to tunnel walls and model support in a slotted wind tunnel test section at subsonic speeds. Input data requirements are described in detail and program output and general program usage are described. The program is written for effective automatic vectorization on a CDC CYBER 200 class vector processing system.
NASA Technical Reports Server (NTRS)
Kumar, A.
1984-01-01
A computer program NASCRIN has been developed for analyzing two-dimensional flow fields in high-speed inlets. It solves the two-dimensional Euler or Navier-Stokes equations in conservation form by an explicit, two-step finite-difference method. An explicit-implicit method can also be used at the user's discretion for viscous flow calculations. For turbulent flow, an algebraic, two-layer eddy-viscosity model is used. The code is operational on the CDC CYBER 203 computer system and is highly vectorized to take full advantage of the vector-processing capability of the system. It is highly user oriented and is structured in such a way that for most supersonic flow problems, the user has to make only a few changes. Although the code is primarily written for supersonic internal flow, it can be used with suitable changes in the boundary conditions for a variety of other problems.
Spinozzi, Giulio; Calabria, Andrea; Brasca, Stefano; Beretta, Stefano; Merelli, Ivan; Milanesi, Luciano; Montini, Eugenio
2017-11-25
Bioinformatics tools designed to identify lentiviral or retroviral vector insertion sites in the genome of host cells are used to address the safety and long-term efficacy of hematopoietic stem cell gene therapy applications and to study the clonal dynamics of hematopoietic reconstitution. The increasing number of gene therapy clinical trials combined with the increasing amount of Next Generation Sequencing data, aimed at identifying integration sites, require both highly accurate and efficient computational software able to correctly process "big data" in a reasonable computational time. Here we present VISPA2 (Vector Integration Site Parallel Analysis, version 2), the latest optimized computational pipeline for integration site identification and analysis with the following features: (1) the sequence analysis for the integration site processing is fully compliant with paired-end reads and includes a sequence quality filter before and after the alignment on the target genome; (2) an heuristic algorithm to reduce false positive integration sites at nucleotide level to reduce the impact of Polymerase Chain Reaction or trimming/alignment artifacts; (3) a classification and annotation module for integration sites; (4) a user friendly web interface as researcher front-end to perform integration site analyses without computational skills; (5) the time speedup of all steps through parallelization (Hadoop free). We tested VISPA2 performances using simulated and real datasets of lentiviral vector integration sites, previously obtained from patients enrolled in a hematopoietic stem cell gene therapy clinical trial and compared the results with other preexisting tools for integration site analysis. On the computational side, VISPA2 showed a > 6-fold speedup and improved precision and recall metrics (1 and 0.97 respectively) compared to previously developed computational pipelines. These performances indicate that VISPA2 is a fast, reliable and user-friendly tool for integration site analysis, which allows gene therapy integration data to be handled in a cost and time effective fashion. Moreover, the web access of VISPA2 ( http://openserver.itb.cnr.it/vispa/ ) ensures accessibility and ease of usage to researches of a complex analytical tool. We released the source code of VISPA2 in a public repository ( https://bitbucket.org/andreacalabria/vispa2 ).
NASA Astrophysics Data System (ADS)
Xie, Ya-Ping; Chen, Xurong
2018-05-01
Photoproduction of vector mesons is computed with dipole model in proton-proton ultraperipheral collisions (UPCs) at the CERN Large Hadron Collider (LHC). The dipole model framework is employed in the calculations of vector mesons production in diffractive processes. Parameters of the bCGC model are refitted with the latest inclusive deep inelastic scattering experimental data. Employing the bCGC model and boosted Gaussian light-cone wave function for vector mesons, we obtain the prediction of rapidity distributions of J/ψ and ψ(2s) mesons in proton-proton ultraperipheral collisions at the LHC. The predictions give a good description of the experimental data of LHCb. Predictions of ϕ and ω mesons are also evaluated in this paper.
Quantum speed limit time in a magnetic resonance
NASA Astrophysics Data System (ADS)
Ivanchenko, E. A.
2017-12-01
A visualization for dynamics of a qudit spin vector in a time-dependent magnetic field is realized by means of mapping a solution for a spin vector on the three-dimensional spherical curve (vector hodograph). The obtained results obviously display the quantum interference of precessional and nutational effects on the spin vector in the magnetic resonance. For any spin the bottom bounds of the quantum speed limit time (QSL) are found. It is shown that the bottom bound goes down when using multilevel spin systems. Under certain conditions the non-nil minimal time, which is necessary to achieve the orthogonal state from the initial one, is attained at spin S = 2. An estimation of the product of two and three standard deviations of the spin components are presented. We discuss the dynamics of the mutual uncertainty, conditional uncertainty and conditional variance in terms of spin standard deviations. The study can find practical applications in the magnetic resonance, 3D visualization of computational data and in designing of optimized information processing devices for quantum computation and communication.
NASA Technical Reports Server (NTRS)
Decker, A. J.; Fite, E. B.; Thorp, S. A.; Mehmed, O.
1998-01-01
The responses of artificial neural networks to experimental and model-generated inputs are compared for detection of damage in twisted fan blades using electronic holography. The training-set inputs, for this work, are experimentally generated characteristic patterns of the vibrating blades. The outputs are damage-flag indicators or second derivatives of the sensitivity-vector-projected displacement vectors from a finite element model. Artificial neural networks have been trained in the past with computational-model-generated training sets. This approach avoids the difficult inverse calculations traditionally used to compare interference fringes with the models. But the high modeling standards are hard to achieve, even with fan-blade finite-element models.
NASA Technical Reports Server (NTRS)
Decker, A. J.; Fite, E. B.; Thorp, S. A.; Mehmed, O.
1998-01-01
The responses of artificial neural networks to experimental and model-generated inputs are compared for detection of damage in twisted fan blades using electronic holography. The training-set inputs, for this work, are experimentally generated characteristic patterns of the vibrating blades. The outputs are damage-flag indicators or second derivatives of the sensitivity-vector-projected displacement vectors from a finite element model. Artificial neural networks have been trained in the past with computational-model- generated training sets. This approach avoids the difficult inverse calculations traditionally used to compare interference fringes with the models. But the high modeling standards are hard to achieve, even with fan-blade finite-element models.
Rate determination from vector observations
NASA Technical Reports Server (NTRS)
Weiss, Jerold L.
1993-01-01
Vector observations are a common class of attitude data provided by a wide variety of attitude sensors. Attitude determination from vector observations is a well-understood process and numerous algorithms such as the TRIAD algorithm exist. These algorithms require measurement of the line of site (LOS) vector to reference objects and knowledge of the LOS directions in some predetermined reference frame. Once attitude is determined, it is a simple matter to synthesize vehicle rate using some form of lead-lag filter, and then, use it for vehicle stabilization. Many situations arise, however, in which rate knowledge is required but knowledge of the nominal LOS directions are not available. This paper presents two methods for determining spacecraft angular rates from vector observations without a priori knowledge of the vector directions. The first approach uses an extended Kalman filter with a spacecraft dynamic model and a kinematic model representing the motion of the observed LOS vectors. The second approach uses a 'differential' TRIAD algorithm to compute the incremental direction cosine matrix, from which vehicle rate is then derived.
SAPNEW: Parallel finite element code for thin shell structures on the Alliant FX/80
NASA Astrophysics Data System (ADS)
Kamat, Manohar P.; Watson, Brian C.
1992-02-01
The results of a research activity aimed at providing a finite element capability for analyzing turbo-machinery bladed-disk assemblies in a vector/parallel processing environment are summarized. Analysis of aircraft turbofan engines is very computationally intensive. The performance limit of modern day computers with a single processing unit was estimated at 3 billions of floating point operations per second (3 gigaflops). In view of this limit of a sequential unit, performance rates higher than 3 gigaflops can be achieved only through vectorization and/or parallelization as on Alliant FX/80. Accordingly, the efforts of this critically needed research were geared towards developing and evaluating parallel finite element methods for static and vibration analysis. A special purpose code, named with the acronym SAPNEW, performs static and eigen analysis of multi-degree-of-freedom blade models built-up from flat thin shell elements.
SAPNEW: Parallel finite element code for thin shell structures on the Alliant FX/80
NASA Technical Reports Server (NTRS)
Kamat, Manohar P.; Watson, Brian C.
1992-01-01
The results of a research activity aimed at providing a finite element capability for analyzing turbo-machinery bladed-disk assemblies in a vector/parallel processing environment are summarized. Analysis of aircraft turbofan engines is very computationally intensive. The performance limit of modern day computers with a single processing unit was estimated at 3 billions of floating point operations per second (3 gigaflops). In view of this limit of a sequential unit, performance rates higher than 3 gigaflops can be achieved only through vectorization and/or parallelization as on Alliant FX/80. Accordingly, the efforts of this critically needed research were geared towards developing and evaluating parallel finite element methods for static and vibration analysis. A special purpose code, named with the acronym SAPNEW, performs static and eigen analysis of multi-degree-of-freedom blade models built-up from flat thin shell elements.
Gschwind, Michael K
2013-04-16
Mechanisms for generating and executing programs for a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA) are provided. A computer program product comprising a computer recordable medium having a computer readable program recorded thereon is provided. The computer readable program, when executed on a computing device, causes the computing device to receive one or more instructions and execute the one or more instructions using logic in an execution unit of the computing device. The logic implements a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA), based on data stored in a vector register file of the computing device. The vector register file is configured to store both scalar and floating point values as vectors having a plurality of vector elements.
2017-02-01
note, a number of different measures implemented in both MATLAB and Python as functions are used to quantify similarity/distance between 2 vector-based...this technical note are widely used and may have an important role when computing the distance and similarity of large datasets and when considering high...throughput processes. In this technical note, a number of different measures implemented in both MAT- LAB and Python as functions are used to
NASA Astrophysics Data System (ADS)
Ledwon, Aleksandra; Bieda, Robert; Kawczyk-Krupka, Aleksandra; Polanski, Andrzej; Wojciechowski, Konrad; Latos, Wojciech; Sieron-Stoltny, Karolina; Sieron, Aleksander
2008-02-01
Background: Fluorescence diagnostics uses the ability of tissues to fluoresce after exposition to a specific wavelength of light. The change in fluorescence between normal and progression to cancer allows to see early cancer and precancerous lesions often missed by white light. Aim: To improve by computer image processing the sensitivity of fluorescence images obtained during examination of skin, oral cavity, vulva and cervix lesions, during endoscopy, cystoscopy and bronchoscopy using Xillix ONCOLIFE. Methods: Function of image f(x,y):R2 --> R 3 was transformed from original color space RGB to space in which vector of 46 values refers to every point labeled by defined xy-coordinates- f(x,y):R2 --> R 46. By means of Fisher discriminator vector of attributes of concrete point analalyzed in the image was reduced according to two defined classes defined as pathologic areas (foreground) and healthy areas (background). As a result the highest four fisher's coefficients allowing the greatest separation between points of pathologic (foreground) and healthy (background) areas were chosen. In this way new function f(x,y):R2 --> R 4 was created in which point x,y corresponds with vector Y, H, a*, c II. In the second step using Gaussian Mixtures and Expectation-Maximisation appropriate classificator was constructed. This classificator enables determination of probability that the selected pixel of analyzed image is a pathologically changed point (foreground) or healthy one (background). Obtained map of probability distribution was presented by means of pseudocolors. Results: Image processing techniques improve the sensitivity, quality and sharpness of original fluorescence images. Conclusion: Computer image processing enables better visualization of suspected areas examined by means of fluorescence diagnostics.
Autonomous Environment-Monitoring Networks
NASA Technical Reports Server (NTRS)
Hand, Charles
2004-01-01
Autonomous environment-monitoring networks (AEMNs) are artificial neural networks that are specialized for recognizing familiarity and, conversely, novelty. Like a biological neural network, an AEMN receives a constant stream of inputs. For purposes of computational implementation, the inputs are vector representations of the information of interest. As long as the most recent input vector is similar to the previous input vectors, no action is taken. Action is taken only when a novel vector is encountered. Whether a given input vector is regarded as novel depends on the previous vectors; hence, the same input vector could be regarded as familiar or novel, depending on the context of previous input vectors. AEMNs have been proposed as means to enable exploratory robots on remote planets to recognize novel features that could merit closer scientific attention. AEMNs could also be useful for processing data from medical instrumentation for automated monitoring or diagnosis. The primary substructure of an AEMN is called a spindle. In its simplest form, a spindle consists of a central vector (C), a scalar (r), and algorithms for changing C and r. The vector C is constructed from all the vectors in a given continuous stream of inputs, such that it is minimally distant from those vectors. The scalar r is the distance between C and the most remote vector in the same set. The construction of a spindle involves four vital parameters: setup size, spindle-population size, and the radii of two novelty boundaries. The setup size is the number of vectors that are taken into account before computing C. The spindle-population size is the total number of input vectors used in constructing the spindle counting both those that arrive before and those that arrive after the computation of C. The novelty-boundary radii are distances from C that partition the neighborhood around C into three concentric regions (see Figure 1). During construction of the spindle, the changing spindle radius is denoted by h. It is the final value of h, reached before beginning construction on the next spindle, that is denoted by r. During construction of a spindle, if a new vector falls between C and the inner boundary, the vector is regarded as completely familiar and no action is taken. If the new vector falls into the region between the inner and outer boundaries, it is considered unusual enough to warrant the adjustment of C and r by use of the aforementioned algorithms, but not unusual enough to be considered novel. If a vector falls outside the outer boundary, it is considered novel, in which case one of several appropriate responses could be initiation of construction of a new spindle.
HMM for hyperspectral spectrum representation and classification with endmember entropy vectors
NASA Astrophysics Data System (ADS)
Arabi, Samir Y. W.; Fernandes, David; Pizarro, Marco A.
2015-10-01
The Hyperspectral images due to its good spectral resolution are extensively used for classification, but its high number of bands requires a higher bandwidth in the transmission data, a higher data storage capability and a higher computational capability in processing systems. This work presents a new methodology for hyperspectral data classification that can work with a reduced number of spectral bands and achieve good results, comparable with processing methods that require all hyperspectral bands. The proposed method for hyperspectral spectra classification is based on the Hidden Markov Model (HMM) associated to each Endmember (EM) of a scene and the conditional probabilities of each EM belongs to each other EM. The EM conditional probability is transformed in EM vector entropy and those vectors are used as reference vectors for the classes in the scene. The conditional probability of a spectrum that will be classified is also transformed in a spectrum entropy vector, which is classified in a given class by the minimum ED (Euclidian Distance) among it and the EM entropy vectors. The methodology was tested with good results using AVIRIS spectra of a scene with 13 EM considering the full 209 bands and the reduced spectral bands of 128, 64 and 32. For the test area its show that can be used only 32 spectral bands instead of the original 209 bands, without significant loss in the classification process.
Design and experimental verification for optical module of optical vector-matrix multiplier.
Zhu, Weiwei; Zhang, Lei; Lu, Yangyang; Zhou, Ping; Yang, Lin
2013-06-20
Optical computing is a new method to implement signal processing functions. The multiplication between a vector and a matrix is an important arithmetic algorithm in the signal processing domain. The optical vector-matrix multiplier (OVMM) is an optoelectronic system to carry out this operation, which consists of an electronic module and an optical module. In this paper, we propose an optical module for OVMM. To eliminate the cross talk and make full use of the optical elements, an elaborately designed structure that involves spherical lenses and cylindrical lenses is utilized in this optical system. The optical design software package ZEMAX is used to optimize the parameters and simulate the whole system. Finally, experimental data is obtained through experiments to evaluate the overall performance of the system. The results of both simulation and experiment indicate that the system constructed can implement the multiplication between a matrix with dimensions of 16 by 16 and a vector with a dimension of 16 successfully.
High-performance computing — an overview
NASA Astrophysics Data System (ADS)
Marksteiner, Peter
1996-08-01
An overview of high-performance computing (HPC) is given. Different types of computer architectures used in HPC are discussed: vector supercomputers, high-performance RISC processors, various parallel computers like symmetric multiprocessors, workstation clusters, massively parallel processors. Software tools and programming techniques used in HPC are reviewed: vectorizing compilers, optimization and vector tuning, optimization for RISC processors; parallel programming techniques like shared-memory parallelism, message passing and data parallelism; and numerical libraries.
Characterization of a 300-GHz Transmission System for Digital Communications
NASA Astrophysics Data System (ADS)
Hudlička, Martin; Salhi, Mohammed; Kleine-Ostmann, Thomas; Schrader, Thorsten
2017-08-01
The paper presents the characterization of a 300-GHz transmission system for modern digital communications. The quality of the modulated signal at the output of the system (error vector magnitude, EVM) is measured using a vector signal analyzer. A method using a digital real-time oscilloscope and consecutive mathematical processing in a computer is shown for analysis of signals with bandwidths exceeding that of state-of-the-art vector signal analyzers. The uncertainty of EVM measured using the real-time oscilloscope is open to analysis. Behaviour of the 300-GHz transmission system is studied with respect to various modulation schemes and different signal symbol rates.
Economical Implementation of a Filter Engine in an FPGA
NASA Technical Reports Server (NTRS)
Kowalski, James E.
2009-01-01
A logic design has been conceived for a field-programmable gate array (FPGA) that would implement a complex system of multiple digital state-space filters. The main innovative aspect of this design lies in providing for reuse of parts of the FPGA hardware to perform different parts of the filter computations at different times, in such a manner as to enable the timely performance of all required computations in the face of limitations on available FPGA hardware resources. The implementation of the digital state-space filter involves matrix vector multiplications, which, in the absence of the present innovation, would ordinarily necessitate some multiplexing of vector elements and/or routing of data flows along multiple paths. The design concept calls for implementing vector registers as shift registers to simplify operand access to multipliers and accumulators, obviating both multiplexing and routing of data along multiple paths. Each vector register would be reused for different parts of a calculation. Outputs would always be drawn from the same register, and inputs would always be loaded into the same register. A simple state machine would control each filter. The output of a given filter would be passed to the next filter, accompanied by a "valid" signal, which would start the state machine of the next filter. Multiple filter modules would share a multiplication/accumulation arithmetic unit. The filter computations would be timed by use of a clock having a frequency high enough, relative to the input and output data rate, to provide enough cycles for matrix and vector arithmetic operations. This design concept could prove beneficial in numerous applications in which digital filters are used and/or vectors are multiplied by coefficient matrices. Examples of such applications include general signal processing, filtering of signals in control systems, processing of geophysical measurements, and medical imaging. For these and other applications, it could be advantageous to combine compact FPGA digital filter implementations with other application-specific logic implementations on single integrated-circuit chips. An FPGA could readily be tailored to implement a variety of filters because the filter coefficients would be loaded into memory at startup.
Selection vector filter framework
NASA Astrophysics Data System (ADS)
Lukac, Rastislav; Plataniotis, Konstantinos N.; Smolka, Bogdan; Venetsanopoulos, Anastasios N.
2003-10-01
We provide a unified framework of nonlinear vector techniques outputting the lowest ranked vector. The proposed framework constitutes a generalized filter class for multichannel signal processing. A new class of nonlinear selection filters are based on the robust order-statistic theory and the minimization of the weighted distance function to other input samples. The proposed method can be designed to perform a variety of filtering operations including previously developed filtering techniques such as vector median, basic vector directional filter, directional distance filter, weighted vector median filters and weighted directional filters. A wide range of filtering operations is guaranteed by the filter structure with two independent weight vectors for angular and distance domains of the vector space. In order to adapt the filter parameters to varying signal and noise statistics, we provide also the generalized optimization algorithms taking the advantage of the weighted median filters and the relationship between standard median filter and vector median filter. Thus, we can deal with both statistical and deterministic aspects of the filter design process. It will be shown that the proposed method holds the required properties such as the capability of modelling the underlying system in the application at hand, the robustness with respect to errors in the model of underlying system, the availability of the training procedure and finally, the simplicity of filter representation, analysis, design and implementation. Simulation studies also indicate that the new filters are computationally attractive and have excellent performance in environments corrupted by bit errors and impulsive noise.
Comparison of algorithms for computing the two-dimensional discrete Hartley transform
NASA Technical Reports Server (NTRS)
Reichenbach, Stephen E.; Burton, John C.; Miller, Keith W.
1989-01-01
Three methods have been described for computing the two-dimensional discrete Hartley transform. Two of these employ a separable transform, the third method, the vector-radix algorithm, does not require separability. In-place computation of the vector-radix method is described. Operation counts and execution times indicate that the vector-radix method is fastest.
AZTEC. Parallel Iterative method Software for Solving Linear Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hutchinson, S.; Shadid, J.; Tuminaro, R.
1995-07-01
AZTEC is an interactive library that greatly simplifies the parrallelization process when solving the linear systems of equations Ax=b where A is a user supplied n X n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. AZTEC is intended as a software tool for users who want to avoid cumbersome parallel programming details but who have large sparse linear systems which require an efficiently utilized parallel processing system. A collection of data transformation tools are provided that allow for easy creation of distributed sparse unstructured matricesmore » for parallel solutions.« less
Magnetic and gravity anomalies in the Americas
NASA Technical Reports Server (NTRS)
Braile, L. W.; Hinze, W. J.; Vonfrese, R. R. B. (Principal Investigator)
1981-01-01
The cleaning and magnetic tape storage of spherical Earth processing programs are reported. These programs include: NVERTSM which inverts total or vector magnetic anomaly data on a distribution of point dipoles in spherical coordinates; SMFLD which utilizes output from NVERTSM to compute total or vector magnetic anomaly fields for a distribution of point dipoles in spherical coordinates; NVERTG; and GFLD. Abstracts are presented for papers dealing with the mapping and modeling of magnetic and gravity anomalies, and with the verification of crustal components in satellite data.
Some Problems and Solutions in Transferring Ecosystem Simulation Codes to Supercomputers
NASA Technical Reports Server (NTRS)
Skiles, J. W.; Schulbach, C. H.
1994-01-01
Many computer codes for the simulation of ecological systems have been developed in the last twenty-five years. This development took place initially on main-frame computers, then mini-computers, and more recently, on micro-computers and workstations. Recent recognition of ecosystem science as a High Performance Computing and Communications Program Grand Challenge area emphasizes supercomputers (both parallel and distributed systems) as the next set of tools for ecological simulation. Transferring ecosystem simulation codes to such systems is not a matter of simply compiling and executing existing code on the supercomputer since there are significant differences in the system architectures of sequential, scalar computers and parallel and/or vector supercomputers. To more appropriately match the application to the architecture (necessary to achieve reasonable performance), the parallelism (if it exists) of the original application must be exploited. We discuss our work in transferring a general grassland simulation model (developed on a VAX in the FORTRAN computer programming language) to a Cray Y-MP. We show the Cray shared-memory vector-architecture, and discuss our rationale for selecting the Cray. We describe porting the model to the Cray and executing and verifying a baseline version, and we discuss the changes we made to exploit the parallelism in the application and to improve code execution. As a result, the Cray executed the model 30 times faster than the VAX 11/785 and 10 times faster than a Sun 4 workstation. We achieved an additional speed-up of approximately 30 percent over the original Cray run by using the compiler's vectorizing capabilities and the machine's ability to put subroutines and functions "in-line" in the code. With the modifications, the code still runs at only about 5% of the Cray's peak speed because it makes ineffective use of the vector processing capabilities of the Cray. We conclude with a discussion and future plans.
Discrimination of malignant lymphomas and leukemia using Radon transform based-higher order spectra
NASA Astrophysics Data System (ADS)
Luo, Yi; Celenk, Mehmet; Bejai, Prashanth
2006-03-01
A new algorithm that can be used to automatically recognize and classify malignant lymphomas and leukemia is proposed in this paper. The algorithm utilizes the morphological watersheds to obtain boundaries of cells from cell images and isolate them from the surrounding background. The areas of cells are extracted from cell images after background subtraction. The Radon transform and higher-order spectra (HOS) analysis are utilized as an image processing tool to generate class feature vectors of different type cells and to extract testing cells' feature vectors. The testing cells' feature vectors are then compared with the known class feature vectors for a possible match by computing the Euclidean distances. The cell in question is classified as belonging to one of the existing cell classes in the least Euclidean distance sense.
Estimation of the laser cutting operating cost by support vector regression methodology
NASA Astrophysics Data System (ADS)
Jović, Srđan; Radović, Aleksandar; Šarkoćević, Živče; Petković, Dalibor; Alizamir, Meysam
2016-09-01
Laser cutting is a popular manufacturing process utilized to cut various types of materials economically. The operating cost is affected by laser power, cutting speed, assist gas pressure, nozzle diameter and focus point position as well as the workpiece material. In this article, the process factors investigated were: laser power, cutting speed, air pressure and focal point position. The aim of this work is to relate the operating cost to the process parameters mentioned above. CO2 laser cutting of stainless steel of medical grade AISI316L has been investigated. The main goal was to analyze the operating cost through the laser power, cutting speed, air pressure, focal point position and material thickness. Since the laser operating cost is a complex, non-linear task, soft computing optimization algorithms can be used. Intelligent soft computing scheme support vector regression (SVR) was implemented. The performance of the proposed estimator was confirmed with the simulation results. The SVR results are then compared with artificial neural network and genetic programing. According to the results, a greater improvement in estimation accuracy can be achieved through the SVR compared to other soft computing methodologies. The new optimization methods benefit from the soft computing capabilities of global optimization and multiobjective optimization rather than choosing a starting point by trial and error and combining multiple criteria into a single criterion.
Processing EOS MLS Level-2 Data
NASA Technical Reports Server (NTRS)
Snyder, W. Van; Wu, Dong; Read, William; Jiang, Jonathan; Wagner, Paul; Livesey, Nathaniel; Schwartz, Michael; Filipiak, Mark; Pumphrey, Hugh; Shippony, Zvi
2006-01-01
A computer program performs level-2 processing of thermal-microwave-radiance data from observations of the limb of the Earth by the Earth Observing System (EOS) Microwave Limb Sounder (MLS). The purpose of the processing is to estimate the composition and temperature of the atmosphere versus altitude from .8 to .90 km. "Level-2" as used here is a specialists f term signifying both vertical profiles of geophysical parameters along the measurement track of the instrument and processing performed by this or other software to generate such profiles. Designed to be flexible, the program is controlled via a configuration file that defines all aspects of processing, including contents of state and measurement vectors, configurations of forward models, measurement and calibration data to be read, and the manner of inverting the models to obtain the desired estimates. The program can operate in a parallel form in which one instance of the program acts a master, coordinating the work of multiple slave instances on a cluster of computers, each slave operating on a portion of the data. Optionally, the configuration file can be made to instruct the software to produce files of simulated radiances based on state vectors formed from sets of geophysical data-product files taken as input.
Method and system for determining precursors of health abnormalities from processing medical records
None, None
2013-06-25
Medical reports are converted to document vectors in computing apparatus and sampled by applying a maximum variation sampling function including a fitness function to the document vectors to reduce a number of medical records being processed and to increase the diversity of the medical records being processed. Linguistic phrases are extracted from the medical records and converted to s-grams. A Haar wavelet function is applied to the s-grams over the preselected time interval; and the coefficient results of the Haar wavelet function are examined for patterns representing the likelihood of health abnormalities. This confirms certain s-grams as precursors of the health abnormality and a parameter can be calculated in relation to the occurrence of such a health abnormality.
Computer Simulation of Diffraction Patterns.
ERIC Educational Resources Information Center
Dodd, N. A.
1983-01-01
Describes an Apple computer program (listing available from author) which simulates Fraunhofer and Fresnel diffraction using vector addition techniques (vector chaining) and allows user to experiment with different shaped multiple apertures. Graphics output include vector resultants, phase difference, diffraction patterns, and the Cornu spiral…
System for Automated Calibration of Vector Modulators
NASA Technical Reports Server (NTRS)
Lux, James; Boas, Amy; Li, Samuel
2009-01-01
Vector modulators are used to impose baseband modulation on RF signals, but non-ideal behavior limits the overall performance. The non-ideal behavior of the vector modulator is compensated using data collected with the use of an automated test system driven by a LabVIEW program that systematically applies thousands of control-signal values to the device under test and collects RF measurement data. The technology innovation automates several steps in the process. First, an automated test system, using computer controlled digital-to-analog converters (DACs) and a computer-controlled vector network analyzer (VNA) systematically can apply different I and Q signals (which represent the complex number by which the RF signal is multiplied) to the vector modulator under test (VMUT), while measuring the RF performance specifically, gain and phase. The automated test system uses the LabVIEW software to control the test equipment, collect the data, and write it to a file. The input to the Lab - VIEW program is either user-input for systematic variation, or is provided in a file containing specific test values that should be fed to the VMUT. The output file contains both the control signals and the measured data. The second step is to post-process the file to determine the correction functions as needed. The result of the entire process is a tabular representation, which allows translation of a desired I/Q value to the required analog control signals to produce a particular RF behavior. In some applications, corrected performance is needed only for a limited range. If the vector modulator is being used as a phase shifter, there is only a need to correct I and Q values that represent points on a circle, not the entire plane. This innovation has been used to calibrate 2-GHz MMIC (monolithic microwave integrated circuit) vector modulators in the High EIRP Cluster Array project (EIRP is high effective isotropic radiated power). These calibrations were then used to create correction tables to allow the commanding of the phase shift in each of four channels used as a phased array for beam steering of a Ka-band (32-GHz) signal. The system also was the basis of a breadboard electronic beam steering system. In this breadboard, the goal was not to make systematic measurements of the properties of a vector modulator, but to drive the breadboard with a series of test patterns varying in phase and amplitude. This is essentially the same calibration process, but with the difference that the data collection process is oriented toward collecting breadboard performance, rather than the measurement of output from a network analyzer.
Statistical properties of color-signal spaces.
Lenz, Reiner; Bui, Thanh Hai
2005-05-01
In applications of principal component analysis (PCA) it has often been observed that the eigenvector with the largest eigenvalue has only nonnegative entries when the vectors of the underlying stochastic process have only nonnegative values. This has been used to show that the coordinate vectors in PCA are all located in a cone. We prove that the nonnegativity of the first eigenvector follows from the Perron-Frobenius (and Krein-Rutman theory). Experiments show also that for stochastic processes with nonnegative signals the mean vector is often very similar to the first eigenvector. This is not true in general, but we first give a heuristical explanation why we can expect such a similarity. We then derive a connection between the dominance of the first eigenvalue and the similarity between the mean and the first eigenvector and show how to check the relative size of the first eigenvalue without actually computing it. In the last part of the paper we discuss the implication of theoretical results for multispectral color processing.
Statistical properties of color-signal spaces
NASA Astrophysics Data System (ADS)
Lenz, Reiner; Hai Bui, Thanh
2005-05-01
In applications of principal component analysis (PCA) it has often been observed that the eigenvector with the largest eigenvalue has only nonnegative entries when the vectors of the underlying stochastic process have only nonnegative values. This has been used to show that the coordinate vectors in PCA are all located in a cone. We prove that the nonnegativity of the first eigenvector follows from the Perron-Frobenius (and Krein-Rutman theory). Experiments show also that for stochastic processes with nonnegative signals the mean vector is often very similar to the first eigenvector. This is not true in general, but we first give a heuristical explanation why we can expect such a similarity. We then derive a connection between the dominance of the first eigenvalue and the similarity between the mean and the first eigenvector and show how to check the relative size of the first eigenvalue without actually computing it. In the last part of the paper we discuss the implication of theoretical results for multispectral color processing.
NASA Astrophysics Data System (ADS)
Mills, R. T.
2014-12-01
As the high performance computing (HPC) community pushes towards the exascale horizon, the importance and prevalence of fine-grained parallelism in new computer architectures is increasing. This is perhaps most apparent in the proliferation of so-called "accelerators" such as the Intel Xeon Phi or NVIDIA GPGPUs, but the trend also holds for CPUs, where serial performance has grown slowly and effective use of hardware threads and vector units are becoming increasingly important to realizing high performance. This has significant implications for weather, climate, and Earth system modeling codes, many of which display impressive scalability across MPI ranks but take relatively little advantage of threading and vector processing. In addition to increasing parallelism, next generation codes will also need to address increasingly deep hierarchies for data movement: NUMA/cache levels, on node vs. off node, local vs. wide neighborhoods on the interconnect, and even in the I/O system. We will discuss some approaches (grounded in experiences with the Intel Xeon Phi architecture) for restructuring Earth science codes to maximize concurrency across multiple levels (vectors, threads, MPI ranks), and also discuss some novel approaches for minimizing expensive data movement/communication.
NASA Technical Reports Server (NTRS)
Habiby, Sarry F.
1987-01-01
The design and implementation of a digital (numerical) optical matrix-vector multiplier are presented. The objective is to demonstrate the operation of an optical processor designed to minimize computation time in performing a practical computing application. This is done by using the large array of processing elements in a Hughes liquid crystal light valve, and relying on the residue arithmetic representation, a holographic optical memory, and position coded optical look-up tables. In the design, all operations are performed in effectively one light valve response time regardless of matrix size. The features of the design allowing fast computation include the residue arithmetic representation, the mapping approach to computation, and the holographic memory. In addition, other features of the work include a practical light valve configuration for efficient polarization control, a model for recording multiple exposures in silver halides with equal reconstruction efficiency, and using light from an optical fiber for a reference beam source in constructing the hologram. The design can be extended to implement larger matrix arrays without increasing computation time.
Decentralized Dimensionality Reduction for Distributed Tensor Data Across Sensor Networks.
Liang, Junli; Yu, Guoyang; Chen, Badong; Zhao, Minghua
2016-11-01
This paper develops a novel decentralized dimensionality reduction algorithm for the distributed tensor data across sensor networks. The main contributions of this paper are as follows. First, conventional centralized methods, which utilize entire data to simultaneously determine all the vectors of the projection matrix along each tensor mode, are not suitable for the network environment. Here, we relax the simultaneous processing manner into the one-vector-by-one-vector (OVBOV) manner, i.e., determining the projection vectors (PVs) related to each tensor mode one by one. Second, we prove that in the OVBOV manner each PV can be determined without modifying any tensor data, which simplifies corresponding computations. Third, we cast the decentralized PV determination problem as a set of subproblems with consensus constraints, so that it can be solved in the network environment only by local computations and information communications among neighboring nodes. Fourth, we introduce the null space and transform the PV determination problem with complex orthogonality constraints into an equivalent hidden convex one without any orthogonality constraint, which can be solved by the Lagrange multiplier method. Finally, experimental results are given to show that the proposed algorithm is an effective dimensionality reduction scheme for the distributed tensor data across the sensor networks.
Field applications of stand-off sensing using visible/NIR multivariate optical computing
NASA Astrophysics Data System (ADS)
Eastwood, DeLyle; Soyemi, Olusola O.; Karunamuni, Jeevanandra; Zhang, Lixia; Li, Hongli; Myrick, Michael L.
2001-02-01
12 A novel multivariate visible/NIR optical computing approach applicable to standoff sensing will be demonstrated with porphyrin mixtures as examples. The ultimate goal is to develop environmental or counter-terrorism sensors for chemicals such as organophosphorus (OP) pesticides or chemical warfare simulants in the near infrared spectral region. The mathematical operation that characterizes prediction of properties via regression from optical spectra is a calculation of inner products between the spectrum and the pre-determined regression vector. The result is scaled appropriately and offset to correspond to the basis from which the regression vector is derived. The process involves collecting spectroscopic data and synthesizing a multivariate vector using a pattern recognition method. Then, an interference coating is designed that reproduces the pattern of the multivariate vector in its transmission or reflection spectrum, and appropriate interference filters are fabricated. High and low refractive index materials such as Nb2O5 and SiO2 are excellent choices for the visible and near infrared regions. The proof of concept has now been established for this system in the visible and will later be extended to chemicals such as OP compounds in the near and mid-infrared.
Vectorized image segmentation via trixel agglomeration
Prasad, Lakshman [Los Alamos, NM; Skourikhine, Alexei N [Los Alamos, NM
2006-10-24
A computer implemented method transforms an image comprised of pixels into a vectorized image specified by a plurality of polygons that can be subsequently used to aid in image processing and understanding. The pixelated image is processed to extract edge pixels that separate different colors and a constrained Delaunay triangulation of the edge pixels forms a plurality of triangles having edges that cover the pixelated image. A color for each one of the plurality of triangles is determined from the color pixels within each triangle. A filter is formed with a set of grouping rules related to features of the pixelated image and applied to the plurality of triangle edges to merge adjacent triangles consistent with the filter into polygons having a plurality of vertices. The pixelated image may be then reformed into an array of the polygons, that can be represented collectively and efficiently by standard vector image.
Vectorized Rebinning Algorithm for Fast Data Down-Sampling
NASA Technical Reports Server (NTRS)
Dean, Bruce; Aronstein, David; Smith, Jeffrey
2013-01-01
A vectorized rebinning (down-sampling) algorithm, applicable to N-dimensional data sets, has been developed that offers a significant reduction in computer run time when compared to conventional rebinning algorithms. For clarity, a two-dimensional version of the algorithm is discussed to illustrate some specific details of the algorithm content, and using the language of image processing, 2D data will be referred to as "images," and each value in an image as a "pixel." The new approach is fully vectorized, i.e., the down-sampling procedure is done as a single step over all image rows, and then as a single step over all image columns. Data rebinning (or down-sampling) is a procedure that uses a discretely sampled N-dimensional data set to create a representation of the same data, but with fewer discrete samples. Such data down-sampling is fundamental to digital signal processing, e.g., for data compression applications.
Extending the length and time scales of Gram-Schmidt Lyapunov vector computations
NASA Astrophysics Data System (ADS)
Costa, Anthony B.; Green, Jason R.
2013-08-01
Lyapunov vectors have found growing interest recently due to their ability to characterize systems out of thermodynamic equilibrium. The computation of orthogonal Gram-Schmidt vectors requires multiplication and QR decomposition of large matrices, which grow as N2 (with the particle count). This expense has limited such calculations to relatively small systems and short time scales. Here, we detail two implementations of an algorithm for computing Gram-Schmidt vectors. The first is a distributed-memory message-passing method using Scalapack. The second uses the newly-released MAGMA library for GPUs. We compare the performance of both codes for Lennard-Jones fluids from N=100 to 1300 between Intel Nahalem/Infiniband DDR and NVIDIA C2050 architectures. To our best knowledge, these are the largest systems for which the Gram-Schmidt Lyapunov vectors have been computed, and the first time their calculation has been GPU-accelerated. We conclude that Lyapunov vector calculations can be significantly extended in length and time by leveraging the power of GPU-accelerated linear algebra.
Adaptation of a program for nonlinear finite element analysis to the CDC STAR 100 computer
NASA Technical Reports Server (NTRS)
Pifko, A. B.; Ogilvie, P. L.
1978-01-01
The conversion of a nonlinear finite element program to the CDC STAR 100 pipeline computer is discussed. The program called DYCAST was developed for the crash simulation of structures. Initial results with the STAR 100 computer indicated that significant gains in computation time are possible for operations on gloval arrays. However, for element level computations that do not lend themselves easily to long vector processing, the STAR 100 was slower than comparable scalar computers. On this basis it is concluded that in order for pipeline computers to impact the economic feasibility of large nonlinear analyses it is absolutely essential that algorithms be devised to improve the efficiency of element level computations.
Multi-color incomplete Cholesky conjugate gradient methods for vector computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Poole, E.L.
1986-01-01
This research is concerned with the solution on vector computers of linear systems of equations. Ax = b, where A is a large, sparse symmetric positive definite matrix with non-zero elements lying only along a few diagonals of the matrix. The system is solved using the incomplete Cholesky conjugate gradient method (ICCG). Multi-color orderings are used of the unknowns in the linear system to obtain p-color matrices for which a no-fill block ICCG method is implemented on the CYBER 205 with O(N/p) length vector operations in both the decomposition of A and, more importantly, in the forward and back solvesmore » necessary at each iteration of the method. (N is the number of unknowns and p is a small constant). A p-colored matrix is a matrix that can be partitioned into a p x p block matrix where the diagonal blocks are diagonal matrices. The matrix is stored by diagonals and matrix multiplication by diagonals is used to carry out the decomposition of A and the forward and back solves. Additionally, if the vectors across adjacent blocks line up, then some of the overhead associated with vector startups can be eliminated in the matrix vector multiplication necessary at each conjugate gradient iteration. Necessary and sufficient conditions are given to determine which multi-color orderings of the unknowns correspond to p-color matrices, and a process is indicated for choosing multi-color orderings.« less
A vectorization of the Hess McDonnell Douglas potential flow program NUED for the STAR-100 computer
NASA Technical Reports Server (NTRS)
Boney, L. R.; Smith, R. E., Jr.
1979-01-01
The computer program NUED for analyzing potential flow about arbitrary three dimensional lifting bodies using the panel method was modified to use vector operations and run on the STAR-100 computer. A high speed of computation and ability to approximate the body surface with a large number of panels are characteristics of NUEDV. The new program shows that vector operations can be readily implemented in programs of this type to increase the computational speed on the STAR-100 computer. The virtual memory architecture of the STAR-100 facilitates the use of large numbers of panels to approximate the body surface.
The International Conference on Vector and Parallel Computing (2nd)
1989-01-17
Computation of the SVD of Bidiagonal Matrices" ...................................... 11 " Lattice QCD -As a Large Scale Scientific Computation...vectorizcd for the IBM 3090 Vector Facility. In addition, elapsed times " Lattice QCD -As a Large Scale Scientific have been reduced by using 3090...benchmarked Lattice QCD on a large number ofcompu- come from the wavefront solver routine. This was exten- ters: CrayX-MP and Cray 2 (vector
Early error detection: an action-research experience teaching vector calculus
NASA Astrophysics Data System (ADS)
Magdalena Añino, María; Merino, Gabriela; Miyara, Alberto; Perassi, Marisol; Ravera, Emiliano; Pita, Gustavo; Waigandt, Diana
2014-04-01
This paper describes an action-research experience carried out with second year students at the School of Engineering of the National University of Entre Ríos, Argentina. Vector calculus students played an active role in their own learning process. They were required to present weekly reports, in both oral and written forms, on the topics studied, instead of merely sitting and watching as the teacher solved problems on the blackboard. The students were also asked to perform computer assignments, and their learning process was continuously monitored. Among many benefits, this methodology has allowed students and teachers to identify errors and misconceptions that might have gone unnoticed under a more passive approach.
Computational Study of Fluidic Thrust Vectoring using Separation Control in a Nozzle
NASA Technical Reports Server (NTRS)
Deere, Karen; Berrier, Bobby L.; Flamm, Jeffrey D.; Johnson, Stuart K.
2003-01-01
A computational investigation of a two- dimensional nozzle was completed to assess the use of fluidic injection to manipulate flow separation and cause thrust vectoring of the primary jet thrust. The nozzle was designed with a recessed cavity to enhance the throat shifting method of fluidic thrust vectoring. The structured-grid, computational fluid dynamics code PAB3D was used to guide the design and analyze over 60 configurations. Nozzle design variables included cavity convergence angle, cavity length, fluidic injection angle, upstream minimum height, aft deck angle, and aft deck shape. All simulations were computed with a static freestream Mach number of 0.05. a nozzle pressure ratio of 3.858, and a fluidic injection flow rate equal to 6 percent of the primary flow rate. Results indicate that the recessed cavity enhances the throat shifting method of fluidic thrust vectoring and allows for greater thrust-vector angles without compromising thrust efficiency.
NASA Astrophysics Data System (ADS)
Lee, Byungjin; Lee, Young Jae; Sung, Sangkyung
2018-05-01
A novel attitude determination method is investigated that is computationally efficient and implementable in low cost sensor and embedded platform. Recent result on attitude reference system design is adapted to further develop a three-dimensional attitude determination algorithm through the relative velocity incremental measurements. For this, velocity incremental vectors, computed respectively from INS and GPS with different update rate, are compared to generate filter measurement for attitude estimation. In the quaternion-based Kalman filter configuration, an Euler-like attitude perturbation angle is uniquely introduced for reducing filter states and simplifying propagation processes. Furthermore, assuming a small angle approximation between attitude update periods, it is shown that the reduced order filter greatly simplifies the propagation processes. For performance verification, both simulation and experimental studies are completed. A low cost MEMS IMU and GPS receiver are employed for system integration, and comparison with the true trajectory or a high-grade navigation system demonstrates the performance of the proposed algorithm.
Abad-Franch, Fernando; Valença-Barbosa, Carolina; Sarquis, Otília; Lima, Marli M.
2014-01-01
Background Vector-borne diseases are major public health concerns worldwide. For many of them, vector control is still key to primary prevention, with control actions planned and evaluated using vector occurrence records. Yet vectors can be difficult to detect, and vector occurrence indices will be biased whenever spurious detection/non-detection records arise during surveys. Here, we investigate the process of Chagas disease vector detection, assessing the performance of the surveillance method used in most control programs – active triatomine-bug searches by trained health agents. Methodology/Principal Findings Control agents conducted triplicate vector searches in 414 man-made ecotopes of two rural localities. Ecotope-specific ‘detection histories’ (vectors or their traces detected or not in each individual search) were analyzed using ordinary methods that disregard detection failures and multiple detection-state site-occupancy models that accommodate false-negative and false-positive detections. Mean (±SE) vector-search sensitivity was ∼0.283±0.057. Vector-detection odds increased as bug colonies grew denser, and were lower in houses than in most peridomestic structures, particularly woodpiles. False-positive detections (non-vector fecal streaks misidentified as signs of vector presence) occurred with probability ∼0.011±0.008. The model-averaged estimate of infestation (44.5±6.4%) was ∼2.4–3.9 times higher than naïve indices computed assuming perfect detection after single vector searches (11.4–18.8%); about 106–137 infestation foci went undetected during such standard searches. Conclusions/Significance We illustrate a relatively straightforward approach to addressing vector detection uncertainty under realistic field survey conditions. Standard vector searches had low sensitivity except in certain singular circumstances. Our findings suggest that many infestation foci may go undetected during routine surveys, especially when vector density is low. Undetected foci can cause control failures and induce bias in entomological indices; this may confound disease risk assessment and mislead program managers into flawed decision making. By helping correct bias in naïve indices, the approach we illustrate has potential to critically strengthen vector-borne disease control-surveillance systems. PMID:25233352
Parallel processing in finite element structural analysis
NASA Technical Reports Server (NTRS)
Noor, Ahmed K.
1987-01-01
A brief review is made of the fundamental concepts and basic issues of parallel processing. Discussion focuses on parallel numerical algorithms, performance evaluation of machines and algorithms, and parallelism in finite element computations. A computational strategy is proposed for maximizing the degree of parallelism at different levels of the finite element analysis process including: 1) formulation level (through the use of mixed finite element models); 2) analysis level (through additive decomposition of the different arrays in the governing equations into the contributions to a symmetrized response plus correction terms); 3) numerical algorithm level (through the use of operator splitting techniques and application of iterative processes); and 4) implementation level (through the effective combination of vectorization, multitasking and microtasking, whenever available).
NASA Astrophysics Data System (ADS)
Anagnostopoulos, Christos Nikolaos; Vovoli, Eftichia
An emotion recognition framework based on sound processing could improve services in human-computer interaction. Various quantitative speech features obtained from sound processing of acting speech were tested, as to whether they are sufficient or not to discriminate between seven emotions. Multilayered perceptrons were trained to classify gender and emotions on the basis of a 24-input vector, which provide information about the prosody of the speaker over the entire sentence using statistics of sound features. Several experiments were performed and the results were presented analytically. Emotion recognition was successful when speakers and utterances were “known” to the classifier. However, severe misclassifications occurred during the utterance-independent framework. At least, the proposed feature vector achieved promising results for utterance-independent recognition of high- and low-arousal emotions.
CUDAICA: GPU Optimization of Infomax-ICA EEG Analysis
Raimondo, Federico; Kamienkowski, Juan E.; Sigman, Mariano; Fernandez Slezak, Diego
2012-01-01
In recent years, Independent Component Analysis (ICA) has become a standard to identify relevant dimensions of the data in neuroscience. ICA is a very reliable method to analyze data but it is, computationally, very costly. The use of ICA for online analysis of the data, used in brain computing interfaces, results are almost completely prohibitive. We show an increase with almost no cost (a rapid video card) of speed of ICA by about 25 fold. The EEG data, which is a repetition of many independent signals in multiple channels, is very suitable for processing using the vector processors included in the graphical units. We profiled the implementation of this algorithm and detected two main types of operations responsible of the processing bottleneck and taking almost 80% of computing time: vector-matrix and matrix-matrix multiplications. By replacing function calls to basic linear algebra functions to the standard CUBLAS routines provided by GPU manufacturers, it does not increase performance due to CUDA kernel launch overhead. Instead, we developed a GPU-based solution that, comparing with the original BLAS and CUBLAS versions, obtains a 25x increase of performance for the ICA calculation. PMID:22811699
Extending the length and time scales of Gram–Schmidt Lyapunov vector computations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Costa, Anthony B., E-mail: acosta@northwestern.edu; Green, Jason R., E-mail: jason.green@umb.edu; Department of Chemistry, University of Massachusetts Boston, Boston, MA 02125
Lyapunov vectors have found growing interest recently due to their ability to characterize systems out of thermodynamic equilibrium. The computation of orthogonal Gram–Schmidt vectors requires multiplication and QR decomposition of large matrices, which grow as N{sup 2} (with the particle count). This expense has limited such calculations to relatively small systems and short time scales. Here, we detail two implementations of an algorithm for computing Gram–Schmidt vectors. The first is a distributed-memory message-passing method using Scalapack. The second uses the newly-released MAGMA library for GPUs. We compare the performance of both codes for Lennard–Jones fluids from N=100 to 1300 betweenmore » Intel Nahalem/Infiniband DDR and NVIDIA C2050 architectures. To our best knowledge, these are the largest systems for which the Gram–Schmidt Lyapunov vectors have been computed, and the first time their calculation has been GPU-accelerated. We conclude that Lyapunov vector calculations can be significantly extended in length and time by leveraging the power of GPU-accelerated linear algebra.« less
On the Impact of Widening Vector Registers on Sequence Alignment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Daily, Jeffrey A.; Kalyanaraman, Anantharaman; Krishnamoorthy, Sriram
2016-09-22
Vector extensions, such as SSE, have been part of the x86 since the 1990s, with applications in graphics, signal processing, and scientific applications. Although many algorithms and applications can naturally benefit from automatic vectorization techniques, there are still many that are difficult to vectorize due to their dependence on irregular data structures, dense branch operations, or data dependencies. Sequence alignment, one of the most widely used operations in bioinformatics workflows, has a computational footprint that features complex data dependencies. In this paper, we demonstrate that the trend of widening vector registers adversely affects the state-of-the-art sequence alignment algorithm based onmore » striped data layouts. We present a practically efficient SIMD implementation of a parallel scan based sequence alignment algorithm that can better exploit wider SIMD units. We conduct comprehensive workload and use case analyses to characterize the relative behavior of the striped and scan approaches and identify the best choice of algorithm based on input length and SIMD width.« less
Virus Database and Online Inquiry System Based on Natural Vectors.
Dong, Rui; Zheng, Hui; Tian, Kun; Yau, Shek-Chung; Mao, Weiguang; Yu, Wenping; Yin, Changchuan; Yu, Chenglong; He, Rong Lucy; Yang, Jie; Yau, Stephen St
2017-01-01
We construct a virus database called VirusDB (http://yaulab.math.tsinghua.edu.cn/VirusDB/) and an online inquiry system to serve people who are interested in viral classification and prediction. The database stores all viral genomes, their corresponding natural vectors, and the classification information of the single/multiple-segmented viral reference sequences downloaded from National Center for Biotechnology Information. The online inquiry system serves the purpose of computing natural vectors and their distances based on submitted genomes, providing an online interface for accessing and using the database for viral classification and prediction, and back-end processes for automatic and manual updating of database content to synchronize with GenBank. Submitted genomes data in FASTA format will be carried out and the prediction results with 5 closest neighbors and their classifications will be returned by email. Considering the one-to-one correspondence between sequence and natural vector, time efficiency, and high accuracy, natural vector is a significant advance compared with alignment methods, which makes VirusDB a useful database in further research.
A Computational Study of a New Dual Throat Fluidic Thrust Vectoring Nozzle Concept
NASA Technical Reports Server (NTRS)
Deere, Karen A.; Berrier, Bobby L.; Flamm, Jeffrey D.; Johnson, Stuart K.
2005-01-01
A computational investigation of a two-dimensional nozzle was completed to assess the use of fluidic injection to manipulate flow separation and cause thrust vectoring of the primary jet thrust. The nozzle was designed with a recessed cavity to enhance the throat shifting method of fluidic thrust vectoring. Several design cycles with the structured-grid, computational fluid dynamics code PAB3D and with experiments in the NASA Langley Research Center Jet Exit Test Facility have been completed to guide the nozzle design and analyze performance. This paper presents computational results on potential design improvements for best experimental configuration tested to date. Nozzle design variables included cavity divergence angle, cavity convergence angle and upstream throat height. Pulsed fluidic injection was also investigated for its ability to decrease mass flow requirements. Internal nozzle performance (wind-off conditions) and thrust vector angles were computed for several configurations over a range of nozzle pressure ratios from 2 to 7, with the fluidic injection flow rate equal to 3 percent of the primary flow rate. Computational results indicate that increasing cavity divergence angle beyond 10 is detrimental to thrust vectoring efficiency, while increasing cavity convergence angle from 20 to 30 improves thrust vectoring efficiency at nozzle pressure ratios greater than 2, albeit at the expense of discharge coefficient. Pulsed injection was no more efficient than steady injection for the Dual Throat Nozzle concept.
Fault Diagnosis for Rotating Machinery: A Method based on Image Processing
Lu, Chen; Wang, Yang; Ragulskis, Minvydas; Cheng, Yujie
2016-01-01
Rotating machinery is one of the most typical types of mechanical equipment and plays a significant role in industrial applications. Condition monitoring and fault diagnosis of rotating machinery has gained wide attention for its significance in preventing catastrophic accident and guaranteeing sufficient maintenance. With the development of science and technology, fault diagnosis methods based on multi-disciplines are becoming the focus in the field of fault diagnosis of rotating machinery. This paper presents a multi-discipline method based on image-processing for fault diagnosis of rotating machinery. Different from traditional analysis method in one-dimensional space, this study employs computing method in the field of image processing to realize automatic feature extraction and fault diagnosis in a two-dimensional space. The proposed method mainly includes the following steps. First, the vibration signal is transformed into a bi-spectrum contour map utilizing bi-spectrum technology, which provides a basis for the following image-based feature extraction. Then, an emerging approach in the field of image processing for feature extraction, speeded-up robust features, is employed to automatically exact fault features from the transformed bi-spectrum contour map and finally form a high-dimensional feature vector. To reduce the dimensionality of the feature vector, thus highlighting main fault features and reducing subsequent computing resources, t-Distributed Stochastic Neighbor Embedding is adopt to reduce the dimensionality of the feature vector. At last, probabilistic neural network is introduced for fault identification. Two typical rotating machinery, axial piston hydraulic pump and self-priming centrifugal pumps, are selected to demonstrate the effectiveness of the proposed method. Results show that the proposed method based on image-processing achieves a high accuracy, thus providing a highly effective means to fault diagnosis for rotating machinery. PMID:27711246
Fault Diagnosis for Rotating Machinery: A Method based on Image Processing.
Lu, Chen; Wang, Yang; Ragulskis, Minvydas; Cheng, Yujie
2016-01-01
Rotating machinery is one of the most typical types of mechanical equipment and plays a significant role in industrial applications. Condition monitoring and fault diagnosis of rotating machinery has gained wide attention for its significance in preventing catastrophic accident and guaranteeing sufficient maintenance. With the development of science and technology, fault diagnosis methods based on multi-disciplines are becoming the focus in the field of fault diagnosis of rotating machinery. This paper presents a multi-discipline method based on image-processing for fault diagnosis of rotating machinery. Different from traditional analysis method in one-dimensional space, this study employs computing method in the field of image processing to realize automatic feature extraction and fault diagnosis in a two-dimensional space. The proposed method mainly includes the following steps. First, the vibration signal is transformed into a bi-spectrum contour map utilizing bi-spectrum technology, which provides a basis for the following image-based feature extraction. Then, an emerging approach in the field of image processing for feature extraction, speeded-up robust features, is employed to automatically exact fault features from the transformed bi-spectrum contour map and finally form a high-dimensional feature vector. To reduce the dimensionality of the feature vector, thus highlighting main fault features and reducing subsequent computing resources, t-Distributed Stochastic Neighbor Embedding is adopt to reduce the dimensionality of the feature vector. At last, probabilistic neural network is introduced for fault identification. Two typical rotating machinery, axial piston hydraulic pump and self-priming centrifugal pumps, are selected to demonstrate the effectiveness of the proposed method. Results show that the proposed method based on image-processing achieves a high accuracy, thus providing a highly effective means to fault diagnosis for rotating machinery.
User's Guide for Monthly Vector Wind Profile Model
NASA Technical Reports Server (NTRS)
Adelfang, S. I.
1999-01-01
The background, theoretical concepts, and methodology for construction of vector wind profiles based on a statistical model are presented. The derived monthly vector wind profiles are to be applied by the launch vehicle design community for establishing realistic estimates of critical vehicle design parameter dispersions related to wind profile dispersions. During initial studies a number of months are used to establish the model profiles that produce the largest monthly dispersions of ascent vehicle aerodynamic load indicators. The largest monthly dispersions for wind, which occur during the winter high-wind months, are used for establishing the design reference dispersions for the aerodynamic load indicators. This document includes a description of the computational process for the vector wind model including specification of input data, parameter settings, and output data formats. Sample output data listings are provided to aid the user in the verification of test output.
Hypergraph-Based Combinatorial Optimization of Matrix-Vector Multiplication
ERIC Educational Resources Information Center
Wolf, Michael Maclean
2009-01-01
Combinatorial scientific computing plays an important enabling role in computational science, particularly in high performance scientific computing. In this thesis, we will describe our work on optimizing matrix-vector multiplication using combinatorial techniques. Our research has focused on two different problems in combinatorial scientific…
NASA Astrophysics Data System (ADS)
Mikula, Brendon D.; Heckler, Andrew F.
2017-06-01
We propose a framework for improving accuracy, fluency, and retention of basic skills essential for solving problems relevant to STEM introductory courses, and implement the framework for the case of basic vector math skills over several semesters in an introductory physics course. Using an iterative development process, the framework begins with a careful identification of target skills and the study of specific student difficulties with these skills. It then employs computer-based instruction, immediate feedback, mastery grading, and well-researched principles from cognitive psychology such as interleaved training sequences and distributed practice. We implemented this with more than 1500 students over 2 semesters. Students completed the mastery practice for an average of about 13 min /week , for a total of about 2-3 h for the whole semester. Results reveal large (>1 SD ) pretest to post-test gains in accuracy in vector skills, even compared to a control group, and these gains were retained at least 2 months after practice. We also find evidence of improved fluency, student satisfaction, and that awarding regular course credit results in higher participation and higher learning gains than awarding extra credit. In all, we find that simple computer-based mastery practice is an effective and efficient way to improve a set of basic and essential skills for introductory physics.
NASA Technical Reports Server (NTRS)
1979-01-01
The current program had the objective to modify a discrete vortex wake method to efficiently compute the aerodynamic forces and moments on high fineness ratio bodies (f approximately 10.0). The approach is to increase computational efficiency by structuring the program to take advantage of new computer vector software and by developing new algorithms when vector software can not efficiently be used. An efficient program was written and substantial savings achieved. Several test cases were run for fineness ratios up to f = 16.0 and angles of attack up to 50 degrees.
Estimating normal mixture parameters from the distribution of a reduced feature vector
NASA Technical Reports Server (NTRS)
Guseman, L. F.; Peters, B. C., Jr.; Swasdee, M.
1976-01-01
A FORTRAN computer program was written and tested. The measurements consisted of 1000 randomly chosen vectors representing 1, 2, 3, 7, and 10 subclasses in equal portions. In the first experiment, the vectors are computed from the input means and covariances. In the second experiment, the vectors are 16 channel measurements. The starting covariances were constructed as if there were no correlation between separate passes. The biases obtained from each run are listed.
Monte Carlo simulation of Ising models by multispin coding on a vector computer
NASA Astrophysics Data System (ADS)
Wansleben, Stephan; Zabolitzky, John G.; Kalle, Claus
1984-11-01
Rebbi's efficient multispin coding algorithm for Ising models is combined with the use of the vector computer CDC Cyber 205. A speed of 21.2 million updates per second is reached. This is comparable to that obtained by special- purpose computers.
NASA Technical Reports Server (NTRS)
Rarig, P. L.
1980-01-01
A program to calculate upwelling infrared radiation was modified to operate efficiently on the STAR-100. The modified software processes specific test cases significantly faster than the initial STAR-100 code. For example, a midlatitude summer atmospheric model is executed in less than 2% of the time originally required on the STAR-100. Furthermore, the optimized program performs extra operations to save the calculated absorption coefficients. Some of the advantages and pitfalls of virtual memory and vector processing are discussed along with strategies used to avoid loss of accuracy and computing power. Results from the vectorized code, in terms of speed, cost, and relative error with respect to serial code solutions are encouraging.
Closed-form integrator for the quaternion (euler angle) kinematics equations
NASA Technical Reports Server (NTRS)
Whitmore, Stephen A. (Inventor)
2000-01-01
The invention is embodied in a method of integrating kinematics equations for updating a set of vehicle attitude angles of a vehicle using 3-dimensional angular velocities of the vehicle, which includes computing an integrating factor matrix from quantities corresponding to the 3-dimensional angular velocities, computing a total integrated angular rate from the quantities corresponding to a 3-dimensional angular velocities, computing a state transition matrix as a sum of (a) a first complementary function of the total integrated angular rate and (b) the integrating factor matrix multiplied by a second complementary function of the total integrated angular rate, and updating the set of vehicle attitude angles using the state transition matrix. Preferably, the method further includes computing a quanternion vector from the quantities corresponding to the 3-dimensional angular velocities, in which case the updating of the set of vehicle attitude angles using the state transition matrix is carried out by (a) updating the quanternion vector by multiplying the quanternion vector by the state transition matrix to produce an updated quanternion vector and (b) computing an updated set of vehicle attitude angles from the updated quanternion vector. The first and second trigonometric functions are complementary, such as a sine and a cosine. The quantities corresponding to the 3-dimensional angular velocities include respective averages of the 3-dimensional angular velocities over plural time frames. The updating of the quanternion vector preserves the norm of the vector, whereby the updated set of vehicle attitude angles are virtually error-free.
An Intelligent Pictorial Information System
NASA Astrophysics Data System (ADS)
Lee, Edward T.; Chang, B.
1987-05-01
In examining the history of computer application, we discover that early computer systems were developed primarily for applications related to scientific computation, as in weather prediction, aerospace applications, and nuclear physics applications. At this stage, the computer system served as a big calculator to perform, in the main, manipulation of numbers. Then it was found that computer systems could also be used for business applications, information storage and retrieval, word processing, and report generation. The history of computer application is summarized in Table I. The complexity of pictures makes picture processing much more difficult than number and alphanumerical processing. Therefore, new techniques, new algorithms, and above all, new pictorial knowledge, [1] are needed to overcome the limitatins of existing computer systems. New frontiers in designing computer systems are the ways to handle the representation,[2,3] classification, manipulation, processing, storage, and retrieval of pictures. Especially, the ways to deal with similarity measures and the meaning of the word "approximate" and the phrase "approximate reasoning" are an important and an indispensable part of an intelligent pictorial information system. [4,5] The main objective of this paper is to investigate the mathematical foundation for the effective organization and efficient retrieval of pictures in similarity-directed pictorial databases, [6] based on similarity retrieval techniques [7] and fuzzy languages [8]. The main advantage of this approach is that similar pictures are stored logically close to each other by using quantitative similarity measures. Thus, for answering queries, the amount of picture data needed to be searched can be reduced and the retrieval time can be improved. In addition, in a pictorial database, very often it is desired to find pictures (or feature vectors, histograms, etc.) that are most similar to or most dissimilar [9] to a test picture (or feature vector). Using similarity measures, one can not only store similar pictures logically or physically close to each other in order to improve retrieval or updating efficiency, one can also use such similarity measures to answer fuzzy queries involving nonexact retrieval conditions. In this paper, similarity directed pictorial databases involving geometric figures, chromosome images, [10] leukocyte images, cardiomyopathy images, and satellite images [11] are presented as illustrative examples.
VEST: Abstract Vector Calculus Simplification in Mathematica
DOE Office of Scientific and Technical Information (OSTI.GOV)
J. Squire, J. Burby and H. Qin
2013-03-12
We present a new package, VEST (Vector Einstein Summation Tools), that performs abstract vector calculus computations in Mathematica. Through the use of index notation, VEST is able to reduce scalar and vector expressions of a very general type using a systematic canonicalization procedure. In addition, utilizing properties of the Levi-Civita symbol, the program can derive types of multi-term vector identities that are not recognized by canonicalization, subsequently applying these to simplify large expressions. In a companion paper [1], we employ VEST in the automation of the calculation of Lagrangians for the single particle guiding center system in plasma physics, amore » computation which illustrates its ability to handle very large expressions. VEST has been designed to be simple and intuitive to use, both for basic checking of work and more involved computations. __________________________________________________« less
VEST: Abstract vector calculus simplification in Mathematica
NASA Astrophysics Data System (ADS)
Squire, J.; Burby, J.; Qin, H.
2014-01-01
We present a new package, VEST (Vector Einstein Summation Tools), that performs abstract vector calculus computations in Mathematica. Through the use of index notation, VEST is able to reduce three-dimensional scalar and vector expressions of a very general type to a well defined standard form. In addition, utilizing properties of the Levi-Civita symbol, the program can derive types of multi-term vector identities that are not recognized by reduction, subsequently applying these to simplify large expressions. In a companion paper Burby et al. (2013) [12], we employ VEST in the automation of the calculation of high-order Lagrangians for the single particle guiding center system in plasma physics, a computation which illustrates its ability to handle very large expressions. VEST has been designed to be simple and intuitive to use, both for basic checking of work and more involved computations.
Gridded Calibration of Ensemble Wind Vector Forecasts Using Ensemble Model Output Statistics
NASA Astrophysics Data System (ADS)
Lazarus, S. M.; Holman, B. P.; Splitt, M. E.
2017-12-01
A computationally efficient method is developed that performs gridded post processing of ensemble wind vector forecasts. An expansive set of idealized WRF model simulations are generated to provide physically consistent high resolution winds over a coastal domain characterized by an intricate land / water mask. Ensemble model output statistics (EMOS) is used to calibrate the ensemble wind vector forecasts at observation locations. The local EMOS predictive parameters (mean and variance) are then spread throughout the grid utilizing flow-dependent statistical relationships extracted from the downscaled WRF winds. Using data withdrawal and 28 east central Florida stations, the method is applied to one year of 24 h wind forecasts from the Global Ensemble Forecast System (GEFS). Compared to the raw GEFS, the approach improves both the deterministic and probabilistic forecast skill. Analysis of multivariate rank histograms indicate the post processed forecasts are calibrated. Two downscaling case studies are presented, a quiescent easterly flow event and a frontal passage. Strengths and weaknesses of the approach are presented and discussed.
Low-rate image coding using vector quantization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Makur, A.
1990-01-01
This thesis deals with the development and analysis of a computationally simple vector quantization image compression system for coding monochrome images at low bit rate. Vector quantization has been known to be an effective compression scheme when a low bit rate is desirable, but the intensive computation required in a vector quantization encoder has been a handicap in using it for low rate image coding. The present work shows that, without substantially increasing the coder complexity, it is indeed possible to achieve acceptable picture quality while attaining a high compression ratio. Several modifications to the conventional vector quantization coder aremore » proposed in the thesis. These modifications are shown to offer better subjective quality when compared to the basic coder. Distributed blocks are used instead of spatial blocks to construct the input vectors. A class of input-dependent weighted distortion functions is used to incorporate psychovisual characteristics in the distortion measure. Computationally simple filtering techniques are applied to further improve the decoded image quality. Finally, unique designs of the vector quantization coder using electronic neural networks are described, so that the coding delay is reduced considerably.« less
NASA Technical Reports Server (NTRS)
Botts, Michael E.; Phillips, Ron J.; Parker, John V.; Wright, Patrick D.
1992-01-01
Five scientists at MSFC/ESAD have EOS SCF investigator status. Each SCF has unique tasks which require the establishment of a computing facility dedicated to accomplishing those tasks. A SCF Working Group was established at ESAD with the charter of defining the computing requirements of the individual SCFs and recommending options for meeting these requirements. The primary goal of the working group was to determine which computing needs can be satisfied using either shared resources or separate but compatible resources, and which needs require unique individual resources. The requirements investigated included CPU-intensive vector and scalar processing, visualization, data storage, connectivity, and I/O peripherals. A review of computer industry directions and a market survey of computing hardware provided information regarding important industry standards and candidate computing platforms. It was determined that the total SCF computing requirements might be most effectively met using a hierarchy consisting of shared and individual resources. This hierarchy is composed of five major system types: (1) a supercomputer class vector processor; (2) a high-end scalar multiprocessor workstation; (3) a file server; (4) a few medium- to high-end visualization workstations; and (5) several low- to medium-range personal graphics workstations. Specific recommendations for meeting the needs of each of these types are presented.
Compute Server Performance Results
NASA Technical Reports Server (NTRS)
Stockdale, I. E.; Barton, John; Woodrow, Thomas (Technical Monitor)
1994-01-01
Parallel-vector supercomputers have been the workhorses of high performance computing. As expectations of future computing needs have risen faster than projected vector supercomputer performance, much work has been done investigating the feasibility of using Massively Parallel Processor systems as supercomputers. An even more recent development is the availability of high performance workstations which have the potential, when clustered together, to replace parallel-vector systems. We present a systematic comparison of floating point performance and price-performance for various compute server systems. A suite of highly vectorized programs was run on systems including traditional vector systems such as the Cray C90, and RISC workstations such as the IBM RS/6000 590 and the SGI R8000. The C90 system delivers 460 million floating point operations per second (FLOPS), the highest single processor rate of any vendor. However, if the price-performance ration (PPR) is considered to be most important, then the IBM and SGI processors are superior to the C90 processors. Even without code tuning, the IBM and SGI PPR's of 260 and 220 FLOPS per dollar exceed the C90 PPR of 160 FLOPS per dollar when running our highly vectorized suite,
Eichenberger, Alexandre E; Gschwind, Michael K; Gunnels, John A
2013-11-05
Mechanisms for performing matrix multiplication operations with data pre-conditioning in a high performance computing architecture are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A load and splat operation is performed to load an element of a second vector operand and replicating the element to each of a plurality of elements of a second target vector register. A multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product of the matrix multiplication operation is accumulated with other partial products of the matrix multiplication operation.
STAMPS: Software Tool for Automated MRI Post-processing on a supercomputer.
Bigler, Don C; Aksu, Yaman; Miller, David J; Yang, Qing X
2009-08-01
This paper describes a Software Tool for Automated MRI Post-processing (STAMP) of multiple types of brain MRIs on a workstation and for parallel processing on a supercomputer (STAMPS). This software tool enables the automation of nonlinear registration for a large image set and for multiple MR image types. The tool uses standard brain MRI post-processing tools (such as SPM, FSL, and HAMMER) for multiple MR image types in a pipeline fashion. It also contains novel MRI post-processing features. The STAMP image outputs can be used to perform brain analysis using Statistical Parametric Mapping (SPM) or single-/multi-image modality brain analysis using Support Vector Machines (SVMs). Since STAMPS is PBS-based, the supercomputer may be a multi-node computer cluster or one of the latest multi-core computers.
Stochastic subset selection for learning with kernel machines.
Rhinelander, Jason; Liu, Xiaoping P
2012-06-01
Kernel machines have gained much popularity in applications of machine learning. Support vector machines (SVMs) are a subset of kernel machines and generalize well for classification, regression, and anomaly detection tasks. The training procedure for traditional SVMs involves solving a quadratic programming (QP) problem. The QP problem scales super linearly in computational effort with the number of training samples and is often used for the offline batch processing of data. Kernel machines operate by retaining a subset of observed data during training. The data vectors contained within this subset are referred to as support vectors (SVs). The work presented in this paper introduces a subset selection method for the use of kernel machines in online, changing environments. Our algorithm works by using a stochastic indexing technique when selecting a subset of SVs when computing the kernel expansion. The work described here is novel because it separates the selection of kernel basis functions from the training algorithm used. The subset selection algorithm presented here can be used in conjunction with any online training technique. It is important for online kernel machines to be computationally efficient due to the real-time requirements of online environments. Our algorithm is an important contribution because it scales linearly with the number of training samples and is compatible with current training techniques. Our algorithm outperforms standard techniques in terms of computational efficiency and provides increased recognition accuracy in our experiments. We provide results from experiments using both simulated and real-world data sets to verify our algorithm.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hutchinson, S.A.; Shadid, J.N.; Tuminaro, R.S.
1995-10-01
Aztec is an iterative library that greatly simplifies the parallelization process when solving the linear systems of equations Ax = b where A is a user supplied n x n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. Aztec is intended as a software tool for users who want to avoid cumbersome parallel programming details but who have large sparse linear systems which require an efficiently utilized parallel processing system. A collection of data transformation tools are provided that allow for easy creation of distributed sparsemore » unstructured matrices for parallel solution. Once the distributed matrix is created, computation can be performed on any of the parallel machines running Aztec: nCUBE 2, IBM SP2 and Intel Paragon, MPI platforms as well as standard serial and vector platforms. Aztec includes a number of Krylov iterative methods such as conjugate gradient (CG), generalized minimum residual (GMRES) and stabilized biconjugate gradient (BICGSTAB) to solve systems of equations. These Krylov methods are used in conjunction with various preconditioners such as polynomial or domain decomposition methods using LU or incomplete LU factorizations within subdomains. Although the matrix A can be general, the package has been designed for matrices arising from the approximation of partial differential equations (PDEs). In particular, the Aztec package is oriented toward systems arising from PDE applications.« less
PIC codes for plasma accelerators on emerging computer architectures (GPUS, Multicore/Manycore CPUS)
NASA Astrophysics Data System (ADS)
Vincenti, Henri
2016-03-01
The advent of exascale computers will enable 3D simulations of a new laser-plasma interaction regimes that were previously out of reach of current Petasale computers. However, the paradigm used to write current PIC codes will have to change in order to fully exploit the potentialities of these new computing architectures. Indeed, achieving Exascale computing facilities in the next decade will be a great challenge in terms of energy consumption and will imply hardware developments directly impacting our way of implementing PIC codes. As data movement (from die to network) is by far the most energy consuming part of an algorithm future computers will tend to increase memory locality at the hardware level and reduce energy consumption related to data movement by using more and more cores on each compute nodes (''fat nodes'') that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, CPU machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD register length is expected to double every four years. GPU's also have a reduced clock speed per core and can process Multiple Instructions on Multiple Datas (MIMD). At the software level Particle-In-Cell (PIC) codes will thus have to achieve both good memory locality and vectorization (for Multicore/Manycore CPU) to fully take advantage of these upcoming architectures. In this talk, we present the portable solutions we implemented in our high performance skeleton PIC code PICSAR to both achieve good memory locality and cache reuse as well as good vectorization on SIMD architectures. We also present the portable solutions used to parallelize the Pseudo-sepctral quasi-cylindrical code FBPIC on GPUs using the Numba python compiler.
Clouding tracing: Visualization of the mixing of fluid elements in convection-diffusion systems
NASA Technical Reports Server (NTRS)
Ma, Kwan-Liu; Smith, Philip J.
1993-01-01
This paper describes a highly interactive method for computer visualization of the basic physical process of dispersion and mixing of fluid elements in convection-diffusion systems. It is based on transforming the vector field from a traditionally Eulerian reference frame into a Lagrangian reference frame. Fluid elements are traced through the vector field for the mean path as well as the statistical dispersion of the fluid elements about the mean position by using added scalar information about the root mean square value of the vector field and its Lagrangian time scale. In this way, clouds of fluid elements are traced and are not just mean paths. We have used this method to visualize the simulation of an industrial incinerator to help identify mechanisms for poor mixing.
Lanczos eigensolution method for high-performance computers
NASA Technical Reports Server (NTRS)
Bostic, Susan W.
1991-01-01
The theory, computational analysis, and applications are presented of a Lanczos algorithm on high performance computers. The computationally intensive steps of the algorithm are identified as: the matrix factorization, the forward/backward equation solution, and the matrix vector multiples. These computational steps are optimized to exploit the vector and parallel capabilities of high performance computers. The savings in computational time from applying optimization techniques such as: variable band and sparse data storage and access, loop unrolling, use of local memory, and compiler directives are presented. Two large scale structural analysis applications are described: the buckling of a composite blade stiffened panel with a cutout, and the vibration analysis of a high speed civil transport. The sequential computational time for the panel problem executed on a CONVEX computer of 181.6 seconds was decreased to 14.1 seconds with the optimized vector algorithm. The best computational time of 23 seconds for the transport problem with 17,000 degs of freedom was on the the Cray-YMP using an average of 3.63 processors.
Use of CYBER 203 and CYBER 205 computers for three-dimensional transonic flow calculations
NASA Technical Reports Server (NTRS)
Melson, N. D.; Keller, J. D.
1983-01-01
Experiences are discussed for modifying two three-dimensional transonic flow computer programs (FLO 22 and FLO 27) for use on the CDC CYBER 203 computer system. Both programs were originally written for use on serial machines. Several methods were attempted to optimize the execution of the two programs on the vector machine: leaving the program in a scalar form (i.e., serial computation) with compiler software used to optimize and vectorize the program, vectorizing parts of the existing algorithm in the program, and incorporating a vectorizable algorithm (ZEBRA I or ZEBRA II) in the program. Comparison runs of the programs were made on CDC CYBER 175. CYBER 203, and two pipe CDC CYBER 205 computer systems.
NASA Astrophysics Data System (ADS)
Pavlichin, Dmitri S.; Mabuchi, Hideo
2014-06-01
Nanoscale integrated photonic devices and circuits offer a path to ultra-low power computation at the few-photon level. Here we propose an optical circuit that performs a ubiquitous operation: the controlled, random-access readout of a collection of stored memory phases or, equivalently, the computation of the inner product of a vector of phases with a binary selector" vector, where the arithmetic is done modulo 2pi and the result is encoded in the phase of a coherent field. This circuit, a collection of cascaded interferometers driven by a coherent input field, demonstrates the use of coherence as a computational resource, and of the use of recently-developed mathematical tools for modeling optical circuits with many coupled parts. The construction extends in a straightforward way to the computation of matrix-vector and matrix-matrix products, and, with the inclusion of an optical feedback loop, to the computation of a weighted" readout of stored memory phases. We note some applications of these circuits for error correction and for computing tasks requiring fast vector inner products, e.g. statistical classification and some machine learning algorithms.
Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Carter, Jonathan; Shalf, John; Skinner, David; Ethier, Stephane; Biswas, Rupak; Djomehri, Jahed; VanderWijngaart, Rob
2003-01-01
The growing gap between sustained and peak performance for scientific applications has become a well-known problem in high performance computing. The recent development of parallel vector systems offers the potential to bridge this gap for a significant number of computational science codes and deliver a substantial increase in computing capabilities. This paper examines the intranode performance of the NEC SX6 vector processor and the cache-based IBM Power3/4 superscalar architectures across a number of key scientific computing areas. First, we present the performance of a microbenchmark suite that examines a full spectrum of low-level machine characteristics. Next, we study the behavior of the NAS Parallel Benchmarks using some simple optimizations. Finally, we evaluate the perfor- mance of several numerical codes from key scientific computing domains. Overall results demonstrate that the SX6 achieves high performance on a large fraction of our application suite and in many cases significantly outperforms the RISC-based architectures. However, certain classes of applications are not easily amenable to vectorization and would likely require extensive reengineering of both algorithm and implementation to utilize the SX6 effectively.
Flux vector splitting of the inviscid equations with application to finite difference methods
NASA Technical Reports Server (NTRS)
Steger, J. L.; Warming, R. F.
1979-01-01
The conservation-law form of the inviscid gasdynamic equations has the remarkable property that the nonlinear flux vectors are homogeneous functions of degree one. This property readily permits the splitting of flux vectors into subvectors by similarity transformations so that each subvector has associated with it a specified eigenvalue spectrum. As a consequence of flux vector splitting, new explicit and implicit dissipative finite-difference schemes are developed for first-order hyperbolic systems of equations. Appropriate one-sided spatial differences for each split flux vector are used throughout the computational field even if the flow is locally subsonic. The results of some preliminary numerical computations are included.
EOS MLS Level 2 Data Processing Software Version 3
NASA Technical Reports Server (NTRS)
Livesey, Nathaniel J.; VanSnyder, Livesey W.; Read, William G.; Schwartz, Michael J.; Lambert, Alyn; Santee, Michelle L.; Nguyen, Honghanh T.; Froidevaux, Lucien; wang, Shuhui; Manney, Gloria L.;
2011-01-01
This software accepts the EOS MLS calibrated measurements of microwave radiances products and operational meteorological data, and produces a set of estimates of atmospheric temperature and composition. This version has been designed to be as flexible as possible. The software is controlled by a Level 2 Configuration File that controls all aspects of the software: defining the contents of state and measurement vectors, defining the configurations of the various forward models available, reading appropriate a priori spectroscopic and calibration data, performing retrievals, post-processing results, computing diagnostics, and outputting results in appropriate files. In production mode, the software operates in a parallel form, with one instance of the program acting as a master, coordinating the work of multiple slave instances on a cluster of computers, each computing the results for individual chunks of data. In addition, to do conventional retrieval calculations and producing geophysical products, the Level 2 Configuration File can instruct the software to produce files of simulated radiances based on a state vector formed from a set of geophysical product files taken as input. Combining both the retrieval and simulation tasks in a single piece of software makes it far easier to ensure that identical forward model algorithms and parameters are used in both tasks. This also dramatically reduces the complexity of the code maintenance effort.
Parallelization of the Physical-Space Statistical Analysis System (PSAS)
NASA Technical Reports Server (NTRS)
Larson, J. W.; Guo, J.; Lyster, P. M.
1999-01-01
Atmospheric data assimilation is a method of combining observations with model forecasts to produce a more accurate description of the atmosphere than the observations or forecast alone can provide. Data assimilation plays an increasingly important role in the study of climate and atmospheric chemistry. The NASA Data Assimilation Office (DAO) has developed the Goddard Earth Observing System Data Assimilation System (GEOS DAS) to create assimilated datasets. The core computational components of the GEOS DAS include the GEOS General Circulation Model (GCM) and the Physical-space Statistical Analysis System (PSAS). The need for timely validation of scientific enhancements to the data assimilation system poses computational demands that are best met by distributed parallel software. PSAS is implemented in Fortran 90 using object-based design principles. The analysis portions of the code solve two equations. The first of these is the "innovation" equation, which is solved on the unstructured observation grid using a preconditioned conjugate gradient (CG) method. The "analysis" equation is a transformation from the observation grid back to a structured grid, and is solved by a direct matrix-vector multiplication. Use of a factored-operator formulation reduces the computational complexity of both the CG solver and the matrix-vector multiplication, rendering the matrix-vector multiplications as a successive product of operators on a vector. Sparsity is introduced to these operators by partitioning the observations using an icosahedral decomposition scheme. PSAS builds a large (approx. 128MB) run-time database of parameters used in the calculation of these operators. Implementing a message passing parallel computing paradigm into an existing yet developing computational system as complex as PSAS is nontrivial. One of the technical challenges is balancing the requirements for computational reproducibility with the need for high performance. The problem of computational reproducibility is well known in the parallel computing community. It is a requirement that the parallel code perform calculations in a fashion that will yield identical results on different configurations of processing elements on the same platform. In some cases this problem can be solved by sacrificing performance. Meeting this requirement and still achieving high performance is very difficult. Topics to be discussed include: current PSAS design and parallelization strategy; reproducibility issues; load balance vs. database memory demands, possible solutions to these problems.
NASA Technical Reports Server (NTRS)
Gilbertsen, Noreen D.; Belytschko, Ted
1990-01-01
The implementation of a nonlinear explicit program on a vectorized, concurrent computer with shared memory is described and studied. The conflict between vectorization and concurrency is described and some guidelines are given for optimal block sizes. Several example problems are summarized to illustrate the types of speed-ups which can be achieved by reprogramming as compared to compiler optimization.
A new parallel-vector finite element analysis software on distributed-memory computers
NASA Technical Reports Server (NTRS)
Qin, Jiangning; Nguyen, Duc T.
1993-01-01
A new parallel-vector finite element analysis software package MPFEA (Massively Parallel-vector Finite Element Analysis) is developed for large-scale structural analysis on massively parallel computers with distributed-memory. MPFEA is designed for parallel generation and assembly of the global finite element stiffness matrices as well as parallel solution of the simultaneous linear equations, since these are often the major time-consuming parts of a finite element analysis. Block-skyline storage scheme along with vector-unrolling techniques are used to enhance the vector performance. Communications among processors are carried out concurrently with arithmetic operations to reduce the total execution time. Numerical results on the Intel iPSC/860 computers (such as the Intel Gamma with 128 processors and the Intel Touchstone Delta with 512 processors) are presented, including an aircraft structure and some very large truss structures, to demonstrate the efficiency and accuracy of MPFEA.
Sorting on STAR. [CDC computer algorithm timing comparison
NASA Technical Reports Server (NTRS)
Stone, H. S.
1978-01-01
Timing comparisons are given for three sorting algorithms written for the CDC STAR computer. One algorithm is Hoare's (1962) Quicksort, which is the fastest or nearly the fastest sorting algorithm for most computers. A second algorithm is a vector version of Quicksort that takes advantage of the STAR's vector operations. The third algorithm is an adaptation of Batcher's (1968) sorting algorithm, which makes especially good use of vector operations but has a complexity of N(log N)-squared as compared with a complexity of N log N for the Quicksort algorithms. In spite of its worse complexity, Batcher's sorting algorithm is competitive with the serial version of Quicksort for vectors up to the largest that can be treated by STAR. Vector Quicksort outperforms the other two algorithms and is generally preferred. These results indicate that unusual instruction sets can introduce biases in program execution time that counter results predicted by worst-case asymptotic complexity analysis.
Some Applications Of Semigroups And Computer Algebra In Discrete Structures
NASA Astrophysics Data System (ADS)
Bijev, G.
2009-11-01
An algebraic approach to the pseudoinverse generalization problem in Boolean vector spaces is used. A map (p) is defined, which is similar to an orthogonal projection in linear vector spaces. Some other important maps with properties similar to those of the generalized inverses (pseudoinverses) of linear transformations and matrices corresponding to them are also defined and investigated. Let Ax = b be an equation with matrix A and vectors x and b Boolean. Stochastic experiments for solving the equation, which involves the maps defined and use computer algebra methods, have been made. As a result, the Hamming distance between vectors Ax = p(b) and b is equal or close to the least possible. We also share our experience in using computer algebra systems for teaching discrete mathematics and linear algebra and research. Some examples for computations with binary relations using Maple are given.
Adly, Amr A.; Abd-El-Hafiz, Salwa K.
2012-01-01
Incorporation of hysteresis models in electromagnetic analysis approaches is indispensable to accurate field computation in complex magnetic media. Throughout those computations, vector nature and computational efficiency of such models become especially crucial when sophisticated geometries requiring massive sub-region discretization are involved. Recently, an efficient vector Preisach-type hysteresis model constructed from only two scalar models having orthogonally coupled elementary operators has been proposed. This paper presents a novel Hopfield neural network approach for the implementation of Stoner–Wohlfarth-like operators that could lead to a significant enhancement in the computational efficiency of the aforementioned model. Advantages of this approach stem from the non-rectangular nature of these operators that substantially minimizes the number of operators needed to achieve an accurate vector hysteresis model. Details of the proposed approach, its identification and experimental testing are presented in the paper. PMID:25685446
The Design of a Templated C++ Small Vector Class for Numerical Computing
NASA Technical Reports Server (NTRS)
Moran, Patrick J.
2000-01-01
We describe the design and implementation of a templated C++ class for vectors. The vector class is templated both for vector length and vector component type; the vector length is fixed at template instantiation time. The vector implementation is such that for a vector of N components of type T, the total number of bytes required by the vector is equal to N * size of (T), where size of is the built-in C operator. The property of having a size no bigger than that required by the components themselves is key in many numerical computing applications, where one may allocate very large arrays of small, fixed-length vectors. In addition to the design trade-offs motivating our fixed-length vector design choice, we review some of the C++ template features essential to an efficient, succinct implementation. In particular, we highlight some of the standard C++ features, such as partial template specialization, that are not supported by all compilers currently. This report provides an inventory listing the relevant support currently provided by some key compilers, as well as test code one can use to verify compiler capabilities.
Rotations with Rodrigues' Vector
ERIC Educational Resources Information Center
Pina, E.
2011-01-01
The rotational dynamics was studied from the point of view of Rodrigues' vector. This vector is defined here by its connection with other forms of parametrization of the rotation matrix. The rotation matrix was expressed in terms of this vector. The angular velocity was computed using the components of Rodrigues' vector as coordinates. It appears…
Pairwise Sequence Alignment Library
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jeff Daily, PNNL
2015-05-20
Vector extensions, such as SSE, have been part of the x86 CPU since the 1990s, with applications in graphics, signal processing, and scientific applications. Although many algorithms and applications can naturally benefit from automatic vectorization techniques, there are still many that are difficult to vectorize due to their dependence on irregular data structures, dense branch operations, or data dependencies. Sequence alignment, one of the most widely used operations in bioinformatics workflows, has a computational footprint that features complex data dependencies. The trend of widening vector registers adversely affects the state-of-the-art sequence alignment algorithm based on striped data layouts. Therefore, amore » novel SIMD implementation of a parallel scan-based sequence alignment algorithm that can better exploit wider SIMD units was implemented as part of the Parallel Sequence Alignment Library (parasail). Parasail features: Reference implementations of all known vectorized sequence alignment approaches. Implementations of Smith Waterman (SW), semi-global (SG), and Needleman Wunsch (NW) sequence alignment algorithms. Implementations across all modern CPU instruction sets including AVX2 and KNC. Language interfaces for C/C++ and Python.« less
A Fast Reduced Kernel Extreme Learning Machine.
Deng, Wan-Yu; Ong, Yew-Soon; Zheng, Qing-Hua
2016-04-01
In this paper, we present a fast and accurate kernel-based supervised algorithm referred to as the Reduced Kernel Extreme Learning Machine (RKELM). In contrast to the work on Support Vector Machine (SVM) or Least Square SVM (LS-SVM), which identifies the support vectors or weight vectors iteratively, the proposed RKELM randomly selects a subset of the available data samples as support vectors (or mapping samples). By avoiding the iterative steps of SVM, significant cost savings in the training process can be readily attained, especially on Big datasets. RKELM is established based on the rigorous proof of universal learning involving reduced kernel-based SLFN. In particular, we prove that RKELM can approximate any nonlinear functions accurately under the condition of support vectors sufficiency. Experimental results on a wide variety of real world small instance size and large instance size applications in the context of binary classification, multi-class problem and regression are then reported to show that RKELM can perform at competitive level of generalized performance as the SVM/LS-SVM at only a fraction of the computational effort incurred. Copyright © 2015 Elsevier Ltd. All rights reserved.
An implementation of the QMR method based on coupled two-term recurrences
NASA Technical Reports Server (NTRS)
Freund, Roland W.; Nachtigal, Noeel M.
1992-01-01
The authors have proposed a new Krylov subspace iteration, the quasi-minimal residual algorithm (QMR), for solving non-Hermitian linear systems. In the original implementation of the QMR method, the Lanczos process with look-ahead is used to generate basis vectors for the underlying Krylov subspaces. In the Lanczos algorithm, these basis vectors are computed by means of three-term recurrences. It has been observed that, in finite precision arithmetic, vector iterations based on three-term recursions are usually less robust than mathematically equivalent coupled two-term vector recurrences. This paper presents a look-ahead algorithm that constructs the Lanczos basis vectors by means of coupled two-term recursions. Implementation details are given, and the look-ahead strategy is described. A new implementation of the QMR method, based on this coupled two-term algorithm, is described. A simplified version of the QMR algorithm without look-ahead is also presented, and the special case of QMR for complex symmetric linear systems is considered. Results of numerical experiments comparing the original and the new implementations of the QMR method are reported.
Progressive Classification Using Support Vector Machines
NASA Technical Reports Server (NTRS)
Wagstaff, Kiri; Kocurek, Michael
2009-01-01
An algorithm for progressive classification of data, analogous to progressive rendering of images, makes it possible to compromise between speed and accuracy. This algorithm uses support vector machines (SVMs) to classify data. An SVM is a machine learning algorithm that builds a mathematical model of the desired classification concept by identifying the critical data points, called support vectors. Coarse approximations to the concept require only a few support vectors, while precise, highly accurate models require far more support vectors. Once the model has been constructed, the SVM can be applied to new observations. The cost of classifying a new observation is proportional to the number of support vectors in the model. When computational resources are limited, an SVM of the appropriate complexity can be produced. However, if the constraints are not known when the model is constructed, or if they can change over time, a method for adaptively responding to the current resource constraints is required. This capability is particularly relevant for spacecraft (or any other real-time systems) that perform onboard data analysis. The new algorithm enables the fast, interactive application of an SVM classifier to a new set of data. The classification process achieved by this algorithm is characterized as progressive because a coarse approximation to the true classification is generated rapidly and thereafter iteratively refined. The algorithm uses two SVMs: (1) a fast, approximate one and (2) slow, highly accurate one. New data are initially classified by the fast SVM, producing a baseline approximate classification. For each classified data point, the algorithm calculates a confidence index that indicates the likelihood that it was classified correctly in the first pass. Next, the data points are sorted by their confidence indices and progressively reclassified by the slower, more accurate SVM, starting with the items most likely to be incorrectly classified. The user can halt this reclassification process at any point, thereby obtaining the best possible result for a given amount of computation time. Alternatively, the results can be displayed as they are generated, providing the user with real-time feedback about the current accuracy of classification.
NASA Astrophysics Data System (ADS)
Ghaemi, Z.; Farnaghi, M.; Alimohammadi, A.
2015-12-01
The critical impact of air pollution on human health and environment in one hand and the complexity of pollutant concentration behavior in the other hand lead the scientists to look for advance techniques for monitoring and predicting the urban air quality. Additionally, recent developments in data measurement techniques have led to collection of various types of data about air quality. Such data is extremely voluminous and to be useful it must be processed at high velocity. Due to the complexity of big data analysis especially for dynamic applications, online forecasting of pollutant concentration trends within a reasonable processing time is still an open problem. The purpose of this paper is to present an online forecasting approach based on Support Vector Machine (SVM) to predict the air quality one day in advance. In order to overcome the computational requirements for large-scale data analysis, distributed computing based on the Hadoop platform has been employed to leverage the processing power of multiple processing units. The MapReduce programming model is adopted for massive parallel processing in this study. Based on the online algorithm and Hadoop framework, an online forecasting system is designed to predict the air pollution of Tehran for the next 24 hours. The results have been assessed on the basis of Processing Time and Efficiency. Quite accurate predictions of air pollutant indicator levels within an acceptable processing time prove that the presented approach is very suitable to tackle large scale air pollution prediction problems.
LCD motion blur reduction: a signal processing approach.
Har-Noy, Shay; Nguyen, Truong Q
2008-02-01
Liquid crystal displays (LCDs) have shown great promise in the consumer market for their use as both computer and television displays. Despite their many advantages, the inherent sample-and-hold nature of LCD image formation results in a phenomenon known as motion blur. In this work, we develop a method for motion blur reduction using the Richardson-Lucy deconvolution algorithm in concert with motion vector information from the scene. We further refine our approach by introducing a perceptual significance metric that allows us to weight the amount of processing performed on different regions in the image. In addition, we analyze the role of motion vector errors in the quality of our resulting image. Perceptual tests indicate that our algorithm reduces the amount of perceivable motion blur in LCDs.
Vectorization, threading, and cache-blocking considerations for hydrocodes on emerging architectures
Fung, J.; Aulwes, R. T.; Bement, M. T.; ...
2015-07-14
This work reports on considerations for improving computational performance in preparation for current and expected changes to computer architecture. The algorithms studied will include increasingly complex prototypes for radiation hydrodynamics codes, such as gradient routines and diffusion matrix assembly (e.g., in [1-6]). The meshes considered for the algorithms are structured or unstructured meshes. The considerations applied for performance improvements are meant to be general in terms of architecture (not specifically graphical processing unit (GPUs) or multi-core machines, for example) and include techniques for vectorization, threading, tiling, and cache blocking. Out of a survey of optimization techniques on applications such asmore » diffusion and hydrodynamics, we make general recommendations with a view toward making these techniques conceptually accessible to the applications code developer. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.« less
A fully vectorized numerical solution of the incompressible Navier-Stokes equations. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Patel, N.
1983-01-01
A vectorizable algorithm is presented for the implicit finite difference solution of the incompressible Navier-Stokes equations in general curvilinear coordinates. The unsteady Reynolds averaged Navier-Stokes equations solved are in two dimension and non-conservative primitive variable form. A two-layer algebraic eddy viscosity turbulence model is used to incorporate the effects of turbulence. Two momentum equations and a Poisson pressure equation, which is obtained by taking the divergence of the momentum equations and satisfying the continuity equation, are solved simultaneously at each time step. An elliptic grid generation approach is used to generate a boundary conforming coordinate system about an airfoil. The governing equations are expressed in terms of the curvilinear coordinates and are solved on a uniform rectangular computational domain. A checkerboard SOR, which can effectively utilize the computer architectural concept of vector processing, is used for iterative solution of the governing equations.
Stochastic determination of matrix determinants
NASA Astrophysics Data System (ADS)
Dorn, Sebastian; Enßlin, Torsten A.
2015-07-01
Matrix determinants play an important role in data analysis, in particular when Gaussian processes are involved. Due to currently exploding data volumes, linear operations—matrices—acting on the data are often not accessible directly but are only represented indirectly in form of a computer routine. Such a routine implements the transformation a data vector undergoes under matrix multiplication. While efficient probing routines to estimate a matrix's diagonal or trace, based solely on such computationally affordable matrix-vector multiplications, are well known and frequently used in signal inference, there is no stochastic estimate for its determinant. We introduce a probing method for the logarithm of a determinant of a linear operator. Our method rests upon a reformulation of the log-determinant by an integral representation and the transformation of the involved terms into stochastic expressions. This stochastic determinant determination enables large-size applications in Bayesian inference, in particular evidence calculations, model comparison, and posterior determination.
Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback
Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi
2016-01-01
Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery. PMID:27861505
Automated metastatic brain lesion detection: a computer aided diagnostic and clinical research tool
NASA Astrophysics Data System (ADS)
Devine, Jeremy; Sahgal, Arjun; Karam, Irene; Martel, Anne L.
2016-03-01
The accurate localization of brain metastases in magnetic resonance (MR) images is crucial for patients undergoing stereotactic radiosurgery (SRS) to ensure that all neoplastic foci are targeted. Computer automated tumor localization and analysis can improve both of these tasks by eliminating inter and intra-observer variations during the MR image reading process. Lesion localization is accomplished using adaptive thresholding to extract enhancing objects. Each enhancing object is represented as a vector of features which includes information on object size, symmetry, position, shape, and context. These vectors are then used to train a random forest classifier. We trained and tested the image analysis pipeline on 3D axial contrast-enhanced MR images with the intention of localizing the brain metastases. In our cross validation study and at the most effective algorithm operating point, we were able to identify 90% of the lesions at a precision rate of 60%.
Hu, Kai; Gui, Zhipeng; Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi
2016-01-01
Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery.
Stochastic determination of matrix determinants.
Dorn, Sebastian; Ensslin, Torsten A
2015-07-01
Matrix determinants play an important role in data analysis, in particular when Gaussian processes are involved. Due to currently exploding data volumes, linear operations-matrices-acting on the data are often not accessible directly but are only represented indirectly in form of a computer routine. Such a routine implements the transformation a data vector undergoes under matrix multiplication. While efficient probing routines to estimate a matrix's diagonal or trace, based solely on such computationally affordable matrix-vector multiplications, are well known and frequently used in signal inference, there is no stochastic estimate for its determinant. We introduce a probing method for the logarithm of a determinant of a linear operator. Our method rests upon a reformulation of the log-determinant by an integral representation and the transformation of the involved terms into stochastic expressions. This stochastic determinant determination enables large-size applications in Bayesian inference, in particular evidence calculations, model comparison, and posterior determination.
NASA Technical Reports Server (NTRS)
Pongratz, M.
1972-01-01
Results from a Nike-Tomahawk sounding rocket flight launched from Fort Churchill are presented. The rocket was launched into a breakup aurora at magnetic local midnight on 21 March 1968. The rocket was instrumented to measure electrons with an electrostatic analyzer electron spectrometer which made 29 measurements in the energy interval 0.5 KeV to 30 KeV. Complete energy spectra were obtained at a rate of 10/sec. Pitch angle information is presented via 3 computed average per rocket spin. The dumped electron average corresponds to averages over electrons moving nearly parallel to the B vector. The mirroring electron average corresponds to averages over electrons moving nearly perpendicular to the B vector. The average was also computed over the entire downward hemisphere (the precipitated electron average). The observations were obtained in an altitude range of 10 km at 230 km altitude.
Computation of Surface Integrals of Curl Vector Fields
ERIC Educational Resources Information Center
Hu, Chenglie
2007-01-01
This article presents a way of computing a surface integral when the vector field of the integrand is a curl field. Presented in some advanced calculus textbooks such as [1], the technique, as the author experienced, is simple and applicable. The computation is based on Stokes' theorem in 3-space calculus, and thus provides not only a means to…
Rotman Lens Sidewall Design and Optimization with Hybrid Hardware/Software Based Programming
2015-01-09
conventional MoM and stored in memory. The components of Zfar are computed as needed through a fast matrix vector multiplication ( MVM ), which...V vector. Iterative methods, e.g. BiCGSTAB, are employed for solving the linear equation. The matrix-vector multiplications ( MVMs ), which dominate...most of the computation in the solving phase, consists of calculating near and far MVMs . The far MVM comprises aggregation, translation, and
A Semi-Vectorization Algorithm to Synthesis of Gravitational Anomaly Quantities on the Earth
NASA Astrophysics Data System (ADS)
Abdollahzadeh, M.; Eshagh, M.; Najafi Alamdari, M.
2009-04-01
The Earth's gravitational potential can be expressed by the well-known spherical harmonic expansion. The computational time of summing up this expansion is an important practical issue which can be reduced by an efficient numerical algorithm. This paper proposes such a method for block-wise synthesizing the anomaly quantities on the Earth surface using vectorization. Fully-vectorization means transformation of the summations to the simple matrix and vector products. It is not a practical for the matrices with large dimensions. Here a semi-vectorization algorithm is proposed to avoid working with large vectors and matrices. It speeds up the computations by using one loop for the summation either on degrees or on orders. The former is a good option to synthesize the anomaly quantities on the Earth surface considering a digital elevation model (DEM). This approach is more efficient than the two-step method which computes the quantities on the reference ellipsoid and continues them upward to the Earth surface. The algorithm has been coded in MATLAB which synthesizes a global grid of 5â²Ã- 5â² (corresponding 9 million points) of gravity anomaly or geoid height using a geopotential model to degree 360 in 10000 seconds by an ordinary computer with 2G RAM.
Vectorization on the star computer of several numerical methods for a fluid flow problem
NASA Technical Reports Server (NTRS)
Lambiotte, J. J., Jr.; Howser, L. M.
1974-01-01
A reexamination of some numerical methods is considered in light of the new class of computers which use vector streaming to achieve high computation rates. A study has been made of the effect on the relative efficiency of several numerical methods applied to a particular fluid flow problem when they are implemented on a vector computer. The method of Brailovskaya, the alternating direction implicit method, a fully implicit method, and a new method called partial implicitization have been applied to the problem of determining the steady state solution of the two-dimensional flow of a viscous imcompressible fluid in a square cavity driven by a sliding wall. Results are obtained for three mesh sizes and a comparison is made of the methods for serial computation.
An efficient sparse matrix multiplication scheme for the CYBER 205 computer
NASA Technical Reports Server (NTRS)
Lambiotte, Jules J., Jr.
1988-01-01
This paper describes the development of an efficient algorithm for computing the product of a matrix and vector on a CYBER 205 vector computer. The desire to provide software which allows the user to choose between the often conflicting goals of minimizing central processing unit (CPU) time or storage requirements has led to a diagonal-based algorithm in which one of four types of storage is selected for each diagonal. The candidate storage types employed were chosen to be efficient on the CYBER 205 for diagonals which have nonzero structure which is dense, moderately sparse, very sparse and short, or very sparse and long; however, for many densities, no diagonal type is most efficient with respect to both resource requirements, and a trade-off must be made. For each diagonal, an initialization subroutine estimates the CPU time and storage required for each storage type based on results from previously performed numerical experimentation. These requirements are adjusted by weights provided by the user which reflect the relative importance the user places on the two resources. The adjusted resource requirements are then compared to select the most efficient storage and computational scheme.
NASA Technical Reports Server (NTRS)
Nguyen, Duc T.
1990-01-01
Practical engineering application can often be formulated in the form of a constrained optimization problem. There are several solution algorithms for solving a constrained optimization problem. One approach is to convert a constrained problem into a series of unconstrained problems. Furthermore, unconstrained solution algorithms can be used as part of the constrained solution algorithms. Structural optimization is an iterative process where one starts with an initial design, a finite element structure analysis is then performed to calculate the response of the system (such as displacements, stresses, eigenvalues, etc.). Based upon the sensitivity information on the objective and constraint functions, an optimizer such as ADS or IDESIGN, can be used to find the new, improved design. For the structural analysis phase, the equation solver for the system of simultaneous, linear equations plays a key role since it is needed for either static, or eigenvalue, or dynamic analysis. For practical, large-scale structural analysis-synthesis applications, computational time can be excessively large. Thus, it is necessary to have a new structural analysis-synthesis code which employs new solution algorithms to exploit both parallel and vector capabilities offered by modern, high performance computers such as the Convex, Cray-2 and Cray-YMP computers. The objective of this research project is, therefore, to incorporate the latest development in the parallel-vector equation solver, PVSOLVE into the widely popular finite-element production code, such as the SAP-4. Furthermore, several nonlinear unconstrained optimization subroutines have also been developed and tested under a parallel computer environment. The unconstrained optimization subroutines are not only useful in their own right, but they can also be incorporated into a more popular constrained optimization code, such as ADS.
Solar physics applications of computer graphics and image processing
NASA Technical Reports Server (NTRS)
Altschuler, M. D.
1985-01-01
Computer graphics devices coupled with computers and carefully developed software provide new opportunities to achieve insight into the geometry and time evolution of scalar, vector, and tensor fields and to extract more information quickly and cheaply from the same image data. Two or more different fields which overlay in space can be calculated from the data (and the physics), then displayed from any perspective, and compared visually. The maximum regions of one field can be compared with the gradients of another. Time changing fields can also be compared. Images can be added, subtracted, transformed, noise filtered, frequency filtered, contrast enhanced, color coded, enlarged, compressed, parameterized, and histogrammed, in whole or section by section. Today it is possible to process multiple digital images to reveal spatial and temporal correlations and cross correlations. Data from different observatories taken at different times can be processed, interpolated, and transformed to a common coordinate system.
A Real-Time Phase Vector Display for EEG Monitoring
NASA Technical Reports Server (NTRS)
Finger, Herbert J.; Anliker, James E.; Rimmer, Tamara
1973-01-01
A real-time, computer-based, phase vector display system has been developed which will output a vector whose phase is equal to the delay between a trigger and the peak of a function which is quasi-coherent with respect to the trigger. The system also contains a sliding averager which enables the operator to average successive trials before calculating the phase vector. Data collection, averaging and display generation are performed on a LINC-8 computer. Output displays appear on several X-Y CRT display units and on a kymograph camera/oscilloscope unit which is used to generate photographs of time-varying phase vectors or contourograms of time-varying averages of input functions.
NASA Technical Reports Server (NTRS)
Lakeotes, Christopher D.
1990-01-01
DEVECT (CYBER-205 Devectorizer) is CYBER-205 FORTRAN source-language-preprocessor computer program reducing vector statements to standard FORTRAN. In addition, DEVECT has many other standard and optional features simplifying conversion of vector-processor programs for CYBER 200 to other computers. Written in FORTRAN IV.
Efficient solution of parabolic equations by Krylov approximation methods
NASA Technical Reports Server (NTRS)
Gallopoulos, E.; Saad, Y.
1990-01-01
Numerical techniques for solving parabolic equations by the method of lines is addressed. The main motivation for the proposed approach is the possibility of exploiting a high degree of parallelism in a simple manner. The basic idea of the method is to approximate the action of the evolution operator on a given state vector by means of a projection process onto a Krylov subspace. Thus, the resulting approximation consists of applying an evolution operator of a very small dimension to a known vector which is, in turn, computed accurately by exploiting well-known rational approximations to the exponential. Because the rational approximation is only applied to a small matrix, the only operations required with the original large matrix are matrix-by-vector multiplications, and as a result the algorithm can easily be parallelized and vectorized. Some relevant approximation and stability issues are discussed. We present some numerical experiments with the method and compare its performance with a few explicit and implicit algorithms.
NASA Technical Reports Server (NTRS)
Bommier, V.
1986-01-01
The Hanle effect is the modification of the linear polarization parameters of a spectral line due to the effect of the magnetic field. It has been successfully applied to the magnetic field vector diagnostic in solar prominences. The magnetic field vector is determined by comparing the measured polarization to the polarization computed, taking into account all the polarizing and depolarizing processes in line formation and the depolarizing effect of the magnetic field. The method was applied to simultaneous polarization measurements in the Helium D3 line and in the hydrogen beta line in 14 prominences. Four polarization parameters are measured, which lead to the determination of the three coordinates of the magnetic field vector and the electron density, owing to the sensitivity of the hydrogen beta line to the non-negligible effect of depolarizing collisions with electrons and protons of the medium. A mean value of 1.3 x 10 to the 10th power cu. cm. is derived in 14 prominences.
A selective-update affine projection algorithm with selective input vectors
NASA Astrophysics Data System (ADS)
Kong, NamWoong; Shin, JaeWook; Park, PooGyeon
2011-10-01
This paper proposes an affine projection algorithm (APA) with selective input vectors, which based on the concept of selective-update in order to reduce estimation errors and computations. The algorithm consists of two procedures: input- vector-selection and state-decision. The input-vector-selection procedure determines the number of input vectors by checking with mean square error (MSE) whether the input vectors have enough information for update. The state-decision procedure determines the current state of the adaptive filter by using the state-decision criterion. As the adaptive filter is in transient state, the algorithm updates the filter coefficients with the selected input vectors. On the other hand, as soon as the adaptive filter reaches the steady state, the update procedure is not performed. Through these two procedures, the proposed algorithm achieves small steady-state estimation errors, low computational complexity and low update complexity for colored input signals.
Potential Application of a Graphical Processing Unit to Parallel Computations in the NUBEAM Code
NASA Astrophysics Data System (ADS)
Payne, J.; McCune, D.; Prater, R.
2010-11-01
NUBEAM is a comprehensive computational Monte Carlo based model for neutral beam injection (NBI) in tokamaks. NUBEAM computes NBI-relevant profiles in tokamak plasmas by tracking the deposition and the slowing of fast ions. At the core of NUBEAM are vector calculations used to track fast ions. These calculations have recently been parallelized to run on MPI clusters. However, cost and interlink bandwidth limit the ability to fully parallelize NUBEAM on an MPI cluster. Recent implementation of double precision capabilities for Graphical Processing Units (GPUs) presents a cost effective and high performance alternative or complement to MPI computation. Commercially available graphics cards can achieve up to 672 GFLOPS double precision and can handle hundreds of thousands of threads. The ability to execute at least one thread per particle simultaneously could significantly reduce the execution time and the statistical noise of NUBEAM. Progress on implementation on a GPU will be presented.
The Unified Floating Point Vector Coprocessor for Reconfigurable Hardware
NASA Astrophysics Data System (ADS)
Kathiara, Jainik
There has been an increased interest recently in using embedded cores on FPGAs. Many of the applications that make use of these cores have floating point operations. Due to the complexity and expense of floating point hardware, these algorithms are usually converted to fixed point operations or implemented using floating-point emulation in software. As the technology advances, more and more homogeneous computational resources and fixed function embedded blocks are added to FPGAs and hence implementation of floating point hardware becomes a feasible option. In this research we have implemented a high performance, autonomous floating point vector Coprocessor (FPVC) that works independently within an embedded processor system. We have presented a unified approach to vector and scalar computation, using a single register file for both scalar operands and vector elements. The Hybrid vector/SIMD computational model of FPVC results in greater overall performance for most applications along with improved peak performance compared to other approaches. By parameterizing vector length and the number of vector lanes, we can design an application specific FPVC and take optimal advantage of the FPGA fabric. For this research we have also initiated designing a software library for various computational kernels, each of which adapts FPVC's configuration and provide maximal performance. The kernels implemented are from the area of linear algebra and include matrix multiplication and QR and Cholesky decomposition. We have demonstrated the operation of FPVC on a Xilinx Virtex 5 using the embedded PowerPC.
SAPNEW: Parallel finite element code for thin shell structures on the Alliant FX-80
NASA Astrophysics Data System (ADS)
Kamat, Manohar P.; Watson, Brian C.
1992-11-01
The finite element method has proven to be an invaluable tool for analysis and design of complex, high performance systems, such as bladed-disk assemblies in aircraft turbofan engines. However, as the problem size increase, the computation time required by conventional computers can be prohibitively high. Parallel processing computers provide the means to overcome these computation time limits. This report summarizes the results of a research activity aimed at providing a finite element capability for analyzing turbomachinery bladed-disk assemblies in a vector/parallel processing environment. A special purpose code, named with the acronym SAPNEW, has been developed to perform static and eigen analysis of multi-degree-of-freedom blade models built-up from flat thin shell elements. SAPNEW provides a stand alone capability for static and eigen analysis on the Alliant FX/80, a parallel processing computer. A preprocessor, named with the acronym NTOS, has been developed to accept NASTRAN input decks and convert them to the SAPNEW format to make SAPNEW more readily used by researchers at NASA Lewis Research Center.
Vector Observation-Aided/Attitude-Rate Estimation Using Global Positioning System Signals
NASA Technical Reports Server (NTRS)
Oshman, Yaakov; Markley, F. Landis
1997-01-01
A sequential filtering algorithm is presented for attitude and attitude-rate estimation from Global Positioning System (GPS) differential carrier phase measurements. A third-order, minimal-parameter method for solving the attitude matrix kinematic equation is used to parameterize the filter's state, which renders the resulting estimator computationally efficient. Borrowing from tracking theory concepts, the angular acceleration is modeled as an exponentially autocorrelated stochastic process, thus avoiding the use of the uncertain spacecraft dynamic model. The new formulation facilitates the use of aiding vector observations in a unified filtering algorithm, which can enhance the method's robustness and accuracy. Numerical examples are used to demonstrate the performance of the method.
The Lenz Vector and Orbital Analog Computers
ERIC Educational Resources Information Center
Harter, W. G.
1976-01-01
Describes a single geometrical diagram based on the Lenz vector which shows the qualitative and quantitative features of all three types of Coulomb orbits. Explains the use of a simple analog computer with an overhead projector to demonstrate many of these effects. (Author/CP)
NASA Technical Reports Server (NTRS)
Pratt, D. T.
1984-01-01
An interactive computer code for simulation of a high-intensity turbulent combustor as a single point inhomogeneous stirred reactor was developed from an existing batch processing computer code CDPSR. The interactive CDPSR code was used as a guide for interpretation and direction of DOE-sponsored companion experiments utilizing Xenon tracer with optical laser diagnostic techniques to experimentally determine the appropriate mixing frequency, and for validation of CDPSR as a mixing-chemistry model for a laboratory jet-stirred reactor. The coalescence-dispersion model for finite rate mixing was incorporated into an existing interactive code AVCO-MARK I, to enable simulation of a combustor as a modular array of stirred flow and plug flow elements, each having a prescribed finite mixing frequency, or axial distribution of mixing frequency, as appropriate. Further increase the speed and reliability of the batch kinetics integrator code CREKID was increased by rewriting in vectorized form for execution on a vector or parallel processor, and by incorporating numerical techniques which enhance execution speed by permitting specification of a very low accuracy tolerance.
Implementation and analysis of a Navier-Stokes algorithm on parallel computers
NASA Technical Reports Server (NTRS)
Fatoohi, Raad A.; Grosch, Chester E.
1988-01-01
The results of the implementation of a Navier-Stokes algorithm on three parallel/vector computers are presented. The object of this research is to determine how well, or poorly, a single numerical algorithm would map onto three different architectures. The algorithm is a compact difference scheme for the solution of the incompressible, two-dimensional, time-dependent Navier-Stokes equations. The computers were chosen so as to encompass a variety of architectures. They are the following: the MPP, an SIMD machine with 16K bit serial processors; Flex/32, an MIMD machine with 20 processors; and Cray/2. The implementation of the algorithm is discussed in relation to these architectures and measures of the performance on each machine are given. The basic comparison is among SIMD instruction parallelism on the MPP, MIMD process parallelism on the Flex/32, and vectorization of a serial code on the Cray/2. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally, conclusions are presented.
Missile signal processing common computer architecture for rapid technology upgrade
NASA Astrophysics Data System (ADS)
Rabinkin, Daniel V.; Rutledge, Edward; Monticciolo, Paul
2004-10-01
Interceptor missiles process IR images to locate an intended target and guide the interceptor towards it. Signal processing requirements have increased as the sensor bandwidth increases and interceptors operate against more sophisticated targets. A typical interceptor signal processing chain is comprised of two parts. Front-end video processing operates on all pixels of the image and performs such operations as non-uniformity correction (NUC), image stabilization, frame integration and detection. Back-end target processing, which tracks and classifies targets detected in the image, performs such algorithms as Kalman tracking, spectral feature extraction and target discrimination. In the past, video processing was implemented using ASIC components or FPGAs because computation requirements exceeded the throughput of general-purpose processors. Target processing was performed using hybrid architectures that included ASICs, DSPs and general-purpose processors. The resulting systems tended to be function-specific, and required custom software development. They were developed using non-integrated toolsets and test equipment was developed along with the processor platform. The lifespan of a system utilizing the signal processing platform often spans decades, while the specialized nature of processor hardware and software makes it difficult and costly to upgrade. As a result, the signal processing systems often run on outdated technology, algorithms are difficult to update, and system effectiveness is impaired by the inability to rapidly respond to new threats. A new design approach is made possible three developments; Moore's Law - driven improvement in computational throughput; a newly introduced vector computing capability in general purpose processors; and a modern set of open interface software standards. Today's multiprocessor commercial-off-the-shelf (COTS) platforms have sufficient throughput to support interceptor signal processing requirements. This application may be programmed under existing real-time operating systems using parallel processing software libraries, resulting in highly portable code that can be rapidly migrated to new platforms as processor technology evolves. Use of standardized development tools and 3rd party software upgrades are enabled as well as rapid upgrade of processing components as improved algorithms are developed. The resulting weapon system will have a superior processing capability over a custom approach at the time of deployment as a result of a shorter development cycles and use of newer technology. The signal processing computer may be upgraded over the lifecycle of the weapon system, and can migrate between weapon system variants enabled by modification simplicity. This paper presents a reference design using the new approach that utilizes an Altivec PowerPC parallel COTS platform. It uses a VxWorks-based real-time operating system (RTOS), and application code developed using an efficient parallel vector library (PVL). A quantification of computing requirements and demonstration of interceptor algorithm operating on this real-time platform are provided.
NASA Technical Reports Server (NTRS)
Wang, R.; Demerdash, N. A.
1991-01-01
A method of combined use of magnetic vector potential based finite-element (FE) formulations and magnetic scalar potential (MSP) based formulations for computation of three-dimensional magnetostatic fields is introduced. In this method, the curl-component of the magnetic field intensity is computed by a reduced magnetic vector potential. This field intensity forms the basic of a forcing function for a global magnetic scalar potential solution over the entire volume of the region. This method allows one to include iron portions sandwiched in between conductors within partitioned current-carrying subregions. The method is most suited for large-scale global-type 3-D magnetostatic field computations in electrical devices, and in particular rotating electric machinery.
Feature generation using genetic programming with application to fault classification.
Guo, Hong; Jack, Lindsay B; Nandi, Asoke K
2005-02-01
One of the major challenges in pattern recognition problems is the feature extraction process which derives new features from existing features, or directly from raw data in order to reduce the cost of computation during the classification process, while improving classifier efficiency. Most current feature extraction techniques transform the original pattern vector into a new vector with increased discrimination capability but lower dimensionality. This is conducted within a predefined feature space, and thus, has limited searching power. Genetic programming (GP) can generate new features from the original dataset without prior knowledge of the probabilistic distribution. In this paper, a GP-based approach is developed for feature extraction from raw vibration data recorded from a rotating machine with six different conditions. The created features are then used as the inputs to a neural classifier for the identification of six bearing conditions. Experimental results demonstrate the ability of GP to discover autimatically the different bearing conditions using features expressed in the form of nonlinear functions. Furthermore, four sets of results--using GP extracted features with artificial neural networks (ANN) and support vector machines (SVM), as well as traditional features with ANN and SVM--have been obtained. This GP-based approach is used for bearing fault classification for the first time and exhibits superior searching power over other techniques. Additionaly, it significantly reduces the time for computation compared with genetic algorithm (GA), therefore, makes a more practical realization of the solution.
NASA Technical Reports Server (NTRS)
Babrauckas, Theresa
2000-01-01
The Affordable High Performance Computing (AHPC) project demonstrated that high-performance computing based on a distributed network of computer workstations is a cost-effective alternative to vector supercomputers for running CPU and memory intensive design and analysis tools. The AHPC project created an integrated system called a Network Supercomputer. By connecting computer work-stations through a network and utilizing the workstations when they are idle, the resulting distributed-workstation environment has the same performance and reliability levels as the Cray C90 vector Supercomputer at less than 25 percent of the C90 cost. In fact, the cost comparison between a Cray C90 Supercomputer and Sun workstations showed that the number of distributed networked workstations equivalent to a C90 costs approximately 8 percent of the C90.
Improved dense trajectories for action recognition based on random projection and Fisher vectors
NASA Astrophysics Data System (ADS)
Ai, Shihui; Lu, Tongwei; Xiong, Yudian
2018-03-01
As an important application of intelligent monitoring system, the action recognition in video has become a very important research area of computer vision. In order to improve the accuracy rate of the action recognition in video with improved dense trajectories, one advanced vector method is introduced. Improved dense trajectories combine Fisher Vector with Random Projection. The method realizes the reduction of the characteristic trajectory though projecting the high-dimensional trajectory descriptor into the low-dimensional subspace based on defining and analyzing Gaussian mixture model by Random Projection. And a GMM-FV hybrid model is introduced to encode the trajectory feature vector and reduce dimension. The computational complexity is reduced by Random Projection which can drop Fisher coding vector. Finally, a Linear SVM is used to classifier to predict labels. We tested the algorithm in UCF101 dataset and KTH dataset. Compared with existed some others algorithm, the result showed that the method not only reduce the computational complexity but also improved the accuracy of action recognition.
Point-based warping with optimized weighting factors of displacement vectors
NASA Astrophysics Data System (ADS)
Pielot, Ranier; Scholz, Michael; Obermayer, Klaus; Gundelfinger, Eckart D.; Hess, Andreas
2000-06-01
The accurate comparison of inter-individual 3D image brain datasets requires non-affine transformation techniques (warping) to reduce geometric variations. Constrained by the biological prerequisites we use in this study a landmark-based warping method with weighted sums of displacement vectors, which is enhanced by an optimization process. Furthermore, we investigate fast automatic procedures for determining landmarks to improve the practicability of 3D warping. This combined approach was tested on 3D autoradiographs of Gerbil brains. The autoradiographs were obtained after injecting a non-metabolized radioactive glucose derivative into the Gerbil thereby visualizing neuronal activity in the brain. Afterwards the brain was processed with standard autoradiographical methods. The landmark-generator computes corresponding reference points simultaneously within a given number of datasets by Monte-Carlo-techniques. The warping function is a distance weighted exponential function with a landmark- specific weighting factor. These weighting factors are optimized by a computational evolution strategy. The warping quality is quantified by several coefficients (correlation coefficient, overlap-index, and registration error). The described approach combines a highly suitable procedure to automatically detect landmarks in autoradiographical brain images and an enhanced point-based warping technique, optimizing the local weighting factors. This optimization process significantly improves the similarity between the warped and the target dataset.
Morphological evidence for parallel processing of information in rat macula.
Ross, M D
1988-01-01
Study of montages, tracings and reconstructions prepared from a series of 570 consecutive ultrathin sections shows that rat maculas are morphologically organized for parallel processing of linear acceleratory information. Type II cells of one terminal field distribute information to neighboring terminals as well. The findings are examined in light of physiological data which indicate that macular receptor fields have a preferred directional vector, and are interpreted by analogy to a computer technology known as an information network.
Plaza-Leiva, Victoria; Gomez-Ruiz, Jose Antonio; Mandow, Anthony; García-Cerezo, Alfonso
2017-03-15
Improving the effectiveness of spatial shape features classification from 3D lidar data is very relevant because it is largely used as a fundamental step towards higher level scene understanding challenges of autonomous vehicles and terrestrial robots. In this sense, computing neighborhood for points in dense scans becomes a costly process for both training and classification. This paper proposes a new general framework for implementing and comparing different supervised learning classifiers with a simple voxel-based neighborhood computation where points in each non-overlapping voxel in a regular grid are assigned to the same class by considering features within a support region defined by the voxel itself. The contribution provides offline training and online classification procedures as well as five alternative feature vector definitions based on principal component analysis for scatter, tubular and planar shapes. Moreover, the feasibility of this approach is evaluated by implementing a neural network (NN) method previously proposed by the authors as well as three other supervised learning classifiers found in scene processing methods: support vector machines (SVM), Gaussian processes (GP), and Gaussian mixture models (GMM). A comparative performance analysis is presented using real point clouds from both natural and urban environments and two different 3D rangefinders (a tilting Hokuyo UTM-30LX and a Riegl). Classification performance metrics and processing time measurements confirm the benefits of the NN classifier and the feasibility of voxel-based neighborhood.
Bleul, Christiane; Baumann-Klausener, Franziska; Labhart, Thomas; Dickinson, Michael H.
2016-01-01
Many insects exploit skylight polarization as a compass cue for orientation and navigation. In the fruit fly, Drosophila melanogaster, photoreceptors R7 and R8 in the dorsal rim area (DRA) of the compound eye are specialized to detect the electric vector (e-vector) of linearly polarized light. These photoreceptors are arranged in stacked pairs with identical fields of view and spectral sensitivities, but mutually orthogonal microvillar orientations. As in larger flies, we found that the microvillar orientation of the distal photoreceptor R7 changes in a fan-like fashion along the DRA. This anatomical arrangement suggests that the DRA constitutes a detector for skylight polarization, in which different e-vectors maximally excite different positions in the array. To test our hypothesis, we measured responses to polarized light of varying e-vector angles in the terminals of R7/8 cells using genetically encoded calcium indicators. Our data confirm a progression of preferred e-vector angles from anterior to posterior in the DRA, and a strict orthogonality between the e-vector preferences of paired R7/8 cells. We observed decreased activity in photoreceptors in response to flashes of light polarized orthogonally to their preferred e-vector angle, suggesting reciprocal inhibition between photoreceptors in the same medullar column, which may serve to increase polarization contrast. Together, our results indicate that the polarization-vision system relies on a spatial map of preferred e-vector angles at the earliest stage of sensory processing. SIGNIFICANCE STATEMENT The fly's visual system is an influential model system for studying neural computation, and much is known about its anatomy, physiology, and development. The circuits underlying motion processing have received the most attention, but researchers are increasingly investigating other functions, such as color perception and object recognition. In this work, we investigate the early neural processing of a somewhat exotic sense, called polarization vision. Because skylight is polarized in an orientation that is rigidly determined by the position of the sun, this cue provides compass information. Behavioral experiments have shown that many species use the polarization pattern in the sky to direct locomotion. Here we describe the input stage of the fly's polarization-vision system. PMID:27170135
Nonlinear Fluid Computations in a Distributed Environment
NASA Technical Reports Server (NTRS)
Atwood, Christopher A.; Smith, Merritt H.
1995-01-01
The performance of a loosely and tightly-coupled workstation cluster is compared against a conventional vector supercomputer for the solution the Reynolds- averaged Navier-Stokes equations. The application geometries include a transonic airfoil, a tiltrotor wing/fuselage, and a wing/body/empennage/nacelle transport. Decomposition is of the manager-worker type, with solution of one grid zone per worker process coupled using the PVM message passing library. Task allocation is determined by grid size and processor speed, subject to available memory penalties. Each fluid zone is computed using an implicit diagonal scheme in an overset mesh framework, while relative body motion is accomplished using an additional worker process to re-establish grid communication.
System balance analysis for vector computers
NASA Technical Reports Server (NTRS)
Knight, J. C.; Poole, W. G., Jr.; Voight, R. G.
1975-01-01
The availability of vector processors capable of sustaining computing rates of 10 to the 8th power arithmetic results pers second raised the question of whether peripheral storage devices representing current technology can keep such processors supplied with data. By examining the solution of a large banded linear system on these computers, it was found that even under ideal conditions, the processors will frequently be waiting for problem data.
NASA Technical Reports Server (NTRS)
Deere, Karen A.; Flamm, Jeffrey D.; Berrier, Bobby L.; Johnson, Stuart K.
2007-01-01
A computational investigation of an axisymmetric Dual Throat Nozzle concept has been conducted. This fluidic thrust-vectoring nozzle was designed with a recessed cavity to enhance the throat shifting technique for improved thrust vectoring. The structured-grid, unsteady Reynolds- Averaged Navier-Stokes flow solver PAB3D was used to guide the nozzle design and analyze performance. Nozzle design variables included extent of circumferential injection, cavity divergence angle, cavity length, and cavity convergence angle. Internal nozzle performance (wind-off conditions) and thrust vector angles were computed for several configurations over a range of nozzle pressure ratios from 1.89 to 10, with the fluidic injection flow rate equal to zero and up to 4 percent of the primary flow rate. The effect of a variable expansion ratio on nozzle performance over a range of freestream Mach numbers up to 2 was investigated. Results indicated that a 60 circumferential injection was a good compromise between large thrust vector angles and efficient internal nozzle performance. A cavity divergence angle greater than 10 was detrimental to thrust vector angle. Shortening the cavity length improved internal nozzle performance with a small penalty to thrust vector angle. Contrary to expectations, a variable expansion ratio did not improve thrust efficiency at the flight conditions investigated.
Parallel-vector unsymmetric Eigen-Solver on high performance computers
NASA Technical Reports Server (NTRS)
Nguyen, Duc T.; Jiangning, Qin
1993-01-01
The popular QR algorithm for solving all eigenvalues of an unsymmetric matrix is reviewed. Among the basic components in the QR algorithm, it was concluded from this study, that the reduction of an unsymmetric matrix to a Hessenberg form (before applying the QR algorithm itself) can be done effectively by exploiting the vector speed and multiple processors offered by modern high-performance computers. Numerical examples of several test cases have indicated that the proposed parallel-vector algorithm for converting a given unsymmetric matrix to a Hessenberg form offers computational advantages over the existing algorithm. The time saving obtained by the proposed methods is increased as the problem size increased.
Visualization of Pulsar Search Data
NASA Astrophysics Data System (ADS)
Foster, R. S.; Wolszczan, A.
1993-05-01
The search for periodic signals from rotating neutron stars or pulsars has been a computationally taxing problem to astronomers for more than twenty-five years. Over this time interval, increases in computational capability have allowed ever more sensitive searches, covering a larger parameter space. The volume of input data and the general presence of radio frequency interference typically produce numerous spurious signals. Visualization of the search output and enhanced real-time processing of significant candidate events allow the pulsar searcher to optimally processes and search for new radio pulsars. The pulsar search algorithm and visualization system presented in this paper currently runs on serial RISC based workstations, a traditional vector based super computer, and a massively parallel computer. A description of the serial software algorithm and its modifications for massively parallel computing are describe. The results of four successive searches for millisecond period radio pulsars using the Arecibo telescope at 430 MHz have resulted in the successful detection of new long-period and millisecond period radio pulsars.
Zhao, Zhiqiang; Chen, Jun; Zhang, Zhaojun; Zhang, Dong H; Wang, Xiao-Gang; Carrington, Tucker; Gatti, Fabien
2018-02-21
Quantum mechanical calculations of ro-vibrational energies of CH 4 , CHD 3 , CH 3 D, and CH 3 F were made with two different numerical approaches. Both use polyspherical coordinates. The computed energy levels agree, confirming the accuracy of the methods. In the first approach, for all the molecules, the coordinates are defined using three Radau vectors for the CH 3 subsystem and a Jacobi vector between the remaining atom and the centre of mass of CH 3 . Euler angles specifying the orientation of a frame attached to CH 3 with respect to a frame attached to the Jacobi vector are used as vibrational coordinates. A direct product potential-optimized discrete variable vibrational basis is used to build a Hamiltonian matrix. Ro-vibrational energies are computed using a re-started Arnoldi eigensolver. In the second approach, the coordinates are the spherical coordinates associated with four Radau vectors or three Radau vectors and a Jacobi vector, and the frame is an Eckart frame. Vibrational basis functions are products of contracted stretch and bend functions, and eigenvalues are computed with the Lanczos algorithm. For CH 4 , CHD 3 , and CH 3 D, we report the first J > 0 energy levels computed on the Wang-Carrington potential energy surface [X.-G. Wang and T. Carrington, J. Chem. Phys. 141(15), 154106 (2014)]. For CH 3 F, the potential energy surface of Zhao et al. [J. Chem. Phys. 144, 204302 (2016)] was used. All the results are in good agreement with experimental data.
NASA Astrophysics Data System (ADS)
Zhao, Zhiqiang; Chen, Jun; Zhang, Zhaojun; Zhang, Dong H.; Wang, Xiao-Gang; Carrington, Tucker; Gatti, Fabien
2018-02-01
Quantum mechanical calculations of ro-vibrational energies of CH4, CHD3, CH3D, and CH3F were made with two different numerical approaches. Both use polyspherical coordinates. The computed energy levels agree, confirming the accuracy of the methods. In the first approach, for all the molecules, the coordinates are defined using three Radau vectors for the CH3 subsystem and a Jacobi vector between the remaining atom and the centre of mass of CH3. Euler angles specifying the orientation of a frame attached to CH3 with respect to a frame attached to the Jacobi vector are used as vibrational coordinates. A direct product potential-optimized discrete variable vibrational basis is used to build a Hamiltonian matrix. Ro-vibrational energies are computed using a re-started Arnoldi eigensolver. In the second approach, the coordinates are the spherical coordinates associated with four Radau vectors or three Radau vectors and a Jacobi vector, and the frame is an Eckart frame. Vibrational basis functions are products of contracted stretch and bend functions, and eigenvalues are computed with the Lanczos algorithm. For CH4, CHD3, and CH3D, we report the first J > 0 energy levels computed on the Wang-Carrington potential energy surface [X.-G. Wang and T. Carrington, J. Chem. Phys. 141(15), 154106 (2014)]. For CH3F, the potential energy surface of Zhao et al. [J. Chem. Phys. 144, 204302 (2016)] was used. All the results are in good agreement with experimental data.
NASA Technical Reports Server (NTRS)
Nosenchuck, D. M.; Littman, M. G.
1986-01-01
The Navier-Stokes computer (NSC) has been developed for solving problems in fluid mechanics involving complex flow simulations that require more speed and capacity than provided by current and proposed Class VI supercomputers. The machine is a parallel processing supercomputer with several new architectural elements which can be programmed to address a wide range of problems meeting the following criteria: (1) the problem is numerically intensive, and (2) the code makes use of long vectors. A simulation of two-dimensional nonsteady viscous flows is presented to illustrate the architecture, programming, and some of the capabilities of the NSC.
Vectorized Jiles-Atherton hysteresis model
NASA Astrophysics Data System (ADS)
Szymański, Grzegorz; Waszak, Michał
2004-01-01
This paper deals with vector hysteresis modeling. A vector model consisting of individual Jiles-Atherton components placed along principal axes is proposed. The cross-axis coupling ensures general vector model properties. Minor loops are obtained using scaling method. The model is intended for efficient finite element method computations defined in terms of magnetic vector potential. Numerical efficiency is ensured by differential susceptibility approach.
NASA Technical Reports Server (NTRS)
Jaggi, S.
1993-01-01
A study is conducted to investigate the effects and advantages of data compression techniques on multispectral imagery data acquired by NASA's airborne scanners at the Stennis Space Center. The first technique used was vector quantization. The vector is defined in the multispectral imagery context as an array of pixels from the same location from each channel. The error obtained in substituting the reconstructed images for the original set is compared for different compression ratios. Also, the eigenvalues of the covariance matrix obtained from the reconstructed data set are compared with the eigenvalues of the original set. The effects of varying the size of the vector codebook on the quality of the compression and on subsequent classification are also presented. The output data from the Vector Quantization algorithm was further compressed by a lossless technique called Difference-mapped Shift-extended Huffman coding. The overall compression for 7 channels of data acquired by the Calibrated Airborne Multispectral Scanner (CAMS), with an RMS error of 15.8 pixels was 195:1 (0.41 bpp) and with an RMS error of 3.6 pixels was 18:1 (.447 bpp). The algorithms were implemented in software and interfaced with the help of dedicated image processing boards to an 80386 PC compatible computer. Modules were developed for the task of image compression and image analysis. Also, supporting software to perform image processing for visual display and interpretation of the compressed/classified images was developed.
Attractor reconstruction for non-linear systems: a methodological note
Nichols, J.M.; Nichols, J.D.
2001-01-01
Attractor reconstruction is an important step in the process of making predictions for non-linear time-series and in the computation of certain invariant quantities used to characterize the dynamics of such series. The utility of computed predictions and invariant quantities is dependent on the accuracy of attractor reconstruction, which in turn is determined by the methods used in the reconstruction process. This paper suggests methods by which the delay and embedding dimension may be selected for a typical delay coordinate reconstruction. A comparison is drawn between the use of the autocorrelation function and mutual information in quantifying the delay. In addition, a false nearest neighbor (FNN) approach is used in minimizing the number of delay vectors needed. Results highlight the need for an accurate reconstruction in the computation of the Lyapunov spectrum and in prediction algorithms.
A Parallel Vector Machine for the PM Programming Language
NASA Astrophysics Data System (ADS)
Bellerby, Tim
2016-04-01
PM is a new programming language which aims to make the writing of computational geoscience models on parallel hardware accessible to scientists who are not themselves expert parallel programmers. It is based around the concept of communicating operators: language constructs that enable variables local to a single invocation of a parallelised loop to be viewed as if they were arrays spanning the entire loop domain. This mechanism enables different loop invocations (which may or may not be executing on different processors) to exchange information in a manner that extends the successful Communicating Sequential Processes idiom from single messages to collective communication. Communicating operators avoid the additional synchronisation mechanisms, such as atomic variables, required when programming using the Partitioned Global Address Space (PGAS) paradigm. Using a single loop invocation as the fundamental unit of concurrency enables PM to uniformly represent different levels of parallelism from vector operations through shared memory systems to distributed grids. This paper describes an implementation of PM based on a vectorised virtual machine. On a single processor node, concurrent operations are implemented using masked vector operations. Virtual machine instructions operate on vectors of values and may be unmasked, masked using a Boolean field, or masked using an array of active vector cell locations. Conditional structures (such as if-then-else or while statement implementations) calculate and apply masks to the operations they control. A shift in mask representation from Boolean to location-list occurs when active locations become sufficiently sparse. Parallel loops unfold data structures (or vectors of data structures for nested loops) into vectors of values that may additionally be distributed over multiple computational nodes and then split into micro-threads compatible with the size of the local cache. Inter-node communication is accomplished using standard OpenMP and MPI. Performance analyses of the PM vector machine, demonstrating its scaling properties with respect to domain size and the number of processor nodes will be presented for a range of hardware configurations. The PM software and language definition are being made available under unrestrictive MIT and Creative Commons Attribution licenses respectively: www.pm-lang.org.
Computer-Generated Diagrams for the Classroom.
ERIC Educational Resources Information Center
Carle, Mark A.; Greenslade, Thomas B., Jr.
1986-01-01
Describes 10 computer programs used to draw diagrams usually drawn on chalkboards, such as addition of three vectors, vector components, range of a projectile, lissajous figures, beats, isotherms, Snell's law, waves passing through a lens, magnetic field due to Helmholtz coils, and three curves. Several programming tips are included. (JN)
Computational Approaches to Image Understanding.
1981-10-01
represnting points, edges, surfaces, and volumes to facilitate display. The geometry or perspective and parailcl (or orthographic) projection has...of making the image forming process explicit. This in turn leads to a concern with geometry , such as the properties f the gradient, stereographic, and...dual spaces. Combining geometry and smoothness leads naturally to multi-variate vector analysis, and to differential geometry . For the most part, a
Development of iterative techniques for the solution of unsteady compressible viscous flows
NASA Technical Reports Server (NTRS)
Hixon, Duane; Sankar, L. N.
1993-01-01
During the past two decades, there has been significant progress in the field of numerical simulation of unsteady compressible viscous flows. At present, a variety of solution techniques exist such as the transonic small disturbance analyses (TSD), transonic full potential equation-based methods, unsteady Euler solvers, and unsteady Navier-Stokes solvers. These advances have been made possible by developments in three areas: (1) improved numerical algorithms; (2) automation of body-fitted grid generation schemes; and (3) advanced computer architectures with vector processing and massively parallel processing features. In this work, the GMRES scheme has been considered as a candidate for acceleration of a Newton iteration time marching scheme for unsteady 2-D and 3-D compressible viscous flow calculation; from preliminary calculations, this will provide up to a 65 percent reduction in the computer time requirements over the existing class of explicit and implicit time marching schemes. The proposed method has ben tested on structured grids, but is flexible enough for extension to unstructured grids. The described scheme has been tested only on the current generation of vector processor architecture of the Cray Y/MP class, but should be suitable for adaptation to massively parallel machines.
Heading-vector navigation based on head-direction cells and path integration.
Kubie, John L; Fenton, André A
2009-05-01
Insect navigation is guided by heading vectors that are computed by path integration. Mammalian navigation models, on the other hand, are typically based on map-like place representations provided by hippocampal place cells. Such models compute optimal routes as a continuous series of locations that connect the current location to a goal. We propose a "heading-vector" model in which head-direction cells or their derivatives serve both as key elements in constructing the optimal route and as the straight-line guidance during route execution. The model is based on a memory structure termed the "shortcut matrix," which is constructed during the initial exploration of an environment when a set of shortcut vectors between sequential pairs of visited waypoint locations is stored. A mechanism is proposed for calculating and storing these vectors that relies on a hypothesized cell type termed an "accumulating head-direction cell." Following exploration, shortcut vectors connecting all pairs of waypoint locations are computed by vector arithmetic and stored in the shortcut matrix. On re-entry, when local view or place representations query the shortcut matrix with a current waypoint and goal, a shortcut trajectory is retrieved. Since the trajectory direction is in head-direction compass coordinates, navigation is accomplished by tracking the firing of head-direction cells that are tuned to the heading angle. Section 1 of the manuscript describes the properties of accumulating head-direction cells. It then shows how accumulating head-direction cells can store local vectors and perform vector arithmetic to perform path-integration-based homing. Section 2 describes the construction and use of the shortcut matrix for computing direct paths between any pair of locations that have been registered in the shortcut matrix. In the discussion, we analyze the advantages of heading-based navigation over map-based navigation. Finally, we survey behavioral evidence that nonhippocampal, heading-based navigation is used in small mammals and humans. Copyright 2008 Wiley-Liss, Inc.
Parallel-vector out-of-core equation solver for computational mechanics
NASA Technical Reports Server (NTRS)
Qin, J.; Agarwal, T. K.; Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.
1993-01-01
A parallel/vector out-of-core equation solver is developed for shared-memory computers, such as the Cray Y-MP machine. The input/ output (I/O) time is reduced by using the a synchronous BUFFER IN and BUFFER OUT, which can be executed simultaneously with the CPU instructions. The parallel and vector capability provided by the supercomputers is also exploited to enhance the performance. Numerical applications in large-scale structural analysis are given to demonstrate the efficiency of the present out-of-core solver.
Vectors a Fortran 90 module for 3-dimensional vector and dyadic arithmetic
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brock, B.C.
1998-02-01
A major advance contained in the new Fortran 90 language standard is the ability to define new data types and the operators associated with them. Writing computer code to implement computations with real and complex three-dimensional vectors and dyadics is greatly simplified if the equations can be implemented directly, without the need to code the vector arithmetic explicitly. The Fortran 90 module described here defines new data types for real and complex 3-dimensional vectors and dyadics, along with the common operations needed to work with these objects. Routines to allow convenient initialization and output of the new types are alsomore » included. In keeping with the philosophy of data abstraction, the details of the implementation of the data types are maintained private, and the functions and operators are made generic to simplify the combining of real, complex, single- and double-precision vectors and dyadics.« less
Field Computation and Nonpropositional Knowledge.
1987-09-01
field computer It is based on xeneralization of Taylor’s theorem to continuous dimensional vector spaces. 20. DISTRIBUTION/AVAILABILITY OF ABSTRACT 21...generalization of Taylor’s theorem to continuous dimensional vector -5paces A number of field computations are illustrated, including several Lransforma...paradigm. The "old" Al has been quite successful in performing a number of difficult tasks, such as theorem prov- ing, chess playing, medical diagnosis and
The Helioseismic and Magnetic Imager (HMI) Vector Magnetic Field Pipeline: Overview and Performance
NASA Astrophysics Data System (ADS)
Hoeksema, J. Todd; Liu, Yang; Hayashi, Keiji; Sun, Xudong; Schou, Jesper; Couvidat, Sebastien; Norton, Aimee; Bobra, Monica; Centeno, Rebecca; Leka, K. D.; Barnes, Graham; Turmon, Michael
2014-09-01
The Helioseismic and Magnetic Imager (HMI) began near-continuous full-disk solar measurements on 1 May 2010 from the Solar Dynamics Observatory (SDO). An automated processing pipeline keeps pace with observations to produce observable quantities, including the photospheric vector magnetic field, from sequences of filtergrams. The basic vector-field frame list cadence is 135 seconds, but to reduce noise the filtergrams are combined to derive data products every 720 seconds. The primary 720 s observables were released in mid-2010, including Stokes polarization parameters measured at six wavelengths, as well as intensity, Doppler velocity, and the line-of-sight magnetic field. More advanced products, including the full vector magnetic field, are now available. Automatically identified HMI Active Region Patches (HARPs) track the location and shape of magnetic regions throughout their lifetime. The vector field is computed using the Very Fast Inversion of the Stokes Vector (VFISV) code optimized for the HMI pipeline; the remaining 180∘ azimuth ambiguity is resolved with the Minimum Energy (ME0) code. The Milne-Eddington inversion is performed on all full-disk HMI observations. The disambiguation, until recently run only on HARP regions, is now implemented for the full disk. Vector and scalar quantities in the patches are used to derive active region indices potentially useful for forecasting; the data maps and indices are collected in the SHARP data series, hmi.sharp_720s. Definitive SHARP processing is completed only after the region rotates off the visible disk; quick-look products are produced in near real time. Patches are provided in both CCD and heliographic coordinates. HMI provides continuous coverage of the vector field, but has modest spatial, spectral, and temporal resolution. Coupled with limitations of the analysis and interpretation techniques, effects of the orbital velocity, and instrument performance, the resulting measurements have a certain dynamic range and sensitivity and are subject to systematic errors and uncertainties that are characterized in this report.
A computer simulation model of Wolbachia invasion for disease vector population modification.
Guevara-Souza, Mauricio; Vallejo, Edgar E
2015-10-05
Wolbachia invasion has been proved to be a promising alternative for controlling vector-borne diseases, particularly Dengue fever. Creating computer models that can provide insight into how vector population modification can be achieved under different conditions would be most valuable for assessing the efficacy of control strategies for this disease. In this paper, we present a computer model that simulates the behavior of native mosquito populations after the introduction of mosquitoes infected with the Wolbachia bacteria. We studied how different factors such as fecundity, fitness cost of infection, migration rates, number of populations, population size, and number of introduced infected mosquitoes affect the spread of the Wolbachia bacteria among native mosquito populations. Two main scenarios of the island model are presented in this paper, with infected mosquitoes introduced into the largest source population and peripheral populations. Overall, the results are promising; Wolbachia infection spreads among native populations and the computer model is capable of reproducing the results obtained by mathematical models and field experiments. Computer models can be very useful for gaining insight into how Wolbachia invasion works and are a promising alternative for complementing experimental and mathematical approaches for vector-borne disease control.
Comparison of SOM point densities based on different criteria.
Kohonen, T
1999-11-15
Point densities of model (codebook) vectors in self-organizing maps (SOMs) are evaluated in this article. For a few one-dimensional SOMs with finite grid lengths and a given probability density function of the input, the numerically exact point densities have been computed. The point density derived from the SOM algorithm turned out to be different from that minimizing the SOM distortion measure, showing that the model vectors produced by the basic SOM algorithm in general do not exactly coincide with the optimum of the distortion measure. A new computing technique based on the calculus of variations has been introduced. It was applied to the computation of point densities derived from the distortion measure for both the classical vector quantization and the SOM with general but equal dimensionality of the input vectors and the grid, respectively. The power laws in the continuum limit obtained in these cases were found to be identical.
Chhabra, Lovely; Sareen, Pooja; Gandagule, Amit; Spodick, David H
2012-03-01
Verticalization of the frontal P vector in patients older than 45 years is virtually diagnostic of pulmonary emphysema (sensitivity, 96%; specificity, 87%). We investigated the correlation of P vector and the computed tomographic visual score of emphysema (VSE) in patients with established diagnosis of chronic obstructive pulmonary disease/emphysema. High-resolution computed tomographic scans of 26 patients with emphysema (age, >45 years) were reviewed to assess the type and extent of emphysema using the subjective visual scoring. Electrocardiograms were independently reviewed to determine the frontal P vector. The P vector and VSE were compared for statistical correlation. Both P vector and VSE were also directly compared with the forced expiratory volume at 1 second. The VSE and the orientation of the P vector (ÂP) had an overall significant positive correlation (r = +0.68; P = .0001) in all patients, but the correlation was very strong in patients with predominant lower-lobe emphysema (r = +0.88; P = .0004). Forced expiratory volume at 1 second and ÂP had almost a linear inverse correlation in predominant lower-lobe emphysema (r = -0.92; P < .0001). Orientation of the P vector positively correlates with visually scored emphysema. Both ÂP and VSE are strong reflectors of qualitative lung function in patients with predominant lower-lobe emphysema. A combination of more vertical ÂP and predominant lower-lobe emphysema reflects severe obstructive lung dysfunction. Copyright © 2012 Elsevier Inc. All rights reserved.
Spatiotemporal Domain Decomposition for Massive Parallel Computation of Space-Time Kernel Density
NASA Astrophysics Data System (ADS)
Hohl, A.; Delmelle, E. M.; Tang, W.
2015-07-01
Accelerated processing capabilities are deemed critical when conducting analysis on spatiotemporal datasets of increasing size, diversity and availability. High-performance parallel computing offers the capacity to solve computationally demanding problems in a limited timeframe, but likewise poses the challenge of preventing processing inefficiency due to workload imbalance between computing resources. Therefore, when designing new algorithms capable of implementing parallel strategies, careful spatiotemporal domain decomposition is necessary to account for heterogeneity in the data. In this study, we perform octtree-based adaptive decomposition of the spatiotemporal domain for parallel computation of space-time kernel density. In order to avoid edge effects near subdomain boundaries, we establish spatiotemporal buffers to include adjacent data-points that are within the spatial and temporal kernel bandwidths. Then, we quantify computational intensity of each subdomain to balance workloads among processors. We illustrate the benefits of our methodology using a space-time epidemiological dataset of Dengue fever, an infectious vector-borne disease that poses a severe threat to communities in tropical climates. Our parallel implementation of kernel density reaches substantial speedup compared to sequential processing, and achieves high levels of workload balance among processors due to great accuracy in quantifying computational intensity. Our approach is portable of other space-time analytical tests.
Optimization of large matrix calculations for execution on the Cray X-MP vector supercomputer
NASA Technical Reports Server (NTRS)
Hornfeck, William A.
1988-01-01
A considerable volume of large computational computer codes were developed for NASA over the past twenty-five years. This code represents algorithms developed for machines of earlier generation. With the emergence of the vector supercomputer as a viable, commercially available machine, an opportunity exists to evaluate optimization strategies to improve the efficiency of existing software. This result is primarily due to architectural differences in the latest generation of large-scale machines and the earlier, mostly uniprocessor, machines. A sofware package being used by NASA to perform computations on large matrices is described, and a strategy for conversion to the Cray X-MP vector supercomputer is also described.
Computational Investigation of the Aerodynamic Effects on Fluidic Thrust Vectoring
NASA Technical Reports Server (NTRS)
Deere, K. A.
2000-01-01
A computational investigation of the aerodynamic effects on fluidic thrust vectoring has been conducted. Three-dimensional simulations of a two-dimensional, convergent-divergent (2DCD) nozzle with fluidic injection for pitch vector control were run with the computational fluid dynamics code PAB using turbulence closure and linear Reynolds stress modeling. Simulations were computed with static freestream conditions (M=0.05) and at Mach numbers from M=0.3 to 1.2, with scheduled nozzle pressure ratios (from 3.6 to 7.2) and secondary to primary total pressure ratios of p(sub t,s)/p(sub t,p)=0.6 and 1.0. Results indicate that the freestream flow decreases vectoring performance and thrust efficiency compared with static (wind-off) conditions. The aerodynamic penalty to thrust vector angle ranged from 1.5 degrees at a nozzle pressure ratio of 6 with M=0.9 freestream conditions to 2.9 degrees at a nozzle pressure ratio of 5.2 with M=0.7 freestream conditions, compared to the same nozzle pressure ratios with static freestream conditions. The aerodynamic penalty to thrust ratio decreased from 4 percent to 0.8 percent as nozzle pressure ratio increased from 3.6 to 7.2. As expected, the freestream flow had little influence on discharge coefficient.
Evaluation of a Multicore-Optimized Implementation for Tomographic Reconstruction
Agulleiro, Jose-Ignacio; Fernández, José Jesús
2012-01-01
Tomography allows elucidation of the three-dimensional structure of an object from a set of projection images. In life sciences, electron microscope tomography is providing invaluable information about the cell structure at a resolution of a few nanometres. Here, large images are required to combine wide fields of view with high resolution requirements. The computational complexity of the algorithms along with the large image size then turns tomographic reconstruction into a computationally demanding problem. Traditionally, high-performance computing techniques have been applied to cope with such demands on supercomputers, distributed systems and computer clusters. In the last few years, the trend has turned towards graphics processing units (GPUs). Here we present a detailed description and a thorough evaluation of an alternative approach that relies on exploitation of the power available in modern multicore computers. The combination of single-core code optimization, vector processing, multithreading and efficient disk I/O operations succeeds in providing fast tomographic reconstructions on standard computers. The approach turns out to be competitive with the fastest GPU-based solutions thus far. PMID:23139768
Geometric and computer-aided spline hob modeling
NASA Astrophysics Data System (ADS)
Brailov, I. G.; Myasoedova, T. M.; Panchuk, K. L.; Krysova, I. V.; Rogoza, YU A.
2018-03-01
The paper considers acquiring the spline hob geometric model. The objective of the research is the development of a mathematical model of spline hob for spline shaft machining. The structure of the spline hob is described taking into consideration the motion in parameters of the machine tool system of cutting edge positioning and orientation. Computer-aided study is performed with the use of CAD and on the basis of 3D modeling methods. Vector representation of cutting edge geometry is accepted as the principal method of spline hob mathematical model development. The paper defines the correlations described by parametric vector functions representing helical cutting edges designed for spline shaft machining with consideration for helical movement in two dimensions. An application for acquiring the 3D model of spline hob is developed on the basis of AutoLISP for AutoCAD environment. The application presents the opportunity for the use of the acquired model for milling process imitation. An example of evaluation, analytical representation and computer modeling of the proposed geometrical model is reviewed. In the mentioned example, a calculation of key spline hob parameters assuring the capability of hobbing a spline shaft of standard design is performed. The polygonal and solid spline hob 3D models are acquired by the use of imitational computer modeling.
A compositional reservoir simulator on distributed memory parallel computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rame, M.; Delshad, M.
1995-12-31
This paper presents the application of distributed memory parallel computes to field scale reservoir simulations using a parallel version of UTCHEM, The University of Texas Chemical Flooding Simulator. The model is a general purpose highly vectorized chemical compositional simulator that can simulate a wide range of displacement processes at both field and laboratory scales. The original simulator was modified to run on both distributed memory parallel machines (Intel iPSC/960 and Delta, Connection Machine 5, Kendall Square 1 and 2, and CRAY T3D) and a cluster of workstations. A domain decomposition approach has been taken towards parallelization of the code. Amore » portion of the discrete reservoir model is assigned to each processor by a set-up routine that attempts a data layout as even as possible from the load-balance standpoint. Each of these subdomains is extended so that data can be shared between adjacent processors for stencil computation. The added routines that make parallel execution possible are written in a modular fashion that makes the porting to new parallel platforms straight forward. Results of the distributed memory computing performance of Parallel simulator are presented for field scale applications such as tracer flood and polymer flood. A comparison of the wall-clock times for same problems on a vector supercomputer is also presented.« less
Parallel processors and nonlinear structural dynamics algorithms and software
NASA Technical Reports Server (NTRS)
Belytschko, Ted
1990-01-01
Techniques are discussed for the implementation and improvement of vectorization and concurrency in nonlinear explicit structural finite element codes. In explicit integration methods, the computation of the element internal force vector consumes the bulk of the computer time. The program can be efficiently vectorized by subdividing the elements into blocks and executing all computations in vector mode. The structuring of elements into blocks also provides a convenient way to implement concurrency by creating tasks which can be assigned to available processors for evaluation. The techniques were implemented in a 3-D nonlinear program with one-point quadrature shell elements. Concurrency and vectorization were first implemented in a single time step version of the program. Techniques were developed to minimize processor idle time and to select the optimal vector length. A comparison of run times between the program executed in scalar, serial mode and the fully vectorized code executed concurrently using eight processors shows speed-ups of over 25. Conjugate gradient methods for solving nonlinear algebraic equations are also readily adapted to a parallel environment. A new technique for improving convergence properties of conjugate gradients in nonlinear problems is developed in conjunction with other techniques such as diagonal scaling. A significant reduction in the number of iterations required for convergence is shown for a statically loaded rigid bar suspended by three equally spaced springs.
Mohammadi, Amrollah; Ahmadian, Alireza; Rabbani, Shahram; Fattahi, Ehsan; Shirani, Shapour
2017-12-01
Finite element models for estimation of intraoperative brain shift suffer from huge computational cost. In these models, image registration and finite element analysis are two time-consuming processes. The proposed method is an improved version of our previously developed Finite Element Drift (FED) registration algorithm. In this work the registration process is combined with the finite element analysis. In the Combined FED (CFED), the deformation of whole brain mesh is iteratively calculated by geometrical extension of a local load vector which is computed by FED. While the processing time of the FED-based method including registration and finite element analysis was about 70 s, the computation time of the CFED was about 3.2 s. The computational cost of CFED is almost 50% less than similar state of the art brain shift estimators based on finite element models. The proposed combination of registration and structural analysis can make the calculation of brain deformation much faster. Copyright © 2016 John Wiley & Sons, Ltd.
An Alternative Method for Computing Mean and Covariance Matrix of Some Multivariate Distributions
ERIC Educational Resources Information Center
Radhakrishnan, R.; Choudhury, Askar
2009-01-01
Computing the mean and covariance matrix of some multivariate distributions, in particular, multivariate normal distribution and Wishart distribution are considered in this article. It involves a matrix transformation of the normal random vector into a random vector whose components are independent normal random variables, and then integrating…
NASA Technical Reports Server (NTRS)
Deffenbaugh, F. D.; Vitz, J. F.
1979-01-01
The users manual for the Discrete Vortex Cross flow Evaluator (DIVORCE) computer program is presented. DIVORCE was developed in FORTRAN 4 for the DCD 6600 and CDC 7600 machines. Optimal calls to a NASA vector subroutine package are provided for use with the CDC 7600.
Vectorization of transport and diffusion computations on the CDC Cyber 205
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abu-Shumays, I.K.
1986-01-01
The development and testing of alternative numerical methods and computational algorithms specifically designed for the vectorization of transport and diffusion computations on a Control Data Corporation (CDC) Cyber 205 vector computer are described. Two solution methods for the discrete ordinates approximation to the transport equation are summarized and compared. Factors of 4 to 7 reduction in run times for certain large transport problems were achieved on a Cyber 205 as compared with run times on a CDC-7600. The solution of tridiagonal systems of linear equations, central to several efficient numerical methods for multidimensional diffusion computations and essential for fluid flowmore » and other physics and engineering problems, is also dealt with. Among the methods tested, a combined odd-even cyclic reduction and modified Cholesky factorization algorithm for solving linear symmetric positive definite tridiagonal systems is found to be the most effective for these systems on a Cyber 205. For large tridiagonal systems, computation with this algorithm is an order of magnitude faster on a Cyber 205 than computation with the best algorithm for tridiagonal systems on a CDC-7600.« less
Hirano, Toshiyuki; Sato, Fumitoshi
2014-07-28
We used grid-free modified Cholesky decomposition (CD) to develop a density-functional-theory (DFT)-based method for calculating the canonical molecular orbitals (CMOs) of large molecules. Our method can be used to calculate standard CMOs, analytically compute exchange-correlation terms, and maximise the capacity of next-generation supercomputers. Cholesky vectors were first analytically downscaled using low-rank pivoted CD and CD with adaptive metric (CDAM). The obtained Cholesky vectors were distributed and stored on each computer node in a parallel computer, and the Coulomb, Fock exchange, and pure exchange-correlation terms were calculated by multiplying the Cholesky vectors without evaluating molecular integrals in self-consistent field iterations. Our method enables DFT and massively distributed memory parallel computers to be used in order to very efficiently calculate the CMOs of large molecules.
A vectorized Lanczos eigensolver for high-performance computers
NASA Technical Reports Server (NTRS)
Bostic, Susan W.
1990-01-01
The computational strategies used to implement a Lanczos-based-method eigensolver on the latest generation of supercomputers are described. Several examples of structural vibration and buckling problems are presented that show the effects of using optimization techniques to increase the vectorization of the computational steps. The data storage and access schemes and the tools and strategies that best exploit the computer resources are presented. The method is implemented on the Convex C220, the Cray 2, and the Cray Y-MP computers. Results show that very good computation rates are achieved for the most computationally intensive steps of the Lanczos algorithm and that the Lanczos algorithm is many times faster than other methods extensively used in the past.
Calculation of biochemical net reactions and pathways by using matrix operations.
Alberty, R A
1996-01-01
Pathways for net biochemical reactions can be calculated by using a computer program that solves systems of linear equations. The coefficients in the linear equations are the stoichiometric numbers in the biochemical equations for the system. The solution of the system of linear equations is a vector of the stoichiometric numbers of the reactions in the pathway for the net reaction; this is referred to as the pathway vector. The pathway vector gives the number of times the various reactions have to occur to produce the desired net reaction. Net reactions may involve unknown numbers of ATP, ADP, and Pi molecules. The numbers of ATP, ADP, and Pi in a desired net reaction can be calculated in a two-step process. In the first step, the pathway is calculated by solving the system of linear equations for an abbreviated stoichiometric number matrix without ATP, ADP, Pi, NADred, and NADox. In the second step, the stoichiometric numbers in the desired net reaction, which includes ATP, ADP, Pi, NADred, and NADox, are obtained by multiplying the full stoichiometric number matrix by the calculated pathway vector. PMID:8804633
Cao, Youfang; Wang, Lianjie; Xu, Kexue; Kou, Chunhai; Zhang, Yulei; Wei, Guifang; He, Junjian; Wang, Yunfang; Zhao, Liping
2005-07-26
A new algorithm for assessing similarity between primer and template has been developed based on the hypothesis that annealing of primer to template is an information transfer process. Primer sequence is converted to a vector of the full potential hydrogen numbers (3 for G or C, 2 for A or T), while template sequence is converted to a vector of the actual hydrogen bond numbers formed after primer annealing. The former is considered as source information and the latter destination information. An information coefficient is calculated as a measure for fidelity of this information transfer process and thus a measure of similarity between primer and potential annealing site on template. Successful prediction of PCR products from whole genomic sequences with a computer program based on the algorithm demonstrated the potential of this new algorithm in areas like in silico PCR and gene finding.
Three-dimensional computational aerodynamics in the 1980's
NASA Technical Reports Server (NTRS)
Lomax, H.
1978-01-01
The future requirements for constructing codes that can be used to compute three-dimensional flows about aerodynamic shapes should be assessed in light of the constraints imposed by future computer architectures and the reality of usable algorithms that can provide practical three-dimensional simulations. On the hardware side, vector processing is inevitable in order to meet the CPU speeds required. To cope with three-dimensional geometries, massive data bases with fetch/store conflicts and transposition problems are inevitable. On the software side, codes must be prepared that: (1) can be adapted to complex geometries, (2) can (at the very least) predict the location of laminar and turbulent boundary layer separation, and (3) will converge rapidly to sufficiently accurate solutions.
Calibration of Predictor Models Using Multiple Validation Experiments
NASA Technical Reports Server (NTRS)
Crespo, Luis G.; Kenny, Sean P.; Giesy, Daniel P.
2015-01-01
This paper presents a framework for calibrating computational models using data from several and possibly dissimilar validation experiments. The offset between model predictions and observations, which might be caused by measurement noise, model-form uncertainty, and numerical error, drives the process by which uncertainty in the models parameters is characterized. The resulting description of uncertainty along with the computational model constitute a predictor model. Two types of predictor models are studied: Interval Predictor Models (IPMs) and Random Predictor Models (RPMs). IPMs use sets to characterize uncertainty, whereas RPMs use random vectors. The propagation of a set through a model makes the response an interval valued function of the state, whereas the propagation of a random vector yields a random process. Optimization-based strategies for calculating both types of predictor models are proposed. Whereas the formulations used to calculate IPMs target solutions leading to the interval value function of minimal spread containing all observations, those for RPMs seek to maximize the models' ability to reproduce the distribution of observations. Regarding RPMs, we choose a structure for the random vector (i.e., the assignment of probability to points in the parameter space) solely dependent on the prediction error. As such, the probabilistic description of uncertainty is not a subjective assignment of belief, nor is it expected to asymptotically converge to a fixed value, but instead it casts the model's ability to reproduce the experimental data. This framework enables evaluating the spread and distribution of the predicted response of target applications depending on the same parameters beyond the validation domain.
General MoM Solutions for Large Arrays
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fasenfest, B; Capolino, F; Wilton, D R
2003-07-22
This paper focuses on a numerical procedure that addresses the difficulties of dealing with large, finite arrays while preserving the generality and robustness of full-wave methods. We present a fast method based on approximating interactions between sufficiently separated array elements via a relatively coarse interpolation of the Green's function on a uniform grid commensurate with the array's periodicity. The interaction between the basis and testing functions is reduced to a three-stage process. The first stage is a projection of standard (e.g., RWG) subdomain bases onto a set of interpolation functions that interpolate the Green's function on the array face. Thismore » projection, which is used in a matrix/vector product for each array cell in an iterative solution process, need only be carried out once for a single cell and results in a low-rank matrix. An intermediate stage matrix/vector product computation involving the uniformly sampled Green's function is of convolutional form in the lateral (transverse) directions so that a 2D FFT may be used. The final stage is a third matrix/vector product computation involving a matrix resulting from projecting testing functions onto the Green's function interpolation functions; the low-rank matrix is either identical to (using Galerkin's method) or similar to that for the bases projection. An effective MoM solution scheme is developed for large arrays using a modification of the AIM (Adaptive Integral Method) method. The method permits the analysis of arrays with arbitrary contours and nonplanar elements. Both fill and solve times within the MoM method are improved with respect to more standard MoM solvers.« less
Fast parallel tandem mass spectral library searching using GPU hardware acceleration.
Baumgardner, Lydia Ashleigh; Shanmugam, Avinash Kumar; Lam, Henry; Eng, Jimmy K; Martin, Daniel B
2011-06-03
Mass spectrometry-based proteomics is a maturing discipline of biologic research that is experiencing substantial growth. Instrumentation has steadily improved over time with the advent of faster and more sensitive instruments collecting ever larger data files. Consequently, the computational process of matching a peptide fragmentation pattern to its sequence, traditionally accomplished by sequence database searching and more recently also by spectral library searching, has become a bottleneck in many mass spectrometry experiments. In both of these methods, the main rate-limiting step is the comparison of an acquired spectrum with all potential matches from a spectral library or sequence database. This is a highly parallelizable process because the core computational element can be represented as a simple but arithmetically intense multiplication of two vectors. In this paper, we present a proof of concept project taking advantage of the massively parallel computing available on graphics processing units (GPUs) to distribute and accelerate the process of spectral assignment using spectral library searching. This program, which we have named FastPaSS (for Fast Parallelized Spectral Searching), is implemented in CUDA (Compute Unified Device Architecture) from NVIDIA, which allows direct access to the processors in an NVIDIA GPU. Our efforts demonstrate the feasibility of GPU computing for spectral assignment, through implementation of the validated spectral searching algorithm SpectraST in the CUDA environment.
User's guide to STIPPAN: A panel method program for slotted tunnel interference prediction
NASA Technical Reports Server (NTRS)
Kemp, W. B., Jr.
1985-01-01
Guidelines are presented for use of the computer program STIPPAN to simulate the subsonic flow in a slotted wind tunnel test section with a known model disturbance. Input data requirements are defined in detail and other aspects of the program usage are discussed in more general terms. The program is written for use in a CDC CYBER 200 class vector processing system.
Visualization of Morse connection graphs for topologically rich 2D vector fields.
Szymczak, Andrzej; Sipeki, Levente
2013-12-01
Recent advances in vector field topologymake it possible to compute its multi-scale graph representations for autonomous 2D vector fields in a robust and efficient manner. One of these representations is a Morse Connection Graph (MCG), a directed graph whose nodes correspond to Morse sets, generalizing stationary points and periodic trajectories, and arcs - to trajectories connecting them. While being useful for simple vector fields, the MCG can be hard to comprehend for topologically rich vector fields, containing a large number of features. This paper describes a visual representation of the MCG, inspired by previous work on graph visualization. Our approach aims to preserve the spatial relationships between the MCG arcs and nodes and highlight the coherent behavior of connecting trajectories. Using simulations of ocean flow, we show that it can provide useful information on the flow structure. This paper focuses specifically on MCGs computed for piecewise constant (PC) vector fields. In particular, we describe extensions of the PC framework that make it more flexible and better suited for analysis of data on complex shaped domains with a boundary. We also describe a topology simplification scheme that makes our MCG visualizations less ambiguous. Despite the focus on the PC framework, our approach could also be applied to graph representations or topological skeletons computed using different methods.
A Semisupervised Support Vector Machines Algorithm for BCI Systems
Qin, Jianzhao; Li, Yuanqing; Sun, Wei
2007-01-01
As an emerging technology, brain-computer interfaces (BCIs) bring us new communication interfaces which translate brain activities into control signals for devices like computers, robots, and so forth. In this study, we propose a semisupervised support vector machine (SVM) algorithm for brain-computer interface (BCI) systems, aiming at reducing the time-consuming training process. In this algorithm, we apply a semisupervised SVM for translating the features extracted from the electrical recordings of brain into control signals. This SVM classifier is built from a small labeled data set and a large unlabeled data set. Meanwhile, to reduce the time for training semisupervised SVM, we propose a batch-mode incremental learning method, which can also be easily applied to the online BCI systems. Additionally, it is suggested in many studies that common spatial pattern (CSP) is very effective in discriminating two different brain states. However, CSP needs a sufficient labeled data set. In order to overcome the drawback of CSP, we suggest a two-stage feature extraction method for the semisupervised learning algorithm. We apply our algorithm to two BCI experimental data sets. The offline data analysis results demonstrate the effectiveness of our algorithm. PMID:18368141
Least-Squares Support Vector Machine Approach to Viral Replication Origin Prediction
Cruz-Cano, Raul; Chew, David S.H.; Kwok-Pui, Choi; Ming-Ying, Leung
2010-01-01
Replication of their DNA genomes is a central step in the reproduction of many viruses. Procedures to find replication origins, which are initiation sites of the DNA replication process, are therefore of great importance for controlling the growth and spread of such viruses. Existing computational methods for viral replication origin prediction have mostly been tested within the family of herpesviruses. This paper proposes a new approach by least-squares support vector machines (LS-SVMs) and tests its performance not only on the herpes family but also on a collection of caudoviruses coming from three viral families under the order of caudovirales. The LS-SVM approach provides sensitivities and positive predictive values superior or comparable to those given by the previous methods. When suitably combined with previous methods, the LS-SVM approach further improves the prediction accuracy for the herpesvirus replication origins. Furthermore, by recursive feature elimination, the LS-SVM has also helped find the most significant features of the data sets. The results suggest that the LS-SVMs will be a highly useful addition to the set of computational tools for viral replication origin prediction and illustrate the value of optimization-based computing techniques in biomedical applications. PMID:20729987
Least-Squares Support Vector Machine Approach to Viral Replication Origin Prediction.
Cruz-Cano, Raul; Chew, David S H; Kwok-Pui, Choi; Ming-Ying, Leung
2010-06-01
Replication of their DNA genomes is a central step in the reproduction of many viruses. Procedures to find replication origins, which are initiation sites of the DNA replication process, are therefore of great importance for controlling the growth and spread of such viruses. Existing computational methods for viral replication origin prediction have mostly been tested within the family of herpesviruses. This paper proposes a new approach by least-squares support vector machines (LS-SVMs) and tests its performance not only on the herpes family but also on a collection of caudoviruses coming from three viral families under the order of caudovirales. The LS-SVM approach provides sensitivities and positive predictive values superior or comparable to those given by the previous methods. When suitably combined with previous methods, the LS-SVM approach further improves the prediction accuracy for the herpesvirus replication origins. Furthermore, by recursive feature elimination, the LS-SVM has also helped find the most significant features of the data sets. The results suggest that the LS-SVMs will be a highly useful addition to the set of computational tools for viral replication origin prediction and illustrate the value of optimization-based computing techniques in biomedical applications.
Plaza-Leiva, Victoria; Gomez-Ruiz, Jose Antonio; Mandow, Anthony; García-Cerezo, Alfonso
2017-01-01
Improving the effectiveness of spatial shape features classification from 3D lidar data is very relevant because it is largely used as a fundamental step towards higher level scene understanding challenges of autonomous vehicles and terrestrial robots. In this sense, computing neighborhood for points in dense scans becomes a costly process for both training and classification. This paper proposes a new general framework for implementing and comparing different supervised learning classifiers with a simple voxel-based neighborhood computation where points in each non-overlapping voxel in a regular grid are assigned to the same class by considering features within a support region defined by the voxel itself. The contribution provides offline training and online classification procedures as well as five alternative feature vector definitions based on principal component analysis for scatter, tubular and planar shapes. Moreover, the feasibility of this approach is evaluated by implementing a neural network (NN) method previously proposed by the authors as well as three other supervised learning classifiers found in scene processing methods: support vector machines (SVM), Gaussian processes (GP), and Gaussian mixture models (GMM). A comparative performance analysis is presented using real point clouds from both natural and urban environments and two different 3D rangefinders (a tilting Hokuyo UTM-30LX and a Riegl). Classification performance metrics and processing time measurements confirm the benefits of the NN classifier and the feasibility of voxel-based neighborhood. PMID:28294963
Dinehart, R.L.; Burau, J.R.
2005-01-01
A strategy of repeated surveys by acoustic Doppler current profiler (ADCP) was applied in a tidal river to map velocity vectors and suspended-sediment indicators. The Sacramento River at the junction with the Delta Cross Channel at Walnut Grove, California, was surveyed over several tidal cycles in the Fall of 2000 and 2001 with a vessel-mounted ADCP. Velocity profiles were recorded along flow-defining survey paths, with surveys repeated every 27 min through a diurnal tidal cycle. Velocity vectors along each survey path were interpolated to a three-dimensional Cartesian grid that conformed to local bathymetry. A separate array of vectors was interpolated onto a grid from each survey. By displaying interpolated vector grids sequentially with computer animation, flow dynamics of the reach could be studied in three-dimensions as flow responded to the tidal cycle. Velocity streamtraces in the grid showed the upwelling of flow from the bottom of the Sacramento River channel into the Delta Cross Channel. The sequential display of vector grids showed that water in the canal briefly returned into the Sacramento River after peak flood tides, which had not been known previously. In addition to velocity vectors, ADCP data were processed to derive channel bathymetry and a spatial indicator for suspended-sediment concentration. Individual beam distances to bed, recorded by the ADCP, were transformed to yield bathymetry accurate enough to resolve small bedforms within the study reach. While recording velocity, ADCPs also record the intensity of acoustic backscatter from particles suspended in the flow. Sequential surveys of backscatter intensity were interpolated to grids and animated to indicate the spatial movement of suspended sediment through the study reach. Calculation of backscatter flux through cross-sectional grids provided a first step for computation of suspended-sediment discharge, the second step being a calibrated relation between backscatter intensity and sediment concentration. Spatial analyses of ADCP data showed that a strategy of repeated surveys and flow-field interpolation has the potential to simplify computation of flow and sediment discharge through complex waterways. The use of trade, product, industry, or firm names in this report is for descriptive purposes only and does not constitute endorsement of products by the US Government. ?? 2005 Elsevier B.V. All rights reserved.
Automated Creation of Labeled Pointcloud Datasets in Support of Machine-Learning Based Perception
2017-12-01
computationally intensive 3D vector math and took more than ten seconds to segment a single LIDAR frame from the HDL-32e with the Dell XPS15 9650’s Intel...Core i7 CPU. Depth Clustering avoids the computationally intensive 3D vector math of Euclidean Clustering-based DON segmentation and, instead
Assessing first-order emulator inference for physical parameters in nonlinear mechanistic models
Hooten, Mevin B.; Leeds, William B.; Fiechter, Jerome; Wikle, Christopher K.
2011-01-01
We present an approach for estimating physical parameters in nonlinear models that relies on an approximation to the mechanistic model itself for computational efficiency. The proposed methodology is validated and applied in two different modeling scenarios: (a) Simulation and (b) lower trophic level ocean ecosystem model. The approach we develop relies on the ability to predict right singular vectors (resulting from a decomposition of computer model experimental output) based on the computer model input and an experimental set of parameters. Critically, we model the right singular vectors in terms of the model parameters via a nonlinear statistical model. Specifically, we focus our attention on first-order models of these right singular vectors rather than the second-order (covariance) structure.
Application of Krylov exponential propagation to fluid dynamics equations
NASA Technical Reports Server (NTRS)
Saad, Youcef; Semeraro, David
1991-01-01
An application of matrix exponentiation via Krylov subspace projection to the solution of fluid dynamics problems is presented. The main idea is to approximate the operation exp(A)v by means of a projection-like process onto a krylov subspace. This results in a computation of an exponential matrix vector product similar to the one above but of a much smaller size. Time integration schemes can then be devised to exploit this basic computational kernel. The motivation of this approach is to provide time-integration schemes that are essentially of an explicit nature but which have good stability properties.
Fundamental organometallic reactions: Applications on the CYBER 205
NASA Technical Reports Server (NTRS)
Rappe, A. K.
1984-01-01
Two of the most challenging problems of Organometallic chemistry (loosely defined) are pollution control with the large space velocities needed and nitrogen fixation, a process so capably done by nature and so relatively poorly done by man (industry). For a computational chemist these problems are on the fringe of what is possible with conventional computers (large models needed and accurate energetics required). A summary of the algorithmic modification needed to address these problems on a vector processor such as the CYBER 205 and a sketch of findings to date on deNOx catalysis and nitrogen fixation are presented.
Orientation of doubly rotated quartz plates.
Sherman, J R
1989-01-01
A derivation from classical spherical trigonometry of equations to compute the orientation of doubly-rotated quartz blanks from Bragg X-ray data is discussed. These are usually derived by compact and efficient vector methods, which are reviewed briefly. They are solved by generating a quadratic equation with numerical coefficients. Two methods exist for performing the computation from measurements against two planes: a direct solution by a quadratic equation and a process of convergent iteration. Both have a spurious solution. Measurement against three lattice planes yields a set of three linear equations the solution of which is an unambiguous result.
Evaluation of laser cutting process with auxiliary gas pressure by soft computing approach
NASA Astrophysics Data System (ADS)
Lazov, Lyubomir; Nikolić, Vlastimir; Jovic, Srdjan; Milovančević, Miloš; Deneva, Heristina; Teirumenieka, Erika; Arsic, Nebojsa
2018-06-01
Evaluation of the optimal laser cutting parameters is very important for the high cut quality. This is highly nonlinear process with different parameters which is the main challenge in the optimization process. Data mining methodology is one of most versatile method which can be used laser cutting process optimization. Support vector regression (SVR) procedure is implemented since it is a versatile and robust technique for very nonlinear data regression. The goal in this study was to determine the optimal laser cutting parameters to ensure robust condition for minimization of average surface roughness. Three cutting parameters, the cutting speed, the laser power, and the assist gas pressure, were used in the investigation. As a laser type TruLaser 1030 technological system was used. Nitrogen as an assisted gas was used in the laser cutting process. As the data mining method, support vector regression procedure was used. Data mining prediction accuracy was very high according the coefficient (R2) of determination and root mean square error (RMSE): R2 = 0.9975 and RMSE = 0.0337. Therefore the data mining approach could be used effectively for determination of the optimal conditions of the laser cutting process.
A recursive technique for adaptive vector quantization
NASA Technical Reports Server (NTRS)
Lindsay, Robert A.
1989-01-01
Vector Quantization (VQ) is fast becoming an accepted, if not preferred method for image compression. The VQ performs well when compressing all types of imagery including Video, Electro-Optical (EO), Infrared (IR), Synthetic Aperture Radar (SAR), Multi-Spectral (MS), and digital map data. The only requirement is to change the codebook to switch the compressor from one image sensor to another. There are several approaches for designing codebooks for a vector quantizer. Adaptive Vector Quantization is a procedure that simultaneously designs codebooks as the data is being encoded or quantized. This is done by computing the centroid as a recursive moving average where the centroids move after every vector is encoded. When computing the centroid of a fixed set of vectors the resultant centroid is identical to the previous centroid calculation. This method of centroid calculation can be easily combined with VQ encoding techniques. The defined quantizer changes after every encoded vector by recursively updating the centroid of minimum distance which is the selected by the encoder. Since the quantizer is changing definition or states after every encoded vector, the decoder must now receive updates to the codebook. This is done as side information by multiplexing bits into the compressed source data.
Thai Language Sentence Similarity Computation Based on Syntactic Structure and Semantic Vector
NASA Astrophysics Data System (ADS)
Wang, Hongbin; Feng, Yinhan; Cheng, Liang
2018-03-01
Sentence similarity computation plays an increasingly important role in text mining, Web page retrieval, machine translation, speech recognition and question answering systems. Thai language as a kind of resources scarce language, it is not like Chinese language with HowNet and CiLin resources. So the Thai sentence similarity research faces some challenges. In order to solve this problem of the Thai language sentence similarity computation. This paper proposes a novel method to compute the similarity of Thai language sentence based on syntactic structure and semantic vector. This method firstly uses the Part-of-Speech (POS) dependency to calculate two sentences syntactic structure similarity, and then through the word vector to calculate two sentences semantic similarity. Finally, we combine the two methods to calculate two Thai language sentences similarity. The proposed method not only considers semantic, but also considers the sentence syntactic structure. The experiment result shows that this method in Thai language sentence similarity computation is feasible.
Song, Kai; Wang, Qi; Liu, Qi; Zhang, Hongquan; Cheng, Yingguo
2011-01-01
This paper describes the design and implementation of a wireless electronic nose (WEN) system which can online detect the combustible gases methane and hydrogen (CH4/H2) and estimate their concentrations, either singly or in mixtures. The system is composed of two wireless sensor nodes—a slave node and a master node. The former comprises a Fe2O3 gas sensing array for the combustible gas detection, a digital signal processor (DSP) system for real-time sampling and processing the sensor array data and a wireless transceiver unit (WTU) by which the detection results can be transmitted to the master node connected with a computer. A type of Fe2O3 gas sensor insensitive to humidity is developed for resistance to environmental influences. A threshold-based least square support vector regression (LS-SVR)estimator is implemented on a DSP for classification and concentration measurements. Experimental results confirm that LS-SVR produces higher accuracy compared with artificial neural networks (ANNs) and a faster convergence rate than the standard support vector regression (SVR). The designed WEN system effectively achieves gas mixture analysis in a real-time process. PMID:22346587
Visualizing turbulent mixing of gases and particles
NASA Technical Reports Server (NTRS)
Ma, Kwan-Liu; Smith, Philip J.; Jain, Sandeep
1995-01-01
A physical model and interactive computer graphics techniques have been developed for the visualization of the basic physical process of stochastic dispersion and mixing from steady-state CFD calculations. The mixing of massless particles and inertial particles is visualized by transforming the vector field from a traditionally Eulerian reference frame into a Lagrangian reference frame. Groups of particles are traced through the vector field for the mean path as well as their statistical dispersion about the mean position by using added scalar information about the root mean square value of the vector field and its Lagrangian time scale. In this way, clouds of particles in a turbulent environment are traced, not just mean paths. In combustion simulations of many industrial processes, good mixing is required to achieve a sufficient degree of combustion efficiency. The ability to visualize this multiphase mixing can not only help identify poor mixing but also explain the mechanism for poor mixing. The information gained from the visualization can be used to improve the overall combustion efficiency in utility boilers or propulsion devices. We have used this technique to visualize steady-state simulations of the combustion performance in several furnace designs.
Ice Shape Characterization Using Self-Organizing Maps
NASA Technical Reports Server (NTRS)
McClain, Stephen T.; Tino, Peter; Kreeger, Richard E.
2011-01-01
A method for characterizing ice shapes using a self-organizing map (SOM) technique is presented. Self-organizing maps are neural-network techniques for representing noisy, multi-dimensional data aligned along a lower-dimensional and possibly nonlinear manifold. For a large set of noisy data, each element of a finite set of codebook vectors is iteratively moved in the direction of the data closest to the winner codebook vector. Through successive iterations, the codebook vectors begin to align with the trends of the higher-dimensional data. In information processing, the intent of SOM methods is to transmit the codebook vectors, which contains far fewer elements and requires much less memory or bandwidth, than the original noisy data set. When applied to airfoil ice accretion shapes, the properties of the codebook vectors and the statistical nature of the SOM methods allows for a quantitative comparison of experimentally measured mean or average ice shapes to ice shapes predicted using computer codes such as LEWICE. The nature of the codebook vectors also enables grid generation and surface roughness descriptions for use with the discrete-element roughness approach. In the present study, SOM characterizations are applied to a rime ice shape, a glaze ice shape at an angle of attack, a bi-modal glaze ice shape, and a multi-horn glaze ice shape. Improvements and future explorations will be discussed.
Solution of partial differential equations on vector and parallel computers
NASA Technical Reports Server (NTRS)
Ortega, J. M.; Voigt, R. G.
1985-01-01
The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed.
Models for discrete-time self-similar vector processes with application to network traffic
NASA Astrophysics Data System (ADS)
Lee, Seungsin; Rao, Raghuveer M.; Narasimha, Rajesh
2003-07-01
The paper defines self-similarity for vector processes by employing the discrete-time continuous-dilation operation which has successfully been used previously by the authors to define 1-D discrete-time stochastic self-similar processes. To define self-similarity of vector processes, it is required to consider the cross-correlation functions between different 1-D processes as well as the autocorrelation function of each constituent 1-D process in it. System models to synthesize self-similar vector processes are constructed based on the definition. With these systems, it is possible to generate self-similar vector processes from white noise inputs. An important aspect of the proposed models is that they can be used to synthesize various types of self-similar vector processes by choosing proper parameters. Additionally, the paper presents evidence of vector self-similarity in two-channel wireless LAN data and applies the aforementioned systems to simulate the corresponding network traffic traces.
Weir, Peter T; Henze, Miriam J; Bleul, Christiane; Baumann-Klausener, Franziska; Labhart, Thomas; Dickinson, Michael H
2016-05-11
Many insects exploit skylight polarization as a compass cue for orientation and navigation. In the fruit fly, Drosophila melanogaster, photoreceptors R7 and R8 in the dorsal rim area (DRA) of the compound eye are specialized to detect the electric vector (e-vector) of linearly polarized light. These photoreceptors are arranged in stacked pairs with identical fields of view and spectral sensitivities, but mutually orthogonal microvillar orientations. As in larger flies, we found that the microvillar orientation of the distal photoreceptor R7 changes in a fan-like fashion along the DRA. This anatomical arrangement suggests that the DRA constitutes a detector for skylight polarization, in which different e-vectors maximally excite different positions in the array. To test our hypothesis, we measured responses to polarized light of varying e-vector angles in the terminals of R7/8 cells using genetically encoded calcium indicators. Our data confirm a progression of preferred e-vector angles from anterior to posterior in the DRA, and a strict orthogonality between the e-vector preferences of paired R7/8 cells. We observed decreased activity in photoreceptors in response to flashes of light polarized orthogonally to their preferred e-vector angle, suggesting reciprocal inhibition between photoreceptors in the same medullar column, which may serve to increase polarization contrast. Together, our results indicate that the polarization-vision system relies on a spatial map of preferred e-vector angles at the earliest stage of sensory processing. The fly's visual system is an influential model system for studying neural computation, and much is known about its anatomy, physiology, and development. The circuits underlying motion processing have received the most attention, but researchers are increasingly investigating other functions, such as color perception and object recognition. In this work, we investigate the early neural processing of a somewhat exotic sense, called polarization vision. Because skylight is polarized in an orientation that is rigidly determined by the position of the sun, this cue provides compass information. Behavioral experiments have shown that many species use the polarization pattern in the sky to direct locomotion. Here we describe the input stage of the fly's polarization-vision system. Copyright © 2016 the authors 0270-6474/16/365397-08$15.00/0.
An index-based algorithm for fast on-line query processing of latent semantic analysis
Li, Pohan; Wang, Wei
2017-01-01
Latent Semantic Analysis (LSA) is widely used for finding the documents whose semantic is similar to the query of keywords. Although LSA yield promising similar results, the existing LSA algorithms involve lots of unnecessary operations in similarity computation and candidate check during on-line query processing, which is expensive in terms of time cost and cannot efficiently response the query request especially when the dataset becomes large. In this paper, we study the efficiency problem of on-line query processing for LSA towards efficiently searching the similar documents to a given query. We rewrite the similarity equation of LSA combined with an intermediate value called partial similarity that is stored in a designed index called partial index. For reducing the searching space, we give an approximate form of similarity equation, and then develop an efficient algorithm for building partial index, which skips the partial similarities lower than a given threshold θ. Based on partial index, we develop an efficient algorithm called ILSA for supporting fast on-line query processing. The given query is transformed into a pseudo document vector, and the similarities between query and candidate documents are computed by accumulating the partial similarities obtained from the index nodes corresponds to non-zero entries in the pseudo document vector. Compared to the LSA algorithm, ILSA reduces the time cost of on-line query processing by pruning the candidate documents that are not promising and skipping the operations that make little contribution to similarity scores. Extensive experiments through comparison with LSA have been done, which demonstrate the efficiency and effectiveness of our proposed algorithm. PMID:28520747
An index-based algorithm for fast on-line query processing of latent semantic analysis.
Zhang, Mingxi; Li, Pohan; Wang, Wei
2017-01-01
Latent Semantic Analysis (LSA) is widely used for finding the documents whose semantic is similar to the query of keywords. Although LSA yield promising similar results, the existing LSA algorithms involve lots of unnecessary operations in similarity computation and candidate check during on-line query processing, which is expensive in terms of time cost and cannot efficiently response the query request especially when the dataset becomes large. In this paper, we study the efficiency problem of on-line query processing for LSA towards efficiently searching the similar documents to a given query. We rewrite the similarity equation of LSA combined with an intermediate value called partial similarity that is stored in a designed index called partial index. For reducing the searching space, we give an approximate form of similarity equation, and then develop an efficient algorithm for building partial index, which skips the partial similarities lower than a given threshold θ. Based on partial index, we develop an efficient algorithm called ILSA for supporting fast on-line query processing. The given query is transformed into a pseudo document vector, and the similarities between query and candidate documents are computed by accumulating the partial similarities obtained from the index nodes corresponds to non-zero entries in the pseudo document vector. Compared to the LSA algorithm, ILSA reduces the time cost of on-line query processing by pruning the candidate documents that are not promising and skipping the operations that make little contribution to similarity scores. Extensive experiments through comparison with LSA have been done, which demonstrate the efficiency and effectiveness of our proposed algorithm.
Chaibub Neto, Elias
2015-01-01
In this paper we propose a vectorized implementation of the non-parametric bootstrap for statistics based on sample moments. Basically, we adopt the multinomial sampling formulation of the non-parametric bootstrap, and compute bootstrap replications of sample moment statistics by simply weighting the observed data according to multinomial counts instead of evaluating the statistic on a resampled version of the observed data. Using this formulation we can generate a matrix of bootstrap weights and compute the entire vector of bootstrap replications with a few matrix multiplications. Vectorization is particularly important for matrix-oriented programming languages such as R, where matrix/vector calculations tend to be faster than scalar operations implemented in a loop. We illustrate the application of the vectorized implementation in real and simulated data sets, when bootstrapping Pearson’s sample correlation coefficient, and compared its performance against two state-of-the-art R implementations of the non-parametric bootstrap, as well as a straightforward one based on a for loop. Our investigations spanned varying sample sizes and number of bootstrap replications. The vectorized bootstrap compared favorably against the state-of-the-art implementations in all cases tested, and was remarkably/considerably faster for small/moderate sample sizes. The same results were observed in the comparison with the straightforward implementation, except for large sample sizes, where the vectorized bootstrap was slightly slower than the straightforward implementation due to increased time expenditures in the generation of weight matrices via multinomial sampling. PMID:26125965
NASA Astrophysics Data System (ADS)
Yihaa Roodhiyah, Lisa’; Tjong, Tiffany; Nurhasan; Sutarno, D.
2018-04-01
The late research, linear matrices of vector finite element in two dimensional(2-D) magnetotelluric (MT) responses modeling was solved by non-sparse direct solver in TE mode. Nevertheless, there is some weakness which have to be improved especially accuracy in the low frequency (10-3 Hz-10-5 Hz) which is not achieved yet and high cost computation in dense mesh. In this work, the solver which is used is sparse direct solver instead of non-sparse direct solverto overcome the weaknesses of solving linear matrices of vector finite element metod using non-sparse direct solver. Sparse direct solver will be advantageous in solving linear matrices of vector finite element method because of the matrix properties which is symmetrical and sparse. The validation of sparse direct solver in solving linear matrices of vector finite element has been done for a homogen half-space model and vertical contact model by analytical solution. Thevalidation result of sparse direct solver in solving linear matrices of vector finite element shows that sparse direct solver is more stable than non-sparse direct solver in computing linear problem of vector finite element method especially in low frequency. In the end, the accuracy of 2D MT responses modelling in low frequency (10-3 Hz-10-5 Hz) has been reached out under the efficient allocation memory of array and less computational time consuming.
Global Magnetohydrodynamic Simulation Using High Performance FORTRAN on Parallel Computers
NASA Astrophysics Data System (ADS)
Ogino, T.
High Performance Fortran (HPF) is one of modern and common techniques to achieve high performance parallel computation. We have translated a 3-dimensional magnetohydrodynamic (MHD) simulation code of the Earth's magnetosphere from VPP Fortran to HPF/JA on the Fujitsu VPP5000/56 vector-parallel supercomputer and the MHD code was fully vectorized and fully parallelized in VPP Fortran. The entire performance and capability of the HPF MHD code could be shown to be almost comparable to that of VPP Fortran. A 3-dimensional global MHD simulation of the earth's magnetosphere was performed at a speed of over 400 Gflops with an efficiency of 76.5 VPP5000/56 in vector and parallel computation that permitted comparison with catalog values. We have concluded that fluid and MHD codes that are fully vectorized and fully parallelized in VPP Fortran can be translated with relative ease to HPF/JA, and a code in HPF/JA may be expected to perform comparably to the same code written in VPP Fortran.
Wang, Zhi-Long; Zhou, Zhi-Guo; Chen, Ying; Li, Xiao-Ting; Sun, Ying-Shi
The aim of this study was to diagnose lymph node metastasis of esophageal cancer by support vector machines model based on computed tomography. A total of 131 esophageal cancer patients with preoperative chemotherapy and radical surgery were included. Various indicators (tumor thickness, tumor length, tumor CT value, total number of lymph nodes, and long axis and short axis sizes of largest lymph node) on CT images before and after neoadjuvant chemotherapy were recorded. A support vector machines model based on these CT indicators was built to predict lymph node metastasis. Support vector machines model diagnosed lymph node metastasis better than preoperative short axis size of largest lymph node on CT. The area under the receiver operating characteristic curves were 0.887 and 0.705, respectively. The support vector machine model of CT images can help diagnose lymph node metastasis in esophageal cancer with preoperative chemotherapy.
Method and System for Temporal Filtering in Video Compression Systems
NASA Technical Reports Server (NTRS)
Lu, Ligang; He, Drake; Jagmohan, Ashish; Sheinin, Vadim
2011-01-01
Three related innovations combine improved non-linear motion estimation, video coding, and video compression. The first system comprises a method in which side information is generated using an adaptive, non-linear motion model. This method enables extrapolating and interpolating a visual signal, including determining the first motion vector between the first pixel position in a first image to a second pixel position in a second image; determining a second motion vector between the second pixel position in the second image and a third pixel position in a third image; determining a third motion vector between the first pixel position in the first image and the second pixel position in the second image, the second pixel position in the second image, and the third pixel position in the third image using a non-linear model; and determining a position of the fourth pixel in a fourth image based upon the third motion vector. For the video compression element, the video encoder has low computational complexity and high compression efficiency. The disclosed system comprises a video encoder and a decoder. The encoder converts the source frame into a space-frequency representation, estimates the conditional statistics of at least one vector of space-frequency coefficients with similar frequencies, and is conditioned on previously encoded data. It estimates an encoding rate based on the conditional statistics and applies a Slepian-Wolf code with the computed encoding rate. The method for decoding includes generating a side-information vector of frequency coefficients based on previously decoded source data and encoder statistics and previous reconstructions of the source frequency vector. It also performs Slepian-Wolf decoding of a source frequency vector based on the generated side-information and the Slepian-Wolf code bits. The video coding element includes receiving a first reference frame having a first pixel value at a first pixel position, a second reference frame having a second pixel value at a second pixel position, and a third reference frame having a third pixel value at a third pixel position. It determines a first motion vector between the first pixel position and the second pixel position, a second motion vector between the second pixel position and the third pixel position, and a fourth pixel value for a fourth frame based upon a linear or nonlinear combination of the first pixel value, the second pixel value, and the third pixel value. A stationary filtering process determines the estimated pixel values. The parameters of the filter may be predetermined constants.
NASA Astrophysics Data System (ADS)
Bruni, Marco; Thomas, Daniel B.; Wands, David
2014-02-01
We present the first calculation of an intrinsically relativistic quantity, the leading-order correction to Newtonian theory, in fully nonlinear cosmological large-scale structure studies. Traditionally, nonlinear structure formation in standard ΛCDM cosmology is studied using N-body simulations, based on Newtonian gravitational dynamics on an expanding background. When one derives the Newtonian regime in a way that is a consistent approximation to the Einstein equations, the first relativistic correction to the usual Newtonian scalar potential is a gravitomagnetic vector potential, giving rise to frame dragging. At leading order, this vector potential does not affect the matter dynamics, thus it can be computed from Newtonian N-body simulations. We explain how we compute the vector potential from simulations in ΛCDM and examine its magnitude relative to the scalar potential, finding that the power spectrum of the vector potential is of the order 10-5 times the scalar power spectrum over the range of nonlinear scales we consider. On these scales the vector potential is up to two orders of magnitudes larger than the value predicted by second-order perturbation theory extrapolated to the same scales. We also discuss some possible observable effects and future developments.
Lee, David; Park, Sang-Hoon; Lee, Sang-Goog
2017-10-07
In this paper, we propose a set of wavelet-based combined feature vectors and a Gaussian mixture model (GMM)-supervector to enhance training speed and classification accuracy in motor imagery brain-computer interfaces. The proposed method is configured as follows: first, wavelet transforms are applied to extract the feature vectors for identification of motor imagery electroencephalography (EEG) and principal component analyses are used to reduce the dimensionality of the feature vectors and linearly combine them. Subsequently, the GMM universal background model is trained by the expectation-maximization (EM) algorithm to purify the training data and reduce its size. Finally, a purified and reduced GMM-supervector is used to train the support vector machine classifier. The performance of the proposed method was evaluated for three different motor imagery datasets in terms of accuracy, kappa, mutual information, and computation time, and compared with the state-of-the-art algorithms. The results from the study indicate that the proposed method achieves high accuracy with a small amount of training data compared with the state-of-the-art algorithms in motor imagery EEG classification.
Supercomputer implementation of finite element algorithms for high speed compressible flows
NASA Technical Reports Server (NTRS)
Thornton, E. A.; Ramakrishnan, R.
1986-01-01
Prediction of compressible flow phenomena using the finite element method is of recent origin and considerable interest. Two shock capturing finite element formulations for high speed compressible flows are described. A Taylor-Galerkin formulation uses a Taylor series expansion in time coupled with a Galerkin weighted residual statement. The Taylor-Galerkin algorithms use explicit artificial dissipation, and the performance of three dissipation models are compared. A Petrov-Galerkin algorithm has as its basis the concepts of streamline upwinding. Vectorization strategies are developed to implement the finite element formulations on the NASA Langley VPS-32. The vectorization scheme results in finite element programs that use vectors of length of the order of the number of nodes or elements. The use of the vectorization procedure speeds up processing rates by over two orders of magnitude. The Taylor-Galerkin and Petrov-Galerkin algorithms are evaluated for 2D inviscid flows on criteria such as solution accuracy, shock resolution, computational speed and storage requirements. The convergence rates for both algorithms are enhanced by local time-stepping schemes. Extension of the vectorization procedure for predicting 2D viscous and 3D inviscid flows are demonstrated. Conclusions are drawn regarding the applicability of the finite element procedures for realistic problems that require hundreds of thousands of nodes.
NASA Astrophysics Data System (ADS)
Přibil, Jiří; Přibilová, Anna; Ďuračkoá, Daniela
2014-01-01
The paper describes our experiment with using the Gaussian mixture models (GMM) for classification of speech uttered by a person wearing orthodontic appliances. For the GMM classification, the input feature vectors comprise the basic and the complementary spectral properties as well as the supra-segmental parameters. Dependence of classification correctness on the number of the parameters in the input feature vector and on the computation complexity is also evaluated. In addition, an influence of the initial setting of the parameters for GMM training process was analyzed. Obtained recognition results are compared visually in the form of graphs as well as numerically in the form of tables and confusion matrices for tested sentences uttered using three configurations of orthodontic appliances.
NASA Astrophysics Data System (ADS)
Carvalho, F.; Gonçalves, V. P.; Navarra, F. S.; Spiering, D.
2018-04-01
Exclusive vector meson photoproduction associated with a leading baryon (B =n ,Δ+,Δ0 ) in p p and p A collisions at RHIC and LHC energies is investigated using the color dipole formalism and taking into account nonlinear effects in the QCD dynamics. In particular, we compute the cross sections for ρ , ϕ and J /Ψ production together with a Δ and compare the predictions with those obtained for a leading neutron. Our results show that the V +Δ cross section is almost 30% of the V +n one. Our results also show that a future experimental analysis of these processes is, in principle, feasible and can be useful to study leading particle production.
A combined direct/inverse three-dimensional transonic wing design method for vector computers
NASA Technical Reports Server (NTRS)
Weed, R. A.; Carlson, L. A.; Anderson, W. K.
1984-01-01
A three-dimensional transonic-wing design algorithm for vector computers is developed, and the results of sample computations are presented graphically. The method incorporates the direct/inverse scheme of Carlson (1975), a Cartesian grid system with boundary conditions applied at a mean plane, and a potential-flow solver based on the conservative form of the full potential equation and using the ZEBRA II vectorizable solution algorithm of South et al. (1980). The accuracy and consistency of the method with regard to direct and inverse analysis and trailing-edge closure are verified in the test computations.
CSM research: Methods and application studies
NASA Technical Reports Server (NTRS)
Knight, Norman F., Jr.
1989-01-01
Computational mechanics is that discipline of applied science and engineering devoted to the study of physical phenomena by means of computational methods based on mathematical modeling and simulation, utilizing digital computers. The discipline combines theoretical and applied mechanics, approximation theory, numerical analysis, and computer science. Computational mechanics has had a major impact on engineering analysis and design. When applied to structural mechanics, the discipline is referred to herein as computational structural mechanics. Complex structures being considered by NASA for the 1990's include composite primary aircraft structures and the space station. These structures will be much more difficult to analyze than today's structures and necessitate a major upgrade in computerized structural analysis technology. NASA has initiated a research activity in structural analysis called Computational Structural Mechanics (CSM). The broad objective of the CSM activity is to develop advanced structural analysis technology that will exploit modern and emerging computers, such as those with vector and/or parallel processing capabilities. Here, the current research directions for the Methods and Application Studies Team of the Langley CSM activity are described.
A Metric to Quantify Shared Visual Attention in Two-Person Teams
NASA Technical Reports Server (NTRS)
Gontar, Patrick; Mulligan, Jeffrey B.
2015-01-01
Introduction: Critical tasks in high-risk environments are often performed by teams, the members of which must work together efficiently. In some situations, the team members may have to work together to solve a particular problem, while in others it may be better for them to divide the work into separate tasks that can be completed in parallel. We hypothesize that these two team strategies can be differentiated on the basis of shared visual attention, measured by gaze tracking. 2) Methods: Gaze recordings were obtained for two-person flight crews flying a high-fidelity simulator (Gontar, Hoermann, 2014). Gaze was categorized with respect to 12 areas of interest (AOIs). We used these data to construct time series of 12 dimensional vectors, with each vector component representing one of the AOIs. At each time step, each vector component was set to 0, except for the one corresponding to the currently fixated AOI, which was set to 1. This time series could then be averaged in time, with the averaging window time (t) as a variable parameter. For example, when we average with a t of one minute, each vector component represents the proportion of time that the corresponding AOI was fixated within the corresponding one minute interval. We then computed the Pearson product-moment correlation coefficient between the gaze proportion vectors for each of the two crew members, at each point in time, resulting in a signal representing the time-varying correlation between gaze behaviors. We determined criteria for concluding correlated gaze behavior using two methods: first, a permutation test was applied to the subjects' data. When one crew member's gaze proportion vector is correlated with a random time sample from the other crewmember's data, a distribution of correlation values is obtained that differs markedly from the distribution obtained from temporally aligned samples. In addition to validating that the gaze tracker was functioning reasonably well, this also allows us to compute probabilities of coordinated behavior for each value of the correlation. As an alternative, we also tabulated distributions of correlation coefficients for synthetic data sets, in which the behavior was modeled as a first-order Markov process, and compared correlation distributions for identical processes with those for disparate processes, allowing us to choose criteria and estimate error rates. 3) Discussion: Our method of gaze correlation is able to measure shared visual attention, and can distinguish between activities involving different instruments. We plan to analyze whether pilots strategies of sharing visual attention can predict performance. Possible measurements of performance include expert ratings from instructors, fuel consumption, total task time, and failure rate. While developed for two-person crews, our approach can be applied to larger groups, using intra-class correlation coefficients instead of the Pearson product-moment correlation.
NASA Astrophysics Data System (ADS)
Khan, F.; Enzmann, F.; Kersten, M.
2015-12-01
In X-ray computed microtomography (μXCT) image processing is the most important operation prior to image analysis. Such processing mainly involves artefact reduction and image segmentation. We propose a new two-stage post-reconstruction procedure of an image of a geological rock core obtained by polychromatic cone-beam μXCT technology. In the first stage, the beam-hardening (BH) is removed applying a best-fit quadratic surface algorithm to a given image data set (reconstructed slice), which minimizes the BH offsets of the attenuation data points from that surface. The final BH-corrected image is extracted from the residual data, or the difference between the surface elevation values and the original grey-scale values. For the second stage, we propose using a least square support vector machine (a non-linear classifier algorithm) to segment the BH-corrected data as a pixel-based multi-classification task. A combination of the two approaches was used to classify a complex multi-mineral rock sample. The Matlab code for this approach is provided in the Appendix. A minor drawback is that the proposed segmentation algorithm may become computationally demanding in the case of a high dimensional training data set.
NASA Technical Reports Server (NTRS)
Himer, J. T.
1992-01-01
Fortran has largely enjoyed prominence for the past few decades as the computer programming language of choice for numerically intensive scientific, engineering, and process control applications. Fortran's well understood static language syntax has allowed resulting parsers and compiler optimizing technologies to often generate among the most efficient and fastest run-time executables, particularly on high-end scalar and vector supercomputers. Computing architectures and paradigms have changed considerably since the last ANSI/ISO Fortran release in 1978, and while FORTRAN 77 has more than survived, it's aged features provide only partial functionality for today's demanding computing environments. The simple block procedural languages have been necessarily evolving, or giving way, to specialized supercomputing, network resource, and object-oriented paradigms. To address these new computing demands, ANSI has worked for the last 12-years with three international public reviews to deliver Fortran 90. Fortran 90 has superseded and replaced ISO FORTRAN 77 internationally as the sole Fortran standard; while in the US, Fortran 90 is expected to be adopted as the ANSI standard this summer, coexisting with ANSI FORTRAN 77 until at least 1996. The development path and current state of Fortran will be briefly described highlighting the many new Fortran 90 syntactic and semantic additions which support (among others): free form source; array syntax; new control structures; modules and interfaces; pointers; derived data types; dynamic memory; enhanced I/O; operator overloading; data abstraction; user optional arguments; new intrinsics for array, bit manipulation, and system inquiry; and enhanced portability through better generic control of underlying system arithmetic models. Examples from dynamical astronomy, signal and image processing will attempt to illustrate Fortran 90's applicability to today's general scalar, vector, and parallel scientific and engineering requirements and object oriented programming paradigms. Time permitting, current work proceeding on the future development of Fortran 2000 and collateral standards will be introduced.
Support Vector Machine-Based Endmember Extraction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Filippi, Anthony M; Archibald, Richard K
Introduced in this paper is the utilization of Support Vector Machines (SVMs) to automatically perform endmember extraction from hyperspectral data. The strengths of SVM are exploited to provide a fast and accurate calculated representation of high-dimensional data sets that may consist of multiple distributions. Once this representation is computed, the number of distributions can be determined without prior knowledge. For each distribution, an optimal transform can be determined that preserves informational content while reducing the data dimensionality, and hence, the computational cost. Finally, endmember extraction for the whole data set is accomplished. Results indicate that this Support Vector Machine-Based Endmembermore » Extraction (SVM-BEE) algorithm has the capability of autonomously determining endmembers from multiple clusters with computational speed and accuracy, while maintaining a robust tolerance to noise.« less
Vectorized schemes for conical potential flow using the artificial density method
NASA Technical Reports Server (NTRS)
Bradley, P. F.; Dwoyer, D. L.; South, J. C., Jr.; Keen, J. M.
1984-01-01
A method is developed to determine solutions to the full-potential equation for steady supersonic conical flow using the artificial density method. Various update schemes used generally for transonic potential solutions are investigated. The schemes are compared for speed and robustness. All versions of the computer code have been vectorized and are currently running on the CYBER-203 computer. The update schemes are vectorized, where possible, either fully (explicit schemes) or partially (implicit schemes). Since each version of the code differs only by the update scheme and elements other than the update scheme are completely vectorizable, comparisons of computational effort and convergence rate among schemes are a measure of the specific scheme's performance. Results are presented for circular and elliptical cones at angle of attack for subcritical and supercritical crossflows.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Avila, Gustavo, E-mail: Gustavo-Avila@telefonica.net; Carrington, Tucker, E-mail: Tucker.Carrington@queensu.ca
In this paper, we improve the collocation method for computing vibrational spectra that was presented in Avila and Carrington, Jr. [J. Chem. Phys. 139, 134114 (2013)]. Using an iterative eigensolver, energy levels and wavefunctions are determined from values of the potential on a Smolyak grid. The kinetic energy matrix-vector product is evaluated by transforming a vector labelled with (nondirect product) grid indices to a vector labelled by (nondirect product) basis indices. Both the transformation and application of the kinetic energy operator (KEO) scale favorably. Collocation facilitates dealing with complicated KEOs because it obviates the need to calculate integrals of coordinatemore » dependent coefficients of differential operators. The ideas are tested by computing energy levels of HONO using a KEO in bond coordinates.« less
NASA Technical Reports Server (NTRS)
Greatorex, Scott (Editor); Beckman, Mark
1996-01-01
Several future, and some current missions, use an on-board computer (OBC) force model that is very limited. The OBC geopotential force model typically includes only the J(2), J(3), J(4), C(2,2) and S(2,2) terms to model non-spherical Earth gravitational effects. The Tropical Rainfall Measuring Mission (TRMM), Wide-field Infrared Explorer (WIRE), Transition Region and Coronal Explorer (TRACE), Submillimeter Wave Astronomy Satellite (SWAS), and X-ray Timing Explorer (XTE) all plan to use this geopotential force model on-board. The Solar, Anomalous, and Magnetospheric Particle Explorer (SAMPEX) is already flying this geopotential force model. Past analysis has shown that one of the leading sources of error in the OBC propagated ephemeris is the omission of the higher order geopotential terms. However, these same analyses have shown a wide range of accuracies for the OBC ephemerides. Analysis was performed using EUVE state vectors that showed the EUVE four day OBC propagated ephemerides varied in accuracy from 200 m. to 45 km. depending on the initial vector used to start the propagation. The vectors used in the study were from a single EUVE orbit at one minute intervals in the ephemeris. Since each vector propagated practically the same path as the others, the differences seen had to be due to differences in the inital state vector only. An algorithm was developed that will optimize the epoch of the uploaded state vector. Proper selection can reduce the previous errors of anywhere from 200 m. to 45 km. to generally less than one km. over four days of propagation. This would enable flight projects to minimize state vector uploads to the spacecraft. Additionally, this method is superior to other methods in that no additional orbit estimates need be done. The definitive ephemeris generated on the ground can be used as long as the proper epoch is chosen. This algorithm can be easily coded in software that would pick the epoch within a specified time range that would minimize the OBC propagation error. This techniques should greatly improve the accuracy of the OBC propagation on-board future spacecraft such as TRMM, WIRE, SWAS, and XTE without increasing complexity in the ground processing.
NASA Technical Reports Server (NTRS)
Smith, Jason T.; Welsh, Sam J.; Farinetti, Antonio L.; Wegner, Tim; Blakeslee, James; Deboeck, Toni F.; Dyer, Daniel; Corley, Bryan M.; Ollivierre, Jarmaine; Kramer, Leonard;
2010-01-01
A Spacecraft Position Optimal Tracking (SPOT) program was developed to process Global Positioning System (GPS) data, sent via telemetry from a spacecraft, to generate accurate navigation estimates of the vehicle position and velocity (state vector) using a Kalman filter. This program uses the GPS onboard receiver measurements to sequentially calculate the vehicle state vectors and provide this information to ground flight controllers. It is the first real-time ground-based shuttle navigation application using onboard sensors. The program is compact, portable, self-contained, and can run on a variety of UNIX or Linux computers. The program has a modular objec-toriented design that supports application-specific plugins such as data corruption remediation pre-processing and remote graphics display. The Kalman filter is extensible to additional sensor types or force models. The Kalman filter design is also strong against data dropouts because it uses physical models from state and covariance propagation in the absence of data. The design of this program separates the functionalities of SPOT into six different executable processes. This allows for the individual processes to be connected in an a la carte manner, making the feature set and executable complexity of SPOT adaptable to the needs of the user. Also, these processes need not be executed on the same workstation. This allows for communications between SPOT processes executing on the same Local Area Network (LAN). Thus, SPOT can be executed in a distributed sense with the capability for a team of flight controllers to efficiently share the same trajectory information currently being computed by the program. SPOT is used in the Mission Control Center (MCC) for Space Shuttle Program (SSP) and International Space Station Program (ISSP) operations, and can also be used as a post -flight analysis tool. It is primarily used for situational awareness, and for contingency situations.
Smisc - A collection of miscellaneous functions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Landon Sego, PNNL
2015-08-31
A collection of functions for statistical computing and data manipulation. These include routines for rapidly aggregating heterogeneous matrices, manipulating file names, loading R objects, sourcing multiple R files, formatting datetimes, multi-core parallel computing, stream editing, specialized plotting, etc. Smisc-package A collection of miscellaneous functions allMissing Identifies missing rows or columns in a data frame or matrix as.numericSilent Silent wrapper for coercing a vector to numeric comboList Produces all possible combinations of a set of linear model predictors cumMax Computes the maximum of the vector up to the current index cumsumNA Computes the cummulative sum of a vector without propogating NAsmore » d2binom Probability functions for the sum of two independent binomials dataIn A flexible way to import data into R. dbb The Beta-Binomial Distribution df2list Row-wise conversion of a data frame to a list dfplapply Parallelized single row processing of a data frame dframeEquiv Examines the equivalence of two dataframes or matrices dkbinom Probability functions for the sum of k independent binomials factor2character Converts all factor variables in a dataframe to character variables findDepMat Identify linearly dependent rows or columns in a matrix formatDT Converts date or datetime strings into alternate formats getExtension Filename manipulations: remove the extension or path, extract the extension or path getPath Filename manipulations: remove the extension or path, extract the extension or path grabLast Filename manipulations: remove the extension or path, extract the extension or path ifelse1 Non-vectorized version of ifelse integ Simple numerical integration routine interactionPlot Two-way Interaction Plot with Error Bar linearMap Linear mapping of a numerical vector or scalar list2df Convert a list to a data frame loadObject Loads and returns the object(s) in an ".Rdata" file more Display the contents of a file to the R terminal movAvg2 Calculate the moving average using a 2-sided window openDevice Opens a graphics device based on the filename extension p2binom Probability functions for the sum of two independent binomials padZero Pad a vector of numbers with zeros parseJob Parses a collection of elements into (almost) equal sized groups pbb The Beta-Binomial Distribution pcbinom A continuous version of the binomial cdf pkbinom Probability functions for the sum of k independent binomials plapply Simple parallelization of lapply plotFun Plot one or more functions on a single plot PowerData An example of power data pvar Prints the name and value of one or more objects qbb The Beta-Binomial Distribution rbb And numerous others (space limits reporting).« less
Computational model of a vector-mediated epidemic
NASA Astrophysics Data System (ADS)
Dickman, Adriana Gomes; Dickman, Ronald
2015-05-01
We discuss a lattice model of vector-mediated transmission of a disease to illustrate how simulations can be applied in epidemiology. The population consists of two species, human hosts and vectors, which contract the disease from one another. Hosts are sedentary, while vectors (mosquitoes) diffuse in space. Examples of such diseases are malaria, dengue fever, and Pierce's disease in vineyards. The model exhibits a phase transition between an absorbing (infection free) phase and an active one as parameters such as infection rates and vector density are varied.
NASA Astrophysics Data System (ADS)
Liu, GaiYun; Chao, Daniel Yuh
2015-08-01
To date, research on the supervisor design for flexible manufacturing systems focuses on speeding up the computation of optimal (maximally permissive) liveness-enforcing controllers. Recent deadlock prevention policies for systems of simple sequential processes with resources (S3PR) reduce the computation burden by considering only the minimal portion of all first-met bad markings (FBMs). Maximal permissiveness is ensured by not forbidding any live state. This paper proposes a method to further reduce the size of minimal set of FBMs to efficiently solve integer linear programming problems while maintaining maximal permissiveness using a vector-covering approach. This paper improves the previous work and achieves the simplest structure with the minimal number of monitors.
Arana-Daniel, Nancy; Gallegos, Alberto A; López-Franco, Carlos; Alanís, Alma Y; Morales, Jacob; López-Franco, Adriana
2016-01-01
With the increasing power of computers, the amount of data that can be processed in small periods of time has grown exponentially, as has the importance of classifying large-scale data efficiently. Support vector machines have shown good results classifying large amounts of high-dimensional data, such as data generated by protein structure prediction, spam recognition, medical diagnosis, optical character recognition and text classification, etc. Most state of the art approaches for large-scale learning use traditional optimization methods, such as quadratic programming or gradient descent, which makes the use of evolutionary algorithms for training support vector machines an area to be explored. The present paper proposes an approach that is simple to implement based on evolutionary algorithms and Kernel-Adatron for solving large-scale classification problems, focusing on protein structure prediction. The functional properties of proteins depend upon their three-dimensional structures. Knowing the structures of proteins is crucial for biology and can lead to improvements in areas such as medicine, agriculture and biofuels.
NASA Astrophysics Data System (ADS)
Liu, Tianyu; Du, Xining; Ji, Wei; Xu, X. George; Brown, Forrest B.
2014-06-01
For nuclear reactor analysis such as the neutron eigenvalue calculations, the time consuming Monte Carlo (MC) simulations can be accelerated by using graphics processing units (GPUs). However, traditional MC methods are often history-based, and their performance on GPUs is affected significantly by the thread divergence problem. In this paper we describe the development of a newly designed event-based vectorized MC algorithm for solving the neutron eigenvalue problem. The code was implemented using NVIDIA's Compute Unified Device Architecture (CUDA), and tested on a NVIDIA Tesla M2090 GPU card. We found that although the vectorized MC algorithm greatly reduces the occurrence of thread divergence thus enhancing the warp execution efficiency, the overall simulation speed is roughly ten times slower than the history-based MC code on GPUs. Profiling results suggest that the slow speed is probably due to the memory access latency caused by the large amount of global memory transactions. Possible solutions to improve the code efficiency are discussed.
NASA Technical Reports Server (NTRS)
Jones, D. W.
1971-01-01
The navigation and guidance process for the Jupiter, Saturn and Uranus planetary encounter phases of the 1977 Grand Tour interior mission was simulated. Reference approach navigation accuracies were defined and the relative information content of the various observation types were evaluated. Reference encounter guidance requirements were defined, sensitivities to assumed simulation model parameters were determined and the adequacy of the linear estimation theory was assessed. A linear sequential estimator was used to provide an estimate of the augmented state vector, consisting of the six state variables of position and velocity plus the three components of a planet position bias. The guidance process was simulated using a nonspherical model of the execution errors. Computation algorithms which simulate the navigation and guidance process were derived from theory and implemented into two research-oriented computer programs, written in FORTRAN.
Computer graphics application in the engineering design integration system
NASA Technical Reports Server (NTRS)
Glatt, C. R.; Abel, R. W.; Hirsch, G. N.; Alford, G. E.; Colquitt, W. N.; Stewart, W. A.
1975-01-01
The computer graphics aspect of the Engineering Design Integration (EDIN) system and its application to design problems were discussed. Three basic types of computer graphics may be used with the EDIN system for the evaluation of aerospace vehicles preliminary designs: offline graphics systems using vellum-inking or photographic processes, online graphics systems characterized by direct coupled low cost storage tube terminals with limited interactive capabilities, and a minicomputer based refresh terminal offering highly interactive capabilities. The offline line systems are characterized by high quality (resolution better than 0.254 mm) and slow turnaround (one to four days). The online systems are characterized by low cost, instant visualization of the computer results, slow line speed (300 BAUD), poor hard copy, and the early limitations on vector graphic input capabilities. The recent acquisition of the Adage 330 Graphic Display system has greatly enhanced the potential for interactive computer aided design.
Method and system of filtering and recommending documents
Patton, Robert M.; Potok, Thomas E.
2016-02-09
Disclosed is a method and system for discovering documents using a computer and providing a small set of the most relevant documents to the attention of a human observer. Using the method, the computer obtains a seed document from the user and generates a seed document vector using term frequency-inverse corpus frequency weighting. A keyword index for a plurality of source documents can be compared with the weighted terms of the seed document vector. The comparison is then filtered to reduce the number of documents, which define an initial subset of the source documents. Initial subset vectors are generated and compared to the seed document vector to obtain a similarity value for each comparison. Based on the similarity value, the method then recommends one or more of the source documents.
NASA Technical Reports Server (NTRS)
Muellerschoen, R. J.
1988-01-01
A unified method to permute vector-stored upper-triangular diagonal factorized covariance (UD) and vector stored upper-triangular square-root information filter (SRIF) arrays is presented. The method involves cyclical permutation of the rows and columns of the arrays and retriangularization with appropriate square-root-free fast Givens rotations or elementary slow Givens reflections. A minimal amount of computation is performed and only one scratch vector of size N is required, where N is the column dimension of the arrays. To make the method efficient for large SRIF arrays on a virtual memory machine, three additional scratch vectors each of size N are used to avoid expensive paging faults. The method discussed is compared with the methods and routines of Bierman's Estimation Subroutine Library (ESL).
Eichenberger, Alexandre E; Gschwind, Michael K; Gunnels, John A
2014-02-11
Mechanisms for performing a complex matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the complex matrix multiplication operation to a first target vector register. The first vector operand comprises a real and imaginary part of a first complex vector value. A complex load and splat operation is performed to load a second complex vector value of a second vector operand and replicate the second complex vector value within a second target vector register. The second complex vector value has a real and imaginary part. A cross multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the complex matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored in a result vector register.
Fast parallel tandem mass spectral library searching using GPU hardware acceleration
Baumgardner, Lydia Ashleigh; Shanmugam, Avinash Kumar; Lam, Henry; Eng, Jimmy K.; Martin, Daniel B.
2011-01-01
Mass spectrometry-based proteomics is a maturing discipline of biologic research that is experiencing substantial growth. Instrumentation has steadily improved over time with the advent of faster and more sensitive instruments collecting ever larger data files. Consequently, the computational process of matching a peptide fragmentation pattern to its sequence, traditionally accomplished by sequence database searching and more recently also by spectral library searching, has become a bottleneck in many mass spectrometry experiments. In both of these methods, the main rate limiting step is the comparison of an acquired spectrum with all potential matches from a spectral library or sequence database. This is a highly parallelizable process because the core computational element can be represented as a simple but arithmetically intense multiplication of two vectors. In this paper we present a proof of concept project taking advantage of the massively parallel computing available on graphics processing units (GPUs) to distribute and accelerate the process of spectral assignment using spectral library searching. This program, which we have named FastPaSS (for Fast Parallelized Spectral Searching) is implemented in CUDA (Compute Unified Device Architecture) from NVIDIA which allows direct access to the processors in an NVIDIA GPU. Our efforts demonstrate the feasibility of GPU computing for spectral assignment, through implementation of the validated spectral searching algorithm SpectraST in the CUDA environment. PMID:21545112
Vector disparity sensor with vergence control for active vision systems.
Barranco, Francisco; Diaz, Javier; Gibaldi, Agostino; Sabatini, Silvio P; Ros, Eduardo
2012-01-01
This paper presents an architecture for computing vector disparity for active vision systems as used on robotics applications. The control of the vergence angle of a binocular system allows us to efficiently explore dynamic environments, but requires a generalization of the disparity computation with respect to a static camera setup, where the disparity is strictly 1-D after the image rectification. The interaction between vision and motor control allows us to develop an active sensor that achieves high accuracy of the disparity computation around the fixation point, and fast reaction time for the vergence control. In this contribution, we address the development of a real-time architecture for vector disparity computation using an FPGA device. We implement the disparity unit and the control module for vergence, version, and tilt to determine the fixation point. In addition, two on-chip different alternatives for the vector disparity engines are discussed based on the luminance (gradient-based) and phase information of the binocular images. The multiscale versions of these engines are able to estimate the vector disparity up to 32 fps on VGA resolution images with very good accuracy as shown using benchmark sequences with known ground-truth. The performances in terms of frame-rate, resource utilization, and accuracy of the presented approaches are discussed. On the basis of these results, our study indicates that the gradient-based approach leads to the best trade-off choice for the integration with the active vision system.
Vector Disparity Sensor with Vergence Control for Active Vision Systems
Barranco, Francisco; Diaz, Javier; Gibaldi, Agostino; Sabatini, Silvio P.; Ros, Eduardo
2012-01-01
This paper presents an architecture for computing vector disparity for active vision systems as used on robotics applications. The control of the vergence angle of a binocular system allows us to efficiently explore dynamic environments, but requires a generalization of the disparity computation with respect to a static camera setup, where the disparity is strictly 1-D after the image rectification. The interaction between vision and motor control allows us to develop an active sensor that achieves high accuracy of the disparity computation around the fixation point, and fast reaction time for the vergence control. In this contribution, we address the development of a real-time architecture for vector disparity computation using an FPGA device. We implement the disparity unit and the control module for vergence, version, and tilt to determine the fixation point. In addition, two on-chip different alternatives for the vector disparity engines are discussed based on the luminance (gradient-based) and phase information of the binocular images. The multiscale versions of these engines are able to estimate the vector disparity up to 32 fps on VGA resolution images with very good accuracy as shown using benchmark sequences with known ground-truth. The performances in terms of frame-rate, resource utilization, and accuracy of the presented approaches are discussed. On the basis of these results, our study indicates that the gradient-based approach leads to the best trade-off choice for the integration with the active vision system. PMID:22438737
Symbolic Computation Using Cellular Automata-Based Hyperdimensional Computing.
Yilmaz, Ozgur
2015-12-01
This letter introduces a novel framework of reservoir computing that is capable of both connectionist machine intelligence and symbolic computation. A cellular automaton is used as the reservoir of dynamical systems. Input is randomly projected onto the initial conditions of automaton cells, and nonlinear computation is performed on the input via application of a rule in the automaton for a period of time. The evolution of the automaton creates a space-time volume of the automaton state space, and it is used as the reservoir. The proposed framework is shown to be capable of long-term memory, and it requires orders of magnitude less computation compared to echo state networks. As the focus of the letter, we suggest that binary reservoir feature vectors can be combined using Boolean operations as in hyperdimensional computing, paving a direct way for concept building and symbolic processing. To demonstrate the capability of the proposed system, we make analogies directly on image data by asking, What is the automobile of air?
Use of Colour and Interactive Animation in Learning 3D Vectors
ERIC Educational Resources Information Center
Iskander, Wejdan; Curtis, Sharon
2005-01-01
This study investigated the effects of two computer-implemented techniques (colour and interactive animation) on learning 3D vectors. The participants were 43 female Saudi Arabian high school students. They were pre-tested on 3D vectors using a paper questionnaire that consisted of calculation and visualization types of questions. The students…
Vectors in Use in a 3D Juggling Game Simulation
ERIC Educational Resources Information Center
Kynigos, Chronis; Latsi, Maria
2006-01-01
The new representations enabled by the educational computer game the "Juggler" can place vectors in a central role both for controlling and measuring the behaviours of objects in a virtual environment simulating motion in three-dimensional spaces. The mathematical meanings constructed by 13 year-old students in relation to vectors as…
Evaluation of the SPAR thermal analyzer on the CYBER-203 computer
NASA Technical Reports Server (NTRS)
Robinson, J. C.; Riley, K. M.; Haftka, R. T.
1982-01-01
The use of the CYBER 203 vector computer for thermal analysis is investigated. Strengths of the CYBER 203 include the ability to perform, in vector mode using a 64 bit word, 50 million floating point operations per second (MFLOPS) for addition and subtraction, 25 MFLOPS for multiplication and 12.5 MFLOPS for division. The speed of scalar operation is comparable to that of a CDC 7600 and is some 2 to 3 times faster than Langley's CYBER 175s. The CYBER 203 has 1,048,576 64-bit words of real memory with an 80 nanosecond (nsec) access time. Memory is bit addressable and provides single error correction, double error detection (SECDED) capability. The virtual memory capability handles data in either 512 or 65,536 word pages. The machine has 256 registers with a 40 nsec access time. The weaknesses of the CYBER 203 include the amount of vector operation overhead and some data storage limitations. In vector operations there is a considerable amount of time before a single result is produced so that vector calculation speed is slower than scalar operation for short vectors.
Design of 2D time-varying vector fields.
Chen, Guoning; Kwatra, Vivek; Wei, Li-Yi; Hansen, Charles D; Zhang, Eugene
2012-10-01
Design of time-varying vector fields, i.e., vector fields that can change over time, has a wide variety of important applications in computer graphics. Existing vector field design techniques do not address time-varying vector fields. In this paper, we present a framework for the design of time-varying vector fields, both for planar domains as well as manifold surfaces. Our system supports the creation and modification of various time-varying vector fields with desired spatial and temporal characteristics through several design metaphors, including streamlines, pathlines, singularity paths, and bifurcations. These design metaphors are integrated into an element-based design to generate the time-varying vector fields via a sequence of basis field summations or spatial constrained optimizations at the sampled times. The key-frame design and field deformation are also introduced to support other user design scenarios. Accordingly, a spatial-temporal constrained optimization and the time-varying transformation are employed to generate the desired fields for these two design scenarios, respectively. We apply the time-varying vector fields generated using our design system to a number of important computer graphics applications that require controllable dynamic effects, such as evolving surface appearance, dynamic scene design, steerable crowd movement, and painterly animation. Many of these are difficult or impossible to achieve via prior simulation-based methods. In these applications, the time-varying vector fields have been applied as either orientation fields or advection fields to control the instantaneous appearance or evolving trajectories of the dynamic effects.
Visualization of x-ray computer tomography using computer-generated holography
NASA Astrophysics Data System (ADS)
Daibo, Masahiro; Tayama, Norio
1998-09-01
The theory converted from x-ray projection data to the hologram directly by combining the computer tomography (CT) with the computer generated hologram (CGH), is proposed. The purpose of this study is to offer the theory for realizing the all- electronic and high-speed seeing through 3D visualization system, which is for the application to medical diagnosis and non- destructive testing. First, the CT is expressed using the pseudo- inverse matrix which is obtained by the singular value decomposition. CGH is expressed in the matrix style. Next, `projection to hologram conversion' (PTHC) matrix is calculated by the multiplication of phase matrix of CGH with pseudo-inverse matrix of the CT. Finally, the projection vector is converted to the hologram vector directly, by multiplication of the PTHC matrix with the projection vector. Incorporating holographic analog computation into CT reconstruction, it becomes possible that the calculation amount is drastically reduced. We demonstrate the CT cross section which is reconstituted by He-Ne laser in the 3D space from the real x-ray projection data acquired by x-ray television equipment, using our direct conversion technique.
Three-dimensional vector modeling and restoration of flat finite wave tank radiometric measurements
NASA Technical Reports Server (NTRS)
Truman, W. M.; Balanis, C. A.
1977-01-01
The three-dimensional vector interaction between a microwave radiometer and a wave tank was modeled. Computer programs for predicting the response of the radiometer to the brightness temperature characteristics of the surroundings were developed along with a computer program that can invert (restore) the radiometer measurements. It is shown that the computer programs can be used to simulate the viewing of large bodies of water, and is applicable to radiometer measurements received from satellites monitoring the ocean. The water temperature, salinity, and wind speed can be determined.
Shuttle program: Ground tracking data program document shuttle OFT launch/landing
NASA Technical Reports Server (NTRS)
Lear, W. M.
1977-01-01
The equations for processing ground tracking data during a space shuttle ascent or entry, or any nonfree flight phase of a shuttle mission are given. The resulting computer program processes data from up to three stations simultaneously: C-band station number 1; C-band station number 2; and an S-band station. The C-band data consists of range, azimuth, and elevation angle measurements. The S-band data consists of range, two angles, and integrated Doppler data in the form of cycle counts. A nineteen element state vector is used in Kalman filter to process the measurements. The acceleration components of the shuttle are taken to be independent exponentially-correlated random variables. Nine elements of the state vector are the measurement bias errors associated with range and two angles for each tracking station. The biases are all modeled as exponentially-correlated random variables with a typical time constant of 108 seconds. All time constants are taken to be the same for all nine state variables. This simplifies the logic in propagating the state error covariance matrix ahead in time.
Multiscale analysis of information dynamics for linear multivariate processes.
Faes, Luca; Montalto, Alessandro; Stramaglia, Sebastiano; Nollo, Giandomenico; Marinazzo, Daniele
2016-08-01
In the study of complex physical and physiological systems represented by multivariate time series, an issue of great interest is the description of the system dynamics over a range of different temporal scales. While information-theoretic approaches to the multiscale analysis of complex dynamics are being increasingly used, the theoretical properties of the applied measures are poorly understood. This study introduces for the first time a framework for the analytical computation of information dynamics for linear multivariate stochastic processes explored at different time scales. After showing that the multiscale processing of a vector autoregressive (VAR) process introduces a moving average (MA) component, we describe how to represent the resulting VARMA process using statespace (SS) models and how to exploit the SS model parameters to compute analytical measures of information storage and information transfer for the original and rescaled processes. The framework is then used to quantify multiscale information dynamics for simulated unidirectionally and bidirectionally coupled VAR processes, showing that rescaling may lead to insightful patterns of information storage and transfer but also to potentially misleading behaviors.
Acceleration of Radiance for Lighting Simulation by Using Parallel Computing with OpenCL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zuo, Wangda; McNeil, Andrew; Wetter, Michael
2011-09-06
We report on the acceleration of annual daylighting simulations for fenestration systems in the Radiance ray-tracing program. The algorithm was optimized to reduce both the redundant data input/output operations and the floating-point operations. To further accelerate the simulation speed, the calculation for matrix multiplications was implemented using parallel computing on a graphics processing unit. We used OpenCL, which is a cross-platform parallel programming language. Numerical experiments show that the combination of the above measures can speed up the annual daylighting simulations 101.7 times or 28.6 times when the sky vector has 146 or 2306 elements, respectively.
A high performance linear equation solver on the VPP500 parallel supercomputer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nakanishi, Makoto; Ina, Hiroshi; Miura, Kenichi
1994-12-31
This paper describes the implementation of two high performance linear equation solvers developed for the Fujitsu VPP500, a distributed memory parallel supercomputer system. The solvers take advantage of the key architectural features of VPP500--(1) scalability for an arbitrary number of processors up to 222 processors, (2) flexible data transfer among processors provided by a crossbar interconnection network, (3) vector processing capability on each processor, and (4) overlapped computation and transfer. The general linear equation solver based on the blocked LU decomposition method achieves 120.0 GFLOPS performance with 100 processors in the LIN-PACK Highly Parallel Computing benchmark.
NASA Astrophysics Data System (ADS)
Yan, Feng-Gang; Cao, Bin; Rong, Jia-Jia; Shen, Yi; Jin, Ming
2016-12-01
A new technique is proposed to reduce the computational complexity of the multiple signal classification (MUSIC) algorithm for direction-of-arrival (DOA) estimate using a uniform linear array (ULA). The steering vector of the ULA is reconstructed as the Kronecker product of two other steering vectors, and a new cost function with spatial aliasing at hand is derived. Thanks to the estimation ambiguity of this spatial aliasing, mirror angles mathematically relating to the true DOAs are generated, based on which the full spectral search involved in the MUSIC algorithm is highly compressed into a limited angular sector accordingly. Further complexity analysis and performance studies are conducted by computer simulations, which demonstrate that the proposed estimator requires an extremely reduced computational burden while it shows a similar accuracy to the standard MUSIC.
Hu, Wenjun; Chung, Fu-Lai; Wang, Shitong
2012-03-01
Although pattern classification has been extensively studied in the past decades, how to effectively solve the corresponding training on large datasets is a problem that still requires particular attention. Many kernelized classification methods, such as SVM and SVDD, can be formulated as the corresponding quadratic programming (QP) problems, but computing the associated kernel matrices requires O(n2)(or even up to O(n3)) computational complexity, where n is the size of the training patterns, which heavily limits the applicability of these methods for large datasets. In this paper, a new classification method called the maximum vector-angular margin classifier (MAMC) is first proposed based on the vector-angular margin to find an optimal vector c in the pattern feature space, and all the testing patterns can be classified in terms of the maximum vector-angular margin ρ, between the vector c and all the training data points. Accordingly, it is proved that the kernelized MAMC can be equivalently formulated as the kernelized Minimum Enclosing Ball (MEB), which leads to a distinctive merit of MAMC, i.e., it has the flexibility of controlling the sum of support vectors like v-SVC and may be extended to a maximum vector-angular margin core vector machine (MAMCVM) by connecting the core vector machine (CVM) method with MAMC such that the corresponding fast training on large datasets can be effectively achieved. Experimental results on artificial and real datasets are provided to validate the power of the proposed methods. Copyright © 2011 Elsevier Ltd. All rights reserved.
Protein sequence comparison based on K-string dictionary.
Yu, Chenglong; He, Rong L; Yau, Stephen S-T
2013-10-25
The current K-string-based protein sequence comparisons require large amounts of computer memory because the dimension of the protein vector representation grows exponentially with K. In this paper, we propose a novel concept, the "K-string dictionary", to solve this high-dimensional problem. It allows us to use a much lower dimensional K-string-based frequency or probability vector to represent a protein, and thus significantly reduce the computer memory requirements for their implementation. Furthermore, based on this new concept, we use Singular Value Decomposition to analyze real protein datasets, and the improved protein vector representation allows us to obtain accurate gene trees. © 2013.
Estimating Thruster Impulses From IMU and Doppler Data
NASA Technical Reports Server (NTRS)
Lisano, Michael E.; Kruizinga, Gerhard L.
2009-01-01
A computer program implements a thrust impulse measurement (TIM) filter, which processes data on changes in velocity and attitude of a spacecraft to estimate the small impulsive forces and torques exerted by the thrusters of the spacecraft reaction control system (RCS). The velocity-change data are obtained from line-of-sight-velocity data from Doppler measurements made from the Earth. The attitude-change data are the telemetered from an inertial measurement unit (IMU) aboard the spacecraft. The TIM filter estimates the threeaxis thrust vector for each RCS thruster, thereby enabling reduction of cumulative navigation error attributable to inaccurate prediction of thrust vectors. The filter has been augmented with a simple mathematical model to compensate for large temperature fluctuations in the spacecraft thruster catalyst bed in order to estimate thrust more accurately at deadbanding cold-firing levels. Also, rigorous consider-covariance estimation is applied in the TIM to account for the expected uncertainty in the moment of inertia and the location of the center of gravity of the spacecraft. The TIM filter was built with, and depends upon, a sigma-point consider-filter algorithm implemented in a Python-language computer program.
NASA Astrophysics Data System (ADS)
Liu, Shaoyong; Gu, Hanming; Tang, Yongjie; Bingkai, Han; Wang, Huazhong; Liu, Dingjin
2018-04-01
Angle-domain common image-point gathers (ADCIGs) can alleviate the limitations of common image-point gathers in an offset domain, and have been widely used for velocity inversion and amplitude variation with angle (AVA) analysis. We propose an effective algorithm for generating ADCIGs in transversely isotropic (TI) media based on the gradient of traveltime by Kirchhoff pre-stack depth migration (KPSDM), as the dynamic programming method for computing the traveltime in TI media would not suffer from the limitation of shadow zones and traveltime interpolation. Meanwhile, we present a specific implementation strategy for ADCIG extraction via KPSDM. Three major steps are included in the presented strategy: (1) traveltime computation using a dynamic programming approach in TI media; (2) slowness vector calculation by the gradient of a traveltime table calculated previously; (3) construction of illumination vectors and subsurface angles in the migration process. Numerical examples are included to demonstrate the effectiveness of our approach, which henceforce shows its potential application for subsequent tomographic velocity inversion and AVA.
Jung, Inuk; Jo, Kyuri; Kang, Hyejin; Ahn, Hongryul; Yu, Youngjae; Kim, Sun
2017-12-01
Identifying biologically meaningful gene expression patterns from time series gene expression data is important to understand the underlying biological mechanisms. To identify significantly perturbed gene sets between different phenotypes, analysis of time series transcriptome data requires consideration of time and sample dimensions. Thus, the analysis of such time series data seeks to search gene sets that exhibit similar or different expression patterns between two or more sample conditions, constituting the three-dimensional data, i.e. gene-time-condition. Computational complexity for analyzing such data is very high, compared to the already difficult NP-hard two dimensional biclustering algorithms. Because of this challenge, traditional time series clustering algorithms are designed to capture co-expressed genes with similar expression pattern in two sample conditions. We present a triclustering algorithm, TimesVector, specifically designed for clustering three-dimensional time series data to capture distinctively similar or different gene expression patterns between two or more sample conditions. TimesVector identifies clusters with distinctive expression patterns in three steps: (i) dimension reduction and clustering of time-condition concatenated vectors, (ii) post-processing clusters for detecting similar and distinct expression patterns and (iii) rescuing genes from unclassified clusters. Using four sets of time series gene expression data, generated by both microarray and high throughput sequencing platforms, we demonstrated that TimesVector successfully detected biologically meaningful clusters of high quality. TimesVector improved the clustering quality compared to existing triclustering tools and only TimesVector detected clusters with differential expression patterns across conditions successfully. The TimesVector software is available at http://biohealth.snu.ac.kr/software/TimesVector/. sunkim.bioinfo@snu.ac.kr. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Spatial data analytics on heterogeneous multi- and many-core parallel architectures using python
Laura, Jason R.; Rey, Sergio J.
2017-01-01
Parallel vector spatial analysis concerns the application of parallel computational methods to facilitate vector-based spatial analysis. The history of parallel computation in spatial analysis is reviewed, and this work is placed into the broader context of high-performance computing (HPC) and parallelization research. The rise of cyber infrastructure and its manifestation in spatial analysis as CyberGIScience is seen as a main driver of renewed interest in parallel computation in the spatial sciences. Key problems in spatial analysis that have been the focus of parallel computing are covered. Chief among these are spatial optimization problems, computational geometric problems including polygonization and spatial contiguity detection, the use of Monte Carlo Markov chain simulation in spatial statistics, and parallel implementations of spatial econometric methods. Future directions for research on parallelization in computational spatial analysis are outlined.
Varol, Altan; Basa, Selçuk
2009-06-01
Maxillary distraction osteogenesis is a challenging procedure when it is performed with internal submerged distractors due to obligation of setting accurate distraction vectors. Five patients with severe maxillary retrognathy were planned with Mimics 10.01 CMF and Simplant 10.01 software. Distraction vectors and rods of distractors were arranged in 3D environment and on STL models. All patients were operated under general anaesthesia and complete Le Fort I downfracture was performed. All distractions were performed according to orientated vectors. All patients achieved stable occlusion and satisfactory aesthetic outcome at the end of the treatment period. Preoperative bending of internal maxillary distractors prevents significant loss of operation time. 3D computer-aided surgical simulation and model surgery provide accurate orientation of distraction vectors for premaxillary and internal trans-sinusoidal maxillary distraction. Combination of virtual surgical simulation and stereolithographic models surgery can be validated as an effective method of preoperative planning for complicated maxillofacial surgery cases.
Computational mechanics analysis tools for parallel-vector supercomputers
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.; Nguyen, Duc T.; Baddourah, Majdi; Qin, Jiangning
1993-01-01
Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigensolution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization search analysis and domain decomposition. The source code for many of these algorithms is available.
A Discriminant Distance Based Composite Vector Selection Method for Odor Classification
Choi, Sang-Il; Jeong, Gu-Min
2014-01-01
We present a composite vector selection method for an effective electronic nose system that performs well even in noisy environments. Each composite vector generated from a electronic nose data sample is evaluated by computing the discriminant distance. By quantitatively measuring the amount of discriminative information in each composite vector, composite vectors containing informative variables can be distinguished and the final composite features for odor classification are extracted using the selected composite vectors. Using the only informative composite vectors can be also helpful to extract better composite features instead of using all the generated composite vectors. Experimental results with different volatile organic compound data show that the proposed system has good classification performance even in a noisy environment compared to other methods. PMID:24747735
Killing-Yano tensors in spaces admitting a hypersurface orthogonal Killing vector
NASA Astrophysics Data System (ADS)
Garfinkle, David; Glass, E. N.
2013-03-01
Methods are presented for finding Killing-Yano tensors, conformal Killing-Yano tensors, and conformal Killing vectors in spacetimes with a hypersurface orthogonal Killing vector. These methods are similar to a method developed by the authors for finding Killing tensors. In all cases one decomposes both the tensor and the equation it satisfies into pieces along the Killing vector and pieces orthogonal to the Killing vector. Solving the separate equations that result from this decomposition requires less computing than integrating the original equation. In each case, examples are given to illustrate the method.
An adaptive vector quantization scheme
NASA Technical Reports Server (NTRS)
Cheung, K.-M.
1990-01-01
Vector quantization is known to be an effective compression scheme to achieve a low bit rate so as to minimize communication channel bandwidth and also to reduce digital memory storage while maintaining the necessary fidelity of the data. However, the large number of computations required in vector quantizers has been a handicap in using vector quantization for low-rate source coding. An adaptive vector quantization algorithm is introduced that is inherently suitable for simple hardware implementation because it has a simple architecture. It allows fast encoding and decoding because it requires only addition and subtraction operations.
Soto-Quiros, Pablo
2015-01-01
This paper presents a parallel implementation of a kind of discrete Fourier transform (DFT): the vector-valued DFT. The vector-valued DFT is a novel tool to analyze the spectra of vector-valued discrete-time signals. This parallel implementation is developed in terms of a mathematical framework with a set of block matrix operations. These block matrix operations contribute to analysis, design, and implementation of parallel algorithms in multicore processors. In this work, an implementation and experimental investigation of the mathematical framework are performed using MATLAB with the Parallel Computing Toolbox. We found that there is advantage to use multicore processors and a parallel computing environment to minimize the high execution time. Additionally, speedup increases when the number of logical processors and length of the signal increase.
Combinatorial vector fields and the valley structure of fitness landscapes.
Stadler, Bärbel M R; Stadler, Peter F
2010-12-01
Adaptive (downhill) walks are a computationally convenient way of analyzing the geometric structure of fitness landscapes. Their inherently stochastic nature has limited their mathematical analysis, however. Here we develop a framework that interprets adaptive walks as deterministic trajectories in combinatorial vector fields and in return associate these combinatorial vector fields with weights that measure their steepness across the landscape. We show that the combinatorial vector fields and their weights have a product structure that is governed by the neutrality of the landscape. This product structure makes practical computations feasible. The framework presented here also provides an alternative, and mathematically more convenient, way of defining notions of valleys, saddle points, and barriers in landscape. As an application, we propose a refined approximation for transition rates between macrostates that are associated with the valleys of the landscape.
Development of software for the MSFC solar vector magnetograph
NASA Technical Reports Server (NTRS)
Kineke, Jack
1996-01-01
The Marshall Space Flight Center Solar Vector Magnetograph is a special purpose telescope used to measure the vector magnetic field in active areas on the surface of the sun. This instrument measures the linear and circular polarization intensities (the Stokes vectors Q, U and V) produced by the Zeeman effect on a specific spectral line due to the solar magnetic field from which the longitudinal and transverse components of the magnetic field may be determined. Beginning in 1990 as a Summer Faculty Fellow in project JOVE and continuing under NASA Grant NAG8-1042, the author has been developing computer software to perform these computations, first using a DEC MicroVAX system equipped with a high speed array processor, and more recently using a DEC AXP/OSF system. This summer's work is a continuation of this development.
Using a multifrontal sparse solver in a high performance, finite element code
NASA Technical Reports Server (NTRS)
King, Scott D.; Lucas, Robert; Raefsky, Arthur
1990-01-01
We consider the performance of the finite element method on a vector supercomputer. The computationally intensive parts of the finite element method are typically the individual element forms and the solution of the global stiffness matrix both of which are vectorized in high performance codes. To further increase throughput, new algorithms are needed. We compare a multifrontal sparse solver to a traditional skyline solver in a finite element code on a vector supercomputer. The multifrontal solver uses the Multiple-Minimum Degree reordering heuristic to reduce the number of operations required to factor a sparse matrix and full matrix computational kernels (e.g., BLAS3) to enhance vector performance. The net result in an order-of-magnitude reduction in run time for a finite element application on one processor of a Cray X-MP.
Image segmentation using hidden Markov Gauss mixture models.
Pyun, Kyungsuk; Lim, Johan; Won, Chee Sun; Gray, Robert M
2007-07-01
Image segmentation is an important tool in image processing and can serve as an efficient front end to sophisticated algorithms and thereby simplify subsequent processing. We develop a multiclass image segmentation method using hidden Markov Gauss mixture models (HMGMMs) and provide examples of segmentation of aerial images and textures. HMGMMs incorporate supervised learning, fitting the observation probability distribution given each class by a Gauss mixture estimated using vector quantization with a minimum discrimination information (MDI) distortion. We formulate the image segmentation problem using a maximum a posteriori criteria and find the hidden states that maximize the posterior density given the observation. We estimate both the hidden Markov parameter and hidden states using a stochastic expectation-maximization algorithm. Our results demonstrate that HMGMM provides better classification in terms of Bayes risk and spatial homogeneity of the classified objects than do several popular methods, including classification and regression trees, learning vector quantization, causal hidden Markov models (HMMs), and multiresolution HMMs. The computational load of HMGMM is similar to that of the causal HMM.
Nonlinear optimization method of ship floating condition calculation in wave based on vector
NASA Astrophysics Data System (ADS)
Ding, Ning; Yu, Jian-xing
2014-08-01
Ship floating condition in regular waves is calculated. New equations controlling any ship's floating condition are proposed by use of the vector operation. This form is a nonlinear optimization problem which can be solved using the penalty function method with constant coefficients. And the solving process is accelerated by dichotomy. During the solving process, the ship's displacement and buoyant centre have been calculated by the integration of the ship surface according to the waterline. The ship surface is described using an accumulative chord length theory in order to determine the displacement, the buoyancy center and the waterline. The draught forming the waterline at each station can be found out by calculating the intersection of the ship surface and the wave surface. The results of an example indicate that this method is exact and efficient. It can calculate the ship floating condition in regular waves as well as simplify the calculation and improve the computational efficiency and the precision of results.
PS3 CELL Development for Scientific Computation and Research
NASA Astrophysics Data System (ADS)
Christiansen, M.; Sevre, E.; Wang, S. M.; Yuen, D. A.; Liu, S.; Lyness, M. D.; Broten, M.
2007-12-01
The Cell processor is one of the most powerful processors on the market, and researchers in the earth sciences may find its parallel architecture to be very useful. A cell processor, with 7 cores, can easily be obtained for experimentation by purchasing a PlayStation 3 (PS3) and installing linux and the IBM SDK. Each core of the PS3 is capable of 25 GFLOPS giving a potential limit of 150 GFLOPS when using all 6 SPUs (synergistic processing units) by using vectorized algorithms. We have used the Cell's computational power to create a program which takes simulated tsunami datasets, parses them, and returns a colorized height field image using ray casting techniques. As expected, the time required to create an image is inversely proportional to the number of SPUs used. We believe that this trend will continue when multiple PS3s are chained using OpenMP functionality and are in the process of researching this. By using the Cell to visualize tsunami data, we have found that its greatest feature is its power. This fact entwines well with the needs of the scientific community where the limiting factor is time. Any algorithm, such as the heat equation, that can be subdivided into multiple parts can take advantage of the PS3 Cell's ability to split the computations across the 6 SPUs reducing required run time by one sixth. Further vectorization of the code can allow for 4 simultanious floating point operations by using the SIMD (single instruction multiple data) capabilities of the SPU increasing efficiency 24 times.
ERIC Educational Resources Information Center
Yaacob, Yuzita; Wester, Michael; Steinberg, Stanly
2010-01-01
This paper presents a prototype of a computer learning assistant ILMEV (Interactive Learning-Mathematica Enhanced Vector calculus) package with the purpose of helping students to understand the theory and applications of integration in vector calculus. The main problem for students using Mathematica is to convert a textbook description of a…
Calibration Test Set for a Phase-Comparison Digital Tracker
NASA Technical Reports Server (NTRS)
Boas, Amy; Li, Samuel; McMaster, Robert
2007-01-01
An apparatus that generates four signals at a frequency of 7.1 GHz having precisely controlled relative phases and equal amplitudes has been designed and built. This apparatus is intended mainly for use in computer-controlled automated calibration and testing of a phase-comparison digital tracker (PCDT) that measures the relative phases of replicas of the same X-band signal received by four antenna elements in an array. (The relative direction of incidence of the signal on the array is then computed from the relative phases.) The present apparatus can also be used to generate precisely phased signals for steering a beam transmitted from a phased antenna array. The apparatus (see figure) includes a 7.1-GHz signal generator, the output of which is fed to a four-way splitter. Each of the four splitter outputs is attenuated by 10 dB and fed as input to a vector modulator, wherein DC bias voltages are used to control the in-phase (I) and quadrature (Q) signal components. The bias voltages are generated by digital-to-analog- converter circuits on a control board that receives its digital control input from a computer running a LabVIEW program. The outputs of the vector modulators are further attenuated by 10 dB, then presented at high-grade radio-frequency connectors. The attenuation reduces the effects of changing mismatch and reflections. The apparatus was calibrated in a process in which the bias voltages were first stepped through all possible IQ settings. Then in a reverse interpolation performed by use of MATLAB software, a lookup table containing 3,600 IQ settings, representing equal amplitude and phase increments of 0.1 , was created for each vector modulator. During operation of the apparatus, these lookup tables are used in calibrating the PCDT.
An efficient and portable SIMD algorithm for charge/current deposition in Particle-In-Cell codes
NASA Astrophysics Data System (ADS)
Vincenti, H.; Lobet, M.; Lehe, R.; Sasanka, R.; Vay, J.-L.
2017-01-01
In current computer architectures, data movement (from die to network) is by far the most energy consuming part of an algorithm (≈ 20 pJ/word on-die to ≈10,000 pJ/word on the network). To increase memory locality at the hardware level and reduce energy consumption related to data movement, future exascale computers tend to use many-core processors on each compute nodes that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD register length is expected to double every four years. As a consequence, Particle-In-Cell (PIC) codes will have to achieve good vectorization to fully take advantage of these upcoming architectures. In this paper, we present a new algorithm that allows for efficient and portable SIMD vectorization of current/charge deposition routines that are, along with the field gathering routines, among the most time consuming parts of the PIC algorithm. Our new algorithm uses a particular data structure that takes into account memory alignment constraints and avoids gather/scatter instructions that can significantly affect vectorization performances on current CPUs. The new algorithm was successfully implemented in the 3D skeleton PIC code PICSAR and tested on Haswell Xeon processors (AVX2-256 bits wide data registers). Results show a factor of × 2 to × 2.5 speed-up in double precision for particle shape factor of orders 1- 3. The new algorithm can be applied as is on future KNL (Knights Landing) architectures that will include AVX-512 instruction sets with 512 bits register lengths (8 doubles/16 singles).
NASA Astrophysics Data System (ADS)
Heidari, Morteza; Zargari Khuzani, Abolfazl; Danala, Gopichandh; Mirniaharikandehei, Seyedehnafiseh; Qian, Wei; Zheng, Bin
2018-03-01
Both conventional and deep machine learning has been used to develop decision-support tools applied in medical imaging informatics. In order to take advantages of both conventional and deep learning approach, this study aims to investigate feasibility of applying a locally preserving projection (LPP) based feature regeneration algorithm to build a new machine learning classifier model to predict short-term breast cancer risk. First, a computer-aided image processing scheme was used to segment and quantify breast fibro-glandular tissue volume. Next, initially computed 44 image features related to the bilateral mammographic tissue density asymmetry were extracted. Then, an LLP-based feature combination method was applied to regenerate a new operational feature vector using a maximal variance approach. Last, a k-nearest neighborhood (KNN) algorithm based machine learning classifier using the LPP-generated new feature vectors was developed to predict breast cancer risk. A testing dataset involving negative mammograms acquired from 500 women was used. Among them, 250 were positive and 250 remained negative in the next subsequent mammography screening. Applying to this dataset, LLP-generated feature vector reduced the number of features from 44 to 4. Using a leave-onecase-out validation method, area under ROC curve produced by the KNN classifier significantly increased from 0.62 to 0.68 (p < 0.05) and odds ratio was 4.60 with a 95% confidence interval of [3.16, 6.70]. Study demonstrated that this new LPP-based feature regeneration approach enabled to produce an optimal feature vector and yield improved performance in assisting to predict risk of women having breast cancer detected in the next subsequent mammography screening.
Chen, Zhenyu; Li, Jianping; Wei, Liwei
2007-10-01
Recently, gene expression profiling using microarray techniques has been shown as a promising tool to improve the diagnosis and treatment of cancer. Gene expression data contain high level of noise and the overwhelming number of genes relative to the number of available samples. It brings out a great challenge for machine learning and statistic techniques. Support vector machine (SVM) has been successfully used to classify gene expression data of cancer tissue. In the medical field, it is crucial to deliver the user a transparent decision process. How to explain the computed solutions and present the extracted knowledge becomes a main obstacle for SVM. A multiple kernel support vector machine (MK-SVM) scheme, consisting of feature selection, rule extraction and prediction modeling is proposed to improve the explanation capacity of SVM. In this scheme, we show that the feature selection problem can be translated into an ordinary multiple parameters learning problem. And a shrinkage approach: 1-norm based linear programming is proposed to obtain the sparse parameters and the corresponding selected features. We propose a novel rule extraction approach using the information provided by the separating hyperplane and support vectors to improve the generalization capacity and comprehensibility of rules and reduce the computational complexity. Two public gene expression datasets: leukemia dataset and colon tumor dataset are used to demonstrate the performance of this approach. Using the small number of selected genes, MK-SVM achieves encouraging classification accuracy: more than 90% for both two datasets. Moreover, very simple rules with linguist labels are extracted. The rule sets have high diagnostic power because of their good classification performance.
NASA Technical Reports Server (NTRS)
Swisshelm, Julie M.
1989-01-01
An explicit flow solver, applicable to the hierarchy of model equations ranging from Euler to full Navier-Stokes, is combined with several techniques designed to reduce computational expense. The computational domain consists of local grid refinements embedded in a global coarse mesh, where the locations of these refinements are defined by the physics of the flow. Flow characteristics are also used to determine which set of model equations is appropriate for solution in each region, thereby reducing not only the number of grid points at which the solution must be obtained, but also the computational effort required to get that solution. Acceleration to steady-state is achieved by applying multigrid on each of the subgrids, regardless of the particular model equations being solved. Since each of these components is explicit, advantage can readily be taken of the vector- and parallel-processing capabilities of machines such as the Cray X-MP and Cray-2.
Design consideration in constructing high performance embedded Knowledge-Based Systems (KBS)
NASA Technical Reports Server (NTRS)
Dalton, Shelly D.; Daley, Philip C.
1988-01-01
As the hardware trends for artificial intelligence (AI) involve more and more complexity, the process of optimizing the computer system design for a particular problem will also increase in complexity. Space applications of knowledge based systems (KBS) will often require an ability to perform both numerically intensive vector computations and real time symbolic computations. Although parallel machines can theoretically achieve the speeds necessary for most of these problems, if the application itself is not highly parallel, the machine's power cannot be utilized. A scheme is presented which will provide the computer systems engineer with a tool for analyzing machines with various configurations of array, symbolic, scaler, and multiprocessors. High speed networks and interconnections make customized, distributed, intelligent systems feasible for the application of AI in space. The method presented can be used to optimize such AI system configurations and to make comparisons between existing computer systems. It is an open question whether or not, for a given mission requirement, a suitable computer system design can be constructed for any amount of money.
NASA Technical Reports Server (NTRS)
Kumar, A.; Rudy, D. H.; Drummond, J. P.; Harris, J. E.
1982-01-01
Several two- and three-dimensional external and internal flow problems solved on the STAR-100 and CYBER-203 vector processing computers are described. The flow field was described by the full Navier-Stokes equations which were then solved by explicit finite-difference algorithms. Problem results and computer system requirements are presented. Program organization and data base structure for three-dimensional computer codes which will eliminate or improve on page faulting, are discussed. Storage requirements for three-dimensional codes are reduced by calculating transformation metric data in each step. As a result, in-core grid points were increased in number by 50% to 150,000, with a 10% execution time increase. An assessment of current and future machine requirements shows that even on the CYBER-205 computer only a few problems can be solved realistically. Estimates reveal that the present situation is more storage limited than compute rate limited, but advancements in both storage and speed are essential to realistically calculate three-dimensional flow.
3D Model Generation From the Engineering Drawing
NASA Astrophysics Data System (ADS)
Vaský, Jozef; Eliáš, Michal; Bezák, Pavol; Červeňanská, Zuzana; Izakovič, Ladislav
2010-01-01
The contribution deals with the transformation of engineering drawings in a paper form into a 3D computer representation. A 3D computer model can be further processed in CAD/CAM system, it can be modified, archived, and a technical drawing can be then generated from it as well. The transformation process from paper form to the data one is a complex and difficult one, particularly owing to the different types of drawings, forms of displayed objects and encountered errors and deviations from technical standards. The algorithm for 3D model generating from an orthogonal vector input representing a simplified technical drawing of the rotational part is described in this contribution. The algorithm was experimentally implemented as ObjectARX application in the AutoCAD system and the test sample as the representation of the rotational part was used for verificaton.
Predicting Error Bars for QSAR Models
NASA Astrophysics Data System (ADS)
Schroeter, Timon; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert
2007-09-01
Unfavorable physicochemical properties often cause drug failures. It is therefore important to take lipophilicity and water solubility into account early on in lead discovery. This study presents log D7 models built using Gaussian Process regression, Support Vector Machines, decision trees and ridge regression algorithms based on 14556 drug discovery compounds of Bayer Schering Pharma. A blind test was conducted using 7013 new measurements from the last months. We also present independent evaluations using public data. Apart from accuracy, we discuss the quality of error bars that can be computed by Gaussian Process models, and ensemble and distance based techniques for the other modelling approaches.
Acceleration of planes segmentation using normals from previous frame
NASA Astrophysics Data System (ADS)
Gritsenko, Pavel; Gritsenko, Igor; Seidakhmet, Askar; Abduraimov, Azizbek
2017-12-01
One of the major problem in integration process of robots is to make them able to function in a human environment. In terms of computer vision, the major feature of human made rooms is the presence of planes [1, 2, 20, 21, 23]. In this article, we will present an algorithm dedicated to increase speed of a plane segmentation. The algorithm uses information about location of a plane and its normal vector to speed up the segmentation process in the next frame. In conjunction with it, we will address such aspects of ICP SLAM as performance and map representation.
NASA Astrophysics Data System (ADS)
Mitri, Farid G.
2018-01-01
Generalized solutions of vector Airy light-sheets, adjustable per their derivative order m, are introduced stemming from the Lorenz gauge condition and Maxwell's equations using the angular spectrum decomposition method. The Cartesian components of the incident radiated electric, magnetic and time-averaged Poynting vector fields in free space (excluding evanescent waves) are determined and computed with particular emphasis on the derivative order of the Airy light-sheet and the polarization on the magnetic vector potential forming the beam. Negative transverse time-averaged Poynting vector components can arise, while the longitudinal counterparts are always positive. Moreover, the analysis is extended to compute the optical radiation force and spin torque vector components on a lossless dielectric prolate subwavelength spheroid in the framework of the electric dipole approximation. The results show that negative forces and spin torques sign reversal arise depending on the derivative order of the beam, the polarization of the magnetic vector potential, and the orientation of the subwavelength prolate spheroid in space. The spin torque sign reversal suggests that counter-clockwise or clockwise rotations around the center of mass of the subwavelength spheroid can occur. The results find useful applications in single Airy light-sheet tweezers, particle manipulation, handling, and rotation applications to name a few examples.
Computational mechanics analysis tools for parallel-vector supercomputers
NASA Technical Reports Server (NTRS)
Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.; Qin, J.
1993-01-01
Computational algorithms for structural analysis on parallel-vector supercomputers are reviewed. These parallel algorithms, developed by the authors, are for the assembly of structural equations, 'out-of-core' strategies for linear equation solution, massively distributed-memory equation solution, unsymmetric equation solution, general eigen-solution, geometrically nonlinear finite element analysis, design sensitivity analysis for structural dynamics, optimization algorithm and domain decomposition. The source code for many of these algorithms is available from NASA Langley.
Modeling node bandwidth limits and their effects on vector combining algorithms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Littlefield, R.J.
Each node in a message-passing multicomputer typically has several communication links. However, the maximum aggregate communication speed of a node is often less than the sum of its individual link speeds. Such computers are called node bandwidth limited (NBL). The NBL constraint is important when choosing algorithms because it can change the relative performance of different algorithms that accomplish the same task. This paper introduces a model of communication performance for NBL computers and uses the model to analyze the overall performance of three algorithms for vector combining (global sum) on the Intel Touchstone DELTA computer. Each of the threemore » algorithms is found to be at least 33% faster than the other two for some combinations of machine size and vector length. The NBL constraint is shown to significantly affect the conditions under which each algorithm is fastest.« less
Matrix-vector multiplication using digital partitioning for more accurate optical computing
NASA Technical Reports Server (NTRS)
Gary, C. K.
1992-01-01
Digital partitioning offers a flexible means of increasing the accuracy of an optical matrix-vector processor. This algorithm can be implemented with the same architecture required for a purely analog processor, which gives optical matrix-vector processors the ability to perform high-accuracy calculations at speeds comparable with or greater than electronic computers as well as the ability to perform analog operations at a much greater speed. Digital partitioning is compared with digital multiplication by analog convolution, residue number systems, and redundant number representation in terms of the size and the speed required for an equivalent throughput as well as in terms of the hardware requirements. Digital partitioning and digital multiplication by analog convolution are found to be the most efficient alogrithms if coding time and hardware are considered, and the architecture for digital partitioning permits the use of analog computations to provide the greatest throughput for a single processor.
An implementation of a tree code on a SIMD, parallel computer
NASA Technical Reports Server (NTRS)
Olson, Kevin M.; Dorband, John E.
1994-01-01
We describe a fast tree algorithm for gravitational N-body simulation on SIMD parallel computers. The tree construction uses fast, parallel sorts. The sorted lists are recursively divided along their x, y and z coordinates. This data structure is a completely balanced tree (i.e., each particle is paired with exactly one other particle) and maintains good spatial locality. An implementation of this tree-building algorithm on a 16k processor Maspar MP-1 performs well and constitutes only a small fraction (approximately 15%) of the entire cycle of finding the accelerations. Each node in the tree is treated as a monopole. The tree search and the summation of accelerations also perform well. During the tree search, node data that is needed from another processor is simply fetched. Roughly 55% of the tree search time is spent in communications between processors. We apply the code to two problems of astrophysical interest. The first is a simulation of the close passage of two gravitationally, interacting, disk galaxies using 65,636 particles. We also simulate the formation of structure in an expanding, model universe using 1,048,576 particles. Our code attains speeds comparable to one head of a Cray Y-MP, so single instruction, multiple data (SIMD) type computers can be used for these simulations. The cost/performance ratio for SIMD machines like the Maspar MP-1 make them an extremely attractive alternative to either vector processors or large multiple instruction, multiple data (MIMD) type parallel computers. With further optimizations (e.g., more careful load balancing), speeds in excess of today's vector processing computers should be possible.
Supercomputer optimizations for stochastic optimal control applications
NASA Technical Reports Server (NTRS)
Chung, Siu-Leung; Hanson, Floyd B.; Xu, Huihuang
1991-01-01
Supercomputer optimizations for a computational method of solving stochastic, multibody, dynamic programming problems are presented. The computational method is valid for a general class of optimal control problems that are nonlinear, multibody dynamical systems, perturbed by general Markov noise in continuous time, i.e., nonsmooth Gaussian as well as jump Poisson random white noise. Optimization techniques for vector multiprocessors or vectorizing supercomputers include advanced data structures, loop restructuring, loop collapsing, blocking, and compiler directives. These advanced computing techniques and superconducting hardware help alleviate Bellman's curse of dimensionality in dynamic programming computations, by permitting the solution of large multibody problems. Possible applications include lumped flight dynamics models for uncertain environments, such as large scale and background random aerospace fluctuations.
Application of the scalar and vector potentials to the aerodynamics of jets
NASA Technical Reports Server (NTRS)
Russell, H. L.; Skifstad, J. G.
1973-01-01
The applicability of a method based on the Stokes potentials (vector and scalar potentials) to computations associated with the aerodynamics of jets was examined. The aerodynamic field near the nozzle could be represented and that the influence of a nonuniform velocity profile at the nozzle exit plane could be determined. Also computations were made for an axisymmetric jet exhausting into a quiescient atmosphere. The velocity at the axis of the jet, and the location of the half-velocity points along the jet yield accurate aerodynamic field computations. Inconsistencies among the different theoretical characterizations of jet flowfields are shown.
An efficient and portable SIMD algorithm for charge/current deposition in Particle-In-Cell codes
Vincenti, H.; Lobet, M.; Lehe, R.; ...
2016-09-19
In current computer architectures, data movement (from die to network) is by far the most energy consuming part of an algorithm (≈20pJ/word on-die to ≈10,000 pJ/word on the network). To increase memory locality at the hardware level and reduce energy consumption related to data movement, future exascale computers tend to use many-core processors on each compute nodes that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD registermore » length is expected to double every four years. As a consequence, Particle-In-Cell (PIC) codes will have to achieve good vectorization to fully take advantage of these upcoming architectures. In this paper, we present a new algorithm that allows for efficient and portable SIMD vectorization of current/charge deposition routines that are, along with the field gathering routines, among the most time consuming parts of the PIC algorithm. Our new algorithm uses a particular data structure that takes into account memory alignment constraints and avoids gather/scat;ter instructions that can significantly affect vectorization performances on current CPUs. The new algorithm was successfully implemented in the 3D skeleton PIC code PICSAR and tested on Haswell Xeon processors (AVX2-256 bits wide data registers). Results show a factor of ×2 to ×2.5 speed-up in double precision for particle shape factor of orders 1–3. The new algorithm can be applied as is on future KNL (Knights Landing) architectures that will include AVX-512 instruction sets with 512 bits register lengths (8 doubles/16 singles). Program summary Program Title: vec_deposition Program Files doi:http://dx.doi.org/10.17632/nh77fv9k8c.1 Licensing provisions: BSD 3-Clause Programming language: Fortran 90 External routines/libraries: OpenMP > 4.0 Nature of problem: Exascale architectures will have many-core processors per node with long vector data registers capable of performing one single instruction on multiple data during one clock cycle. Data register lengths are expected to double every four years and this pushes for new portable solutions for efficiently vectorizing Particle-In-Cell codes on these future many-core architectures. One of the main hotspot routines of the PIC algorithm is the current/charge deposition for which there is no efficient and portable vector algorithm. Solution method: Here we provide an efficient and portable vector algorithm of current/charge deposition routines that uses a new data structure, which significantly reduces gather/scatter operations. Vectorization is controlled using OpenMP 4.0 compiler directives for vectorization which ensures portability across different architectures. Restrictions: Here we do not provide the full PIC algorithm with an executable but only vector routines for current/charge deposition. These scalar/vector routines can be used as library routines in your 3D Particle-In-Cell code. However, to get the best performances out of vector routines you have to satisfy the two following requirements: (1) Your code should implement particle tiling (as explained in the manuscript) to allow for maximized cache reuse and reduce memory accesses that can hinder vector performances. The routines can be used directly on each particle tile. (2) You should compile your code with a Fortran 90 compiler (e.g Intel, gnu or cray) and provide proper alignment flags and compiler alignment directives (more details in README file).« less
An efficient and portable SIMD algorithm for charge/current deposition in Particle-In-Cell codes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vincenti, H.; Lobet, M.; Lehe, R.
In current computer architectures, data movement (from die to network) is by far the most energy consuming part of an algorithm (≈20pJ/word on-die to ≈10,000 pJ/word on the network). To increase memory locality at the hardware level and reduce energy consumption related to data movement, future exascale computers tend to use many-core processors on each compute nodes that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD registermore » length is expected to double every four years. As a consequence, Particle-In-Cell (PIC) codes will have to achieve good vectorization to fully take advantage of these upcoming architectures. In this paper, we present a new algorithm that allows for efficient and portable SIMD vectorization of current/charge deposition routines that are, along with the field gathering routines, among the most time consuming parts of the PIC algorithm. Our new algorithm uses a particular data structure that takes into account memory alignment constraints and avoids gather/scat;ter instructions that can significantly affect vectorization performances on current CPUs. The new algorithm was successfully implemented in the 3D skeleton PIC code PICSAR and tested on Haswell Xeon processors (AVX2-256 bits wide data registers). Results show a factor of ×2 to ×2.5 speed-up in double precision for particle shape factor of orders 1–3. The new algorithm can be applied as is on future KNL (Knights Landing) architectures that will include AVX-512 instruction sets with 512 bits register lengths (8 doubles/16 singles). Program summary Program Title: vec_deposition Program Files doi:http://dx.doi.org/10.17632/nh77fv9k8c.1 Licensing provisions: BSD 3-Clause Programming language: Fortran 90 External routines/libraries: OpenMP > 4.0 Nature of problem: Exascale architectures will have many-core processors per node with long vector data registers capable of performing one single instruction on multiple data during one clock cycle. Data register lengths are expected to double every four years and this pushes for new portable solutions for efficiently vectorizing Particle-In-Cell codes on these future many-core architectures. One of the main hotspot routines of the PIC algorithm is the current/charge deposition for which there is no efficient and portable vector algorithm. Solution method: Here we provide an efficient and portable vector algorithm of current/charge deposition routines that uses a new data structure, which significantly reduces gather/scatter operations. Vectorization is controlled using OpenMP 4.0 compiler directives for vectorization which ensures portability across different architectures. Restrictions: Here we do not provide the full PIC algorithm with an executable but only vector routines for current/charge deposition. These scalar/vector routines can be used as library routines in your 3D Particle-In-Cell code. However, to get the best performances out of vector routines you have to satisfy the two following requirements: (1) Your code should implement particle tiling (as explained in the manuscript) to allow for maximized cache reuse and reduce memory accesses that can hinder vector performances. The routines can be used directly on each particle tile. (2) You should compile your code with a Fortran 90 compiler (e.g Intel, gnu or cray) and provide proper alignment flags and compiler alignment directives (more details in README file).« less
Intelligent earthquake data processing for global adjoint tomography
NASA Astrophysics Data System (ADS)
Chen, Y.; Hill, J.; Li, T.; Lei, W.; Ruan, Y.; Lefebvre, M. P.; Tromp, J.
2016-12-01
Due to the increased computational capability afforded by modern and future computing architectures, the seismology community is demanding a more comprehensive understanding of the full waveform information from the recorded earthquake seismograms. Global waveform tomography is a complex workflow that matches observed seismic data with synthesized seismograms by iteratively updating the earth model parameters based on the adjoint state method. This methodology allows us to compute a very accurate model of the earth's interior. The synthetic data is simulated by solving the wave equation in the entire globe using a spectral-element method. In order to ensure the inversion accuracy and stability, both the synthesized and observed seismograms must be carefully pre-processed. Because the scale of the inversion problem is extremely large and there is a very large volume of data to both be read and written, an efficient and reliable pre-processing workflow must be developed. We are investigating intelligent algorithms based on a machine-learning (ML) framework that will automatically tune parameters for the data processing chain. One straightforward application of ML in data processing is to classify all possible misfit calculation windows into usable and unusable ones, based on some intelligent ML models such as neural network, support vector machine or principle component analysis. The intelligent earthquake data processing framework will enable the seismology community to compute the global waveform tomography using seismic data from an arbitrarily large number of earthquake events in the fastest, most efficient way.
Cappella, Joseph N
2017-10-01
Simultaneous developments in big data, social media, and computational social science have set the stage for how we think about and understand interpersonal and mass communication. This article explores some of the ways that these developments generate 4 hypothetical "vectors" - directions - into the next generation of communication research. These vectors include developments in network analysis, modeling interpersonal and social influence, recommendation systems, and the blurring of distinctions between interpersonal and mass audiences through narrowcasting and broadcasting. The methods and research in these arenas are occurring in areas outside the typical boundaries of the communication discipline but engage classic, substantive questions in mass and interpersonal communication.
Vectorization for Molecular Dynamics on Intel Xeon Phi Corpocessors
NASA Astrophysics Data System (ADS)
Yi, Hongsuk
2014-03-01
Many modern processors are capable of exploiting data-level parallelism through the use of single instruction multiple data (SIMD) execution. The new Intel Xeon Phi coprocessor supports 512 bit vector registers for the high performance computing. In this paper, we have developed a hierarchical parallelization scheme for accelerated molecular dynamics simulations with the Terfoff potentials for covalent bond solid crystals on Intel Xeon Phi coprocessor systems. The scheme exploits multi-level parallelism computing. We combine thread-level parallelism using a tightly coupled thread-level and task-level parallelism with 512-bit vector register. The simulation results show that the parallel performance of SIMD implementations on Xeon Phi is apparently superior to their x86 CPU architecture.
System and method for generating a relationship network
Franks, Kasian; Myers, Cornelia A; Podowski, Raf M
2015-05-05
A computer-implemented system and process for generating a relationship network is disclosed. The system provides a set of data items to be related and generates variable length data vectors to represent the relationships between the terms within each data item. The system can be used to generate a relationship network for documents, images, or any other type of file. This relationship network can then be queried to discover the relationships between terms within the set of data items.
System and method for generating a relationship network
Franks, Kasian [Kensington, CA; Myers, Cornelia A [St. Louis, MO; Podowski, Raf M [Pleasant Hill, CA
2011-07-26
A computer-implemented system and process for generating a relationship network is disclosed. The system provides a set of data items to be related and generates variable length data vectors to represent the relationships between the terms within each data item. The system can be used to generate a relationship network for documents, images, or any other type of file. This relationship network can then be queried to discover the relationships between terms within the set of data items.
NASA Technical Reports Server (NTRS)
Goorjian, Peter M.; Silberberg, Yaron; Kwak, Dochan (Technical Monitor)
1994-01-01
This paper will present results in computational nonlinear optics. An algorithm will be described that solves the full vector nonlinear Maxwell's equations exactly without the approximations that are currently made. Present methods solve a reduced scalar wave equation, namely the nonlinear Schrodinger equation, and neglect the optical carrier. Also, results will be shown of calculations of 2-D electromagnetic nonlinear waves computed by directly integrating in time the nonlinear vector Maxwell's equations. The results will include simulations of 'light bullet' like pulses. Here diffraction and dispersion will be counteracted by nonlinear effects. The time integration efficiently implements linear and nonlinear convolutions for the electric polarization, and can take into account such quantum effects as Kerr and Raman interactions. The present approach is robust and should permit modeling 2-D and 3-D optical soliton propagation, scattering, and switching directly from the full-vector Maxwell's equations.
Lysine acetylation sites prediction using an ensemble of support vector machine classifiers.
Xu, Yan; Wang, Xiao-Bo; Ding, Jun; Wu, Ling-Yun; Deng, Nai-Yang
2010-05-07
Lysine acetylation is an essentially reversible and high regulated post-translational modification which regulates diverse protein properties. Experimental identification of acetylation sites is laborious and expensive. Hence, there is significant interest in the development of computational methods for reliable prediction of acetylation sites from amino acid sequences. In this paper we use an ensemble of support vector machine classifiers to perform this work. The experimentally determined acetylation lysine sites are extracted from Swiss-Prot database and scientific literatures. Experiment results show that an ensemble of support vector machine classifiers outperforms single support vector machine classifier and other computational methods such as PAIL and LysAcet on the problem of predicting acetylation lysine sites. The resulting method has been implemented in EnsemblePail, a web server for lysine acetylation sites prediction available at http://www.aporc.org/EnsemblePail/. Copyright (c) 2010 Elsevier Ltd. All rights reserved.
Method and system for efficient video compression with low-complexity encoder
NASA Technical Reports Server (NTRS)
Chen, Jun (Inventor); He, Dake (Inventor); Sheinin, Vadim (Inventor); Jagmohan, Ashish (Inventor); Lu, Ligang (Inventor)
2012-01-01
Disclosed are a method and system for video compression, wherein the video encoder has low computational complexity and high compression efficiency. The disclosed system comprises a video encoder and a video decoder, wherein the method for encoding includes the steps of converting a source frame into a space-frequency representation; estimating conditional statistics of at least one vector of space-frequency coefficients; estimating encoding rates based on the said conditional statistics; and applying Slepian-Wolf codes with the said computed encoding rates. The preferred method for decoding includes the steps of; generating a side-information vector of frequency coefficients based on previously decoded source data, encoder statistics, and previous reconstructions of the source frequency vector; and performing Slepian-Wolf decoding of at least one source frequency vector based on the generated side-information, the Slepian-Wolf code bits and the encoder statistics.
NASA Technical Reports Server (NTRS)
Goorjian, Peter M.; Silberberg, Yaron; Kwak, Dochan (Technical Monitor)
1995-01-01
This paper will present results in computational nonlinear optics. An algorithm will be described that solves the full vector nonlinear Maxwell's equations exactly without the approximations that we currently made. Present methods solve a reduced scalar wave equation, namely the nonlinear Schrodinger equation, and neglect the optical carrier. Also, results will be shown of calculations of 2-D electromagnetic nonlinear waves computed by directly integrating in time the nonlinear vector Maxwell's equations. The results will include simulations of 'light bullet' like pulses. Here diffraction and dispersion will be counteracted by nonlinear effects. The time integration efficiently implements linear and nonlinear convolutions for the electric polarization, and can take into account such quantum effects as Karr and Raman interactions. The present approach is robust and should permit modeling 2-D and 3-D optical soliton propagation, scattering, and switching directly from the full-vector Maxwell's equations.
Renormalizable Electrodynamics of Scalar and Vector Mesons. Part II
DOE R&D Accomplishments Database
Salam, Abdus; Delbourgo, Robert
1964-01-01
The "gauge" technique" for solving theories introduced in an earlier paper is applied to scalar and vector electrodynamics. It is shown that for scalar electrodynamics, there is no {lambda}φ*2φ2 infinity in the theory, while with conventional subtractions vector electrodynamics is completely finite. The essential ideas of the gauge technique are explained in section 3, and a preliminary set of rules for finite computation in vector electrodynamics is set out in Eqs. (7.28) - (7.34).
On-line range images registration with GPGPU
NASA Astrophysics Data System (ADS)
Będkowski, J.; Naruniec, J.
2013-03-01
This paper concerns implementation of algorithms in the two important aspects of modern 3D data processing: data registration and segmentation. Solution proposed for the first topic is based on the 3D space decomposition, while the latter on image processing and local neighbourhood search. Data processing is implemented by using NVIDIA compute unified device architecture (NIVIDIA CUDA) parallel computation. The result of the segmentation is a coloured map where different colours correspond to different objects, such as walls, floor and stairs. The research is related to the problem of collecting 3D data with a RGB-D camera mounted on a rotated head, to be used in mobile robot applications. Performance of the data registration algorithm is aimed for on-line processing. The iterative closest point (ICP) approach is chosen as a registration method. Computations are based on the parallel fast nearest neighbour search. This procedure decomposes 3D space into cubic buckets and, therefore, the time of the matching is deterministic. First technique of the data segmentation uses accele-rometers integrated with a RGB-D sensor to obtain rotation compensation and image processing method for defining pre-requisites of the known categories. The second technique uses the adapted nearest neighbour search procedure for obtaining normal vectors for each range point.
Fast Quaternion Attitude Estimation from Two Vector Measurements
NASA Technical Reports Server (NTRS)
Markley, F. Landis; Bauer, Frank H. (Technical Monitor)
2001-01-01
Many spacecraft attitude determination methods use exactly two vector measurements. The two vectors are typically the unit vector to the Sun and the Earth's magnetic field vector for coarse "sun-mag" attitude determination or unit vectors to two stars tracked by two star trackers for fine attitude determination. Existing closed-form attitude estimates based on Wahba's optimality criterion for two arbitrarily weighted observations are somewhat slow to evaluate. This paper presents two new fast quaternion attitude estimation algorithms using two vector observations, one optimal and one suboptimal. The suboptimal method gives the same estimate as the TRIAD algorithm, at reduced computational cost. Simulations show that the TRIAD estimate is almost as accurate as the optimal estimate in representative test scenarios.
Perspectives on the role of mobility, behavior, and time scales in the spread of diseases.
Castillo-Chavez, Carlos; Bichara, Derdei; Morin, Benjamin R
2016-12-20
The dynamics, control, and evolution of communicable and vector-borne diseases are intimately connected to the joint dynamics of epidemiological, behavioral, and mobility processes that operate across multiple spatial, temporal, and organizational scales. The identification of a theoretical explanatory framework that accounts for the pattern regularity exhibited by a large number of host-parasite systems, including those sustained by host-vector epidemiological dynamics, is but one of the challenges facing the coevolving fields of computational, evolutionary, and theoretical epidemiology. Host-parasite epidemiological patterns, including epidemic outbreaks and endemic recurrent dynamics, are characteristic to well-identified regions of the world; the result of processes and constraints such as strain competition, host and vector mobility, and population structure operating over multiple scales in response to recurrent disturbances (like El Niño) and climatological and environmental perturbations over thousands of years. It is therefore important to identify and quantify the processes responsible for observed epidemiological macroscopic patterns: the result of individual interactions in changing social and ecological landscapes. In this perspective, we touch on some of the issues calling for the identification of an encompassing theoretical explanatory framework by identifying some of the limitations of existing theory, in the context of particular epidemiological systems. Fostering the reenergizing of research that aims at disentangling the role of epidemiological and socioeconomic forces on disease dynamics, better understood as complex adaptive systems, is a key aim of this perspective.
Symbolic computer vector analysis
NASA Technical Reports Server (NTRS)
Stoutemyer, D. R.
1977-01-01
A MACSYMA program is described which performs symbolic vector algebra and vector calculus. The program can combine and simplify symbolic expressions including dot products and cross products, together with the gradient, divergence, curl, and Laplacian operators. The distribution of these operators over sums or products is under user control, as are various other expansions, including expansion into components in any specific orthogonal coordinate system. There is also a capability for deriving the scalar or vector potential of a vector field. Examples include derivation of the partial differential equations describing fluid flow and magnetohydrodynamics, for 12 different classic orthogonal curvilinear coordinate systems.
A GPU-based incompressible Navier-Stokes solver on moving overset grids
NASA Astrophysics Data System (ADS)
Chandar, Dominic D. J.; Sitaraman, Jayanarayanan; Mavriplis, Dimitri J.
2013-07-01
In pursuit of obtaining high fidelity solutions to the fluid flow equations in a short span of time, graphics processing units (GPUs) which were originally intended for gaming applications are currently being used to accelerate computational fluid dynamics (CFD) codes. With a high peak throughput of about 1 TFLOPS on a PC, GPUs seem to be favourable for many high-resolution computations. One such computation that involves a lot of number crunching is computing time accurate flow solutions past moving bodies. The aim of the present paper is thus to discuss the development of a flow solver on unstructured and overset grids and its implementation on GPUs. In its present form, the flow solver solves the incompressible fluid flow equations on unstructured/hybrid/overset grids using a fully implicit projection method. The resulting discretised equations are solved using a matrix-free Krylov solver using several GPU kernels such as gradient, Laplacian and reduction. Some of the simple arithmetic vector calculations are implemented using the CU++: An Object Oriented Framework for Computational Fluid Dynamics Applications using Graphics Processing Units, Journal of Supercomputing, 2013, doi:10.1007/s11227-013-0985-9 approach where GPU kernels are automatically generated at compile time. Results are presented for two- and three-dimensional computations on static and moving grids.
NASA Technical Reports Server (NTRS)
Kumar, A.; Graves, R. A., Jr.; Weilmuenster, K. J.
1980-01-01
A vectorized code, EQUIL, was developed for calculating the equilibrium chemistry of a reacting gas mixture on the Control Data STAR-100 computer. The code provides species mole fractions, mass fractions, and thermodynamic and transport properties of the mixture for given temperature, pressure, and elemental mass fractions. The code is set up for the electrons H, He, C, O, N system of elements. In all, 24 chemical species are included.
Stable computations with flat radial basis functions using vector-valued rational approximations
NASA Astrophysics Data System (ADS)
Wright, Grady B.; Fornberg, Bengt
2017-02-01
One commonly finds in applications of smooth radial basis functions (RBFs) that scaling the kernels so they are 'flat' leads to smaller discretization errors. However, the direct numerical approach for computing with flat RBFs (RBF-Direct) is severely ill-conditioned. We present an algorithm for bypassing this ill-conditioning that is based on a new method for rational approximation (RA) of vector-valued analytic functions with the property that all components of the vector share the same singularities. This new algorithm (RBF-RA) is more accurate, robust, and easier to implement than the Contour-Padé method, which is similarly based on vector-valued rational approximation. In contrast to the stable RBF-QR and RBF-GA algorithms, which are based on finding a better conditioned base in the same RBF-space, the new algorithm can be used with any type of smooth radial kernel, and it is also applicable to a wider range of tasks (including calculating Hermite type implicit RBF-FD stencils). We present a series of numerical experiments demonstrating the effectiveness of this new method for computing RBF interpolants in the flat regime. We also demonstrate the flexibility of the method by using it to compute implicit RBF-FD formulas in the flat regime and then using these for solving Poisson's equation in a 3-D spherical shell.
A portable approach for PIC on emerging architectures
NASA Astrophysics Data System (ADS)
Decyk, Viktor
2016-03-01
A portable approach for designing Particle-in-Cell (PIC) algorithms on emerging exascale computers, is based on the recognition that 3 distinct programming paradigms are needed. They are: low level vector (SIMD) processing, middle level shared memory parallel programing, and high level distributed memory programming. In addition, there is a memory hierarchy associated with each level. Such algorithms can be initially developed using vectorizing compilers, OpenMP, and MPI. This is the approach recommended by Intel for the Phi processor. These algorithms can then be translated and possibly specialized to other programming models and languages, as needed. For example, the vector processing and shared memory programming might be done with CUDA instead of vectorizing compilers and OpenMP, but generally the algorithm itself is not greatly changed. The UCLA PICKSC web site at http://www.idre.ucla.edu/ contains example open source skeleton codes (mini-apps) illustrating each of these three programming models, individually and in combination. Fortran2003 now supports abstract data types, and design patterns can be used to support a variety of implementations within the same code base. Fortran2003 also supports interoperability with C so that implementations in C languages are also easy to use. Finally, main codes can be translated into dynamic environments such as Python, while still taking advantage of high performing compiled languages. Parallel languages are still evolving with interesting developments in co-Array Fortran, UPC, and OpenACC, among others, and these can also be supported within the same software architecture. Work supported by NSF and DOE Grants.
Vector adaptive predictive coder for speech and audio
NASA Technical Reports Server (NTRS)
Chen, Juin-Hwey (Inventor); Gersho, Allen (Inventor)
1990-01-01
A real-time vector adaptive predictive coder which approximates each vector of K speech samples by using each of M fixed vectors in a first codebook to excite a time-varying synthesis filter and picking the vector that minimizes distortion. Predictive analysis for each frame determines parameters used for computing from vectors in the first codebook zero-state response vectors that are stored at the same address (index) in a second codebook. Encoding of input speech vectors s.sub.n is then carried out using the second codebook. When the vector that minimizes distortion is found, its index is transmitted to a decoder which has a codebook identical to the first codebook of the decoder. There the index is used to read out a vector that is used to synthesize an output speech vector s.sub.n. The parameters used in the encoder are quantized, for example by using a table, and the indices are transmitted to the decoder where they are decoded to specify transfer characteristics of filters used in producing the vector s.sub.n from the receiver codebook vector selected by the vector index transmitted.
Aprà, E; Kowalski, K
2016-03-08
In this paper we discuss the implementation of multireference coupled-cluster formalism with singles, doubles, and noniterative triples (MRCCSD(T)), which is capable of taking advantage of the processing power of the Intel Xeon Phi coprocessor. We discuss the integration of two levels of parallelism underlying the MRCCSD(T) implementation with computational kernels designed to offload the computationally intensive parts of the MRCCSD(T) formalism to Intel Xeon Phi coprocessors. Special attention is given to the enhancement of the parallel performance by task reordering that has improved load balancing in the noniterative part of the MRCCSD(T) calculations. We also discuss aspects regarding efficient optimization and vectorization strategies.
Adaptive compressive learning for prediction of protein-protein interactions from primary sequence.
Zhang, Ya-Nan; Pan, Xiao-Yong; Huang, Yan; Shen, Hong-Bin
2011-08-21
Protein-protein interactions (PPIs) play an important role in biological processes. Although much effort has been devoted to the identification of novel PPIs by integrating experimental biological knowledge, there are still many difficulties because of lacking enough protein structural and functional information. It is highly desired to develop methods based only on amino acid sequences for predicting PPIs. However, sequence-based predictors are often struggling with the high-dimensionality causing over-fitting and high computational complexity problems, as well as the redundancy of sequential feature vectors. In this paper, a novel computational approach based on compressed sensing theory is proposed to predict yeast Saccharomyces cerevisiae PPIs from primary sequence and has achieved promising results. The key advantage of the proposed compressed sensing algorithm is that it can compress the original high-dimensional protein sequential feature vector into a much lower but more condensed space taking the sparsity property of the original signal into account. What makes compressed sensing much more attractive in protein sequence analysis is its compressed signal can be reconstructed from far fewer measurements than what is usually considered necessary in traditional Nyquist sampling theory. Experimental results demonstrate that proposed compressed sensing method is powerful for analyzing noisy biological data and reducing redundancy in feature vectors. The proposed method represents a new strategy of dealing with high-dimensional protein discrete model and has great potentiality to be extended to deal with many other complicated biological systems. Copyright © 2011 Elsevier Ltd. All rights reserved.
Marelli, Marco; Baroni, Marco
2015-07-01
The present work proposes a computational model of morpheme combination at the meaning level. The model moves from the tenets of distributional semantics, and assumes that word meanings can be effectively represented by vectors recording their co-occurrence with other words in a large text corpus. Given this assumption, affixes are modeled as functions (matrices) mapping stems onto derived forms. Derived-form meanings can be thought of as the result of a combinatorial procedure that transforms the stem vector on the basis of the affix matrix (e.g., the meaning of nameless is obtained by multiplying the vector of name with the matrix of -less). We show that this architecture accounts for the remarkable human capacity of generating new words that denote novel meanings, correctly predicting semantic intuitions about novel derived forms. Moreover, the proposed compositional approach, once paired with a whole-word route, provides a new interpretative framework for semantic transparency, which is here partially explained in terms of ease of the combinatorial procedure and strength of the transformation brought about by the affix. Model-based predictions are in line with the modulation of semantic transparency on explicit intuitions about existing words, response times in lexical decision, and morphological priming. In conclusion, we introduce a computational model to account for morpheme combination at the meaning level. The model is data-driven, theoretically sound, and empirically supported, and it makes predictions that open new research avenues in the domain of semantic processing. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
Deblurring for spatial and temporal varying motion with optical computing
NASA Astrophysics Data System (ADS)
Xiao, Xiao; Xue, Dongfeng; Hui, Zhao
2016-05-01
A way to estimate and remove spatially and temporally varying motion blur is proposed, which is based on an optical computing system. The translation and rotation motion can be independently estimated from the joint transform correlator (JTC) system without iterative optimization. The inspiration comes from the fact that the JTC system is immune to rotation motion in a Cartesian coordinate system. The work scheme of the JTC system is designed to keep switching between the Cartesian coordinate system and polar coordinate system in different time intervals with the ping-pang handover. In the ping interval, the JTC system works in the Cartesian coordinate system to obtain a translation motion vector with optical computing speed. In the pang interval, the JTC system works in the polar coordinate system. The rotation motion is transformed to the translation motion through coordinate transformation. Then the rotation motion vector can also be obtained from JTC instantaneously. To deal with continuous spatially variant motion blur, submotion vectors based on the projective motion path blur model are proposed. The submotion vectors model is more effective and accurate at modeling spatially variant motion blur than conventional methods. The simulation and real experiment results demonstrate its overall effectiveness.
Link-Based Similarity Measures Using Reachability Vectors
Yoon, Seok-Ho; Kim, Ji-Soo; Ryu, Minsoo; Choi, Ho-Jin
2014-01-01
We present a novel approach for computing link-based similarities among objects accurately by utilizing the link information pertaining to the objects involved. We discuss the problems with previous link-based similarity measures and propose a novel approach for computing link based similarities that does not suffer from these problems. In the proposed approach each target object is represented by a vector. Each element of the vector corresponds to all the objects in the given data, and the value of each element denotes the weight for the corresponding object. As for this weight value, we propose to utilize the probability of reaching from the target object to the specific object, computed using the “Random Walk with Restart” strategy. Then, we define the similarity between two objects as the cosine similarity of the two vectors. In this paper, we provide examples to show that our approach does not suffer from the aforementioned problems. We also evaluate the performance of the proposed methods in comparison with existing link-based measures, qualitatively and quantitatively, with respect to two kinds of data sets, scientific papers and Web documents. Our experimental results indicate that the proposed methods significantly outperform the existing measures. PMID:24701188
Constraints on muon-specific dark forces
NASA Astrophysics Data System (ADS)
Karshenboim, Savely G.; McKeen, David; Pospelov, Maxim
2014-10-01
The recent measurement of the Lamb shift in muonic hydrogen allows for the most precise extraction of the charge radius of the proton which is currently in conflict with other determinations based on e-p scattering and hydrogen spectroscopy. This discrepancy could be the result of some new muon-specific force with O(1-100) MeV force carrier—in this paper we concentrate on vector mediators. Such an explanation faces challenges from the constraints imposed by the g-2 of the muon and electron as well as precision spectroscopy of muonic atoms. In this work we complement the family of constraints by calculating the contribution of hypothetical forces to the muonium hyperfine structure. We also compute the two-loop contribution to the electron parity-violating amplitude due to a muon loop, which is sensitive to the muon axial-vector coupling. Overall, we find that the combination of low-energy constraints favors the mass of the mediator to be below 10 MeV and that a certain degree of tuning is required between vector and axial-vector couplings of new vector particles to muons in order to satisfy constraints from muon g-2. However, we also observe that in the absence of a consistent standard model embedding high-energy weak-charged processes accompanied by the emission of new vector particles are strongly enhanced by (E/mV)2, with E a characteristic energy scale and mV the mass of the mediator. In particular, leptonic W decays impose the strongest constraints on such models completely disfavoring the remainder of the parameter space.
Predicting protein amidation sites by orchestrating amino acid sequence features
NASA Astrophysics Data System (ADS)
Zhao, Shuqiu; Yu, Hua; Gong, Xiujun
2017-08-01
Amidation is the fourth major category of post-translational modifications, which plays an important role in physiological and pathological processes. Identifying amidation sites can help us understanding the amidation and recognizing the original reason of many kinds of diseases. But the traditional experimental methods for predicting amidation sites are often time-consuming and expensive. In this study, we propose a computational method for predicting amidation sites by orchestrating amino acid sequence features. Three kinds of feature extraction methods are used to build a feature vector enabling to capture not only the physicochemical properties but also position related information of the amino acids. An extremely randomized trees algorithm is applied to choose the optimal features to remove redundancy and dependence among components of the feature vector by a supervised fashion. Finally the support vector machine classifier is used to label the amidation sites. When tested on an independent data set, it shows that the proposed method performs better than all the previous ones with the prediction accuracy of 0.962 at the Matthew's correlation coefficient of 0.89 and area under curve of 0.964.
Color and Vector Flow Imaging in Parallel Ultrasound With Sub-Nyquist Sampling.
Madiena, Craig; Faurie, Julia; Poree, Jonathan; Garcia, Damien; Garcia, Damien; Madiena, Craig; Faurie, Julia; Poree, Jonathan
2018-05-01
RF acquisition with a high-performance multichannel ultrasound system generates massive data sets in short periods of time, especially in "ultrafast" ultrasound when digital receive beamforming is required. Sampling at a rate four times the carrier frequency is the standard procedure since this rule complies with the Nyquist-Shannon sampling theorem and simplifies quadrature sampling. Bandpass sampling (or undersampling) outputs a bandpass signal at a rate lower than the maximal frequency without harmful aliasing. Advantages over Nyquist sampling are reduced storage volumes and data workflow, and simplified digital signal processing tasks. We used RF undersampling in color flow imaging (CFI) and vector flow imaging (VFI) to decrease data volume significantly (factor of 3 to 13 in our configurations). CFI and VFI with Nyquist and sub-Nyquist samplings were compared in vitro and in vivo. The estimate errors due to undersampling were small or marginal, which illustrates that Doppler and vector Doppler images can be correctly computed with a drastically reduced amount of RF samples. Undersampling can be a method of choice in CFI and VFI to avoid information overload and reduce data transfer and storage.
Fast higher-order MR image reconstruction using singular-vector separation.
Wilm, Bertram J; Barmet, Christoph; Pruessmann, Klaas P
2012-07-01
Medical resonance imaging (MRI) conventionally relies on spatially linear gradient fields for image encoding. However, in practice various sources of nonlinear fields can perturb the encoding process and give rise to artifacts unless they are suitably addressed at the reconstruction level. Accounting for field perturbations that are neither linear in space nor constant over time, i.e., dynamic higher-order fields, is particularly challenging. It was previously shown to be feasible with conjugate-gradient iteration. However, so far this approach has been relatively slow due to the need to carry out explicit matrix-vector multiplications in each cycle. In this work, it is proposed to accelerate higher-order reconstruction by expanding the encoding matrix such that fast Fourier transform can be employed for more efficient matrix-vector computation. The underlying principle is to represent the perturbing terms as sums of separable functions of space and time. Compact representations with this property are found by singular-vector analysis of the perturbing matrix. Guidelines for balancing the accuracy and speed of the resulting algorithm are derived by error propagation analysis. The proposed technique is demonstrated for the case of higher-order field perturbations due to eddy currents caused by diffusion weighting. In this example, image reconstruction was accelerated by two orders of magnitude.
Analyzing big data with the hybrid interval regression methods.
Huang, Chia-Hui; Yang, Keng-Chieh; Kao, Han-Ying
2014-01-01
Big data is a new trend at present, forcing the significant impacts on information technologies. In big data applications, one of the most concerned issues is dealing with large-scale data sets that often require computation resources provided by public cloud services. How to analyze big data efficiently becomes a big challenge. In this paper, we collaborate interval regression with the smooth support vector machine (SSVM) to analyze big data. Recently, the smooth support vector machine (SSVM) was proposed as an alternative of the standard SVM that has been proved more efficient than the traditional SVM in processing large-scale data. In addition the soft margin method is proposed to modify the excursion of separation margin and to be effective in the gray zone that the distribution of data becomes hard to be described and the separation margin between classes.
Analyzing Big Data with the Hybrid Interval Regression Methods
Kao, Han-Ying
2014-01-01
Big data is a new trend at present, forcing the significant impacts on information technologies. In big data applications, one of the most concerned issues is dealing with large-scale data sets that often require computation resources provided by public cloud services. How to analyze big data efficiently becomes a big challenge. In this paper, we collaborate interval regression with the smooth support vector machine (SSVM) to analyze big data. Recently, the smooth support vector machine (SSVM) was proposed as an alternative of the standard SVM that has been proved more efficient than the traditional SVM in processing large-scale data. In addition the soft margin method is proposed to modify the excursion of separation margin and to be effective in the gray zone that the distribution of data becomes hard to be described and the separation margin between classes. PMID:25143968
Integrated Dual Imaging Detector
NASA Technical Reports Server (NTRS)
Rust, David M.
1999-01-01
A new type of image detector was designed to simultaneously analyze the polarization of light at all picture elements in a scene. The integrated Dual Imaging detector (IDID) consists of a lenslet array and a polarizing beamsplitter bonded to a commercial charge coupled device (CCD). The IDID simplifies the design and operation of solar vector magnetographs and the imaging polarimeters and spectroscopic imagers used, for example, in atmosphere and solar research. When used in a solar telescope, the vector magnetic fields on the solar surface. Other applications include environmental monitoring, robot vision, and medical diagnoses (through the eye). Innovations in the IDID include (1) two interleaved imaging arrays (one for each polarization plane); (2) large dynamic range (well depth of 10(exp 5) electrons per pixel); (3) simultaneous readout and display of both images; and (4) laptop computer signal processing to produce polarization maps in field situations.
NASA Astrophysics Data System (ADS)
Weiss, Chester J.
2013-08-01
An essential element for computational hypothesis testing, data inversion and experiment design for electromagnetic geophysics is a robust forward solver, capable of easily and quickly evaluating the electromagnetic response of arbitrary geologic structure. The usefulness of such a solver hinges on the balance among competing desires like ease of use, speed of forward calculation, scalability to large problems or compute clusters, parsimonious use of memory access, accuracy and by necessity, the ability to faithfully accommodate a broad range of geologic scenarios over extremes in length scale and frequency content. This is indeed a tall order. The present study addresses recent progress toward the development of a forward solver with these properties. Based on the Lorenz-gauged Helmholtz decomposition, a new finite volume solution over Cartesian model domains endowed with complex-valued electrical properties is shown to be stable over the frequency range 10-2-1010 Hz and range 10-3-105 m in length scale. Benchmark examples are drawn from magnetotellurics, exploration geophysics, geotechnical mapping and laboratory-scale analysis, showing excellent agreement with reference analytic solutions. Computational efficiency is achieved through use of a matrix-free implementation of the quasi-minimum-residual (QMR) iterative solver, which eliminates explicit storage of finite volume matrix elements in favor of "on the fly" computation as needed by the iterative Krylov sequence. Further efficiency is achieved through sparse coupling matrices between the vector and scalar potentials whose non-zero elements arise only in those parts of the model domain where the conductivity gradient is non-zero. Multi-thread parallelization in the QMR solver through OpenMP pragmas is used to reduce the computational cost of its most expensive step: the single matrix-vector product at each iteration. High-level MPI communicators farm independent processes to available compute nodes for simultaneous computation of multi-frequency or multi-transmitter responses.
Mathematical Theory of Generalized Duality Quantum Computers Acting on Vector-States
NASA Astrophysics Data System (ADS)
Cao, Huai-Xin; Long, Gui-Lu; Guo, Zhi-Hua; Chen, Zheng-Li
2013-06-01
Following the idea of duality quantum computation, a generalized duality quantum computer (GDQC) acting on vector-states is defined as a tuple consisting of a generalized quantum wave divider (GQWD) and a finite number of unitary operators as well as a generalized quantum wave combiner (GQWC). It is proved that the GQWD and GQWC of a GDQC are an isometry and a co-isometry, respectively, and mutually dual. It is also proved that every GDQC gives a contraction, called a generalized duality quantum gate (GDQG). A classification of GDQCs is given and the properties of GDQGs are discussed. Some applications are obtained, including two orthogonal duality quantum computer algorithms for unsorted database search and an understanding of the Mach-Zehnder interferometer.
NASA Technical Reports Server (NTRS)
Waithe, Kenrick A.; Deere, Karen A.
2003-01-01
A computational and experimental study was conducted to investigate the effects of multiple injection ports in a two-dimensional, convergent-divergent nozzle, for fluidic thrust vectoring. The concept of multiple injection ports was conceived to enhance the thrust vectoring capability of a convergent-divergent nozzle over that of a single injection port without increasing the secondary mass flow rate requirements. The experimental study was conducted at static conditions in the Jet Exit Test Facility of the 16-Foot Transonic Tunnel Complex at NASA Langley Research Center. Internal nozzle performance was obtained at nozzle pressure ratios up to 10 with secondary nozzle pressure ratios up to 1 for five configurations. The computational study was conducted using the Reynolds Averaged Navier-Stokes computational fluid dynamics code PAB3D with two-equation turbulence closure and linear Reynolds stress modeling. Internal nozzle performance was predicted for nozzle pressure ratios up to 10 with a secondary nozzle pressure ratio of 0.7 for two configurations. Results from the experimental study indicate a benefit to multiple injection ports in a convergent-divergent nozzle. In general, increasing the number of injection ports from one to two increased the pitch thrust vectoring capability without any thrust performance penalties at nozzle pressure ratios less than 4 with high secondary pressure ratios. Results from the computational study are in excellent agreement with experimental results and validates PAB3D as a tool for predicting internal nozzle performance of a two dimensional, convergent-divergent nozzle with multiple injection ports.
Data processing device test apparatus and method therefor
Wilcox, Richard Jacob; Mulig, Jason D.; Eppes, David; Bruce, Michael R.; Bruce, Victoria J.; Ring, Rosalinda M.; Cole, Jr., Edward I.; Tangyunyong, Paiboon; Hawkins, Charles F.; Louie, Arnold Y.
2003-04-08
A method and apparatus mechanism for testing data processing devices are implemented. The test mechanism isolates critical paths by correlating a scanning microscope image with a selected speed path failure. A trigger signal having a preselected value is generated at the start of each pattern vector. The sweep of the scanning microscope is controlled by a computer, which also receives and processes the image signals returned from the microscope. The value of the trigger signal is correlated with a set of pattern lines being driven on the DUT. The trigger is either asserted or negated depending the detection of a pattern line failure and the particular line that failed. In response to the detection of the particular speed path failure being characterized, and the trigger signal, the control computer overlays a mask on the image of the device under test (DUT). The overlaid image provides a visual correlation of the failure with the structural elements of the DUT at the level of resolution of the microscope itself.
The covariance matrix for the solution vector of an equality-constrained least-squares problem
NASA Technical Reports Server (NTRS)
Lawson, C. L.
1976-01-01
Methods are given for computing the covariance matrix for the solution vector of an equality-constrained least squares problem. The methods are matched to the solution algorithms given in the book, 'Solving Least Squares Problems.'
Machine Learning in Intrusion Detection
2005-07-01
machine learning tasks. Anomaly detection provides the core technology for a broad spectrum of security-centric applications. In this dissertation, we examine various aspects of anomaly based intrusion detection in computer security. First, we present a new approach to learn program behavior for intrusion detection. Text categorization techniques are adopted to convert each process to a vector and calculate the similarity between two program activities. Then the k-nearest neighbor classifier is employed to classify program behavior as normal or intrusive. We demonstrate
PIV Data Validation Software Package
NASA Technical Reports Server (NTRS)
Blackshire, James L.
1997-01-01
A PIV data validation and post-processing software package was developed to provide semi-automated data validation and data reduction capabilities for Particle Image Velocimetry data sets. The software provides three primary capabilities including (1) removal of spurious vector data, (2) filtering, smoothing, and interpolating of PIV data, and (3) calculations of out-of-plane vorticity, ensemble statistics, and turbulence statistics information. The software runs on an IBM PC/AT host computer working either under Microsoft Windows 3.1 or Windows 95 operating systems.
Li, Ke; Liu, Yi; Wang, Quanxin; Wu, Yalei; Song, Shimin; Sun, Yi; Liu, Tengchong; Wang, Jun; Li, Yang; Du, Shaoyi
2015-01-01
This paper proposes a novel multi-label classification method for resolving the spacecraft electrical characteristics problems which involve many unlabeled test data processing, high-dimensional features, long computing time and identification of slow rate. Firstly, both the fuzzy c-means (FCM) offline clustering and the principal component feature extraction algorithms are applied for the feature selection process. Secondly, the approximate weighted proximal support vector machine (WPSVM) online classification algorithms is used to reduce the feature dimension and further improve the rate of recognition for electrical characteristics spacecraft. Finally, the data capture contribution method by using thresholds is proposed to guarantee the validity and consistency of the data selection. The experimental results indicate that the method proposed can obtain better data features of the spacecraft electrical characteristics, improve the accuracy of identification and shorten the computing time effectively. PMID:26544549
Using trees to compute approximate solutions to ordinary differential equations exactly
NASA Technical Reports Server (NTRS)
Grossman, Robert
1991-01-01
Some recent work is reviewed which relates families of trees to symbolic algorithms for the exact computation of series which approximate solutions of ordinary differential equations. It turns out that the vector space whose basis is the set of finite, rooted trees carries a natural multiplication related to the composition of differential operators, making the space of trees an algebra. This algebraic structure can be exploited to yield a variety of algorithms for manipulating vector fields and the series and algebras they generate.
1998-09-01
1 .AND. ICOUNT .GT. ISTRAIN )GOTO 55 Add additional terms in equations for interface nodes If radial loading is applied, add term BMAT (NTOT-1) = SR...term in bmat Using Bmat , and the L-U decomposition of Amat determine XSOL, the vector of radial and hoop stresses CALL LUBKSB(AMAT,NRA,LDA,IPVT... BMAT ,XSOL) Compute stresses from the XSOL solution vector Use Boundary conditions S(1,NTOT2) = SR S(2,1) = S(1,1) Compute total axial
Sabooh, M Fazli; Iqbal, Nadeem; Khan, Mukhtaj; Khan, Muslim; Maqbool, H F
2018-05-01
This study examines accurate and efficient computational method for identification of 5-methylcytosine sites in RNA modification. The occurrence of 5-methylcytosine (m 5 C) plays a vital role in a number of biological processes. For better comprehension of the biological functions and mechanism it is necessary to recognize m 5 C sites in RNA precisely. The laboratory techniques and procedures are available to identify m 5 C sites in RNA, but these procedures require a lot of time and resources. This study develops a new computational method for extracting the features of RNA sequence. In this method, first the RNA sequence is encoded via composite feature vector, then, for the selection of discriminate features, the minimum-redundancy-maximum-relevance algorithm was used. Secondly, the classification method used has been based on a support vector machine by using jackknife cross validation test. The suggested method efficiently identifies m 5 C sites from non- m 5 C sites and the outcome of the suggested algorithm is 93.33% with sensitivity of 90.0 and specificity of 96.66 on bench mark datasets. The result exhibits that proposed algorithm shown significant identification performance compared to the existing computational techniques. This study extends the knowledge about the occurrence sites of RNA modification which paves the way for better comprehension of the biological uses and mechanism. Copyright © 2018 Elsevier Ltd. All rights reserved.
2014-01-01
Background Support vector regression (SVR) and Gaussian process regression (GPR) were used for the analysis of electroanalytical experimental data to estimate diffusion coefficients. Results For simulated cyclic voltammograms based on the EC, Eqr, and EqrC mechanisms these regression algorithms in combination with nonlinear kernel/covariance functions yielded diffusion coefficients with higher accuracy as compared to the standard approach of calculating diffusion coefficients relying on the Nicholson-Shain equation. The level of accuracy achieved by SVR and GPR is virtually independent of the rate constants governing the respective reaction steps. Further, the reduction of high-dimensional voltammetric signals by manual selection of typical voltammetric peak features decreased the performance of both regression algorithms compared to a reduction by downsampling or principal component analysis. After training on simulated data sets, diffusion coefficients were estimated by the regression algorithms for experimental data comprising voltammetric signals for three organometallic complexes. Conclusions Estimated diffusion coefficients closely matched the values determined by the parameter fitting method, but reduced the required computational time considerably for one of the reaction mechanisms. The automated processing of voltammograms according to the regression algorithms yields better results than the conventional analysis of peak-related data. PMID:24987463
A Code Generation Approach for Auto-Vectorization in the Spade Compiler
NASA Astrophysics Data System (ADS)
Wang, Huayong; Andrade, Henrique; Gedik, Buğra; Wu, Kun-Lung
We describe an auto-vectorization approach for the Spade stream processing programming language, comprising two ideas. First, we provide support for vectors as a primitive data type. Second, we provide a C++ library with architecture-specific implementations of a large number of pre-vectorized operations as the means to support language extensions. We evaluate our approach with several stream processing operators, contrasting Spade's auto-vectorization with the native auto-vectorization provided by the GNU gcc and Intel icc compilers.
Comparison of five-axis milling and rapid prototyping for implant surgical templates.
Park, Ji-Man; Yi, Tae-Kyoung; Koak, Jai-Young; Kim, Seong-Kyoon; Park, Eun-Jin; Heo, Seong-Joo
2014-01-01
This study aims to compare and evaluate the accuracy of surgical templates fabricated using coordinate synchronization processing with five-axis milling and design-related processing with rapid prototyping (RP). Master phantoms with 10 embedded gutta-percha cylinders hidden under artificial gingiva were fabricated and imaged using cone beam computed tomography. Vectors of the hidden cylinders were extracted and transferred to those of the planned implants through reverse engineering using virtual planning software. An RP-produced template was fabricated by stereolithography in photopolymer at the RP center according to planned data. Metal sleeves were bonded after holes were bored (group RP). For the milled template, milling coordinates were synchronized using the conversion process for the coordinate synchronization platform located on the model's bottom. Metal bushings were set on holes milled on the five-axis milling machine, on which the model was fixed through the coordinate synchronization plate, and the framework was constructed on the model using orthodontic resin (group CS). A computed tomography image was taken with templates firmly fixed on models using anchor pins (RP) or anchor screws (CS). The accuracy was analyzed via reverse engineering. Differences between the two groups were compared by repeated measures two-factor analysis. From the reverse-engineered image of the template on the experimental model, RP-produced templates showed significantly larger deviations than did milled surgical guides. Maximum deviations of the group RP were 1.58 mm (horizontal), 1.68 mm (vertical), and 8.51 degrees (angular); those of the group CS were 0.68 mm (horizontal), 0.41 mm (vertical), and 3.23 degrees (angular). A comparison of milling and RP template production methods showed that a vector-milled surgical guide had significantly smaller deviations than did an RP-produced template. The accuracy of computer-guided milled surgical templates was within the safety margin of previous studies.
Object recognition of real targets using modelled SAR images
NASA Astrophysics Data System (ADS)
Zherdev, D. A.
2017-12-01
In this work the problem of recognition is studied using SAR images. The algorithm of recognition is based on the computation of conjugation indices with vectors of class. The support subspaces for each class are constructed by exception of the most and the less correlated vectors in a class. In the study we examine the ability of a significant feature vector size reduce that leads to recognition time decrease. The images of targets form the feature vectors that are transformed using pre-trained convolutional neural network (CNN).
NASA Technical Reports Server (NTRS)
Boyalakuntla, Kishore; Soni, Bharat K.; Thornburg, Hugh J.; Yu, Robert
1996-01-01
During the past decade, computational simulation of fluid flow around complex configurations has progressed significantly and many notable successes have been reported, however, unsteady time-dependent solutions are not easily obtainable. The present effort involves unsteady time dependent simulation of temporally deforming geometries. Grid generation for a complex configuration can be a time consuming process and temporally varying geometries necessitate the regeneration of such grids for every time step. Traditional grid generation techniques have been tried and demonstrated to be inadequate to such simulations. Non-Uniform Rational B-splines (NURBS) based techniques provide a compact and accurate representation of the geometry. This definition can be coupled with a distribution mesh for a user defined spacing. The present method greatly reduces cpu requirements for time dependent remeshing, facilitating the simulation of more complex unsteady problems. A thrust vectoring nozzle has been chosen to demonstrate the capability as it is of current interest in the aerospace industry for better maneuverability of fighter aircraft in close combat and in post stall regimes. This current effort is the first step towards multidisciplinary design optimization which involves coupling the aerodynamic heat transfer and structural analysis techniques. Applications include simulation of temporally deforming bodies and aeroelastic problems.
NASA Astrophysics Data System (ADS)
Voznyuk, I.; Litman, A.; Tortel, H.
2015-08-01
A Quasi-Newton method for reconstructing the constitutive parameters of three-dimensional (3D) penetrable scatterers from scattered field measurements is presented. This method is adapted for handling large-scale electromagnetic problems while keeping the memory requirement and the time flexibility as low as possible. The forward scattering problem is solved by applying the finite-element tearing and interconnecting full-dual-primal (FETI-FDP2) method which shares the same spirit as the domain decomposition methods for finite element methods. The idea is to split the computational domain into smaller non-overlapping sub-domains in order to simultaneously solve local sub-problems. Various strategies are proposed in order to efficiently couple the inversion algorithm with the FETI-FDP2 method: a separation into permanent and non-permanent subdomains is performed, iterative solvers are favorized for resolving the interface problem and a marching-on-in-anything initial guess selection further accelerates the process. The computational burden is also reduced by applying the adjoint state vector methodology. Finally, the inversion algorithm is confronted to measurements extracted from the 3D Fresnel database.
Incoherent averaging of phase singularities in speckle-shearing interferometry.
Mantel, Klaus; Nercissian, Vanusch; Lindlein, Norbert
2014-08-01
Interferometric speckle techniques are plagued by the omnipresence of phase singularities, impairing the phase unwrapping process. To reduce the number of phase singularities by physical means, an incoherent averaging of multiple speckle fields may be applied. It turns out, however, that the results may strongly deviate from the expected √N behavior. Using speckle-shearing interferometry as an example, we investigate the mechanism behind the reduction of phase singularities, both by calculations and by computer simulations. Key to an understanding of the reduction mechanism during incoherent averaging is the representation of the physical averaging process in terms of certain vector fields associated with each speckle field.
Predicting Error Bars for QSAR Models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schroeter, Timon; Technische Universitaet Berlin, Department of Computer Science, Franklinstrasse 28/29, 10587 Berlin; Schwaighofer, Anton
2007-09-18
Unfavorable physicochemical properties often cause drug failures. It is therefore important to take lipophilicity and water solubility into account early on in lead discovery. This study presents log D{sub 7} models built using Gaussian Process regression, Support Vector Machines, decision trees and ridge regression algorithms based on 14556 drug discovery compounds of Bayer Schering Pharma. A blind test was conducted using 7013 new measurements from the last months. We also present independent evaluations using public data. Apart from accuracy, we discuss the quality of error bars that can be computed by Gaussian Process models, and ensemble and distance based techniquesmore » for the other modelling approaches.« less
Elastic and acoustic wavefield decompositions and application to reverse time migrations
NASA Astrophysics Data System (ADS)
Wang, Wenlong
P- and S-waves coexist in elastic wavefields, and separation between them is an essential step in elastic reverse-time migrations (RTMs). Unlike the traditional separation methods that use curl and divergence operators, which do not preserve the wavefield vector component information, we propose and compare two vector decomposition methods, which preserve the same vector components that exist in the input elastic wavefield. The amplitude and phase information is automatically preserved, so no amplitude or phase corrections are required. The decoupled propagation method is extended from elastic to viscoelastic wavefields. To use the decomposed P and S vector wavefields and generate PP and PS images, we create a new 2D migration context for isotropic, elastic RTM which includes PS vector decomposition; the propagation directions of both incident and reflected P- and S-waves are calculated directly from the stress and particle velocity definitions of the decomposed P- and S-wave Poynting vectors. Then an excitation-amplitude image condition that scales the receiver wavelet by the source vector magnitude produces angle-dependent images of PP and PS reflection coefficients with the correct polarities, polarization, and amplitudes. It thus simplifies the process of obtaining PP and PS angle-domain common-image gathers (ADCIGs); it is less effort to generate ADCIGs from vector data than from scalar data. Besides P- and S-waves decomposition, separations of up- and down-going waves are also a part of processing of multi-component recorded data and propagating wavefields. A complex trace based up/down separation approach is extended from acoustic to elastic, and combined with P- and S-wave decomposition by decoupled propagation. This eliminates the need for a Fourier transform over time, thereby significantly reducing the storage cost and improving computational efficiency. Wavefield decomposition is applied to both synthetic elastic VSP data and propagating wavefield snapshots. Poynting vectors obtained from the particle-velocity and stress fields after P/S and up/down decompositions are much more accurate than those without. The up/down separation algorithm is also applicable in acoustic RTMs, where both (forward-time extrapolated) source and (reverse-time extrapolated) receiver wavefields are decomposed into up-going and down-going parts. Together with the crosscorrelation imaging condition, four images (down-up, up-down, up-up and down-down) are generated, which facilitate the analysis of artifacts and the imaging ability of the four images. Artifacts may exist in all the decomposed images, but their positions and types are different. The causes of artifacts in different images are explained and illustrated with sketches and numerical tests.
Motion camera based on a custom vision sensor and an FPGA architecture
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel
1998-09-01
A digital camera for custom focal plane arrays was developed. The camera allows the test and development of analog or mixed-mode arrays for focal plane processing. The camera is used with a custom sensor for motion detection to implement a motion computation system. The custom focal plane sensor detects moving edges at the pixel level using analog VLSI techniques. The sensor communicates motion events using the event-address protocol associated to a temporal reference. In a second stage, a coprocessing architecture based on a field programmable gate array (FPGA) computes the time-of-travel between adjacent pixels. The FPGA allows rapid prototyping and flexible architecture development. Furthermore, the FPGA interfaces the sensor to a compact PC computer which is used for high level control and data communication to the local network. The camera could be used in applications such as self-guided vehicles, mobile robotics and smart surveillance systems. The programmability of the FPGA allows the exploration of further signal processing like spatial edge detection or image segmentation tasks. The article details the motion algorithm, the sensor architecture, the use of the event- address protocol for velocity vector computation and the FPGA architecture used in the motion camera system.
NASA Technical Reports Server (NTRS)
Schmidt, R. F.
1971-01-01
Some results obtained with a digital computer program written at Goddard Space Flight Center to obtain electromagnetic fields scattered by perfectly reflecting surfaces are presented. For purposes of illustration a paraboloidal reflector was illuminated at radio frequencies in the simulation for both receiving and transmitting modes of operation. Fields were computed in the Fresnel and Fraunhofer regions. A dual-reflector system (Cassegrain) was also simulated for the transmitting case, and fields were computed in the Fraunhofer region. Appended results include derivations which show that the vector Kirchhoff-Kottler formulation has an equivalent form requiring only incident magnetic fields as a driving function. Satisfaction of the radiation conditions at infinity by the equivalent form is demonstrated by a conversion from Cartesian to spherical vector operators. A subsequent development presents the formulation by which Fresnel or Fraunhofer patterns are obtainable for dual-reflector systems. A discussion of the time-average Poynting vector is also appended.
NASA Astrophysics Data System (ADS)
Fei, Cheng-Wei; Bai, Guang-Chen
2014-12-01
To improve the computational precision and efficiency of probabilistic design for mechanical dynamic assembly like the blade-tip radial running clearance (BTRRC) of gas turbine, a distribution collaborative probabilistic design method-based support vector machine of regression (SR)(called as DCSRM) is proposed by integrating distribution collaborative response surface method and support vector machine regression model. The mathematical model of DCSRM is established and the probabilistic design idea of DCSRM is introduced. The dynamic assembly probabilistic design of aeroengine high-pressure turbine (HPT) BTRRC is accomplished to verify the proposed DCSRM. The analysis results reveal that the optimal static blade-tip clearance of HPT is gained for designing BTRRC, and improving the performance and reliability of aeroengine. The comparison of methods shows that the DCSRM has high computational accuracy and high computational efficiency in BTRRC probabilistic analysis. The present research offers an effective way for the reliability design of mechanical dynamic assembly and enriches mechanical reliability theory and method.
Support vector machine firefly algorithm based optimization of lens system.
Shamshirband, Shahaboddin; Petković, Dalibor; Pavlović, Nenad T; Ch, Sudheer; Altameem, Torki A; Gani, Abdullah
2015-01-01
Lens system design is an important factor in image quality. The main aspect of the lens system design methodology is the optimization procedure. Since optimization is a complex, nonlinear task, soft computing optimization algorithms can be used. There are many tools that can be employed to measure optical performance, but the spot diagram is the most useful. The spot diagram gives an indication of the image of a point object. In this paper, the spot size radius is considered an optimization criterion. Intelligent soft computing scheme support vector machines (SVMs) coupled with the firefly algorithm (FFA) are implemented. The performance of the proposed estimators is confirmed with the simulation results. The result of the proposed SVM-FFA model has been compared with support vector regression (SVR), artificial neural networks, and generic programming methods. The results show that the SVM-FFA model performs more accurately than the other methodologies. Therefore, SVM-FFA can be used as an efficient soft computing technique in the optimization of lens system designs.
Computer-Assisted Transgenesis of Caenorhabditis elegans for Deep Phenotyping
Gilleland, Cody L.; Falls, Adam T.; Noraky, James; Heiman, Maxwell G.; Yanik, Mehmet F.
2015-01-01
A major goal in the study of human diseases is to assign functions to genes or genetic variants. The model organism Caenorhabditis elegans provides a powerful tool because homologs of many human genes are identifiable, and large collections of genetic vectors and mutant strains are available. However, the delivery of such vector libraries into mutant strains remains a long-standing experimental bottleneck for phenotypic analysis. Here, we present a computer-assisted microinjection platform to streamline the production of transgenic C. elegans with multiple vectors for deep phenotyping. Briefly, animals are immobilized in a temperature-sensitive hydrogel using a standard multiwell platform. Microinjections are then performed under control of an automated microscope using precision robotics driven by customized computer vision algorithms. We demonstrate utility by phenotyping the morphology of 12 neuronal classes in six mutant backgrounds using combinations of neuron-type-specific fluorescent reporters. This technology can industrialize the assignment of in vivo gene function by enabling large-scale transgenic engineering. PMID:26163188
Agulleiro, Jose-Ignacio; Fernandez, Jose-Jesus
2015-01-01
Cache blocking is a technique widely used in scientific computing to minimize the exchange of information with main memory by reusing the data kept in cache memory. In tomographic reconstruction on standard computers using vector instructions, cache blocking turns out to be central to optimize performance. To this end, sinograms of the tilt-series and slices of the volumes to be reconstructed have to be divided into small blocks that fit into the different levels of cache memory. The code is then reorganized so as to operate with a block as much as possible before proceeding with another one. This data article is related to the research article titled Tomo3D 2.0 – Exploitation of Advanced Vector eXtensions (AVX) for 3D reconstruction (Agulleiro and Fernandez, 2015) [1]. Here we present data of a thorough study of the performance of tomographic reconstruction by varying cache block sizes, which allows derivation of expressions for their automatic quasi-optimal tuning. PMID:26217710
Agulleiro, Jose-Ignacio; Fernandez, Jose-Jesus
2015-06-01
Cache blocking is a technique widely used in scientific computing to minimize the exchange of information with main memory by reusing the data kept in cache memory. In tomographic reconstruction on standard computers using vector instructions, cache blocking turns out to be central to optimize performance. To this end, sinograms of the tilt-series and slices of the volumes to be reconstructed have to be divided into small blocks that fit into the different levels of cache memory. The code is then reorganized so as to operate with a block as much as possible before proceeding with another one. This data article is related to the research article titled Tomo3D 2.0 - Exploitation of Advanced Vector eXtensions (AVX) for 3D reconstruction (Agulleiro and Fernandez, 2015) [1]. Here we present data of a thorough study of the performance of tomographic reconstruction by varying cache block sizes, which allows derivation of expressions for their automatic quasi-optimal tuning.
Krylov subspace methods for computing hydrodynamic interactions in Brownian dynamics simulations
Ando, Tadashi; Chow, Edmond; Saad, Yousef; Skolnick, Jeffrey
2012-01-01
Hydrodynamic interactions play an important role in the dynamics of macromolecules. The most common way to take into account hydrodynamic effects in molecular simulations is in the context of a Brownian dynamics simulation. However, the calculation of correlated Brownian noise vectors in these simulations is computationally very demanding and alternative methods are desirable. This paper studies methods based on Krylov subspaces for computing Brownian noise vectors. These methods are related to Chebyshev polynomial approximations, but do not require eigenvalue estimates. We show that only low accuracy is required in the Brownian noise vectors to accurately compute values of dynamic and static properties of polymer and monodisperse suspension models. With this level of accuracy, the computational time of Krylov subspace methods scales very nearly as O(N2) for the number of particles N up to 10 000, which was the limit tested. The performance of the Krylov subspace methods, especially the “block” version, is slightly better than that of the Chebyshev method, even without taking into account the additional cost of eigenvalue estimates required by the latter. Furthermore, at N = 10 000, the Krylov subspace method is 13 times faster than the exact Cholesky method. Thus, Krylov subspace methods are recommended for performing large-scale Brownian dynamics simulations with hydrodynamic interactions. PMID:22897254
DOE Office of Scientific and Technical Information (OSTI.GOV)
Agarwal, Sapan; Quach, Tu -Thach; Parekh, Ojas
In this study, the exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-basedmore » architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.« less
Agarwal, Sapan; Quach, Tu -Thach; Parekh, Ojas; ...
2016-01-06
In this study, the exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational properties of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an N × N crossbar, these two kernels can be O(N) more energy efficient than a conventional digital memory-basedmore » architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1)). These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N) reduction in energy for the entire algorithm when run with finite precision. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.« less
NASA Astrophysics Data System (ADS)
Lee, M.; Leiter, K.; Eisner, C.; Breuer, A.; Wang, X.
2017-09-01
In this work, we investigate a block Jacobi-Davidson (J-D) variant suitable for sparse symmetric eigenproblems where a substantial number of extremal eigenvalues are desired (e.g., ground-state real-space quantum chemistry). Most J-D algorithm variations tend to slow down as the number of desired eigenpairs increases due to frequent orthogonalization against a growing list of solved eigenvectors. In our specification of block J-D, all of the steps of the algorithm are performed in clusters, including the linear solves, which allows us to greatly reduce computational effort with blocked matrix-vector multiplies. In addition, we move orthogonalization against locked eigenvectors and working eigenvectors outside of the inner loop but retain the single Ritz vector projection corresponding to the index of the correction vector. Furthermore, we minimize the computational effort by constraining the working subspace to the current vectors being updated and the latest set of corresponding correction vectors. Finally, we incorporate accuracy thresholds based on the precision required by the Fermi-Dirac distribution. The net result is a significant reduction in the computational effort against most previous block J-D implementations, especially as the number of wanted eigenpairs grows. We compare our approach with another robust implementation of block J-D (JDQMR) and the state-of-the-art Chebyshev filter subspace (CheFSI) method for various real-space density functional theory systems. Versus CheFSI, for first-row elements, our method yields competitive timings for valence-only systems and 4-6× speedups for all-electron systems with up to 10× reduced matrix-vector multiplies. For all-electron calculations on larger elements (e.g., gold) where the wanted spectrum is quite narrow compared to the full spectrum, we observe 60× speedup with 200× fewer matrix-vector multiples vs. CheFSI.
Lee, M; Leiter, K; Eisner, C; Breuer, A; Wang, X
2017-09-21
In this work, we investigate a block Jacobi-Davidson (J-D) variant suitable for sparse symmetric eigenproblems where a substantial number of extremal eigenvalues are desired (e.g., ground-state real-space quantum chemistry). Most J-D algorithm variations tend to slow down as the number of desired eigenpairs increases due to frequent orthogonalization against a growing list of solved eigenvectors. In our specification of block J-D, all of the steps of the algorithm are performed in clusters, including the linear solves, which allows us to greatly reduce computational effort with blocked matrix-vector multiplies. In addition, we move orthogonalization against locked eigenvectors and working eigenvectors outside of the inner loop but retain the single Ritz vector projection corresponding to the index of the correction vector. Furthermore, we minimize the computational effort by constraining the working subspace to the current vectors being updated and the latest set of corresponding correction vectors. Finally, we incorporate accuracy thresholds based on the precision required by the Fermi-Dirac distribution. The net result is a significant reduction in the computational effort against most previous block J-D implementations, especially as the number of wanted eigenpairs grows. We compare our approach with another robust implementation of block J-D (JDQMR) and the state-of-the-art Chebyshev filter subspace (CheFSI) method for various real-space density functional theory systems. Versus CheFSI, for first-row elements, our method yields competitive timings for valence-only systems and 4-6× speedups for all-electron systems with up to 10× reduced matrix-vector multiplies. For all-electron calculations on larger elements (e.g., gold) where the wanted spectrum is quite narrow compared to the full spectrum, we observe 60× speedup with 200× fewer matrix-vector multiples vs. CheFSI.
The Advanced Software Development and Commercialization Project
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gallopoulos, E.; Canfield, T.R.; Minkoff, M.
1990-09-01
This is the first of a series of reports pertaining to progress in the Advanced Software Development and Commercialization Project, a joint collaborative effort between the Center for Supercomputing Research and Development of the University of Illinois and the Computing and Telecommunications Division of Argonne National Laboratory. The purpose of this work is to apply techniques of parallel computing that were pioneered by University of Illinois researchers to mature computational fluid dynamics (CFD) and structural dynamics (SD) computer codes developed at Argonne. The collaboration in this project will bring this unique combination of expertise to bear, for the first time,more » on industrially important problems. By so doing, it will expose the strengths and weaknesses of existing techniques for parallelizing programs and will identify those problems that need to be solved in order to enable wide spread production use of parallel computers. Secondly, the increased efficiency of the CFD and SD codes themselves will enable the simulation of larger, more accurate engineering models that involve fluid and structural dynamics. In order to realize the above two goals, we are considering two production codes that have been developed at ANL and are widely used by both industry and Universities. These are COMMIX and WHAMS-3D. The first is a computational fluid dynamics code that is used for both nuclear reactor design and safety and as a design tool for the casting industry. The second is a three-dimensional structural dynamics code used in nuclear reactor safety as well as crashworthiness studies. These codes are currently available for both sequential and vector computers only. Our main goal is to port and optimize these two codes on shared memory multiprocessors. In so doing, we shall establish a process that can be followed in optimizing other sequential or vector engineering codes for parallel processors.« less
A vectorial semantics approach to personality assessment.
Neuman, Yair; Cohen, Yochai
2014-04-23
Personality assessment and, specifically, the assessment of personality disorders have traditionally been indifferent to computational models. Computational personality is a new field that involves the automatic classification of individuals' personality traits that can be compared against gold-standard labels. In this context, we introduce a new vectorial semantics approach to personality assessment, which involves the construction of vectors representing personality dimensions and disorders, and the automatic measurements of the similarity between these vectors and texts written by human subjects. We evaluated our approach by using a corpus of 2468 essays written by students who were also assessed through the five-factor personality model. To validate our approach, we measured the similarity between the essays and the personality vectors to produce personality disorder scores. These scores and their correspondence with the subjects' classification of the five personality factors reproduce patterns well-documented in the psychological literature. In addition, we show that, based on the personality vectors, we can predict each of the five personality factors with high accuracy.
A Vectorial Semantics Approach to Personality Assessment
NASA Astrophysics Data System (ADS)
Neuman, Yair; Cohen, Yochai
2014-04-01
Personality assessment and, specifically, the assessment of personality disorders have traditionally been indifferent to computational models. Computational personality is a new field that involves the automatic classification of individuals' personality traits that can be compared against gold-standard labels. In this context, we introduce a new vectorial semantics approach to personality assessment, which involves the construction of vectors representing personality dimensions and disorders, and the automatic measurements of the similarity between these vectors and texts written by human subjects. We evaluated our approach by using a corpus of 2468 essays written by students who were also assessed through the five-factor personality model. To validate our approach, we measured the similarity between the essays and the personality vectors to produce personality disorder scores. These scores and their correspondence with the subjects' classification of the five personality factors reproduce patterns well-documented in the psychological literature. In addition, we show that, based on the personality vectors, we can predict each of the five personality factors with high accuracy.
NASA Technical Reports Server (NTRS)
Charlesworth, Arthur
1990-01-01
The nondeterministic divide partitions a vector into two non-empty slices by allowing the point of division to be chosen nondeterministically. Support for high-level divide-and-conquer programming provided by the nondeterministic divide is investigated. A diva algorithm is a recursive divide-and-conquer sequential algorithm on one or more vectors of the same range, whose division point for a new pair of recursive calls is chosen nondeterministically before any computation is performed and whose recursive calls are made immediately after the choice of division point; also, access to vector components is only permitted during activations in which the vector parameters have unit length. The notion of diva algorithm is formulated precisely as a diva call, a restricted call on a sequential procedure. Diva calls are proven to be intimately related to associativity. Numerous applications of diva calls are given and strategies are described for translating a diva call into code for a variety of parallel computers. Thus diva algorithms separate logical correctness concerns from implementation concerns.
A Vectorial Semantics Approach to Personality Assessment
Neuman, Yair; Cohen, Yochai
2014-01-01
Personality assessment and, specifically, the assessment of personality disorders have traditionally been indifferent to computational models. Computational personality is a new field that involves the automatic classification of individuals' personality traits that can be compared against gold-standard labels. In this context, we introduce a new vectorial semantics approach to personality assessment, which involves the construction of vectors representing personality dimensions and disorders, and the automatic measurements of the similarity between these vectors and texts written by human subjects. We evaluated our approach by using a corpus of 2468 essays written by students who were also assessed through the five-factor personality model. To validate our approach, we measured the similarity between the essays and the personality vectors to produce personality disorder scores. These scores and their correspondence with the subjects' classification of the five personality factors reproduce patterns well-documented in the psychological literature. In addition, we show that, based on the personality vectors, we can predict each of the five personality factors with high accuracy. PMID:24755833
Crosstalk Cancellation for a Simultaneous Phase Shifting Interferometer
NASA Technical Reports Server (NTRS)
Olczak, Eugene (Inventor)
2014-01-01
A method of minimizing fringe print-through in a phase-shifting interferometer, includes the steps of: (a) determining multiple transfer functions of pixels in the phase-shifting interferometer; (b) computing a crosstalk term for each transfer function; and (c) displaying, to a user, a phase-difference map using the crosstalk terms computed in step (b). Determining a transfer function in step (a) includes measuring intensities of a reference beam and a test beam at the pixels, and measuring an optical path difference between the reference beam and the test beam at the pixels. Computing crosstalk terms in step (b) includes computing an N-dimensional vector, where N corresponds to the number of transfer functions, and the N-dimensional vector is obtained by minimizing a variance of a modulation function in phase shifted images.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, K; Able, A
Purpose: To evaluate an Enhanced Dynamic Wedge (EDW) as part of machine commission process with feature study. Methods: The EDW system in this study was from a Truebeam, which is the Linear accelerator manufactured by Varian Medical Systems. The EDW feature vectors includes selected elements. These elements were dosimetric output spots check, field size, wedge angles, dose rate, collimator orientation, and different energy settings. Point dose measurement was done by a PTW farmer chamber, and profiles were measured by Gafchromic EBT2 films positing at different depths of the Solidwater based on the study elements. The output spot measurements were donemore » with PTW farmer chamber with Solidwater setting for all orientation and wedge angles in the EDW system. The profiles comparisons were done by IMRT measurement function in RIT software at version 6.3. And the films were scanned by Vidar scanner. Dosimetry calculation were done by using the same Solidwater scanned by GE LightSpeed CT in Eclipse Treatment Planning System (TPS). Then measurements were compared to simulation results in TPS. Results: The energy average percentage difference between chamber measurement and TPS was 0.16% with standard deviation (SD) at 0.93%. For selected features, the average percentage difference between film measurement and computation was 0.93% with SD at 1.55% in horizontal profiles, and 1.18% with SD at 0.98% at vertical profiles. The average gamma difference for film measurement and TPS computing results was at 0.924 with SD at 0.314. Conclusion: A feature vector was developed to describe the commission of EDW, and developing a complete set of features for sufficiency of commission of a LINAC function could provide optimal commission instance with acceptable confident level of clinical application of the machine. Given the institution specific vector pattern and big data process, it could provide wide range clinical outcome comparison information in application of EDW.« less
Adjudicating between face-coding models with individual-face fMRI responses
Kriegeskorte, Nikolaus
2017-01-01
The perceptual representation of individual faces is often explained with reference to a norm-based face space. In such spaces, individuals are encoded as vectors where identity is primarily conveyed by direction and distinctiveness by eccentricity. Here we measured human fMRI responses and psychophysical similarity judgments of individual face exemplars, which were generated as realistic 3D animations using a computer-graphics model. We developed and evaluated multiple neurobiologically plausible computational models, each of which predicts a representational distance matrix and a regional-mean activation profile for 24 face stimuli. In the fusiform face area, a face-space coding model with sigmoidal ramp tuning provided a better account of the data than one based on exemplar tuning. However, an image-processing model with weighted banks of Gabor filters performed similarly. Accounting for the data required the inclusion of a measurement-level population averaging mechanism that approximates how fMRI voxels locally average distinct neuronal tunings. Our study demonstrates the importance of comparing multiple models and of modeling the measurement process in computational neuroimaging. PMID:28746335
Two-dimensional nonsteady viscous flow simulation on the Navier-Stokes computer miniNode
NASA Technical Reports Server (NTRS)
Nosenchuck, Daniel M.; Littman, Michael G.; Flannery, William
1986-01-01
The needs of large-scale scientific computation are outpacing the growth in performance of mainframe supercomputers. In particular, problems in fluid mechanics involving complex flow simulations require far more speed and capacity than that provided by current and proposed Class VI supercomputers. To address this concern, the Navier-Stokes Computer (NSC) was developed. The NSC is a parallel-processing machine, comprised of individual Nodes, each comparable in performance to current supercomputers. The global architecture is that of a hypercube, and a 128-Node NSC has been designed. New architectural features, such as a reconfigurable many-function ALU pipeline and a multifunction memory-ALU switch, have provided the capability to efficiently implement a wide range of algorithms. Efficient algorithms typically involve numerically intensive tasks, which often include conditional operations. These operations may be efficiently implemented on the NSC without, in general, sacrificing vector-processing speed. To illustrate the architecture, programming, and several of the capabilities of the NSC, the simulation of two-dimensional, nonsteady viscous flows on a prototype Node, called the miniNode, is presented.
Stewart, Terrence C; Eliasmith, Chris
2013-06-01
Quantum probability (QP) theory can be seen as a type of vector symbolic architecture (VSA): mental states are vectors storing structured information and manipulated using algebraic operations. Furthermore, the operations needed by QP match those in other VSAs. This allows existing biologically realistic neural models to be adapted to provide a mechanistic explanation of the cognitive phenomena described in the target article by Pothos & Busemeyer (P&B).
Achieving High Performance on the i860 Microprocessor
NASA Technical Reports Server (NTRS)
Lee, King; Kutler, Paul (Technical Monitor)
1998-01-01
The i860 is a high performance microprocessor used in the Intel Touchstone project. This paper proposes a paradigm for programming the i860 that is modelled on the vector instructions of the Cray computers. Fortran callable assembler subroutines were written that mimic the concurrent vector instructions of the Cray. Cache takes the place of vector registers. Using this paradigm we have achieved twice the performance of compiled code on a traditional solve.
FPGA Implementation of Generalized Hebbian Algorithm for Texture Classification
Lin, Shiow-Jyu; Hwang, Wen-Jyi; Lee, Wei-Hao
2012-01-01
This paper presents a novel hardware architecture for principal component analysis. The architecture is based on the Generalized Hebbian Algorithm (GHA) because of its simplicity and effectiveness. The architecture is separated into three portions: the weight vector updating unit, the principal computation unit and the memory unit. In the weight vector updating unit, the computation of different synaptic weight vectors shares the same circuit for reducing the area costs. To show the effectiveness of the circuit, a texture classification system based on the proposed architecture is physically implemented by Field Programmable Gate Array (FPGA). It is embedded in a System-On-Programmable-Chip (SOPC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient design for attaining both high speed performance and low area costs. PMID:22778640
NASA Astrophysics Data System (ADS)
Wang, Qiqi; Rigas, Georgios; Esclapez, Lucas; Magri, Luca; Blonigan, Patrick
2016-11-01
Bluff body flows are of fundamental importance to many engineering applications involving massive flow separation and in particular the transport industry. Coherent flow structures emanating in the wake of three-dimensional bluff bodies, such as cars, trucks and lorries, are directly linked to increased aerodynamic drag, noise and structural fatigue. For low Reynolds laminar and transitional regimes, hydrodynamic stability theory has aided the understanding and prediction of the unstable dynamics. In the same framework, sensitivity analysis provides the means for efficient and optimal control, provided the unstable modes can be accurately predicted. However, these methodologies are limited to laminar regimes where only a few unstable modes manifest. Here we extend the stability analysis to low-dimensional chaotic regimes by computing the Lyapunov covariant vectors and their associated Lyapunov exponents. We compare them to eigenvectors and eigenvalues computed in traditional hydrodynamic stability analysis. Computing Lyapunov covariant vectors and Lyapunov exponents also enables the extension of sensitivity analysis to chaotic flows via the shadowing method. We compare the computed shadowing sensitivities to traditional sensitivity analysis. These Lyapunov based methodologies do not rely on mean flow assumptions, and are mathematically rigorous for calculating sensitivities of fully unsteady flow simulations.
Entanglement-Based Machine Learning on a Quantum Computer
NASA Astrophysics Data System (ADS)
Cai, X.-D.; Wu, D.; Su, Z.-E.; Chen, M.-C.; Wang, X.-L.; Li, Li; Liu, N.-L.; Lu, C.-Y.; Pan, J.-W.
2015-03-01
Machine learning, a branch of artificial intelligence, learns from previous experience to optimize performance, which is ubiquitous in various fields such as computer sciences, financial analysis, robotics, and bioinformatics. A challenge is that machine learning with the rapidly growing "big data" could become intractable for classical computers. Recently, quantum machine learning algorithms [Lloyd, Mohseni, and Rebentrost, arXiv.1307.0411] were proposed which could offer an exponential speedup over classical algorithms. Here, we report the first experimental entanglement-based classification of two-, four-, and eight-dimensional vectors to different clusters using a small-scale photonic quantum computer, which are then used to implement supervised and unsupervised machine learning. The results demonstrate the working principle of using quantum computers to manipulate and classify high-dimensional vectors, the core mathematical routine in machine learning. The method can, in principle, be scaled to larger numbers of qubits, and may provide a new route to accelerate machine learning.
Polarization ellipse and Stokes parameters in geometric algebra.
Santos, Adler G; Sugon, Quirino M; McNamara, Daniel J
2012-01-01
In this paper, we use geometric algebra to describe the polarization ellipse and Stokes parameters. We show that a solution to Maxwell's equation is a product of a complex basis vector in Jackson and a linear combination of plane wave functions. We convert both the amplitudes and the wave function arguments from complex scalars to complex vectors. This conversion allows us to separate the electric field vector and the imaginary magnetic field vector, because exponentials of imaginary scalars convert vectors to imaginary vectors and vice versa, while exponentials of imaginary vectors only rotate the vector or imaginary vector they are multiplied to. We convert this expression for polarized light into two other representations: the Cartesian representation and the rotated ellipse representation. We compute the conversion relations among the representation parameters and their corresponding Stokes parameters. And finally, we propose a set of geometric relations between the electric and magnetic fields that satisfy an equation similar to the Poincaré sphere equation.
A Review of Tensors and Tensor Signal Processing
NASA Astrophysics Data System (ADS)
Cammoun, L.; Castaño-Moraga, C. A.; Muñoz-Moreno, E.; Sosa-Cabrera, D.; Acar, B.; Rodriguez-Florido, M. A.; Brun, A.; Knutsson, H.; Thiran, J. P.
Tensors have been broadly used in mathematics and physics, since they are a generalization of scalars or vectors and allow to represent more complex properties. In this chapter we present an overview of some tensor applications, especially those focused on the image processing field. From a mathematical point of view, a lot of work has been developed about tensor calculus, which obviously is more complex than scalar or vectorial calculus. Moreover, tensors can represent the metric of a vector space, which is very useful in the field of differential geometry. In physics, tensors have been used to describe several magnitudes, such as the strain or stress of materials. In solid mechanics, tensors are used to define the generalized Hooke’s law, where a fourth order tensor relates the strain and stress tensors. In fluid dynamics, the velocity gradient tensor provides information about the vorticity and the strain of the fluids. Also an electromagnetic tensor is defined, that simplifies the notation of the Maxwell equations. But tensors are not constrained to physics and mathematics. They have been used, for instance, in medical imaging, where we can highlight two applications: the diffusion tensor image, which represents how molecules diffuse inside the tissues and is broadly used for brain imaging; and the tensorial elastography, which computes the strain and vorticity tensor to analyze the tissues properties. Tensors have also been used in computer vision to provide information about the local structure or to define anisotropic image filters.
Texture Feature Extraction and Classification for Iris Diagnosis
NASA Astrophysics Data System (ADS)
Ma, Lin; Li, Naimin
Appling computer aided techniques in iris image processing, and combining occidental iridology with the traditional Chinese medicine is a challenging research area in digital image processing and artificial intelligence. This paper proposes an iridology model that consists the iris image pre-processing, texture feature analysis and disease classification. To the pre-processing, a 2-step iris localization approach is proposed; a 2-D Gabor filter based texture analysis and a texture fractal dimension estimation method are proposed for pathological feature extraction; and at last support vector machines are constructed to recognize 2 typical diseases such as the alimentary canal disease and the nerve system disease. Experimental results show that the proposed iridology diagnosis model is quite effective and promising for medical diagnosis and health surveillance for both hospital and public use.
Gold, Peter O.; Cowgill, Eric; Kreylos, Oliver; Gold, Ryan D.
2012-01-01
Three-dimensional (3D) slip vectors recorded by displaced landforms are difficult to constrain across complex fault zones, and the uncertainties associated with such measurements become increasingly challenging to assess as landforms degrade over time. We approach this problem from a remote sensing perspective by using terrestrial laser scanning (TLS) and 3D structural analysis. We have developed an integrated TLS data collection and point-based analysis workflow that incorporates accurate assessments of aleatoric and epistemic uncertainties using experimental surveys, Monte Carlo simulations, and iterative site reconstructions. Our scanning workflow and equipment requirements are optimized for single-operator surveying, and our data analysis process is largely completed using new point-based computing tools in an immersive 3D virtual reality environment. In a case study, we measured slip vector orientations at two sites along the rupture trace of the 1954 Dixie Valley earthquake (central Nevada, United States), yielding measurements that are the first direct constraints on the 3D slip vector for this event. These observations are consistent with a previous approximation of net extension direction for this event. We find that errors introduced by variables in our survey method result in <2.5 cm of variability in components of displacement, and are eclipsed by the 10–60 cm epistemic errors introduced by reconstructing the field sites to their pre-erosion geometries. Although the higher resolution TLS data sets enabled visualization and data interactivity critical for reconstructing the 3D slip vector and for assessing uncertainties, dense topographic constraints alone were not sufficient to significantly narrow the wide (<26°) range of allowable slip vector orientations that resulted from accounting for epistemic uncertainties.
NASA Astrophysics Data System (ADS)
Lavergne, T.; Eastwood, S.; Teffah, Z.; Schyberg, H.; Breivik, L.-A.
2010-10-01
The retrieval of sea ice motion with the Maximum Cross-Correlation (MCC) method from low-resolution (10-15 km) spaceborne imaging sensors is challenged by a dominating quantization noise as the time span of displacement vectors is shortened. To allow investigating shorter displacements from these instruments, we introduce an alternative sea ice motion tracking algorithm that builds on the MCC method but relies on a continuous optimization step for computing the motion vector. The prime effect of this method is to effectively dampen the quantization noise, an artifact of the MCC. It allows for retrieving spatially smooth 48 h sea ice motion vector fields in the Arctic. Strategies to detect and correct erroneous vectors as well as to optimally merge several polarization channels of a given instrument are also described. A test processing chain is implemented and run with several active and passive microwave imagers (Advanced Microwave Scanning Radiometer-EOS (AMSR-E), Special Sensor Microwave Imager, and Advanced Scatterometer) during three Arctic autumn, winter, and spring seasons. Ice motion vectors are collocated to and compared with GPS positions of in situ drifters. Error statistics are shown to be ranging from 2.5 to 4.5 km (standard deviation for components of the vectors) depending on the sensor, without significant bias. We discuss the relative contribution of measurement and representativeness errors by analyzing monthly validation statistics. The 37 GHz channels of the AMSR-E instrument allow for the best validation statistics. The operational low-resolution sea ice drift product of the EUMETSAT OSI SAF (European Organisation for the Exploitation of Meteorological Satellites Ocean and Sea Ice Satellite Application Facility) is based on the algorithms presented in this paper.
Analysis of a dual-reflector antenna system using physical optics and digital computers
NASA Technical Reports Server (NTRS)
Schmidt, R. F.
1972-01-01
The application of physical-optics diffraction theory to a deployable dual-reflector geometry is discussed. The methods employed are not restricted to the Conical-Gregorian antenna, but apply in a general way to dual and even multiple reflector systems. Complex vector wave methods are used in the Fresnel and Fraunhofer regions of the reflectors. Field amplitude, phase, polarization data, and time average Poynting vectors are obtained via an IBM 360/91 digital computer. Focal region characteristics are plotted with the aid of a CalComp plotter. Comparison between the GSFC Huygens wavelet approach, JPL measurements, and JPL computer results based on the near field spherical wave expansion method are made wherever possible.
NASA Astrophysics Data System (ADS)
Oware, E. K.; Moysey, S. M.
2016-12-01
Regularization stabilizes the geophysical imaging problem resulting from sparse and noisy measurements that render solutions unstable and non-unique. Conventional regularization constraints are, however, independent of the physics of the underlying process and often produce smoothed-out tomograms with mass underestimation. Cascaded time-lapse (CTL) is a widely used reconstruction technique for monitoring wherein a tomogram obtained from the background dataset is employed as starting model for the inversion of subsequent time-lapse datasets. In contrast, a proper orthogonal decomposition (POD)-constrained inversion framework enforces physics-based regularization based upon prior understanding of the expected evolution of state variables. The physics-based constraints are represented in the form of POD basis vectors. The basis vectors are constructed from numerically generated training images (TIs) that mimic the desired process. The target can be reconstructed from a small number of selected basis vectors, hence, there is a reduction in the number of inversion parameters compared to the full dimensional space. The inversion involves finding the optimal combination of the selected basis vectors conditioned on the geophysical measurements. We apply the algorithm to 2-D lab-scale saline transport experiments with electrical resistivity (ER) monitoring. We consider two transport scenarios with one and two mass injection points evolving into unimodal and bimodal plume morphologies, respectively. The unimodal plume is consistent with the assumptions underlying the generation of the TIs, whereas bimodality in plume morphology was not conceptualized. We compare difference tomograms retrieved from POD with those obtained from CTL. Qualitative comparisons of the difference tomograms with images of their corresponding dye plumes suggest that POD recovered more compact plumes in contrast to those of CTL. While mass recovery generally deteriorated with increasing number of time-steps, POD outperformed CTL in terms of mass recovery accuracy rates. POD is computationally superior requiring only 2.5 mins to complete each inversion compared to 3 hours for CTL to do the same.
Framework to trade optimality for local processing in large-scale wavefront reconstruction problems.
Haber, Aleksandar; Verhaegen, Michel
2016-11-15
We show that the minimum variance wavefront estimation problems permit localized approximate solutions, in the sense that the wavefront value at a point (excluding unobservable modes, such as the piston mode) can be approximated by a linear combination of the wavefront slope measurements in the point's neighborhood. This enables us to efficiently compute a wavefront estimate by performing a single sparse matrix-vector multiplication. Moreover, our results open the possibility for the development of wavefront estimators that can be easily implemented in a decentralized/distributed manner, and in which the estimate optimality can be easily traded for computational efficiency. We numerically validate our approach on Hudgin wavefront sensor geometries, and the results can be easily generalized to Fried geometries.
New-Sum: A Novel Online ABFT Scheme For General Iterative Methods
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tao, Dingwen; Song, Shuaiwen; Krishnamoorthy, Sriram
Emerging high-performance computing platforms, with large component counts and lower power margins, are anticipated to be more susceptible to soft errors in both logic circuits and memory subsystems. We present an online algorithm-based fault tolerance (ABFT) approach to efficiently detect and recover soft errors for general iterative methods. We design a novel checksum-based encoding scheme for matrix-vector multiplication that is resilient to both arithmetic and memory errors. Our design decouples the checksum updating process from the actual computation, and allows adaptive checksum overhead control. Building on this new encoding mechanism, we propose two online ABFT designs that can effectively recovermore » from errors when combined with a checkpoint/rollback scheme.« less
Analysis of rotary engine combustion processes based on unsteady, three-dimensional computations
NASA Technical Reports Server (NTRS)
Raju, M. S.; Willis, E. A.
1990-01-01
A new computer code was developed for predicting the turbulent and chemically reacting flows with sprays occurring inside of a stratified charge rotary engine. The solution procedure is based on an Eulerian Lagrangian approach where the unsteady, three-dimensional Navier-Stokes equations for a perfect gas mixture with variable properties are solved in generalized, Eulerian coordinates on a moving grid by making use of an implicit finite volume, Steger-Warming flux vector splitting scheme, and the liquid phase equations are solved in Lagrangian coordinates. Both the details of the numerical algorithm and the finite difference predictions of the combustor flow field during the opening of exhaust and/or intake, and also during fuel vaporization and combustion, are presented.
Analysis of rotary engine combustion processes based on unsteady, three-dimensional computations
NASA Technical Reports Server (NTRS)
Raju, M. S.; Willis, E. A.
1989-01-01
A new computer code was developed for predicting the turbulent, and chemically reacting flows with sprays occurring inside of a stratified charge rotary engine. The solution procedure is based on an Eulerian Lagrangian approach where the unsteady, 3-D Navier-Stokes equations for a perfect gas mixture with variable properties are solved in generalized, Eulerian coordinates on a moving grid by making use of an implicit finite volume, Steger-Warming flux vector splitting scheme, and the liquid phase equations are solved in Lagrangian coordinates. Both the details of the numerical algorithm and the finite difference predictions of the combustor flow field during the opening of exhaust and/or intake, and also during fuel vaporization and combustion, are presented.
A VLBI variance-covariance analysis interactive computer program. M.S. Thesis
NASA Technical Reports Server (NTRS)
Bock, Y.
1980-01-01
An interactive computer program (in FORTRAN) for the variance covariance analysis of VLBI experiments is presented for use in experiment planning, simulation studies and optimal design problems. The interactive mode is especially suited to these types of analyses providing ease of operation as well as savings in time and cost. The geodetic parameters include baseline vector parameters and variations in polar motion and Earth rotation. A discussion of the theroy on which the program is based provides an overview of the VLBI process emphasizing the areas of interest to geodesy. Special emphasis is placed on the problem of determining correlations between simultaneous observations from a network of stations. A model suitable for covariance analyses is presented. Suggestions towards developing optimal observation schedules are included.
Intrinsic Bayesian Active Contours for Extraction of Object Boundaries in Images
Srivastava, Anuj
2010-01-01
We present a framework for incorporating prior information about high-probability shapes in the process of contour extraction and object recognition in images. Here one studies shapes as elements of an infinite-dimensional, non-linear quotient space, and statistics of shapes are defined and computed intrinsically using differential geometry of this shape space. Prior models on shapes are constructed using probability distributions on tangent bundles of shape spaces. Similar to the past work on active contours, where curves are driven by vector fields based on image gradients and roughness penalties, we incorporate the prior shape knowledge in the form of vector fields on curves. Through experimental results, we demonstrate the use of prior shape models in the estimation of object boundaries, and their success in handling partial obscuration and missing data. Furthermore, we describe the use of this framework in shape-based object recognition or classification. PMID:21076692
An Indoor Slam Method Based on Kinect and Multi-Feature Extended Information Filter
NASA Astrophysics Data System (ADS)
Chang, M.; Kang, Z.
2017-09-01
Based on the frame of ORB-SLAM in this paper the transformation parameters between adjacent Kinect image frames are computed using ORB keypoints, from which priori information matrix and information vector are calculated. The motion update of multi-feature extended information filter is then realized. According to the point cloud data formed by depth image, ICP algorithm was used to extract the point features of the point cloud data in the scene and built an observation model while calculating a-posteriori information matrix and information vector, and weakening the influences caused by the error accumulation in the positioning process. Furthermore, this paper applied ORB-SLAM frame to realize autonomous positioning in real time in interior unknown environment. In the end, Lidar was used to get data in the scene in order to estimate positioning accuracy put forward in this paper.
NASA Astrophysics Data System (ADS)
Zheng, Bin; Pleass, Charles M.; Ih, Charles S.
1993-11-01
A hybrid three-axis laser Doppler velocimeter system has been demonstrated in our laboratory. The system can monitor the motion of microorganisms in an unconstrained environment. During measurement, a computer system collects and processes time series data from the transit of a microorganism through the measurement volume. The fast Fourier transform of this data contains the motion signature of this microorganism. Because individual microorganisms can be selected from the field, ambiguity caused by multiscattering among two or more microorganisms can be avoided. Using this new system, we can obtain a feature vector that relates to features of the microorganism, such as its size, average translational velocity, rotation or wobbling, and its flagellum beat frequency. Such a vector appears to be a useful criterion for distinguishing the species using statistical pattern recognition. Successful experiments demonstrate that the new system and technique has some unique advantages.
Semantically enabled image similarity search
NASA Astrophysics Data System (ADS)
Casterline, May V.; Emerick, Timothy; Sadeghi, Kolia; Gosse, C. A.; Bartlett, Brent; Casey, Jason
2015-05-01
Georeferenced data of various modalities are increasingly available for intelligence and commercial use, however effectively exploiting these sources demands a unified data space capable of capturing the unique contribution of each input. This work presents a suite of software tools for representing geospatial vector data and overhead imagery in a shared high-dimension vector or embedding" space that supports fused learning and similarity search across dissimilar modalities. While the approach is suitable for fusing arbitrary input types, including free text, the present work exploits the obvious but computationally difficult relationship between GIS and overhead imagery. GIS is comprised of temporally-smoothed but information-limited content of a GIS, while overhead imagery provides an information-rich but temporally-limited perspective. This processing framework includes some important extensions of concepts in literature but, more critically, presents a means to accomplish them as a unified framework at scale on commodity cloud architectures.
A hybrid dynamic harmony search algorithm for identical parallel machines scheduling
NASA Astrophysics Data System (ADS)
Chen, Jing; Pan, Quan-Ke; Wang, Ling; Li, Jun-Qing
2012-02-01
In this article, a dynamic harmony search (DHS) algorithm is proposed for the identical parallel machines scheduling problem with the objective to minimize makespan. First, an encoding scheme based on a list scheduling rule is developed to convert the continuous harmony vectors to discrete job assignments. Second, the whole harmony memory (HM) is divided into multiple small-sized sub-HMs, and each sub-HM performs evolution independently and exchanges information with others periodically by using a regrouping schedule. Third, a novel improvisation process is applied to generate a new harmony by making use of the information of harmony vectors in each sub-HM. Moreover, a local search strategy is presented and incorporated into the DHS algorithm to find promising solutions. Simulation results show that the hybrid DHS (DHS_LS) is very competitive in comparison to its competitors in terms of mean performance and average computational time.
Fast image interpolation for motion estimation using graphics hardware
NASA Astrophysics Data System (ADS)
Kelly, Francis; Kokaram, Anil
2004-05-01
Motion estimation and compensation is the key to high quality video coding. Block matching motion estimation is used in most video codecs, including MPEG-2, MPEG-4, H.263 and H.26L. Motion estimation is also a key component in the digital restoration of archived video and for post-production and special effects in the movie industry. Sub-pixel accurate motion vectors can improve the quality of the vector field and lead to more efficient video coding. However sub-pixel accuracy requires interpolation of the image data. Image interpolation is a key requirement of many image processing algorithms. Often interpolation can be a bottleneck in these applications, especially in motion estimation due to the large number pixels involved. In this paper we propose using commodity computer graphics hardware for fast image interpolation. We use the full search block matching algorithm to illustrate the problems and limitations of using graphics hardware in this way.
Processing-in-Memory Enabled Graphics Processors for 3D Rendering
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xie, Chenhao; Song, Shuaiwen; Wang, Jing
2017-02-06
The performance of 3D rendering of Graphics Processing Unit that convents 3D vector stream into 2D frame with 3D image effects significantly impact users’ gaming experience on modern computer systems. Due to the high texture throughput in 3D rendering, main memory bandwidth becomes a critical obstacle for improving the overall rendering performance. 3D stacked memory systems such as Hybrid Memory Cube (HMC) provide opportunities to significantly overcome the memory wall by directly connecting logic controllers to DRAM dies. Based on the observation that texel fetches significantly impact off-chip memory traffic, we propose two architectural designs to enable Processing-In-Memory based GPUmore » for efficient 3D rendering.« less
EEG feature selection method based on decision tree.
Duan, Lijuan; Ge, Hui; Ma, Wei; Miao, Jun
2015-01-01
This paper aims to solve automated feature selection problem in brain computer interface (BCI). In order to automate feature selection process, we proposed a novel EEG feature selection method based on decision tree (DT). During the electroencephalogram (EEG) signal processing, a feature extraction method based on principle component analysis (PCA) was used, and the selection process based on decision tree was performed by searching the feature space and automatically selecting optimal features. Considering that EEG signals are a series of non-linear signals, a generalized linear classifier named support vector machine (SVM) was chosen. In order to test the validity of the proposed method, we applied the EEG feature selection method based on decision tree to BCI Competition II datasets Ia, and the experiment showed encouraging results.
A Two-Layer Least Squares Support Vector Machine Approach to Credit Risk Assessment
NASA Astrophysics Data System (ADS)
Liu, Jingli; Li, Jianping; Xu, Weixuan; Shi, Yong
Least squares support vector machine (LS-SVM) is a revised version of support vector machine (SVM) and has been proved to be a useful tool for pattern recognition. LS-SVM had excellent generalization performance and low computational cost. In this paper, we propose a new method called two-layer least squares support vector machine which combines kernel principle component analysis (KPCA) and linear programming form of least square support vector machine. With this method sparseness and robustness is obtained while solving large dimensional and large scale database. A U.S. commercial credit card database is used to test the efficiency of our method and the result proved to be a satisfactory one.
GISpark: A Geospatial Distributed Computing Platform for Spatiotemporal Big Data
NASA Astrophysics Data System (ADS)
Wang, S.; Zhong, E.; Wang, E.; Zhong, Y.; Cai, W.; Li, S.; Gao, S.
2016-12-01
Geospatial data are growing exponentially because of the proliferation of cost effective and ubiquitous positioning technologies such as global remote-sensing satellites and location-based devices. Analyzing large amounts of geospatial data can provide great value for both industrial and scientific applications. Data- and compute- intensive characteristics inherent in geospatial big data increasingly pose great challenges to technologies of data storing, computing and analyzing. Such challenges require a scalable and efficient architecture that can store, query, analyze, and visualize large-scale spatiotemporal data. Therefore, we developed GISpark - a geospatial distributed computing platform for processing large-scale vector, raster and stream data. GISpark is constructed based on the latest virtualized computing infrastructures and distributed computing architecture. OpenStack and Docker are used to build multi-user hosting cloud computing infrastructure for GISpark. The virtual storage systems such as HDFS, Ceph, MongoDB are combined and adopted for spatiotemporal data storage management. Spark-based algorithm framework is developed for efficient parallel computing. Within this framework, SuperMap GIScript and various open-source GIS libraries can be integrated into GISpark. GISpark can also integrated with scientific computing environment (e.g., Anaconda), interactive computing web applications (e.g., Jupyter notebook), and machine learning tools (e.g., TensorFlow/Orange). The associated geospatial facilities of GISpark in conjunction with the scientific computing environment, exploratory spatial data analysis tools, temporal data management and analysis systems make up a powerful geospatial computing tool. GISpark not only provides spatiotemporal big data processing capacity in the geospatial field, but also provides spatiotemporal computational model and advanced geospatial visualization tools that deals with other domains related with spatial property. We tested the performance of the platform based on taxi trajectory analysis. Results suggested that GISpark achieves excellent run time performance in spatiotemporal big data applications.
Summary of Fluidic Thrust Vectoring Research Conducted at NASA Langley Research Center
NASA Technical Reports Server (NTRS)
Deere, Karen A.
2003-01-01
Interest in low-observable aircraft and in lowering an aircraft's exhaust system weight sparked decades of research for fixed geometry exhaust nozzles. The desire for such integrated exhaust nozzles was the catalyst for new fluidic control techniques; including throat area control, expansion control, and thrust-vector angle control. This paper summarizes a variety of fluidic thrust vectoring concepts that have been tested both experimentally and computationally at NASA Langley Research Center. The nozzle concepts are divided into three categories according to the method used for fluidic thrust vectoring: the shock vector control method, the throat shifting method, and the counterflow method. This paper explains the thrust vectoring mechanism for each fluidic method, provides examples of configurations tested for each method, and discusses the advantages and disadvantages of each method.
Gu, Rui; Xu, Jinglei
2014-01-01
The dual throat nozzle (DTN) technique is capable to achieve higher thrust-vectoring efficiencies than other fluidic techniques, without compromising thrust efficiency significantly during vectoring operation. The excellent performance of the DTN is mainly due to the concaved cavity. In this paper, two DTNs of different scales have been investigated by unsteady numerical simulations to compare the parameter variations and study the effects of cavity during the vector starting process. The results remind us that during the vector starting process, dynamic loads may be generated, which is a potentially challenging problem for the aircraft trim and control.
An image-processing software package: UU and Fig for optical metrology applications
NASA Astrophysics Data System (ADS)
Chen, Lujie
2013-06-01
Modern optical metrology applications are largely supported by computational methods, such as phase shifting [1], Fourier Transform [2], digital image correlation [3], camera calibration [4], etc, in which image processing is a critical and indispensable component. While it is not too difficult to obtain a wide variety of image-processing programs from the internet; few are catered for the relatively special area of optical metrology. This paper introduces an image-processing software package: UU (data processing) and Fig (data rendering) that incorporates many useful functions to process optical metrological data. The cross-platform programs UU and Fig are developed based on wxWidgets. At the time of writing, it has been tested on Windows, Linux and Mac OS. The userinterface is designed to offer precise control of the underline processing procedures in a scientific manner. The data input/output mechanism is designed to accommodate diverse file formats and to facilitate the interaction with other independent programs. In terms of robustness, although the software was initially developed for personal use, it is comparably stable and accurate to most of the commercial software of similar nature. In addition to functions for optical metrology, the software package has a rich collection of useful tools in the following areas: real-time image streaming from USB and GigE cameras, computational geometry, computer vision, fitting of data, 3D image processing, vector image processing, precision device control (rotary stage, PZT stage, etc), point cloud to surface reconstruction, volume rendering, batch processing, etc. The software package is currently used in a number of universities for teaching and research.
Vectorization of linear discrete filtering algorithms
NASA Technical Reports Server (NTRS)
Schiess, J. R.
1977-01-01
Linear filters, including the conventional Kalman filter and versions of square root filters devised by Potter and Carlson, are studied for potential application on streaming computers. The square root filters are known to maintain a positive definite covariance matrix in cases in which the Kalman filter diverges due to ill-conditioning of the matrix. Vectorization of the filters is discussed, and comparisons are made of the number of operations and storage locations required by each filter. The Carlson filter is shown to be the most efficient of the filters on the Control Data STAR-100 computer.
LFSPMC: Linear feature selection program using the probability of misclassification
NASA Technical Reports Server (NTRS)
Guseman, L. F., Jr.; Marion, B. P.
1975-01-01
The computational procedure and associated computer program for a linear feature selection technique are presented. The technique assumes that: a finite number, m, of classes exists; each class is described by an n-dimensional multivariate normal density function of its measurement vectors; the mean vector and covariance matrix for each density function are known (or can be estimated); and the a priori probability for each class is known. The technique produces a single linear combination of the original measurements which minimizes the one-dimensional probability of misclassification defined by the transformed densities.
A path model for Whittaker vectors
NASA Astrophysics Data System (ADS)
Di Francesco, Philippe; Kedem, Rinat; Turmunkh, Bolor
2017-06-01
In this paper we construct weighted path models to compute Whittaker vectors in the completion of Verma modules, as well as Whittaker functions of fundamental type, for all finite-dimensional simple Lie algebras, affine Lie algebras, and the quantum algebra U_q(slr+1) . This leads to series expressions for the Whittaker functions. We show how this construction leads directly to the quantum Toda equations satisfied by these functions, and to the q-difference equations in the quantum case. We investigate the critical limit of affine Whittaker functions computed in this way.
HOSVD-Based 3D Active Appearance Model: Segmentation of Lung Fields in CT Images.
Wang, Qingzhu; Kang, Wanjun; Hu, Haihui; Wang, Bin
2016-07-01
An Active Appearance Model (AAM) is a computer vision model which can be used to effectively segment lung fields in CT images. However, the fitting result is often inadequate when the lungs are affected by high-density pathologies. To overcome this problem, we propose a Higher-order Singular Value Decomposition (HOSVD)-based Three-dimensional (3D) AAM. An evaluation was performed on 310 diseased lungs form the Lung Image Database Consortium Image Collection. Other contemporary AAMs operate directly on patterns represented by vectors, i.e., before applying the AAM to a 3D lung volume,it has to be vectorized first into a vector pattern by some technique like concatenation. However, some implicit structural or local contextual information may be lost in this transformation. According to the nature of the 3D lung volume, HOSVD is introduced to represent and process the lung in tensor space. Our method can not only directly operate on the original 3D tensor patterns, but also efficiently reduce the computer memory usage. The evaluation resulted in an average Dice coefficient of 97.0 % ± 0.59 %, a mean absolute surface distance error of 1.0403 ± 0.5716 mm, a mean border positioning errors of 0.9187 ± 0.5381 pixel, and a Hausdorff Distance of 20.4064 ± 4.3855, respectively. Experimental results showed that our methods delivered significant and better segmentation results, compared with the three other model-based lung segmentation approaches, namely 3D Snake, 3D ASM and 3D AAM.
Recent Progress on the Second Generation CMORPH: A Prototype Operational Processing System
NASA Astrophysics Data System (ADS)
Xie, Pingping; Joyce, Robert; Wu, Shaorong
2016-04-01
As reported at the EGU General Assembly of 2015, a conceptual test system was developed for the second generation CMORPH to produce global analyses of 30-min precipitation on a 0.05deg lat/lon grid over the entire globe from pole to pole through integration of information from satellite observations as well as numerical model simulations. The second generation CMORPH is built upon the Kalman Filter based CMORPH algorithm of Joyce and Xie (2011). Inputs to the system include both rainfall and snowfall rate retrievals from passive microwave (PMW) measurements aboard all available low earth orbit (LEO) satellites, precipitation estimates derived from infrared (IR) observations of geostationary (GEO) as well as LEO platforms, and precipitation simulations from numerical global models. Sub-systems were developed and refined to derive precipitation estimates from the GEO and LEO IR observations and to compute precipitating cloud motion vectors. The results were reported at the EGU of 2014 and the AGU 2015 Fall Meetings. In this presentation, we report our recent work on the construction of a prototype operational processing system for the second generation CMORPH. The second generation CMORPH prototype operational processing system takes in the passive microwave (PMW) retrievals of instantaneous precipitation rates from all available sensors, the full-resolution GEO and LEO IR data, as well as the hourly precipitation fields generated by the NOAA/NCEP Climate Forecast System (CFS) Reanalysis (CFS). First, a combined field of PMW based precipitation retrievals (MWCOMB) is created on a 0.05deg lat/lon grid over the entire globe through inter-calibrating retrievals from various sensors against a common reference. For this experiment, the reference field is the GMI based retrievals with climatological adjustment against the TMI retrievals using data over the overlapping period. Precipitation estimation is then derived from the GEO and LEO IR data through calibration against the global MWCOMB and the CloudSat CPR based estimates. At the meantime, precipitating cloud motion vectors are derived through the combination of vectors computed from the GEO IR based precipitation estimates and the CFSR precipitation with a 2DVAR technique. A prototype system is applied to generate integrated global precipitation estimates over the entire globe for a three-month period from June 1 to August 31 of 2015. Preliminary tests are conducted to optimize the performance of the system. Specific efforts are made to improve the computational efficiency of the system. The second generation CMORPH test products are compared to the first generation CMORPH and ground observations. Detailed results will be reported at the EGU.
NASA Technical Reports Server (NTRS)
Natanson, G. A.
1997-01-01
New algorithms are described covering the simulation, processing, and calibration of penetration angles of the Barnes static Earth sensor assembly (SESA) as implemented in the Goddard Space Flight Center Flight Dynamics Division ground support system for the Tropical Rainfall Measuring Mission (TRMM) Observatory. The new treatment involves a detailed analysis of the measurements by individual quadrants. It is shown that, to a good approximation, individual quadrant misalignments can be treated simply as penetration angle biases. Simple formulas suitable for real-time applications are introduced for computing quadrant-dependent effects. The simulator generates penetration angles by solving a quadratic equation with coefficients uniquely determined by the spacecraft's position and the quadrant's orientation in GeoCentric Inertial (GCI) coordinates. Measurement processing for attitude determination is based on linearized equations obtained by expanding the coefficients of the aforementioned quadratic equation as a Taylor series in both the Earth oblateness coefficient (alpha approx. 1/150) and the angle between the pointing axis and the geodetic nadir vector. A simple formula relating a measured value of the penetration angle to the deviation of the Earth-pointed axis from the geodetic nadir vector is derived. It is shown that even near the very edge of the quadrant's Field Of View (FOV), attitude errors resulting from quadratic effects are a few hundredths of a degree, which is small compared to the attitude determination accuracy requirement (0.18 degree, 3 sigma) of TRMM. Calibration of SESA measurements is complicated by a first-order filtering used in the TRMM onboard algorithm to compute penetration angles from raw voltages. A simple calibration scheme is introduced where these complications are avoided by treating penetration angles as the primary raw measurements, which are adjusted using biases and scale factors. In addition to three misalignment parameters, the calibration state vector contains only two average penetration angle biases (one per each pair of opposite quadrants) since, because of the very narrow sensor FOV (+/- 2.6 degrees), differences between biases of the penetration angles measured by opposite quadrants cannot be distinguished from roll and pitch sensor misalignments. After calibration, the estimated misalignments and average penetration angle biases are converted to the four penetration angle biases and to the yaw misalignment angle. The resultant biases and the estimated scale factors are finally used to update the coefficients necessary for onboard computations of penetration angles from measured voltages.
Robust and accurate vectorization of line drawings.
Hilaire, Xavier; Tombre, Karl
2006-06-01
This paper presents a method for vectorizing the graphical parts of paper-based line drawings. The method consists of separating the input binary image into layers of homogeneous thickness, skeletonizing each layer, segmenting the skeleton by a method based on random sampling, and simplifying the result. The segmentation method is robust with a best bound of 50 percent noise reached for indefinitely long primitives. Accurate estimation of the recognized vector's parameters is enabled by explicitly computing their feasibility domains. Theoretical performance analysis and expression of the complexity of the segmentation method are derived. Experimental results and comparisons with other vectorization systems are also provided.
Amplitudes for multiphoton quantum processes in linear optics
NASA Astrophysics Data System (ADS)
Urías, Jesús
2011-07-01
The prominent role that linear optical networks have acquired in the engineering of photon states calls for physically intuitive and automatic methods to compute the probability amplitudes for the multiphoton quantum processes occurring in linear optics. A version of Wick's theorem for the expectation value, on any vector state, of products of linear operators, in general, is proved. We use it to extract the combinatorics of any multiphoton quantum processes in linear optics. The result is presented as a concise rule to write down directly explicit formulae for the probability amplitude of any multiphoton process in linear optics. The rule achieves a considerable simplification and provides an intuitive physical insight about quantum multiphoton processes. The methodology is applied to the generation of high-photon-number entangled states by interferometrically mixing coherent light with spontaneously down-converted light.
Development of a distributed-parameter mathematical model for simulation of cryogenic wind tunnels
NASA Technical Reports Server (NTRS)
Tripp, J. S.
1983-01-01
A one-dimensional distributed-parameter dynamic model of a cryogenic wind tunnel was developed which accounts for internal and external heat transfer, viscous momentum losses, and slotted-test-section dynamics. Boundary conditions imposed by liquid-nitrogen injection, gas venting, and the tunnel fan were included. A time-dependent numerical solution to the resultant set of partial differential equations was obtained on a CDC CYBER 203 vector-processing digital computer at a usable computational rate. Preliminary computational studies were performed by using parameters of the Langley 0.3-Meter Transonic Cryogenic Tunnel. Studies were performed by using parameters from the National Transonic Facility (NTF). The NTF wind-tunnel model was used in the design of control loops for Mach number, total temperature, and total pressure and for determining interactions between the control loops. It was employed in the application of optimal linear-regulator theory and eigenvalue-placement techniques to develop Mach number control laws.
Learning Motion Features for Example-Based Finger Motion Estimation for Virtual Characters
NASA Astrophysics Data System (ADS)
Mousas, Christos; Anagnostopoulos, Christos-Nikolaos
2017-09-01
This paper presents a methodology for estimating the motion of a character's fingers based on the use of motion features provided by a virtual character's hand. In the presented methodology, firstly, the motion data is segmented into discrete phases. Then, a number of motion features are computed for each motion segment of a character's hand. The motion features are pre-processed using restricted Boltzmann machines, and by using the different variations of semantically similar finger gestures in a support vector machine learning mechanism, the optimal weights for each feature assigned to a metric are computed. The advantages of the presented methodology in comparison to previous solutions are the following: First, we automate the computation of optimal weights that are assigned to each motion feature counted in our metric. Second, the presented methodology achieves an increase (about 17%) in correctly estimated finger gestures in comparison to a previous method.
Machine learning methods in chemoinformatics
Mitchell, John B O
2014-01-01
Machine learning algorithms are generally developed in computer science or adjacent disciplines and find their way into chemical modeling by a process of diffusion. Though particular machine learning methods are popular in chemoinformatics and quantitative structure–activity relationships (QSAR), many others exist in the technical literature. This discussion is methods-based and focused on some algorithms that chemoinformatics researchers frequently use. It makes no claim to be exhaustive. We concentrate on methods for supervised learning, predicting the unknown property values of a test set of instances, usually molecules, based on the known values for a training set. Particularly relevant approaches include Artificial Neural Networks, Random Forest, Support Vector Machine, k-Nearest Neighbors and naïve Bayes classifiers. WIREs Comput Mol Sci 2014, 4:468–481. How to cite this article: WIREs Comput Mol Sci 2014, 4:468–481. doi:10.1002/wcms.1183 PMID:25285160
Essential issues in multiprocessor systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gajski, D.D.; Peir, J.K.
1985-06-01
During the past several years, a great number of proposals have been made with the objective to increase supercomputer performance by an order of magnitude on the basis of a utilization of new computer architectures. The present paper is concerned with a suitable classification scheme for comparing these architectures. It is pointed out that there are basically four schools of thought as to the most important factor for an enhancement of computer performance. According to one school, the development of faster circuits will make it possible to retain present architectures, except, possibly, for a mechanism providing synchronization of parallel processes.more » A second school assigns priority to the optimization and vectorization of compilers, which will detect parallelism and help users to write better parallel programs. A third school believes in the predominant importance of new parallel algorithms, while the fourth school supports new models of computation. The merits of the four approaches are critically evaluated. 50 references.« less
COMPUTATION OF GLOBAL PHOTOCHEMISTRY WITH SMVGEAR II (R823186)
A computer model was developed to simulate global gas-phase photochemistry. The model solves chemical equations with SMVGEAR II, a sparse-matrix, vectorized Gear-type code. To obtain SMVGEAR II, the original SMVGEAR code was modified to allow computation of different sets of chem...
Multitasking a three-dimensional Navier-Stokes algorithm on the Cray-2
NASA Technical Reports Server (NTRS)
Swisshelm, Julie M.
1989-01-01
A three-dimensional computational aerodynamics algorithm has been multitasked for efficient parallel execution on the Cray-2. It provides a means for examining the multitasking performance of a complete CFD application code. An embedded zonal multigrid scheme is used to solve the Reynolds-averaged Navier-Stokes equations for an internal flow model problem. The explicit nature of each component of the method allows a spatial partitioning of the computational domain to achieve a well-balanced task load for MIMD computers with vector-processing capability. Experiments have been conducted with both two- and three-dimensional multitasked cases. The best speedup attained by an individual task group was 3.54 on four processors of the Cray-2, while the entire solver yielded a speedup of 2.67 on four processors for the three-dimensional case. The multiprocessing efficiency of various types of computational tasks is examined, performance on two Cray-2s with different memory access speeds is compared, and extrapolation to larger problems is discussed.
Spin-dependent post-Newtonian parameters from EMRI computation in Kerr background
NASA Astrophysics Data System (ADS)
Friedman, John; Le Tiec, Alexandre; Shah, Abhay
2013-04-01
Because the extreme mass-ratio inspiral (EMRI) approximation is accurate to all orders in v/c, it can be used to find high order post-Newtonian parameters that are not yet analytically accessible. We report here on progress in computing spin-dependent, conservative, post-Newtonian parameters from a radiation-gauge computation for a particle in circular orbit in a family of Kerr geometries. For a particle with 4-velocity u^α= U k^α, with k^α the helical Killing vector of the perturbed spacetime, the renormalized perturbation δU, when written as a function of the particle's angular velocity, is invariant under gauge transformations generated by helically symmetric vectors. The EMRI computations are done in a modified radiation gauge. Extracted parameters are compared to previously known and newly computed spin-dependent post-Newtonian terms. This work is modeled on earlier computations by Blanchet, Detweiler, Le Tiec and Whiting of spin-independent terms for a particle in circular orbit in a Schwarzschild geometry.
Vector assembly of colloids on monolayer substrates
NASA Astrophysics Data System (ADS)
Jiang, Lingxiang; Yang, Shenyu; Tsang, Boyce; Tu, Mei; Granick, Steve
2017-06-01
The key to spontaneous and directed assembly is to encode the desired assembly information to building blocks in a programmable and efficient way. In computer graphics, raster graphics encodes images on a single-pixel level, conferring fine details at the expense of large file sizes, whereas vector graphics encrypts shape information into vectors that allow small file sizes and operational transformations. Here, we adapt this raster/vector concept to a 2D colloidal system and realize `vector assembly' by manipulating particles on a colloidal monolayer substrate with optical tweezers. In contrast to raster assembly that assigns optical tweezers to each particle, vector assembly requires a minimal number of optical tweezers that allow operations like chain elongation and shortening. This vector approach enables simple uniform particles to form a vast collection of colloidal arenes and colloidenes, the spontaneous dissociation of which is achieved with precision and stage-by-stage complexity by simply removing the optical tweezers.
NASA Astrophysics Data System (ADS)
Li, W.; Shao, H.
2017-12-01
For geospatial cyberinfrastructure enabled web services, the ability of rapidly transmitting and sharing spatial data over the Internet plays a critical role to meet the demands of real-time change detection, response and decision-making. Especially for the vector datasets which serve as irreplaceable and concrete material in data-driven geospatial applications, their rich geometry and property information facilitates the development of interactive, efficient and intelligent data analysis and visualization applications. However, the big-data issues of vector datasets have hindered their wide adoption in web services. In this research, we propose a comprehensive optimization strategy to enhance the performance of vector data transmitting and processing. This strategy combines: 1) pre- and on-the-fly generalization, which automatically determines proper simplification level through the introduction of appropriate distance tolerance (ADT) to meet various visualization requirements, and at the same time speed up simplification efficiency; 2) a progressive attribute transmission method to reduce data size and therefore the service response time; 3) compressed data transmission and dynamic adoption of a compression method to maximize the service efficiency under different computing and network environments. A cyberinfrastructure web portal was developed for implementing the proposed technologies. After applying our optimization strategies, substantial performance enhancement is achieved. We expect this work to widen the use of web service providing vector data to support real-time spatial feature sharing, visual analytics and decision-making.
Effects of camera location on the reconstruction of 3D flare trajectory with two cameras
NASA Astrophysics Data System (ADS)
Özsaraç, Seçkin; Yeşilkaya, Muhammed
2015-05-01
Flares are used as valuable electronic warfare assets for the battle against infrared guided missiles. The trajectory of the flare is one of the most important factors that determine the effectiveness of the counter measure. Reconstruction of the three dimensional (3D) position of a point, which is seen by multiple cameras, is a common problem. Camera placement, camera calibration, corresponding pixel determination in between the images of different cameras and also the triangulation algorithm affect the performance of 3D position estimation. In this paper, we specifically investigate the effects of camera placement on the flare trajectory estimation performance by simulations. Firstly, 3D trajectory of a flare and also the aircraft, which dispenses the flare, are generated with simple motion models. Then, we place two virtual ideal pinhole camera models on different locations. Assuming the cameras are tracking the aircraft perfectly, the view vectors of the cameras are computed. Afterwards, using the view vector of each camera and also the 3D position of the flare, image plane coordinates of the flare on both cameras are computed using the field of view (FOV) values. To increase the fidelity of the simulation, we have used two sources of error. One is used to model the uncertainties in the determination of the camera view vectors, i.e. the orientations of the cameras are measured noisy. Second noise source is used to model the imperfections of the corresponding pixel determination of the flare in between the two cameras. Finally, 3D position of the flare is estimated using the corresponding pixel indices, view vector and also the FOV of the cameras by triangulation. All the processes mentioned so far are repeated for different relative camera placements so that the optimum estimation error performance is found for the given aircraft and are trajectories.
Syngeneic AAV pseudo-vectors potentiates full vector transduction
USDA-ARS?s Scientific Manuscript database
An excessive amount of empty capsids are generated during regular AAV vector production process. These pseudo-vectors often remain in final vectors used for animal studies or clinical trials. The potential effects of these pseudo-vectors on AAV transduction have been a major concern. In the current ...
Numerical simulation using vorticity-vector potential formulation
NASA Technical Reports Server (NTRS)
Tokunaga, Hiroshi
1993-01-01
An accurate and efficient computational method is needed for three-dimensional incompressible viscous flows in engineering applications. On solving the turbulent shear flows directly or using the subgrid scale model, it is indispensable to resolve the small scale fluid motions as well as the large scale motions. From this point of view, the pseudo-spectral method is used so far as the computational method. However, the finite difference or the finite element methods are widely applied for computing the flow with practical importance since these methods are easily applied to the flows with complex geometric configurations. However, there exist several problems in applying the finite difference method to direct and large eddy simulations. Accuracy is one of most important problems. This point was already addressed by the present author on the direct simulations on the instability of the plane Poiseuille flow and also on the transition to turbulence. In order to obtain high efficiency, the multi-grid Poisson solver is combined with the higher-order, accurate finite difference method. The formulation method is also one of the most important problems in applying the finite difference method to the incompressible turbulent flows. The three-dimensional Navier-Stokes equations have been solved so far in the primitive variables formulation. One of the major difficulties of this method is the rigorous satisfaction of the equation of continuity. In general, the staggered grid is used for the satisfaction of the solenoidal condition for the velocity field at the wall boundary. However, the velocity field satisfies the equation of continuity automatically in the vorticity-vector potential formulation. From this point of view, the vorticity-vector potential method was extended to the generalized coordinate system. In the present article, we adopt the vorticity-vector potential formulation, the generalized coordinate system, and the 4th-order accurate difference method as the computational method. We present the computational method and apply the present method to computations of flows in a square cavity at large Reynolds number in order to investigate its effectiveness.