computationally intensive algorithms: Topics by Science.gov

Sample records for computationally intensive algorithms

A Fast Synthetic Aperture Radar Raw Data Simulation Using Cloud Computing.

PubMed

Li, Zhixin; Su, Dandan; Zhu, Haijiang; Li, Wei; Zhang, Fan; Li, Ruirui

2017-01-08

Synthetic Aperture Radar (SAR) raw data simulation is a fundamental problem in radar system design and imaging algorithm research. The growth of surveying swath and resolution results in a significant increase in data volume and simulation period, which can be considered to be a comprehensive data intensive and computing intensive issue. Although several high performance computing (HPC) methods have demonstrated their potential for accelerating simulation, the input/output (I/O) bottleneck of huge raw data has not been eased. In this paper, we propose a cloud computing based SAR raw data simulation algorithm, which employs the MapReduce model to accelerate the raw data computing and the Hadoop distributed file system (HDFS) for fast I/O access. The MapReduce model is designed for the irregular parallel accumulation of raw data simulation, which greatly reduces the parallel efficiency of graphics processing unit (GPU) based simulation methods. In addition, three kinds of optimization strategies are put forward from the aspects of programming model, HDFS configuration and scheduling. The experimental results show that the cloud computing based algorithm achieves 4_ speedup over the baseline serial approach in an 8-node cloud environment, and each optimization strategy can improve about 20%. This work proves that the proposed cloud algorithm is capable of solving the computing intensive and data intensive issues in SAR raw data simulation, and is easily extended to large scale computing to achieve higher acceleration.
A study of real-time computer graphic display technology for aeronautical applications

NASA Technical Reports Server (NTRS)

Rajala, S. A.

1981-01-01

The development, simulation, and testing of an algorithm for anti-aliasing vector drawings is discussed. The pseudo anti-aliasing line drawing algorithm is an extension to Bresenham's algorithm for computer control of a digital plotter. The algorithm produces a series of overlapping line segments where the display intensity shifts from one segment to the other in this overlap (transition region). In this algorithm the length of the overlap and the intensity shift are essentially constants because the transition region is an aid to the eye in integrating the segments into a single smooth line.
Efficient Parallel Algorithm For Direct Numerical Simulation of Turbulent Flows

NASA Technical Reports Server (NTRS)

Moitra, Stuti; Gatski, Thomas B.

1997-01-01

A distributed algorithm for a high-order-accurate finite-difference approach to the direct numerical simulation (DNS) of transition and turbulence in compressible flows is described. This work has two major objectives. The first objective is to demonstrate that parallel and distributed-memory machines can be successfully and efficiently used to solve computationally intensive and input/output intensive algorithms of the DNS class. The second objective is to show that the computational complexity involved in solving the tridiagonal systems inherent in the DNS algorithm can be reduced by algorithm innovations that obviate the need to use a parallelized tridiagonal solver.
A Fast Synthetic Aperture Radar Raw Data Simulation Using Cloud Computing

PubMed Central

Li, Zhixin; Su, Dandan; Zhu, Haijiang; Li, Wei; Zhang, Fan; Li, Ruirui

2017-01-01

Synthetic Aperture Radar (SAR) raw data simulation is a fundamental problem in radar system design and imaging algorithm research. The growth of surveying swath and resolution results in a significant increase in data volume and simulation period, which can be considered to be a comprehensive data intensive and computing intensive issue. Although several high performance computing (HPC) methods have demonstrated their potential for accelerating simulation, the input/output (I/O) bottleneck of huge raw data has not been eased. In this paper, we propose a cloud computing based SAR raw data simulation algorithm, which employs the MapReduce model to accelerate the raw data computing and the Hadoop distributed file system (HDFS) for fast I/O access. The MapReduce model is designed for the irregular parallel accumulation of raw data simulation, which greatly reduces the parallel efficiency of graphics processing unit (GPU) based simulation methods. In addition, three kinds of optimization strategies are put forward from the aspects of programming model, HDFS configuration and scheduling. The experimental results show that the cloud computing based algorithm achieves 4× speedup over the baseline serial approach in an 8-node cloud environment, and each optimization strategy can improve about 20%. This work proves that the proposed cloud algorithm is capable of solving the computing intensive and data intensive issues in SAR raw data simulation, and is easily extended to large scale computing to achieve higher acceleration. PMID:28075343
MRPack: Multi-Algorithm Execution Using Compute-Intensive Approach in MapReduce

PubMed Central

2015-01-01

Large quantities of data have been generated from multiple sources at exponential rates in the last few years. These data are generated at high velocity as real time and streaming data in variety of formats. These characteristics give rise to challenges in its modeling, computation, and processing. Hadoop MapReduce (MR) is a well known data-intensive distributed processing framework using the distributed file system (DFS) for Big Data. Current implementations of MR only support execution of a single algorithm in the entire Hadoop cluster. In this paper, we propose MapReducePack (MRPack), a variation of MR that supports execution of a set of related algorithms in a single MR job. We exploit the computational capability of a cluster by increasing the compute-intensiveness of MapReduce while maintaining its data-intensive approach. It uses the available computing resources by dynamically managing the task assignment and intermediate data. Intermediate data from multiple algorithms are managed using multi-key and skew mitigation strategies. The performance study of the proposed system shows that it is time, I/O, and memory efficient compared to the default MapReduce. The proposed approach reduces the execution time by 200% with an approximate 50% decrease in I/O cost. Complexity and qualitative results analysis shows significant performance improvement. PMID:26305223
MRPack: Multi-Algorithm Execution Using Compute-Intensive Approach in MapReduce.

PubMed

Idris, Muhammad; Hussain, Shujaat; Siddiqi, Muhammad Hameed; Hassan, Waseem; Syed Muhammad Bilal, Hafiz; Lee, Sungyoung

2015-01-01

Large quantities of data have been generated from multiple sources at exponential rates in the last few years. These data are generated at high velocity as real time and streaming data in variety of formats. These characteristics give rise to challenges in its modeling, computation, and processing. Hadoop MapReduce (MR) is a well known data-intensive distributed processing framework using the distributed file system (DFS) for Big Data. Current implementations of MR only support execution of a single algorithm in the entire Hadoop cluster. In this paper, we propose MapReducePack (MRPack), a variation of MR that supports execution of a set of related algorithms in a single MR job. We exploit the computational capability of a cluster by increasing the compute-intensiveness of MapReduce while maintaining its data-intensive approach. It uses the available computing resources by dynamically managing the task assignment and intermediate data. Intermediate data from multiple algorithms are managed using multi-key and skew mitigation strategies. The performance study of the proposed system shows that it is time, I/O, and memory efficient compared to the default MapReduce. The proposed approach reduces the execution time by 200% with an approximate 50% decrease in I/O cost. Complexity and qualitative results analysis shows significant performance improvement.
Wind velocity profile reconstruction from intensity fluctuations of a plane wave propagating in a turbulent atmosphere.

PubMed

Banakh, V A; Marakasov, D A

2007-08-01

Reconstruction of a wind profile based on the statistics of plane-wave intensity fluctuations in a turbulent atmosphere is considered. The algorithm for wind profile retrieval from the spatiotemporal spectrum of plane-wave weak intensity fluctuations is described, and the results of end-to-end computer experiments on wind profiling based on the developed algorithm are presented. It is shown that the reconstructing algorithm allows retrieval of a wind profile from turbulent plane-wave intensity fluctuations with acceptable accuracy.
Data intensive computing at Sandia.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wilson, Andrew T.

2010-09-01

Data-Intensive Computing is parallel computing where you design your algorithms and your software around efficient access and traversal of a data set; where hardware requirements are dictated by data size as much as by desired run times usually distilling compact results from massive data.
Colour based fire detection method with temporal intensity variation filtration

NASA Astrophysics Data System (ADS)

Trambitckii, K.; Anding, K.; Musalimov, V.; Linß, G.

2015-02-01

Development of video, computing technologies and computer vision gives a possibility of automatic fire detection on video information. Under that project different algorithms was implemented to find more efficient way of fire detection. In that article colour based fire detection algorithm is described. But it is not enough to use only colour information to detect fire properly. The main reason of this is that in the shooting conditions may be a lot of things having colour similar to fire. A temporary intensity variation of pixels is used to separate them from the fire. These variations are averaged over the series of several frames. This algorithm shows robust work and was realised as a computer program by using of the OpenCV library.
Developing Subdomain Allocation Algorithms Based on Spatial and Communicational Constraints to Accelerate Dust Storm Simulation

PubMed Central

Gui, Zhipeng; Yu, Manzhu; Yang, Chaowei; Jiang, Yunfeng; Chen, Songqing; Xia, Jizhe; Huang, Qunying; Liu, Kai; Li, Zhenlong; Hassan, Mohammed Anowarul; Jin, Baoxuan

2016-01-01

Dust storm has serious disastrous impacts on environment, human health, and assets. The developments and applications of dust storm models have contributed significantly to better understand and predict the distribution, intensity and structure of dust storms. However, dust storm simulation is a data and computing intensive process. To improve the computing performance, high performance computing has been widely adopted by dividing the entire study area into multiple subdomains and allocating each subdomain on different computing nodes in a parallel fashion. Inappropriate allocation may introduce imbalanced task loads and unnecessary communications among computing nodes. Therefore, allocation is a key factor that may impact the efficiency of parallel process. An allocation algorithm is expected to consider the computing cost and communication cost for each computing node to minimize total execution time and reduce overall communication cost for the entire simulation. This research introduces three algorithms to optimize the allocation by considering the spatial and communicational constraints: 1) an Integer Linear Programming (ILP) based algorithm from combinational optimization perspective; 2) a K-Means and Kernighan-Lin combined heuristic algorithm (K&K) integrating geometric and coordinate-free methods by merging local and global partitioning; 3) an automatic seeded region growing based geometric and local partitioning algorithm (ASRG). The performance and effectiveness of the three algorithms are compared based on different factors. Further, we adopt the K&K algorithm as the demonstrated algorithm for the experiment of dust model simulation with the non-hydrostatic mesoscale model (NMM-dust) and compared the performance with the MPI default sequential allocation. The results demonstrate that K&K method significantly improves the simulation performance with better subdomain allocation. This method can also be adopted for other relevant atmospheric and numerical modeling. PMID:27044039
An Evolutionary Algorithm for Fast Intensity Based Image Matching Between Optical and SAR Satellite Imagery

NASA Astrophysics Data System (ADS)

Fischer, Peter; Schuegraf, Philipp; Merkle, Nina; Storch, Tobias

2018-04-01

This paper presents a hybrid evolutionary algorithm for fast intensity based matching between satellite imagery from SAR and very high-resolution (VHR) optical sensor systems. The precise and accurate co-registration of image time series and images of different sensors is a key task in multi-sensor image processing scenarios. The necessary preprocessing step of image matching and tie-point detection is divided into a search problem and a similarity measurement. Within this paper we evaluate the use of an evolutionary search strategy for establishing the spatial correspondence between satellite imagery of optical and radar sensors. The aim of the proposed algorithm is to decrease the computational costs during the search process by formulating the search as an optimization problem. Based upon the canonical evolutionary algorithm, the proposed algorithm is adapted for SAR/optical imagery intensity based matching. Extensions are drawn using techniques like hybridization (e.g. local search) and others to lower the number of objective function calls and refine the result. The algorithm significantely decreases the computational costs whilst finding the optimal solution in a reliable way.
CT to Cone-beam CT Deformable Registration With Simultaneous Intensity Correction

PubMed Central

Zhen, Xin; Gu, Xuejun; Yan, Hao; Zhou, Linghong; Jia, Xun; Jiang, Steve B.

2012-01-01

Computed tomography (CT) to cone-beam computed tomography (CBCT) deformable image registration (DIR) is a crucial step in adaptive radiation therapy. Current intensity-based registration algorithms, such as demons, may fail in the context of CT-CBCT DIR because of inconsistent intensities between the two modalities. In this paper, we propose a variant of demons, called Deformation with Intensity Simultaneously Corrected (DISC), to deal with CT-CBCT DIR. DISC distinguishes itself from the original demons algorithm by performing an adaptive intensity correction step on the CBCT image at every iteration step of the demons registration. Specifically, the intensity correction of a voxel in CBCT is achieved by matching the first and the second moments of the voxel intensities inside a patch around the voxel with those on the CT image. It is expected that such a strategy can remove artifacts in the CBCT image, as well as ensuring the intensity consistency between the two modalities. DISC is implemented on computer graphics processing units (GPUs) in compute unified device architecture (CUDA) programming environment. The performance of DISC is evaluated on a simulated patient case and six clinical head-and-neck cancer patient data. It is found that DISC is robust against the CBCT artifacts and intensity inconsistency and significantly improves the registration accuracy when compared with the original demons. PMID:23032638
Inherent smoothness of intensity patterns for intensity modulated radiation therapy generated by simultaneous projection algorithms

NASA Astrophysics Data System (ADS)

Xiao, Ying; Michalski, Darek; Censor, Yair; Galvin, James M.

2004-07-01

The efficient delivery of intensity modulated radiation therapy (IMRT) depends on finding optimized beam intensity patterns that produce dose distributions, which meet given constraints for the tumour as well as any critical organs to be spared. Many optimization algorithms that are used for beamlet-based inverse planning are susceptible to large variations of neighbouring intensities. Accurately delivering an intensity pattern with a large number of extrema can prove impossible given the mechanical limitations of standard multileaf collimator (MLC) delivery systems. In this study, we apply Cimmino's simultaneous projection algorithm to the beamlet-based inverse planning problem, modelled mathematically as a system of linear inequalities. We show that using this method allows us to arrive at a smoother intensity pattern. Including nonlinear terms in the simultaneous projection algorithm to deal with dose-volume histogram (DVH) constraints does not compromise this property from our experimental observation. The smoothness properties are compared with those from other optimization algorithms which include simulated annealing and the gradient descent method. The simultaneous property of these algorithms is ideally suited to parallel computing technologies.
Architecutres, Models, Algorithms, and Software Tools for Configurable Computing

DTIC Science & Technology

2000-03-06

and J.G. Nash. The gated interconnection network for dynamic programming. Plenum, 1988 . [18] Ju wook Jang, Heonchul Park, and Viktor K. Prasanna. A ...Sep. 1997. [2] C. Ebeling, D. C. Cronquist , P. Franklin and C. Fisher, "RaPiD - A configurable computing architecture for compute-intensive...ABSTRACT (Maximum 200 words) The Models, Algorithms, and Architectures for Reconfigurable Computing (MAARC) project developed a sound framework for
A novel iris localization algorithm using correlation filtering

NASA Astrophysics Data System (ADS)

Pohit, Mausumi; Sharma, Jitu

2015-06-01

Fast and efficient segmentation of iris from the eye images is a primary requirement for robust database independent iris recognition. In this paper we have presented a new algorithm for computing the inner and outer boundaries of the iris and locating the pupil centre. Pupil-iris boundary computation is based on correlation filtering approach, whereas iris-sclera boundary is determined through one dimensional intensity mapping. The proposed approach is computationally less extensive when compared with the existing algorithms like Hough transform.
Wind profiling based on the optical beam intensity statistics in a turbulent atmosphere.

PubMed

Banakh, Victor A; Marakasov, Dimitrii A

2007-10-01

Reconstruction of the wind profile from the statistics of intensity fluctuations of an optical beam propagating in a turbulent atmosphere is considered. The equations for the spatiotemporal correlation function and the spectrum of weak intensity fluctuations of a Gaussian beam are obtained. The algorithms of wind profile retrieval from the spatiotemporal intensity spectrum are described and the results of end-to-end computer experiments on wind profiling based on the developed algorithms are presented. It is shown that the developed algorithms allow retrieval of the wind profile from the turbulent optical beam intensity fluctuations with acceptable accuracy in many practically feasible laser measurements set up in the atmosphere.
Parallel Architectures and Parallel Algorithms for Integrated Vision Systems. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Choudhary, Alok Nidhi

1989-01-01

Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is a system that uses vision algorithms from all levels of processing to perform for a high level application (e.g., object recognition). An IVS normally involves algorithms from low level, intermediate level, and high level vision. Designing parallel architectures for vision systems is of tremendous interest to researchers. Several issues are addressed in parallel architectures and parallel algorithms for integrated vision systems.
Convex optimization problem prototyping for image reconstruction in computed tomography with the Chambolle-Pock algorithm

PubMed Central

Sidky, Emil Y.; Jørgensen, Jakob H.; Pan, Xiaochuan

2012-01-01

The primal-dual optimization algorithm developed in Chambolle and Pock (CP), 2011 is applied to various convex optimization problems of interest in computed tomography (CT) image reconstruction. This algorithm allows for rapid prototyping of optimization problems for the purpose of designing iterative image reconstruction algorithms for CT. The primal-dual algorithm is briefly summarized in the article, and its potential for prototyping is demonstrated by explicitly deriving CP algorithm instances for many optimization problems relevant to CT. An example application modeling breast CT with low-intensity X-ray illumination is presented. PMID:22538474
A Survey on the Feasibility of Sound Classification on Wireless Sensor Nodes

PubMed Central

Salomons, Etto L.; Havinga, Paul J. M.

2015-01-01

Wireless sensor networks are suitable to gain context awareness for indoor environments. As sound waves form a rich source of context information, equipping the nodes with microphones can be of great benefit. The algorithms to extract features from sound waves are often highly computationally intensive. This can be problematic as wireless nodes are usually restricted in resources. In order to be able to make a proper decision about which features to use, we survey how sound is used in the literature for global sound classification, age and gender classification, emotion recognition, person verification and identification and indoor and outdoor environmental sound classification. The results of the surveyed algorithms are compared with respect to accuracy and computational load. The accuracies are taken from the surveyed papers; the computational loads are determined by benchmarking the algorithms on an actual sensor node. We conclude that for indoor context awareness, the low-cost algorithms for feature extraction perform equally well as the more computationally-intensive variants. As the feature extraction still requires a large amount of processing time, we present four possible strategies to deal with this problem. PMID:25822142
Implementation of Multispectral Image Classification on a Remote Adaptive Computer

NASA Technical Reports Server (NTRS)

Figueiredo, Marco A.; Gloster, Clay S.; Stephens, Mark; Graves, Corey A.; Nakkar, Mouna

1999-01-01

As the demand for higher performance computers for the processing of remote sensing science algorithms increases, the need to investigate new computing paradigms its justified. Field Programmable Gate Arrays enable the implementation of algorithms at the hardware gate level, leading to orders of m a,gnitude performance increase over microprocessor based systems. The automatic classification of spaceborne multispectral images is an example of a computation intensive application, that, can benefit from implementation on an FPGA - based custom computing machine (adaptive or reconfigurable computer). A probabilistic neural network is used here to classify pixels of of a multispectral LANDSAT-2 image. The implementation described utilizes Java client/server application programs to access the adaptive computer from a remote site. Results verify that a remote hardware version of the algorithm (implemented on an adaptive computer) is significantly faster than a local software version of the same algorithm implemented on a typical general - purpose computer).

Contextual classification of multispectral image data: Approximate algorithm

NASA Technical Reports Server (NTRS)

Tilton, J. C. (Principal Investigator)

1980-01-01

An approximation to a classification algorithm incorporating spatial context information in a general, statistical manner is presented which is computationally less intensive. Classifications that are nearly as accurate are produced.
Randomized algorithms for high quality treatment planning in volumetric modulated arc therapy

NASA Astrophysics Data System (ADS)

Yang, Yu; Dong, Bin; Wen, Zaiwen

2017-02-01

In recent years, volumetric modulated arc therapy (VMAT) has been becoming a more and more important radiation technique widely used in clinical application for cancer treatment. One of the key problems in VMAT is treatment plan optimization, which is complicated due to the constraints imposed by the involved equipments. In this paper, we consider a model with four major constraints: the bound on the beam intensity, an upper bound on the rate of the change of the beam intensity, the moving speed of leaves of the multi-leaf collimator (MLC) and its directional-convexity. We solve the model by a two-stage algorithm: performing minimization with respect to the shapes of the aperture and the beam intensities alternatively. Specifically, the shapes of the aperture are obtained by a greedy algorithm whose performance is enhanced by random sampling in the leaf pairs with a decremental rate. The beam intensity is optimized using a gradient projection method with non-monotonic line search. We further improve the proposed algorithm by an incremental random importance sampling of the voxels to reduce the computational cost of the energy functional. Numerical simulations on two clinical cancer date sets demonstrate that our method is highly competitive to the state-of-the-art algorithms in terms of both computational time and quality of treatment planning.
A Discussion of Using a Reconfigurable Processor to Implement the Discrete Fourier Transform

NASA Technical Reports Server (NTRS)

White, Michael J.

2004-01-01

This paper presents the design and implementation of the Discrete Fourier Transform (DFT) algorithm on a reconfigurable processor system. While highly applicable to many engineering problems, the DFT is an extremely computationally intensive algorithm. Consequently, the eventual goal of this work is to enhance the execution of a floating-point precision DFT algorithm by off loading the algorithm from the computing system. This computing system, within the context of this research, is a typical high performance desktop computer with an may of field programmable gate arrays (FPGAs). FPGAs are hardware devices that are configured by software to execute an algorithm. If it is desired to change the algorithm, the software is changed to reflect the modification, then download to the FPGA, which is then itself modified. This paper will discuss methodology for developing the DFT algorithm to be implemented on the FPGA. We will discuss the algorithm, the FPGA code effort, and the results to date.
Applications in Data-Intensive Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shah, Anuj R.; Adkins, Joshua N.; Baxter, Douglas J.

2010-04-01

This book chapter, to be published in Advances in Computers, Volume 78, in 2010 describes applications of data intensive computing (DIC). This is an invited chapter resulting from a previous publication on DIC. This work summarizes efforts coming out of the PNNL's Data Intensive Computing Initiative. Advances in technology have empowered individuals with the ability to generate digital content with mouse clicks and voice commands. Digital pictures, emails, text messages, home videos, audio, and webpages are common examples of digital content that are generated on a regular basis. Data intensive computing facilitates human understanding of complex problems. Data-intensive applications providemore » timely and meaningful analytical results in response to exponentially growing data complexity and associated analysis requirements through the development of new classes of software, algorithms, and hardware.« less
OCT angiography by absolute intensity difference applied to normal and diseased human retinas

PubMed Central

Ruminski, Daniel; Sikorski, Bartosz L.; Bukowska, Danuta; Szkulmowski, Maciej; Krawiec, Krzysztof; Malukiewicz, Grazyna; Bieganowski, Lech; Wojtkowski, Maciej

2015-01-01

We compare four optical coherence tomography techniques for noninvasive visualization of microcapillary network in the human retina and murine cortex. We perform phantom studies to investigate contrast-to-noise ratio for angiographic images obtained with each of the algorithm. We show that the computationally simplest absolute intensity difference angiographic OCT algorithm that bases only on two cross-sectional intensity images may be successfully used in clinical study of healthy eyes and eyes with diabetic maculopathy and branch retinal vein occlusion. PMID:26309740
Accelerating artificial intelligence with reconfigurable computing

NASA Astrophysics Data System (ADS)

Cieszewski, Radoslaw

Reconfigurable computing is emerging as an important area of research in computer architectures and software systems. Many algorithms can be greatly accelerated by placing the computationally intense portions of an algorithm into reconfigurable hardware. Reconfigurable computing combines many benefits of both software and ASIC implementations. Like software, the mapped circuit is flexible, and can be changed over the lifetime of the system. Similar to an ASIC, reconfigurable systems provide a method to map circuits into hardware. Reconfigurable systems therefore have the potential to achieve far greater performance than software as a result of bypassing the fetch-decode-execute operations of traditional processors, and possibly exploiting a greater level of parallelism. Such a field, where there is many different algorithms which can be accelerated, is an artificial intelligence. This paper presents example hardware implementations of Artificial Neural Networks, Genetic Algorithms and Expert Systems.
Approximation algorithms for planning and control

NASA Technical Reports Server (NTRS)

Boddy, Mark; Dean, Thomas

1989-01-01

A control system operating in a complex environment will encounter a variety of different situations, with varying amounts of time available to respond to critical events. Ideally, such a control system will do the best possible with the time available. In other words, its responses should approximate those that would result from having unlimited time for computation, where the degree of the approximation depends on the amount of time it actually has. There exist approximation algorithms for a wide variety of problems. Unfortunately, the solution to any reasonably complex control problem will require solving several computationally intensive problems. Algorithms for successive approximation are a subclass of the class of anytime algorithms, algorithms that return answers for any amount of computation time, where the answers improve as more time is allotted. An architecture is described for allocating computation time to a set of anytime algorithms, based on expectations regarding the value of the answers they return. The architecture described is quite general, producing optimal schedules for a set of algorithms under widely varying conditions.
Accelerated Adaptive MGS Phase Retrieval

NASA Technical Reports Server (NTRS)

Lam, Raymond K.; Ohara, Catherine M.; Green, Joseph J.; Bikkannavar, Siddarayappa A.; Basinger, Scott A.; Redding, David C.; Shi, Fang

2011-01-01

The Modified Gerchberg-Saxton (MGS) algorithm is an image-based wavefront-sensing method that can turn any science instrument focal plane into a wavefront sensor. MGS characterizes optical systems by estimating the wavefront errors in the exit pupil using only intensity images of a star or other point source of light. This innovative implementation of MGS significantly accelerates the MGS phase retrieval algorithm by using stream-processing hardware on conventional graphics cards. Stream processing is a relatively new, yet powerful, paradigm to allow parallel processing of certain applications that apply single instructions to multiple data (SIMD). These stream processors are designed specifically to support large-scale parallel computing on a single graphics chip. Computationally intensive algorithms, such as the Fast Fourier Transform (FFT), are particularly well suited for this computing environment. This high-speed version of MGS exploits commercially available hardware to accomplish the same objective in a fraction of the original time. The exploit involves performing matrix calculations in nVidia graphic cards. The graphical processor unit (GPU) is hardware that is specialized for computationally intensive, highly parallel computation. From the software perspective, a parallel programming model is used, called CUDA, to transparently scale multicore parallelism in hardware. This technology gives computationally intensive applications access to the processing power of the nVidia GPUs through a C/C++ programming interface. The AAMGS (Accelerated Adaptive MGS) software takes advantage of these advanced technologies, to accelerate the optical phase error characterization. With a single PC that contains four nVidia GTX-280 graphic cards, the new implementation can process four images simultaneously to produce a JWST (James Webb Space Telescope) wavefront measurement 60 times faster than the previous code.
Generation of Non-Homogeneous Poisson Processes by Thinning: Programming Considerations and Comparision with Competing Algorithms.

DTIC Science & Technology

1978-12-01

Poisson processes . The method is valid for Poisson processes with any given intensity function. The basic thinning algorithm is modified to exploit several refinements which reduce computer execution time by approximately one-third. The basic and modified thinning programs are compared with the Poisson decomposition and gap-statistics algorithm, which is easily implemented for Poisson processes with intensity functions of the form exp(a sub 0 + a sub 1t + a sub 2 t-squared. The thinning programs are competitive in both execution
Mathematical algorithm development and parametric studies with the GEOFRAC three-dimensional stochastic model of natural rock fracture systems

NASA Astrophysics Data System (ADS)

Ivanova, Violeta M.; Sousa, Rita; Murrihy, Brian; Einstein, Herbert H.

2014-06-01

This paper presents results from research conducted at MIT during 2010-2012 on modeling of natural rock fracture systems with the GEOFRAC three-dimensional stochastic model. Following a background summary of discrete fracture network models and a brief introduction of GEOFRAC, the paper provides a thorough description of the newly developed mathematical and computer algorithms for fracture intensity, aperture, and intersection representation, which have been implemented in MATLAB. The new methods optimize, in particular, the representation of fracture intensity in terms of cumulative fracture area per unit volume, P32, via the Poisson-Voronoi Tessellation of planes into polygonal fracture shapes. In addition, fracture apertures now can be represented probabilistically or deterministically whereas the newly implemented intersection algorithms allow for computing discrete pathways of interconnected fractures. In conclusion, results from a statistical parametric study, which was conducted with the enhanced GEOFRAC model and the new MATLAB-based Monte Carlo simulation program FRACSIM, demonstrate how fracture intensity, size, and orientations influence fracture connectivity.
PNNLs Data Intensive Computing research battles Homeland Security threats

ScienceCinema

David Thurman; Joe Kielman; Katherine Wolf; David Atkinson

2018-05-11

The Pacific Northwest National Laboratorys (PNNL's) approach to data intensive computing (DIC) is focused on three key research areas: hybrid hardware architecture, software architectures, and analytic algorithms. Advancements in these areas will help to address, and solve, DIC issues associated with capturing, managing, analyzing and understanding, in near real time, data at volumes and rates that push the frontiers of current technologies.
PNNL pushing scientific discovery through data intensive computing breakthroughs

ScienceCinema

Deborah Gracio; David Koppenaal; Ruby Leung

2018-05-18

The Pacific Northwest National Laboratory's approach to data intensive computing (DIC) is focused on three key research areas: hybrid hardware architectures, software architectures, and analytic algorithms. Advancements in these areas will help to address, and solve, DIC issues associated with capturing, managing, analyzing and understanding, in near real time, data at volumes and rates that push the frontiers of current technologies.
Accelerating Dust Storm Simulation by Balancing Task Allocation in Parallel Computing Environment

NASA Astrophysics Data System (ADS)

Gui, Z.; Yang, C.; XIA, J.; Huang, Q.; YU, M.

2013-12-01

Dust storm has serious negative impacts on environment, human health, and assets. The continuing global climate change has increased the frequency and intensity of dust storm in the past decades. To better understand and predict the distribution, intensity and structure of dust storm, a series of dust storm models have been developed, such as Dust Regional Atmospheric Model (DREAM), the NMM meteorological module (NMM-dust) and Chinese Unified Atmospheric Chemistry Environment for Dust (CUACE/Dust). The developments and applications of these models have contributed significantly to both scientific research and our daily life. However, dust storm simulation is a data and computing intensive process. Normally, a simulation for a single dust storm event may take several days or hours to run. It seriously impacts the timeliness of prediction and potential applications. To speed up the process, high performance computing is widely adopted. By partitioning a large study area into small subdomains according to their geographic location and executing them on different computing nodes in a parallel fashion, the computing performance can be significantly improved. Since spatiotemporal correlations exist in the geophysical process of dust storm simulation, each subdomain allocated to a node need to communicate with other geographically adjacent subdomains to exchange data. Inappropriate allocations may introduce imbalance task loads and unnecessary communications among computing nodes. Therefore, task allocation method is the key factor, which may impact the feasibility of the paralleling. The allocation algorithm needs to carefully leverage the computing cost and communication cost for each computing node to minimize total execution time and reduce overall communication cost for the entire system. This presentation introduces two algorithms for such allocation and compares them with evenly distributed allocation method. Specifically, 1) In order to get optimized solutions, a quadratic programming based modeling method is proposed. This algorithm performs well with small amount of computing tasks. However, its efficiency decreases significantly as the subdomain number and computing node number increase. 2) To compensate performance decreasing for large scale tasks, a K-Means clustering based algorithm is introduced. Instead of dedicating to get optimized solutions, this method can get relatively good feasible solutions within acceptable time. However, it may introduce imbalance communication for nodes or node-isolated subdomains. This research shows both two algorithms have their own strength and weakness for task allocation. A combination of the two algorithms is under study to obtain a better performance. Keywords: Scheduling; Parallel Computing; Load Balance; Optimization; Cost Model
Point spread function based classification of regions for linear digital tomosynthesis

NASA Astrophysics Data System (ADS)

Israni, Kenny; Avinash, Gopal; Li, Baojun

2007-03-01

In digital tomosynthesis, one of the limitations is the presence of out-of-plane blur due to the limited angle acquisition. The point spread function (PSF) characterizes blur in the imaging volume, and is shift-variant in tomosynthesis. The purpose of this research is to classify the tomosynthesis imaging volume into four different categories based on PSF-driven focus criteria. We considered linear tomosynthesis geometry and simple back projection algorithm for reconstruction. The three-dimensional PSF at every pixel in the imaging volume was determined. Intensity profiles were computed for every pixel by integrating the PSF-weighted intensities contained within the line segment defined by the PSF, at each slice. Classification rules based on these intensity profiles were used to categorize image regions. At background and low-frequency pixels, the derived intensity profiles were flat curves with relatively low and high maximum intensities respectively. At in-focus pixels, the maximum intensity of the profiles coincided with the PSF-weighted intensity of the pixel. At out-of-focus pixels, the PSF-weighted intensity of the pixel was always less than the maximum intensity of the profile. We validated our method using human observer classified regions as gold standard. Based on the computed and manual classifications, the mean sensitivity and specificity of the algorithm were 77+/-8.44% and 91+/-4.13% respectively (t=-0.64, p=0.56, DF=4). Such a classification algorithm may assist in mitigating out-of-focus blur from tomosynthesis image slices.
On the development of efficient algorithms for three dimensional fluid flow

NASA Technical Reports Server (NTRS)

Maccormack, R. W.

1988-01-01

The difficulties of constructing efficient algorithms for three-dimensional flow are discussed. Reasonable candidates are analyzed and tested, and most are found to have obvious shortcomings. Yet, there is promise that an efficient class of algorithms exist between the severely time-step sized-limited explicit or approximately factored algorithms and the computationally intensive direct inversion of large sparse matrices by Gaussian elimination.
Computer Generated Holography with Intensity-Graded Patterns

PubMed Central

Conti, Rossella; Assayag, Osnath; de Sars, Vincent; Guillon, Marc; Emiliani, Valentina

2016-01-01

Computer Generated Holography achieves patterned illumination at the sample plane through phase modulation of the laser beam at the objective back aperture. This is obtained by using liquid crystal-based spatial light modulators (LC-SLMs), which modulate the spatial phase of the incident laser beam. A variety of algorithms is employed to calculate the phase modulation masks addressed to the LC-SLM. These algorithms range from simple gratings-and-lenses to generate multiple diffraction-limited spots, to iterative Fourier-transform algorithms capable of generating arbitrary illumination shapes perfectly tailored on the base of the target contour. Applications for holographic light patterning include multi-trap optical tweezers, patterned voltage imaging and optical control of neuronal excitation using uncaging or optogenetics. These past implementations of computer generated holography used binary input profile to generate binary light distribution at the sample plane. Here we demonstrate that using graded input sources, enables generating intensity graded light patterns and extend the range of application of holographic light illumination. At first, we use intensity-graded holograms to compensate for LC-SLM position dependent diffraction efficiency or sample fluorescence inhomogeneity. Finally we show that intensity-graded holography can be used to equalize photo evoked currents from cells expressing different levels of chanelrhodopsin2 (ChR2), one of the most commonly used optogenetics light gated channels, taking into account the non-linear dependence of channel opening on incident light. PMID:27799896
A vectorized Lanczos eigensolver for high-performance computers

NASA Technical Reports Server (NTRS)

Bostic, Susan W.

1990-01-01

The computational strategies used to implement a Lanczos-based-method eigensolver on the latest generation of supercomputers are described. Several examples of structural vibration and buckling problems are presented that show the effects of using optimization techniques to increase the vectorization of the computational steps. The data storage and access schemes and the tools and strategies that best exploit the computer resources are presented. The method is implemented on the Convex C220, the Cray 2, and the Cray Y-MP computers. Results show that very good computation rates are achieved for the most computationally intensive steps of the Lanczos algorithm and that the Lanczos algorithm is many times faster than other methods extensively used in the past.
A parallel algorithm for the two-dimensional time fractional diffusion equation with implicit difference method.

PubMed

Gong, Chunye; Bao, Weimin; Tang, Guojian; Jiang, Yuewen; Liu, Jie

2014-01-01

It is very time consuming to solve fractional differential equations. The computational complexity of two-dimensional fractional differential equation (2D-TFDE) with iterative implicit finite difference method is O(M(x)M(y)N(2)). In this paper, we present a parallel algorithm for 2D-TFDE and give an in-depth discussion about this algorithm. A task distribution model and data layout with virtual boundary are designed for this parallel algorithm. The experimental results show that the parallel algorithm compares well with the exact solution. The parallel algorithm on single Intel Xeon X5540 CPU runs 3.16-4.17 times faster than the serial algorithm on single CPU core. The parallel efficiency of 81 processes is up to 88.24% compared with 9 processes on a distributed memory cluster system. We do think that the parallel computing technology will become a very basic method for the computational intensive fractional applications in the near future.
Lanczos eigensolution method for high-performance computers

NASA Technical Reports Server (NTRS)

Bostic, Susan W.

1991-01-01

The theory, computational analysis, and applications are presented of a Lanczos algorithm on high performance computers. The computationally intensive steps of the algorithm are identified as: the matrix factorization, the forward/backward equation solution, and the matrix vector multiples. These computational steps are optimized to exploit the vector and parallel capabilities of high performance computers. The savings in computational time from applying optimization techniques such as: variable band and sparse data storage and access, loop unrolling, use of local memory, and compiler directives are presented. Two large scale structural analysis applications are described: the buckling of a composite blade stiffened panel with a cutout, and the vibration analysis of a high speed civil transport. The sequential computational time for the panel problem executed on a CONVEX computer of 181.6 seconds was decreased to 14.1 seconds with the optimized vector algorithm. The best computational time of 23 seconds for the transport problem with 17,000 degs of freedom was on the the Cray-YMP using an average of 3.63 processors.
Image restoration for three-dimensional fluorescence microscopy using an orthonormal basis for efficient representation of depth-variant point-spread functions

PubMed Central

Patwary, Nurmohammed; Preza, Chrysanthe

2015-01-01

A depth-variant (DV) image restoration algorithm for wide field fluorescence microscopy, using an orthonormal basis decomposition of DV point-spread functions (PSFs), is investigated in this study. The efficient PSF representation is based on a previously developed principal component analysis (PCA), which is computationally intensive. We present an approach developed to reduce the number of DV PSFs required for the PCA computation, thereby making the PCA-based approach computationally tractable for thick samples. Restoration results from both synthetic and experimental images show consistency and that the proposed algorithm addresses efficiently depth-induced aberration using a small number of principal components. Comparison of the PCA-based algorithm with a previously-developed strata-based DV restoration algorithm demonstrates that the proposed method improves performance by 50% in terms of accuracy and simultaneously reduces the processing time by 64% using comparable computational resources. PMID:26504634

Comparison of optimization algorithms in intensity-modulated radiation therapy planning

NASA Astrophysics Data System (ADS)

Kendrick, Rachel

Intensity-modulated radiation therapy is used to better conform the radiation dose to the target, which includes avoiding healthy tissue. Planning programs employ optimization methods to search for the best fluence of each photon beam, and therefore to create the best treatment plan. The Computational Environment for Radiotherapy Research (CERR), a program written in MATLAB, was used to examine some commonly-used algorithms for one 5-beam plan. Algorithms include the genetic algorithm, quadratic programming, pattern search, constrained nonlinear optimization, simulated annealing, the optimization method used in Varian EclipseTM, and some hybrids of these. Quadratic programing, simulated annealing, and a quadratic/simulated annealing hybrid were also separately compared using different prescription doses. The results of each dose-volume histogram as well as the visual dose color wash were used to compare the plans. CERR's built-in quadratic programming provided the best overall plan, but avoidance of the organ-at-risk was rivaled by other programs. Hybrids of quadratic programming with some of these algorithms seems to suggest the possibility of better planning programs, as shown by the improved quadratic/simulated annealing plan when compared to the simulated annealing algorithm alone. Further experimentation will be done to improve cost functions and computational time.
Medical image registration by combining global and local information: a chain-type diffeomorphic demons algorithm.

PubMed

Liu, Xiaozheng; Yuan, Zhenming; Zhu, Junming; Xu, Dongrong

2013-12-07

The demons algorithm is a popular algorithm for non-rigid image registration because of its computational efficiency and simple implementation. The deformation forces of the classic demons algorithm were derived from image gradients by considering the deformation to decrease the intensity dissimilarity between images. However, the methods using the difference of image intensity for medical image registration are easily affected by image artifacts, such as image noise, non-uniform imaging and partial volume effects. The gradient magnitude image is constructed from the local information of an image, so the difference in a gradient magnitude image can be regarded as more reliable and robust for these artifacts. Then, registering medical images by considering the differences in both image intensity and gradient magnitude is a straightforward selection. In this paper, based on a diffeomorphic demons algorithm, we propose a chain-type diffeomorphic demons algorithm by combining the differences in both image intensity and gradient magnitude for medical image registration. Previous work had shown that the classic demons algorithm can be considered as an approximation of a second order gradient descent on the sum of the squared intensity differences. By optimizing the new dissimilarity criteria, we also present a set of new demons forces which were derived from the gradients of the image and gradient magnitude image. We show that, in controlled experiments, this advantage is confirmed, and yields a fast convergence.
An S N Algorithm for Modern Architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, Randal Scott

2016-08-29

LANL discrete ordinates transport packages are required to perform large, computationally intensive time-dependent calculations on massively parallel architectures, where even a single such calculation may need many months to complete. While KBA methods scale out well to very large numbers of compute nodes, we are limited by practical constraints on the number of such nodes we can actually apply to any given calculation. Instead, we describe a modified KBA algorithm that allows realization of the reductions in solution time offered by both the current, and future, architectural changes within a compute node.
Efficient mapping algorithms for scheduling robot inverse dynamics computation on a multiprocessor system

NASA Technical Reports Server (NTRS)

Lee, C. S. G.; Chen, C. L.

1989-01-01

Two efficient mapping algorithms for scheduling the robot inverse dynamics computation consisting of m computational modules with precedence relationship to be executed on a multiprocessor system consisting of p identical homogeneous processors with processor and communication costs to achieve minimum computation time are presented. An objective function is defined in terms of the sum of the processor finishing time and the interprocessor communication time. The minimax optimization is performed on the objective function to obtain the best mapping. This mapping problem can be formulated as a combination of the graph partitioning and the scheduling problems; both have been known to be NP-complete. Thus, to speed up the searching for a solution, two heuristic algorithms were proposed to obtain fast but suboptimal mapping solutions. The first algorithm utilizes the level and the communication intensity of the task modules to construct an ordered priority list of ready modules and the module assignment is performed by a weighted bipartite matching algorithm. For a near-optimal mapping solution, the problem can be solved by the heuristic algorithm with simulated annealing. These proposed optimization algorithms can solve various large-scale problems within a reasonable time. Computer simulations were performed to evaluate and verify the performance and the validity of the proposed mapping algorithms. Finally, experiments for computing the inverse dynamics of a six-jointed PUMA-like manipulator based on the Newton-Euler dynamic equations were implemented on an NCUBE/ten hypercube computer to verify the proposed mapping algorithms. Computer simulation and experimental results are compared and discussed.
Scalable software-defined optical networking with high-performance routing and wavelength assignment algorithms.

PubMed

Lee, Chankyun; Cao, Xiaoyuan; Yoshikane, Noboru; Tsuritani, Takehiro; Rhee, June-Koo Kevin

2015-10-19

The feasibility of software-defined optical networking (SDON) for a practical application critically depends on scalability of centralized control performance. The paper, highly scalable routing and wavelength assignment (RWA) algorithms are investigated on an OpenFlow-based SDON testbed for proof-of-concept demonstration. Efficient RWA algorithms are proposed to achieve high performance in achieving network capacity with reduced computation cost, which is a significant attribute in a scalable centralized-control SDON. The proposed heuristic RWA algorithms differ in the orders of request processes and in the procedures of routing table updates. Combined in a shortest-path-based routing algorithm, a hottest-request-first processing policy that considers demand intensity and end-to-end distance information offers both the highest throughput of networks and acceptable computation scalability. We further investigate trade-off relationship between network throughput and computation complexity in routing table update procedure by a simulation study.
An acceleration framework for synthetic aperture radar algorithms

NASA Astrophysics Data System (ADS)

Kim, Youngsoo; Gloster, Clay S.; Alexander, Winser E.

2017-04-01

Algorithms for radar signal processing, such as Synthetic Aperture Radar (SAR) are computationally intensive and require considerable execution time on a general purpose processor. Reconfigurable logic can be used to off-load the primary computational kernel onto a custom computing machine in order to reduce execution time by an order of magnitude as compared to kernel execution on a general purpose processor. Specifically, Field Programmable Gate Arrays (FPGAs) can be used to accelerate these kernels using hardware-based custom logic implementations. In this paper, we demonstrate a framework for algorithm acceleration. We used SAR as a case study to illustrate the potential for algorithm acceleration offered by FPGAs. Initially, we profiled the SAR algorithm and implemented a homomorphic filter using a hardware implementation of the natural logarithm. Experimental results show a linear speedup by adding reasonably small processing elements in Field Programmable Gate Array (FPGA) as opposed to using a software implementation running on a typical general purpose processor.
Volumetric visualization algorithm development for an FPGA-based custom computing machine

NASA Astrophysics Data System (ADS)

Sallinen, Sami J.; Alakuijala, Jyrki; Helminen, Hannu; Laitinen, Joakim

1998-05-01

Rendering volumetric medical images is a burdensome computational task for contemporary computers due to the large size of the data sets. Custom designed reconfigurable hardware could considerably speed up volume visualization if an algorithm suitable for the platform is used. We present an algorithm and speedup techniques for visualizing volumetric medical CT and MR images with a custom-computing machine based on a Field Programmable Gate Array (FPGA). We also present simulated performance results of the proposed algorithm calculated with a software implementation running on a desktop PC. Our algorithm is capable of generating perspective projection renderings of single and multiple isosurfaces with transparency, simulated X-ray images, and Maximum Intensity Projections (MIP). Although more speedup techniques exist for parallel projection than for perspective projection, we have constrained ourselves to perspective viewing, because of its importance in the field of radiotherapy. The algorithm we have developed is based on ray casting, and the rendering is sped up by three different methods: shading speedup by gradient precalculation, a new generalized version of Ray-Acceleration by Distance Coding (RADC), and background ray elimination by speculative ray selection.
Parallel Computing Strategies for Irregular Algorithms

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)

2002-01-01

Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.
Development of a voltage-dependent current noise algorithm for conductance-based stochastic modelling of auditory nerve fibres.

PubMed

Badenhorst, Werner; Hanekom, Tania; Hanekom, Johan J

2016-12-01

This study presents the development of an alternative noise current term and novel voltage-dependent current noise algorithm for conductance-based stochastic auditory nerve fibre (ANF) models. ANFs are known to have significant variance in threshold stimulus which affects temporal characteristics such as latency. This variance is primarily caused by the stochastic behaviour or microscopic fluctuations of the node of Ranvier's voltage-dependent sodium channels of which the intensity is a function of membrane voltage. Though easy to implement and low in computational cost, existing current noise models have two deficiencies: it is independent of membrane voltage, and it is unable to inherently determine the noise intensity required to produce in vivo measured discharge probability functions. The proposed algorithm overcomes these deficiencies while maintaining its low computational cost and ease of implementation compared to other conductance and Markovian-based stochastic models. The algorithm is applied to a Hodgkin-Huxley-based compartmental cat ANF model and validated via comparison of the threshold probability and latency distributions to measured cat ANF data. Simulation results show the algorithm's adherence to in vivo stochastic fibre characteristics such as an exponential relationship between the membrane noise and transmembrane voltage, a negative linear relationship between the log of the relative spread of the discharge probability and the log of the fibre diameter and a decrease in latency with an increase in stimulus intensity.
Phase retrieval from intensity-only data by relative entropy minimization.

PubMed

Deming, Ross W

2007-11-01

A recursive algorithm, which appears to be new, is presented for estimating the amplitude and phase of a wave field from intensity-only measurements on two or more scan planes at different axial positions. The problem is framed as a nonlinear optimization, in which the angular spectrum of the complex field model is adjusted in order to minimize the relative entropy, or Kullback-Leibler divergence, between the measured and reconstructed intensities. The most common approach to this so-called phase retrieval problem is a variation of the well-known Gerchberg-Saxton algorithm devised by Misell (J. Phys. D6, L6, 1973), which is efficient and extremely simple to implement. The new algorithm has a computational structure that is very similar to Misell's approach, despite the fundamental difference in the optimization criteria used for each. Based upon results from noisy simulated data, the new algorithm appears to be more robust than Misell's approach and to produce better results from low signal-to-noise ratio data. The convergence of the new algorithm is examined.
Investigation of the effect of atmospheric dust on the determination of total ozone from the earth's ultraviolet reflectivity measurements, 1

NASA Technical Reports Server (NTRS)

Dave, J. V.

1976-01-01

Two computer algorithms are described. These algorithms were used for computing the aximuth-independent component of the intensity of the monochromatic radiation emerging at the top of a pseudo-spherical atmosphere with arbitrary vertical distribution of ozone, and with any arbitrary height distribution of up to two different kinds of aerosol. This atmospheric model was assumed to rest on a surface obeying Lambert's law of reflection.
Robust incremental compensation of the light attenuation with depth in 3D fluorescence microscopy.

PubMed

Kervrann, C; Legland, D; Pardini, L

2004-06-01

Summary Fluorescent signal intensities from confocal laser scanning microscopes (CLSM) suffer from several distortions inherent to the method. Namely, layers which lie deeper within the specimen are relatively dark due to absorption and scattering of both excitation and fluorescent light, photobleaching and/or other factors. Because of these effects, a quantitative analysis of images is not always possible without correction. Under certain assumptions, the decay of intensities can be estimated and used for a partial depth intensity correction. In this paper we propose an original robust incremental method for compensating the attenuation of intensity signals. Most previous correction methods are more or less empirical and based on fitting a decreasing parametric function to the section mean intensity curve computed by summing all pixel values in each section. The fitted curve is then used for the calculation of correction factors for each section and a new compensated sections series is computed. However, these methods do not perfectly correct the images. Hence, the algorithm we propose for the automatic correction of intensities relies on robust estimation, which automatically ignores pixels where measurements deviate from the decay model. It is based on techniques adopted from the computer vision literature for image motion estimation. The resulting algorithm is used to correct volumes acquired in CLSM. An implementation of such a restoration filter is discussed and examples of successful restorations are given.
Low-Light Image Enhancement Using Adaptive Digital Pixel Binning

PubMed Central

Yoo, Yoonjong; Im, Jaehyun; Paik, Joonki

2015-01-01

This paper presents an image enhancement algorithm for low-light scenes in an environment with insufficient illumination. Simple amplification of intensity exhibits various undesired artifacts: noise amplification, intensity saturation, and loss of resolution. In order to enhance low-light images without undesired artifacts, a novel digital binning algorithm is proposed that considers brightness, context, noise level, and anti-saturation of a local region in the image. The proposed algorithm does not require any modification of the image sensor or additional frame-memory; it needs only two line-memories in the image signal processor (ISP). Since the proposed algorithm does not use an iterative computation, it can be easily embedded in an existing digital camera ISP pipeline containing a high-resolution image sensor. PMID:26121609
An efficient and accurate 3D displacements tracking strategy for digital volume correlation

NASA Astrophysics Data System (ADS)

Pan, Bing; Wang, Bo; Wu, Dafang; Lubineau, Gilles

2014-07-01

Owing to its inherent computational complexity, practical implementation of digital volume correlation (DVC) for internal displacement and strain mapping faces important challenges in improving its computational efficiency. In this work, an efficient and accurate 3D displacement tracking strategy is proposed for fast DVC calculation. The efficiency advantage is achieved by using three improvements. First, to eliminate the need of updating Hessian matrix in each iteration, an efficient 3D inverse compositional Gauss-Newton (3D IC-GN) algorithm is introduced to replace existing forward additive algorithms for accurate sub-voxel displacement registration. Second, to ensure the 3D IC-GN algorithm that converges accurately and rapidly and avoid time-consuming integer-voxel displacement searching, a generalized reliability-guided displacement tracking strategy is designed to transfer accurate and complete initial guess of deformation for each calculation point from its computed neighbors. Third, to avoid the repeated computation of sub-voxel intensity interpolation coefficients, an interpolation coefficient lookup table is established for tricubic interpolation. The computational complexity of the proposed fast DVC and the existing typical DVC algorithms are first analyzed quantitatively according to necessary arithmetic operations. Then, numerical tests are performed to verify the performance of the fast DVC algorithm in terms of measurement accuracy and computational efficiency. The experimental results indicate that, compared with the existing DVC algorithm, the presented fast DVC algorithm produces similar precision and slightly higher accuracy at a substantially reduced computational cost.
The Impact of Monte Carlo Dose Calculations on Intensity-Modulated Radiation Therapy

NASA Astrophysics Data System (ADS)

Siebers, J. V.; Keall, P. J.; Mohan, R.

The effect of dose calculation accuracy for IMRT was studied by comparing different dose calculation algorithms. A head and neck IMRT plan was optimized using a superposition dose calculation algorithm. Dose was re-computed for the optimized plan using both Monte Carlo and pencil beam dose calculation algorithms to generate patient and phantom dose distributions. Tumor control probabilities (TCP) and normal tissue complication probabilities (NTCP) were computed to estimate the plan outcome. For the treatment plan studied, Monte Carlo best reproduces phantom dose measurements, the TCP was slightly lower than the superposition and pencil beam results, and the NTCP values differed little.
TORC3: Token-ring clearing heuristic for currency circulation

NASA Astrophysics Data System (ADS)

Humes, Carlos, Jr.; Lauretto, Marcelo S.; Nakano, Fábio; Pereira, Carlos A. B.; Rafare, Guilherme F. G.; Stern, Julio Michael

2012-10-01

Clearing algorithms are at the core of modern payment systems, facilitating the settling of multilateral credit messages with (near) minimum transfers of currency. Traditional clearing procedures use batch processing based on MILP - mixed-integer linear programming algorithms. The MILP approach demands intensive computational resources; moreover, it is also vulnerable to operational risks generated by possible defaults during the inter-batch period. This paper presents TORC3 - the Token-Ring Clearing Algorithm for Currency Circulation. In contrast to the MILP approach, TORC3 is a real time heuristic procedure, demanding modest computational resources, and able to completely shield the clearing operation against the participating agents' risk of default.
Computational Discovery of Materials Using the Firefly Algorithm

NASA Astrophysics Data System (ADS)

Avendaño-Franco, Guillermo; Romero, Aldo

Our current ability to model physical phenomena accurately, the increase computational power and better algorithms are the driving forces behind the computational discovery and design of novel materials, allowing for virtual characterization before their realization in the laboratory. We present the implementation of a novel firefly algorithm, a population-based algorithm for global optimization for searching the structure/composition space. This novel computation-intensive approach naturally take advantage of concurrency, targeted exploration and still keeping enough diversity. We apply the new method in both periodic and non-periodic structures and we present the implementation challenges and solutions to improve efficiency. The implementation makes use of computational materials databases and network analysis to optimize the search and get insights about the geometric structure of local minima on the energy landscape. The method has been implemented in our software PyChemia, an open-source package for materials discovery. We acknowledge the support of DMREF-NSF 1434897 and the Donors of the American Chemical Society Petroleum Research Fund for partial support of this research under Contract 54075-ND10.
The importance of ray pathlengths when measuring objects in maximum intensity projection images.

PubMed

Schreiner, S; Dawant, B M; Paschal, C B; Galloway, R L

1996-01-01

It is important to understand any process that affects medical data. Once the data have changed from the original form, one must consider the possibility that the information contained in the data has also changed. In general, false negative and false positive diagnoses caused by this post-processing must be minimized. Medical imaging is one area in which post-processing is commonly performed, but there is often little or no discussion of how these algorithms affect the data. This study uncovers some interesting properties of maximum intensity projection (MIP) algorithms which are commonly used in the post-processing of magnetic resonance (MR) and computed tomography (CT) angiographic data. The appearance of the width of vessels and the extent of malformations such as aneurysms is of interest to clinicians. This study will show how MIP algorithms interact with the shape of the object being projected. MIP's can make objects appear thinner in the projection than in the original data set and also alter the shape of the profile of the object seen in the original data. These effects have consequences for width-measuring algorithms which will be discussed. Each projected intensity is dependent upon the pathlength of the ray from which the projected pixel arises. The morphology (shape and intensity profile) of an object will change the pathlength that each ray experiences. This is termed the pathlength effect. In order to demonstrate the pathlength effect, simple computer models of an imaged vessel were created. Additionally, a static MR phantom verified that the derived equation for the projection-plane probability density function (pdf) predicts the projection-plane intensities well (R(2)=0.96). Finally, examples of projections through in vivo MR angiography and CT angiography data are presented.
TU-AB-303-08: GPU-Based Software Platform for Efficient Image-Guided Adaptive Radiation Therapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Park, S; Robinson, A; McNutt, T

2015-06-15

Purpose: In this study, we develop an integrated software platform for adaptive radiation therapy (ART) that combines fast and accurate image registration, segmentation, and dose computation/accumulation methods. Methods: The proposed system consists of three key components; 1) deformable image registration (DIR), 2) automatic segmentation, and 3) dose computation/accumulation. The computationally intensive modules including DIR and dose computation have been implemented on a graphics processing unit (GPU). All required patient-specific data including the planning CT (pCT) with contours, daily cone-beam CTs, and treatment plan are automatically queried and retrieved from their own databases. To improve the accuracy of DIR between pCTmore » and CBCTs, we use the double force demons DIR algorithm in combination with iterative CBCT intensity correction by local intensity histogram matching. Segmentation of daily CBCT is then obtained by propagating contours from the pCT. Daily dose delivered to the patient is computed on the registered pCT by a GPU-accelerated superposition/convolution algorithm. Finally, computed daily doses are accumulated to show the total delivered dose to date. Results: Since the accuracy of DIR critically affects the quality of the other processes, we first evaluated our DIR method on eight head-and-neck cancer cases and compared its performance. Normalized mutual-information (NMI) and normalized cross-correlation (NCC) computed as similarity measures, and our method produced overall NMI of 0.663 and NCC of 0.987, outperforming conventional methods by 3.8% and 1.9%, respectively. Experimental results show that our registration method is more consistent and roust than existing algorithms, and also computationally efficient. Computation time at each fraction took around one minute (30–50 seconds for registration and 15–25 seconds for dose computation). Conclusion: We developed an integrated GPU-accelerated software platform that enables accurate and efficient DIR, auto-segmentation, and dose computation, thus supporting an efficient ART workflow. This work was supported by NIH/NCI under grant R42CA137886.« less
A micro-hydrology computation ordering algorithm

NASA Astrophysics Data System (ADS)

Croley, Thomas E.

1980-11-01

Discrete-distributed-parameter models are essential for watershed modelling where practical consideration of spatial variations in watershed properties and inputs is desired. Such modelling is necessary for analysis of detailed hydrologic impacts from management strategies and land-use effects. Trade-offs between model validity and model complexity exist in resolution of the watershed. Once these are determined, the watershed is then broken into sub-areas which each have essentially spatially-uniform properties. Lumped-parameter (micro-hydrology) models are applied to these sub-areas and their outputs are combined through the use of a computation ordering technique, as illustrated by many discrete-distributed-parameter hydrology models. Manual ordering of these computations requires fore-thought, and is tedious, error prone, sometimes storage intensive and least adaptable to changes in watershed resolution. A programmable algorithm for ordering micro-hydrology computations is presented that enables automatic ordering of computations within the computer via an easily understood and easily implemented "node" definition, numbering and coding scheme. This scheme and the algorithm are detailed in logic flow-charts and an example application is presented. Extensions and modifications of the algorithm are easily made for complex geometries or differing microhydrology models. The algorithm is shown to be superior to manual ordering techniques and has potential use in high-resolution studies.

GPU-accelerated phase extraction algorithm for interferograms: a real-time application

NASA Astrophysics Data System (ADS)

Zhu, Xiaoqiang; Wu, Yongqian; Liu, Fengwei

2016-11-01

Optical testing, having the merits of non-destruction and high sensitivity, provides a vital guideline for optical manufacturing. But the testing process is often computationally intensive and expensive, usually up to a few seconds, which is sufferable for dynamic testing. In this paper, a GPU-accelerated phase extraction algorithm is proposed, which is based on the advanced iterative algorithm. The accelerated algorithm can extract the right phase-distribution from thirteen 1024x1024 fringe patterns with arbitrary phase shifts in 233 milliseconds on average using NVIDIA Quadro 4000 graphic card, which achieved a 12.7x speedup ratio than the same algorithm executed on CPU and 6.6x speedup ratio than that on Matlab using DWANING W5801 workstation. The performance improvement can fulfill the demand of computational accuracy and real-time application.
A Parallel and Incremental Approach for Data-Intensive Learning of Bayesian Networks.

PubMed

Yue, Kun; Fang, Qiyu; Wang, Xiaoling; Li, Jin; Liu, Weiyi

2015-12-01

Bayesian network (BN) has been adopted as the underlying model for representing and inferring uncertain knowledge. As the basis of realistic applications centered on probabilistic inferences, learning a BN from data is a critical subject of machine learning, artificial intelligence, and big data paradigms. Currently, it is necessary to extend the classical methods for learning BNs with respect to data-intensive computing or in cloud environments. In this paper, we propose a parallel and incremental approach for data-intensive learning of BNs from massive, distributed, and dynamically changing data by extending the classical scoring and search algorithm and using MapReduce. First, we adopt the minimum description length as the scoring metric and give the two-pass MapReduce-based algorithms for computing the required marginal probabilities and scoring the candidate graphical model from sample data. Then, we give the corresponding strategy for extending the classical hill-climbing algorithm to obtain the optimal structure, as well as that for storing a BN by pairs. Further, in view of the dynamic characteristics of the changing data, we give the concept of influence degree to measure the coincidence of the current BN with new data, and then propose the corresponding two-pass MapReduce-based algorithms for BNs incremental learning. Experimental results show the efficiency, scalability, and effectiveness of our methods.
A Parallel Point Matching Algorithm for Landmark Based Image Registration Using Multicore Platform

PubMed Central

Yang, Lin; Gong, Leiguang; Zhang, Hong; Nosher, John L.; Foran, David J.

2013-01-01

Point matching is crucial for many computer vision applications. Establishing the correspondence between a large number of data points is a computationally intensive process. Some point matching related applications, such as medical image registration, require real time or near real time performance if applied to critical clinical applications like image assisted surgery. In this paper, we report a new multicore platform based parallel algorithm for fast point matching in the context of landmark based medical image registration. We introduced a non-regular data partition algorithm which utilizes the K-means clustering algorithm to group the landmarks based on the number of available processing cores, which optimize the memory usage and data transfer. We have tested our method using the IBM Cell Broadband Engine (Cell/B.E.) platform. The results demonstrated a significant speed up over its sequential implementation. The proposed data partition and parallelization algorithm, though tested only on one multicore platform, is generic by its design. Therefore the parallel algorithm can be extended to other computing platforms, as well as other point matching related applications. PMID:24308014
Accelerated computer generated holography using sparse bases in the STFT domain.

PubMed

Blinder, David; Schelkens, Peter

2018-01-22

Computer-generated holography at high resolutions is a computationally intensive task. Efficient algorithms are needed to generate holograms at acceptable speeds, especially for real-time and interactive applications such as holographic displays. We propose a novel technique to generate holograms using a sparse basis representation in the short-time Fourier space combined with a wavefront-recording plane placed in the middle of the 3D object. By computing the point spread functions in the transform domain, we update only a small subset of the precomputed largest-magnitude coefficients to significantly accelerate the algorithm over conventional look-up table methods. We implement the algorithm on a GPU, and report a speedup factor of over 30. We show that this transform is superior over wavelet-based approaches, and show quantitative and qualitative improvements over the state-of-the-art WASABI method; we report accuracy gains of 2dB PSNR, as well improved view preservation.
Three-dimensional spiral CT during arterial portography: comparison of three rendering techniques.

PubMed

Heath, D G; Soyer, P A; Kuszyk, B S; Bliss, D F; Calhoun, P S; Bluemke, D A; Choti, M A; Fishman, E K

1995-07-01

The three most common techniques for three-dimensional reconstruction are surface rendering, maximum-intensity projection (MIP), and volume rendering. Surface-rendering algorithms model objects as collections of geometric primitives that are displayed with surface shading. The MIP algorithm renders an image by selecting the voxel with the maximum intensity signal along a line extended from the viewer's eye through the data volume. Volume-rendering algorithms sum the weighted contributions of all voxels along the line. Each technique has advantages and shortcomings that must be considered during selection of one for a specific clinical problem and during interpretation of the resulting images. With surface rendering, sharp-edged, clear three-dimensional reconstruction can be completed on modest computer systems; however, overlapping structures cannot be visualized and artifacts are a problem. MIP is computationally a fast technique, but it does not allow depiction of overlapping structures, and its images are three-dimensionally ambiguous unless depth cues are provided. Both surface rendering and MIP use less than 10% of the image data. In contrast, volume rendering uses nearly all of the data, allows demonstration of overlapping structures, and engenders few artifacts, but it requires substantially more computer power than the other techniques.
Infrared and visible image fusion based on total variation and augmented Lagrangian.

PubMed

Guo, Hanqi; Ma, Yong; Mei, Xiaoguang; Ma, Jiayi

2017-11-01

This paper proposes a new algorithm for infrared and visible image fusion based on gradient transfer that achieves fusion by preserving the intensity of the infrared image and then transferring gradients in the corresponding visible one to the result. The gradient transfer suffers from the problems of low dynamic range and detail loss because it ignores the intensity from the visible image. The new algorithm solves these problems by providing additive intensity from the visible image to balance the intensity between the infrared image and the visible one. It formulates the fusion task as an l 1 -l 1 -TV minimization problem and then employs variable splitting and augmented Lagrangian to convert the unconstrained problem to a constrained one that can be solved in the framework of alternating the multiplier direction method. Experiments demonstrate that the new algorithm achieves better fusion results with a high computation efficiency in both qualitative and quantitative tests than gradient transfer and most state-of-the-art methods.
Advanced processing for high-bandwidth sensor systems

NASA Astrophysics Data System (ADS)

Szymanski, John J.; Blain, Phil C.; Bloch, Jeffrey J.; Brislawn, Christopher M.; Brumby, Steven P.; Cafferty, Maureen M.; Dunham, Mark E.; Frigo, Janette R.; Gokhale, Maya; Harvey, Neal R.; Kenyon, Garrett; Kim, Won-Ha; Layne, J.; Lavenier, Dominique D.; McCabe, Kevin P.; Mitchell, Melanie; Moore, Kurt R.; Perkins, Simon J.; Porter, Reid B.; Robinson, S.; Salazar, Alfonso; Theiler, James P.; Young, Aaron C.

2000-11-01

Compute performance and algorithm design are key problems of image processing and scientific computing in general. For example, imaging spectrometers are capable of producing data in hundreds of spectral bands with millions of pixels. These data sets show great promise for remote sensing applications, but require new and computationally intensive processing. The goal of the Deployable Adaptive Processing Systems (DAPS) project at Los Alamos National Laboratory is to develop advanced processing hardware and algorithms for high-bandwidth sensor applications. The project has produced electronics for processing multi- and hyper-spectral sensor data, as well as LIDAR data, while employing processing elements using a variety of technologies. The project team is currently working on reconfigurable computing technology and advanced feature extraction techniques, with an emphasis on their application to image and RF signal processing. This paper presents reconfigurable computing technology and advanced feature extraction algorithm work and their application to multi- and hyperspectral image processing. Related projects on genetic algorithms as applied to image processing will be introduced, as will the collaboration between the DAPS project and the DARPA Adaptive Computing Systems program. Further details are presented in other talks during this conference and in other conferences taking place during this symposium.
Benchmarking Memory Performance with the Data Cube Operator

NASA Technical Reports Server (NTRS)

Frumkin, Michael A.; Shabanov, Leonid V.

2004-01-01

Data movement across a computer memory hierarchy and across computational grids is known to be a limiting factor for applications processing large data sets. We use the Data Cube Operator on an Arithmetic Data Set, called ADC, to benchmark capabilities of computers and of computational grids to handle large distributed data sets. We present a prototype implementation of a parallel algorithm for computation of the operatol: The algorithm follows a known approach for computing views from the smallest parent. The ADC stresses all levels of grid memory and storage by producing some of 2d views of an Arithmetic Data Set of d-tuples described by a small number of integers. We control data intensity of the ADC by selecting the tuple parameters, the sizes of the views, and the number of realized views. Benchmarking results of memory performance of a number of computer architectures and of a small computational grid are presented.
Adaptive control of turbulence intensity is accelerated by frugal flow sampling.

PubMed

Quinn, Daniel B; van Halder, Yous; Lentink, David

2017-11-01

The aerodynamic performance of vehicles and animals, as well as the productivity of turbines and energy harvesters, depends on the turbulence intensity of the incoming flow. Previous studies have pointed at the potential benefits of active closed-loop turbulence control. However, it is unclear what the minimal sensory and algorithmic requirements are for realizing this control. Here we show that very low-bandwidth anemometers record sufficient information for an adaptive control algorithm to converge quickly. Our online Newton-Raphson algorithm tunes the turbulence in a recirculating wind tunnel by taking readings from an anemometer in the test section. After starting at 9% turbulence intensity, the algorithm converges on values ranging from 10% to 45% in less than 12 iterations within 1% accuracy. By down-sampling our measurements, we show that very-low-bandwidth anemometers record sufficient information for convergence. Furthermore, down-sampling accelerates convergence by smoothing gradients in turbulence intensity. Our results explain why low-bandwidth anemometers in engineering and mechanoreceptors in biology may be sufficient for adaptive control of turbulence intensity. Finally, our analysis suggests that, if certain turbulent eddy sizes are more important to control than others, frugal adaptive control schemes can be particularly computationally effective for improving performance. © 2017 The Author(s).
Parameters Free Computational Characterization of Defects in Transition Metal Oxides with Diffusion Quantum Monte Carlo

NASA Astrophysics Data System (ADS)

Santana, Juan A.; Krogel, Jaron T.; Kent, Paul R.; Reboredo, Fernando

Materials based on transition metal oxides (TMO's) are among the most challenging systems for computational characterization. Reliable and practical computations are possible by directly solving the many-body problem for TMO's with quantum Monte Carlo (QMC) methods. These methods are very computationally intensive, but recent developments in algorithms and computational infrastructures have enabled their application to real materials. We will show our efforts on the application of the diffusion quantum Monte Carlo (DMC) method to study the formation of defects in binary and ternary TMO and heterostructures of TMO. We will also outline current limitations in hardware and algorithms. This work is supported by the Materials Sciences & Engineering Division of the Office of Basic Energy Sciences, U.S. Department of Energy (DOE).
Real-time implementation of optimized maximum noise fraction transform for feature extraction of hyperspectral images

NASA Astrophysics Data System (ADS)

Wu, Yuanfeng; Gao, Lianru; Zhang, Bing; Zhao, Haina; Li, Jun

2014-01-01

We present a parallel implementation of the optimized maximum noise fraction (G-OMNF) transform algorithm for feature extraction of hyperspectral images on commodity graphics processing units (GPUs). The proposed approach explored the algorithm data-level concurrency and optimized the computing flow. We first defined a three-dimensional grid, in which each thread calculates a sub-block data to easily facilitate the spatial and spectral neighborhood data searches in noise estimation, which is one of the most important steps involved in OMNF. Then, we optimized the processing flow and computed the noise covariance matrix before computing the image covariance matrix to reduce the original hyperspectral image data transmission. These optimization strategies can greatly improve the computing efficiency and can be applied to other feature extraction algorithms. The proposed parallel feature extraction algorithm was implemented on an Nvidia Tesla GPU using the compute unified device architecture and basic linear algebra subroutines library. Through the experiments on several real hyperspectral images, our GPU parallel implementation provides a significant speedup of the algorithm compared with the CPU implementation, especially for highly data parallelizable and arithmetically intensive algorithm parts, such as noise estimation. In order to further evaluate the effectiveness of G-OMNF, we used two different applications: spectral unmixing and classification for evaluation. Considering the sensor scanning rate and the data acquisition time, the proposed parallel implementation met the on-board real-time feature extraction.
GPU-based parallel algorithm for blind image restoration using midfrequency-based methods

NASA Astrophysics Data System (ADS)

Xie, Lang; Luo, Yi-han; Bao, Qi-liang

2013-08-01

GPU-based general-purpose computing is a new branch of modern parallel computing, so the study of parallel algorithms specially designed for GPU hardware architecture is of great significance. In order to solve the problem of high computational complexity and poor real-time performance in blind image restoration, the midfrequency-based algorithm for blind image restoration was analyzed and improved in this paper. Furthermore, a midfrequency-based filtering method is also used to restore the image hardly with any recursion or iteration. Combining the algorithm with data intensiveness, data parallel computing and GPU execution model of single instruction and multiple threads, a new parallel midfrequency-based algorithm for blind image restoration is proposed in this paper, which is suitable for stream computing of GPU. In this algorithm, the GPU is utilized to accelerate the estimation of class-G point spread functions and midfrequency-based filtering. Aiming at better management of the GPU threads, the threads in a grid are scheduled according to the decomposition of the filtering data in frequency domain after the optimization of data access and the communication between the host and the device. The kernel parallelism structure is determined by the decomposition of the filtering data to ensure the transmission rate to get around the memory bandwidth limitation. The results show that, with the new algorithm, the operational speed is significantly increased and the real-time performance of image restoration is effectively improved, especially for high-resolution images.
A low-cost vector processor boosting compute-intensive image processing operations

NASA Technical Reports Server (NTRS)

Adorf, Hans-Martin

1992-01-01

Low-cost vector processing (VP) is within reach of everyone seriously engaged in scientific computing. The advent of affordable add-on VP-boards for standard workstations complemented by mathematical/statistical libraries is beginning to impact compute-intensive tasks such as image processing. A case in point in the restoration of distorted images from the Hubble Space Telescope. A low-cost implementation is presented of the standard Tarasko-Richardson-Lucy restoration algorithm on an Intel i860-based VP-board which is seamlessly interfaced to a commercial, interactive image processing system. First experience is reported (including some benchmarks for standalone FFT's) and some conclusions are drawn.
Optimizing ion channel models using a parallel genetic algorithm on graphical processors.

PubMed

Ben-Shalom, Roy; Aviv, Amit; Razon, Benjamin; Korngreen, Alon

2012-01-01

We have recently shown that we can semi-automatically constrain models of voltage-gated ion channels by combining a stochastic search algorithm with ionic currents measured using multiple voltage-clamp protocols. Although numerically successful, this approach is highly demanding computationally, with optimization on a high performance Linux cluster typically lasting several days. To solve this computational bottleneck we converted our optimization algorithm for work on a graphical processing unit (GPU) using NVIDIA's CUDA. Parallelizing the process on a Fermi graphic computing engine from NVIDIA increased the speed ∼180 times over an application running on an 80 node Linux cluster, considerably reducing simulation times. This application allows users to optimize models for ion channel kinetics on a single, inexpensive, desktop "super computer," greatly reducing the time and cost of building models relevant to neuronal physiology. We also demonstrate that the point of algorithm parallelization is crucial to its performance. We substantially reduced computing time by solving the ODEs (Ordinary Differential Equations) so as to massively reduce memory transfers to and from the GPU. This approach may be applied to speed up other data intensive applications requiring iterative solutions of ODEs. Copyright © 2012 Elsevier B.V. All rights reserved.
A parallel implementation of the network identification by multiple regression (NIR) algorithm to reverse-engineer regulatory gene networks.

PubMed

Gregoretti, Francesco; Belcastro, Vincenzo; di Bernardo, Diego; Oliva, Gennaro

2010-04-21

The reverse engineering of gene regulatory networks using gene expression profile data has become crucial to gain novel biological knowledge. Large amounts of data that need to be analyzed are currently being produced due to advances in microarray technologies. Using current reverse engineering algorithms to analyze large data sets can be very computational-intensive. These emerging computational requirements can be met using parallel computing techniques. It has been shown that the Network Identification by multiple Regression (NIR) algorithm performs better than the other ready-to-use reverse engineering software. However it cannot be used with large networks with thousands of nodes--as is the case in biological networks--due to the high time and space complexity. In this work we overcome this limitation by designing and developing a parallel version of the NIR algorithm. The new implementation of the algorithm reaches a very good accuracy even for large gene networks, improving our understanding of the gene regulatory networks that is crucial for a wide range of biomedical applications.
Monte Carlo evaluation of Acuros XB dose calculation Algorithm for intensity modulated radiation therapy of nasopharyngeal carcinoma

NASA Astrophysics Data System (ADS)

Yeh, Peter C. Y.; Lee, C. C.; Chao, T. C.; Tung, C. J.

2017-11-01

Intensity-modulated radiation therapy is an effective treatment modality for the nasopharyngeal carcinoma. One important aspect of this cancer treatment is the need to have an accurate dose algorithm dealing with the complex air/bone/tissue interface in the head-neck region to achieve the cure without radiation-induced toxicities. The Acuros XB algorithm explicitly solves the linear Boltzmann transport equation in voxelized volumes to account for the tissue heterogeneities such as lungs, bone, air, and soft tissues in the treatment field receiving radiotherapy. With the single beam setup in phantoms, this algorithm has already been demonstrated to achieve the comparable accuracy with Monte Carlo simulations. In the present study, five nasopharyngeal carcinoma patients treated with the intensity-modulated radiation therapy were examined for their dose distributions calculated using the Acuros XB in the planning target volume and the organ-at-risk. Corresponding results of Monte Carlo simulations were computed from the electronic portal image data and the BEAMnrc/DOSXYZnrc code. Analysis of dose distributions in terms of the clinical indices indicated that the Acuros XB was in comparable accuracy with Monte Carlo simulations and better than the anisotropic analytical algorithm for dose calculations in real patients.
Parallelizing flow-accumulation calculations on graphics processing units—From iterative DEM preprocessing algorithm to recursive multiple-flow-direction algorithm

NASA Astrophysics Data System (ADS)

Qin, Cheng-Zhi; Zhan, Lijun

2012-06-01

As one of the important tasks in digital terrain analysis, the calculation of flow accumulations from gridded digital elevation models (DEMs) usually involves two steps in a real application: (1) using an iterative DEM preprocessing algorithm to remove the depressions and flat areas commonly contained in real DEMs, and (2) using a recursive flow-direction algorithm to calculate the flow accumulation for every cell in the DEM. Because both algorithms are computationally intensive, quick calculation of the flow accumulations from a DEM (especially for a large area) presents a practical challenge to personal computer (PC) users. In recent years, rapid increases in hardware capacity of the graphics processing units (GPUs) provided in modern PCs have made it possible to meet this challenge in a PC environment. Parallel computing on GPUs using a compute-unified-device-architecture (CUDA) programming model has been explored to speed up the execution of the single-flow-direction algorithm (SFD). However, the parallel implementation on a GPU of the multiple-flow-direction (MFD) algorithm, which generally performs better than the SFD algorithm, has not been reported. Moreover, GPU-based parallelization of the DEM preprocessing step in the flow-accumulation calculations has not been addressed. This paper proposes a parallel approach to calculate flow accumulations (including both iterative DEM preprocessing and a recursive MFD algorithm) on a CUDA-compatible GPU. For the parallelization of an MFD algorithm (MFD-md), two different parallelization strategies using a GPU are explored. The first parallelization strategy, which has been used in the existing parallel SFD algorithm on GPU, has the problem of computing redundancy. Therefore, we designed a parallelization strategy based on graph theory. The application results show that the proposed parallel approach to calculate flow accumulations on a GPU performs much faster than either sequential algorithms or other parallel GPU-based algorithms based on existing parallelization strategies.
An auto-adaptive optimization approach for targeting nonpoint source pollution control practices.

PubMed

Chen, Lei; Wei, Guoyuan; Shen, Zhenyao

2015-10-21

To solve computationally intensive and technically complex control of nonpoint source pollution, the traditional genetic algorithm was modified into an auto-adaptive pattern, and a new framework was proposed by integrating this new algorithm with a watershed model and an economic module. Although conceptually simple and comprehensive, the proposed algorithm would search automatically for those Pareto-optimality solutions without a complex calibration of optimization parameters. The model was applied in a case study in a typical watershed of the Three Gorges Reservoir area, China. The results indicated that the evolutionary process of optimization was improved due to the incorporation of auto-adaptive parameters. In addition, the proposed algorithm outperformed the state-of-the-art existing algorithms in terms of convergence ability and computational efficiency. At the same cost level, solutions with greater pollutant reductions could be identified. From a scientific viewpoint, the proposed algorithm could be extended to other watersheds to provide cost-effective configurations of BMPs.
Dense depth maps from correspondences derived from perceived motion

NASA Astrophysics Data System (ADS)

Kirby, Richard; Whitaker, Ross

2017-01-01

Many computer vision applications require finding corresponding points between images and using the corresponding points to estimate disparity. Today's correspondence finding algorithms primarily use image features or pixel intensities common between image pairs. Some 3-D computer vision applications, however, do not produce the desired results using correspondences derived from image features or pixel intensities. Two examples are the multimodal camera rig and the center region of a coaxial camera rig. We present an image correspondence finding technique that aligns pairs of image sequences using optical flow fields. The optical flow fields provide information about the structure and motion of the scene, which are not available in still images but can be used in image alignment. We apply the technique to a dual focal length stereo camera rig consisting of a visible light-infrared camera pair and to a coaxial camera rig. We test our method on real image sequences and compare our results with the state-of-the-art multimodal and structure from motion (SfM) algorithms. Our method produces more accurate depth and scene velocity reconstruction estimates than the state-of-the-art multimodal and SfM algorithms.
Semiautomated hybrid algorithm for estimation of three-dimensional liver surface in CT using dynamic cellular automata and level-sets

PubMed Central

Dakua, Sarada Prasad; Abinahed, Julien; Al-Ansari, Abdulla

2015-01-01

Abstract. Liver segmentation continues to remain a major challenge, largely due to its intense complexity with surrounding anatomical structures (stomach, kidney, and heart), high noise level and lack of contrast in pathological computed tomography (CT) data. We present an approach to reconstructing the liver surface in low contrast CT. The main contributions are: (1) a stochastic resonance-based methodology in discrete cosine transform domain is developed to enhance the contrast of pathological liver images, (2) a new formulation is proposed to prevent the object boundary, resulting from the cellular automata method, from leaking into the surrounding areas of similar intensity, and (3) a level-set method is suggested to generate intermediate segmentation contours from two segmented slices distantly located in a subject sequence. We have tested the algorithm on real datasets obtained from two sources, Hamad General Hospital and medical image computing and computer-assisted interventions grand challenge workshop. Various parameters in the algorithm, such as w, Δt, z, α, μ, α1, and α2, play imperative roles, thus their values are precisely selected. Both qualitative and quantitative evaluation performed on liver data show promising segmentation accuracy when compared with ground truth data reflecting the potential of the proposed method. PMID:26158101

A novel approach to multiple sequence alignment using hadoop data grids.

PubMed

Sudha Sadasivam, G; Baktavatchalam, G

2010-01-01

Multiple alignment of protein sequences helps to determine evolutionary linkage and to predict molecular structures. The factors to be considered while aligning multiple sequences are speed and accuracy of alignment. Although dynamic programming algorithms produce accurate alignments, they are computation intensive. In this paper we propose a time efficient approach to sequence alignment that also produces quality alignment. The dynamic nature of the algorithm coupled with data and computational parallelism of hadoop data grids improves the accuracy and speed of sequence alignment. The principle of block splitting in hadoop coupled with its scalability facilitates alignment of very large sequences.
Encryption and decryption algorithm using algebraic matrix approach

NASA Astrophysics Data System (ADS)

Thiagarajan, K.; Balasubramanian, P.; Nagaraj, J.; Padmashree, J.

2018-04-01

Cryptographic algorithms provide security of data against attacks during encryption and decryption. However, they are computationally intensive process which consume large amount of CPU time and space at time of encryption and decryption. The goal of this paper is to study the encryption and decryption algorithm and to find space complexity of the encrypted and decrypted data by using of algorithm. In this paper, we encrypt and decrypt the message using key with the help of cyclic square matrix provides the approach applicable for any number of words having more number of characters and longest word. Also we discussed about the time complexity of the algorithm. The proposed algorithm is simple but difficult to break the process.
SALUTE Grid Application using Message-Oriented Middleware

NASA Astrophysics Data System (ADS)

Atanassov, E.; Dimitrov, D. Sl.; Gurov, T.

2009-10-01

Stochastic ALgorithms for Ultra-fast Transport in sEmiconductors (SALUTE) is a grid application developed for solving various computationally intensive problems which describe ultra-fast carrier transport in semiconductors. SALUTE studies memory and quantum effects during the relaxation process due to electronphonon interaction in one-band semiconductors or quantum wires. Formally, SALUTE integrates a set of novel Monte Carlo, quasi-Monte Carlo and hybrid algorithms for solving various computationally intensive problems which describe the femtosecond relaxation process of optically excited carriers in one-band semiconductors or quantum wires. In this paper we present application-specific job submission and reservation management tool named a Job Track Server (JTS). It is developed using Message-Oriented middleware to implement robust, versatile job submission and tracing mechanism, which can be tailored to application specific failover and quality of service requirements. Experience from using the JTS for submission of SALUTE jobs is presented.
Improve threshold segmentation using features extraction to automatic lung delimitation.

PubMed

França, Cleunio; Vasconcelos, Germano; Diniz, Paula; Melo, Pedro; Diniz, Jéssica; Novaes, Magdala

2013-01-01

With the consolidation of PACS and RIS systems, the development of algorithms for tissue segmentation and diseases detection have intensely evolved in recent years. These algorithms have advanced to improve its accuracy and specificity, however, there is still some way until these algorithms achieved satisfactory error rates and reduced processing time to be used in daily diagnosis. The objective of this study is to propose a algorithm for lung segmentation in x-ray computed tomography images using features extraction, as Centroid and orientation measures, to improve the basic threshold segmentation. As result we found a accuracy of 85.5%.
Spot quantification in two dimensional gel electrophoresis image analysis: comparison of different approaches and presentation of a novel compound fitting algorithm

PubMed Central

2014-01-01

Background Various computer-based methods exist for the detection and quantification of protein spots in two dimensional gel electrophoresis images. Area-based methods are commonly used for spot quantification: an area is assigned to each spot and the sum of the pixel intensities in that area, the so-called volume, is used a measure for spot signal. Other methods use the optical density, i.e. the intensity of the most intense pixel of a spot, or calculate the volume from the parameters of a fitted function. Results In this study we compare the performance of different spot quantification methods using synthetic and real data. We propose a ready-to-use algorithm for spot detection and quantification that uses fitting of two dimensional Gaussian function curves for the extraction of data from two dimensional gel electrophoresis (2-DE) images. The algorithm implements fitting using logical compounds and is computationally efficient. The applicability of the compound fitting algorithm was evaluated for various simulated data and compared with other quantification approaches. We provide evidence that even if an incorrect bell-shaped function is used, the fitting method is superior to other approaches, especially when spots overlap. Finally, we validated the method with experimental data of urea-based 2-DE of Aβ peptides andre-analyzed published data sets. Our methods showed higher precision and accuracy than other approaches when applied to exposure time series and standard gels. Conclusion Compound fitting as a quantification method for 2-DE spots shows several advantages over other approaches and could be combined with various spot detection methods. The algorithm was scripted in MATLAB (Mathworks) and is available as a supplemental file. PMID:24915860
Two frameworks for integrating knowledge in induction

NASA Technical Reports Server (NTRS)

Rosenbloom, Paul S.; Hirsh, Haym; Cohen, William W.; Smith, Benjamin D.

1994-01-01

The use of knowledge in inductive learning is critical for improving the quality of the concept definitions generated, reducing the number of examples required in order to learn effective concept definitions, and reducing the computation needed to find good concept definitions. Relevant knowledge may come in many forms (such as examples, descriptions, advice, and constraints) and from many sources (such as books, teachers, databases, and scientific instruments). How to extract the relevant knowledge from this plethora of possibilities, and then to integrate it together so as to appropriately affect the induction process is perhaps the key issue at this point in inductive learning. Here the focus is on the integration part of this problem; that is, how induction algorithms can, and do, utilize a range of extracted knowledge. Preliminary work on a transformational framework for defining knowledge-intensive inductive algorithms out of relatively knowledge-free algorithms is described, as is a more tentative problems-space framework that attempts to cover all induction algorithms within a single general approach. These frameworks help to organize what is known about current knowledge-intensive induction algorithms, and to point towards new algorithms.
A multi-emitter fitting algorithm for potential live cell super-resolution imaging over a wide range of molecular densities.

PubMed

Takeshima, T; Takahashi, T; Yamashita, J; Okada, Y; Watanabe, S

2018-05-25

Multi-emitter fitting algorithms have been developed to improve the temporal resolution of single-molecule switching nanoscopy, but the molecular density range they can analyse is narrow and the computation required is intensive, significantly limiting their practical application. Here, we propose a computationally fast method, wedged template matching (WTM), an algorithm that uses a template matching technique to localise molecules at any overlapping molecular density from sparse to ultrahigh density with subdiffraction resolution. WTM achieves the localization of overlapping molecules at densities up to 600 molecules μm -2 with a high detection sensitivity and fast computational speed. WTM also shows localization precision comparable with that of DAOSTORM (an algorithm for high-density super-resolution microscopy), at densities up to 20 molecules μm -2 , and better than DAOSTORM at higher molecular densities. The application of WTM to a high-density biological sample image demonstrated that it resolved protein dynamics from live cell images with subdiffraction resolution and a temporal resolution of several hundred milliseconds or less through a significant reduction in the number of camera images required for a high-density reconstruction. WTM algorithm is a computationally fast, multi-emitter fitting algorithm that can analyse over a wide range of molecular densities. The algorithm is available through the website. https://doi.org/10.17632/bf3z6xpn5j.1. © 2018 The Authors. Journal of Microscopy published by JohnWiley & Sons Ltd on behalf of Royal Microscopical Society.
Anatomy assisted PET image reconstruction incorporating multi-resolution joint entropy

NASA Astrophysics Data System (ADS)

Tang, Jing; Rahmim, Arman

2015-01-01

A promising approach in PET image reconstruction is to incorporate high resolution anatomical information (measured from MR or CT) taking the anato-functional similarity measures such as mutual information or joint entropy (JE) as the prior. These similarity measures only classify voxels based on intensity values, while neglecting structural spatial information. In this work, we developed an anatomy-assisted maximum a posteriori (MAP) reconstruction algorithm wherein the JE measure is supplied by spatial information generated using wavelet multi-resolution analysis. The proposed wavelet-based JE (WJE) MAP algorithm involves calculation of derivatives of the subband JE measures with respect to individual PET image voxel intensities, which we have shown can be computed very similarly to how the inverse wavelet transform is implemented. We performed a simulation study with the BrainWeb phantom creating PET data corresponding to different noise levels. Realistically simulated T1-weighted MR images provided by BrainWeb modeling were applied in the anatomy-assisted reconstruction with the WJE-MAP algorithm and the intensity-only JE-MAP algorithm. Quantitative analysis showed that the WJE-MAP algorithm performed similarly to the JE-MAP algorithm at low noise level in the gray matter (GM) and white matter (WM) regions in terms of noise versus bias tradeoff. When noise increased to medium level in the simulated data, the WJE-MAP algorithm started to surpass the JE-MAP algorithm in the GM region, which is less uniform with smaller isolated structures compared to the WM region. In the high noise level simulation, the WJE-MAP algorithm presented clear improvement over the JE-MAP algorithm in both the GM and WM regions. In addition to the simulation study, we applied the reconstruction algorithms to real patient studies involving DPA-173 PET data and Florbetapir PET data with corresponding T1-MPRAGE MRI images. Compared to the intensity-only JE-MAP algorithm, the WJE-MAP algorithm resulted in comparable regional mean values to those from the maximum likelihood algorithm while reducing noise. Achieving robust performance in various noise-level simulation and patient studies, the WJE-MAP algorithm demonstrates its potential in clinical quantitative PET imaging.
Experimental image alignment system

NASA Technical Reports Server (NTRS)

Moyer, A. L.; Kowel, S. T.; Kornreich, P. G.

1980-01-01

A microcomputer-based instrument for image alignment with respect to a reference image is described which uses the DEFT sensor (Direct Electronic Fourier Transform) for image sensing and preprocessing. The instrument alignment algorithm which uses the two-dimensional Fourier transform as input is also described. It generates signals used to steer the stage carrying the test image into the correct orientation. This algorithm has computational advantages over algorithms which use image intensity data as input and is suitable for a microcomputer-based instrument since the two-dimensional Fourier transform is provided by the DEFT sensor.
Multimedia systems in ultrasound image boundary detection and measurements

NASA Astrophysics Data System (ADS)

Pathak, Sayan D.; Chalana, Vikram; Kim, Yongmin

1997-05-01

Ultrasound as a medical imaging modality offers the clinician a real-time of the anatomy of the internal organs/tissues, their movement, and flow noninvasively. One of the applications of ultrasound is to monitor fetal growth by measuring biparietal diameter (BPD) and head circumference (HC). We have been working on automatic detection of fetal head boundaries in ultrasound images. These detected boundaries are used to measure BPD and HC. The boundary detection algorithm is based on active contour models and takes 32 seconds on an external high-end workstation, SUN SparcStation 20/71. Our goal has been to make this tool available within an ultrasound machine and at the same time significantly improve its performance utilizing multimedia technology. With the advent of high- performance programmable digital signal processors (DSP), the software solution within an ultrasound machine instead of the traditional hardwired approach or requiring an external computer is now possible. We have integrated our boundary detection algorithm into a programmable ultrasound image processor (PUIP) that fits into a commercial ultrasound machine. The PUIP provides both the high computing power and flexibility needed to support computationally-intensive image processing algorithms within an ultrasound machine. According to our data analysis, BPD/HC measurements made on PUIP lie within the interobserver variability. Hence, the errors in the automated BPD/HC measurements using the algorithm are on the same order as the average interobserver differences. On PUIP, it takes 360 ms to measure the values of BPD/HC on one head image. When processing multiple head images in sequence, it takes 185 ms per image, thus enabling 5.4 BPD/HC measurements per second. Reduction in the overall execution time from 32 seconds to a fraction of a second and making this multimedia system available within an ultrasound machine will help this image processing algorithm and other computer-intensive imaging applications become a practical tool for the sonographers in the feature.
SU-E-T-33: A Feasibility-Seeking Algorithm Applied to Planning of Intensity Modulated Proton Therapy: A Proof of Principle Study

DOE Office of Scientific and Technical Information (OSTI.GOV)

Penfold, S; Casiraghi, M; Dou, T

2015-06-15

Purpose: To investigate the applicability of feasibility-seeking cyclic orthogonal projections to the field of intensity modulated proton therapy (IMPT) inverse planning. Feasibility of constraints only, as opposed to optimization of a merit function, is less demanding algorithmically and holds a promise of parallel computations capability with non-cyclic orthogonal projections algorithms such as string-averaging or block-iterative strategies. Methods: A virtual 2D geometry was designed containing a C-shaped planning target volume (PTV) surrounding an organ at risk (OAR). The geometry was pixelized into 1 mm pixels. Four beams containing a subset of proton pencil beams were simulated in Geant4 to provide themore » system matrix A whose elements a-ij correspond to the dose delivered to pixel i by a unit intensity pencil beam j. A cyclic orthogonal projections algorithm was applied with the goal of finding a pencil beam intensity distribution that would meet the following dose requirements: D-OAR < 54 Gy and 57 Gy < D-PTV < 64.2 Gy. The cyclic algorithm was based on the concept of orthogonal projections onto half-spaces according to the Agmon-Motzkin-Schoenberg algorithm, also known as ‘ART for inequalities’. Results: The cyclic orthogonal projections algorithm resulted in less than 5% of the PTV pixels and less than 1% of OAR pixels violating their dose constraints, respectively. Because of the abutting OAR-PTV geometry and the realistic modelling of the pencil beam penumbra, complete satisfaction of the dose objectives was not achieved, although this would be a clinically acceptable plan for a meningioma abutting the brainstem, for example. Conclusion: The cyclic orthogonal projections algorithm was demonstrated to be an effective tool for inverse IMPT planning in the 2D test geometry described. We plan to further develop this linear algorithm to be capable of incorporating dose-volume constraints into the feasibility-seeking algorithm.« less
Algorithm 971: An Implementation of a Randomized Algorithm for Principal Component Analysis

PubMed Central

LI, HUAMIN; LINDERMAN, GEORGE C.; SZLAM, ARTHUR; STANTON, KELLY P.; KLUGER, YUVAL; TYGERT, MARK

2017-01-01

Recent years have witnessed intense development of randomized methods for low-rank approximation. These methods target principal component analysis and the calculation of truncated singular value decompositions. The present article presents an essentially black-box, foolproof implementation for Mathworks’ MATLAB, a popular software platform for numerical computation. As illustrated via several tests, the randomized algorithms for low-rank approximation outperform or at least match the classical deterministic techniques (such as Lanczos iterations run to convergence) in basically all respects: accuracy, computational efficiency (both speed and memory usage), ease-of-use, parallelizability, and reliability. However, the classical procedures remain the methods of choice for estimating spectral norms and are far superior for calculating the least singular values and corresponding singular vectors (or singular subspaces). PMID:28983138
Optical systolic solutions of linear algebraic equations

NASA Technical Reports Server (NTRS)

Neuman, C. P.; Casasent, D.

1984-01-01

The philosophy and data encoding possible in systolic array optical processor (SAOP) were reviewed. The multitude of linear algebraic operations achievable on this architecture is examined. These operations include such linear algebraic algorithms as: matrix-decomposition, direct and indirect solutions, implicit and explicit methods for partial differential equations, eigenvalue and eigenvector calculations, and singular value decomposition. This architecture can be utilized to realize general techniques for solving matrix linear and nonlinear algebraic equations, least mean square error solutions, FIR filters, and nested-loop algorithms for control engineering applications. The data flow and pipelining of operations, design of parallel algorithms and flexible architectures, application of these architectures to computationally intensive physical problems, error source modeling of optical processors, and matching of the computational needs of practical engineering problems to the capabilities of optical processors are emphasized.
Power System Decomposition for Practical Implementation of Bulk-Grid Voltage Control Methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vallem, Mallikarjuna R.; Vyakaranam, Bharat GNVSR; Holzer, Jesse T.

Power system algorithms such as AC optimal power flow and coordinated volt/var control of the bulk power system are computationally intensive and become difficult to solve in operational time frames. The computational time required to run these algorithms increases exponentially as the size of the power system increases. The solution time for multiple subsystems is less than that for solving the entire system simultaneously, and the local nature of the voltage problem lends itself to such decomposition. This paper describes an algorithm that can be used to perform power system decomposition from the point of view of the voltage controlmore » problem. Our approach takes advantage of the dominant localized effect of voltage control and is based on clustering buses according to the electrical distances between them. One of the contributions of the paper is to use multidimensional scaling to compute n-dimensional Euclidean coordinates for each bus based on electrical distance to perform algorithms like K-means clustering. A simple coordinated reactive power control of photovoltaic inverters for voltage regulation is used to demonstrate the effectiveness of the proposed decomposition algorithm and its components. The proposed decomposition method is demonstrated on the IEEE 118-bus system.« less
Spectral Regularization Algorithms for Learning Large Incomplete Matrices.

PubMed

Mazumder, Rahul; Hastie, Trevor; Tibshirani, Robert

2010-03-01

We use convex relaxation techniques to provide a sequence of regularized low-rank solutions for large-scale matrix completion problems. Using the nuclear norm as a regularizer, we provide a simple and very efficient convex algorithm for minimizing the reconstruction error subject to a bound on the nuclear norm. Our algorithm Soft-Impute iteratively replaces the missing elements with those obtained from a soft-thresholded SVD. With warm starts this allows us to efficiently compute an entire regularization path of solutions on a grid of values of the regularization parameter. The computationally intensive part of our algorithm is in computing a low-rank SVD of a dense matrix. Exploiting the problem structure, we show that the task can be performed with a complexity linear in the matrix dimensions. Our semidefinite-programming algorithm is readily scalable to large matrices: for example it can obtain a rank-80 approximation of a 10(6) × 10(6) incomplete matrix with 10(5) observed entries in 2.5 hours, and can fit a rank 40 approximation to the full Netflix training set in 6.6 hours. Our methods show very good performance both in training and test error when compared to other competitive state-of-the art techniques.
Spectral Regularization Algorithms for Learning Large Incomplete Matrices

PubMed Central

Mazumder, Rahul; Hastie, Trevor; Tibshirani, Robert

2010-01-01

We use convex relaxation techniques to provide a sequence of regularized low-rank solutions for large-scale matrix completion problems. Using the nuclear norm as a regularizer, we provide a simple and very efficient convex algorithm for minimizing the reconstruction error subject to a bound on the nuclear norm. Our algorithm Soft-Impute iteratively replaces the missing elements with those obtained from a soft-thresholded SVD. With warm starts this allows us to efficiently compute an entire regularization path of solutions on a grid of values of the regularization parameter. The computationally intensive part of our algorithm is in computing a low-rank SVD of a dense matrix. Exploiting the problem structure, we show that the task can be performed with a complexity linear in the matrix dimensions. Our semidefinite-programming algorithm is readily scalable to large matrices: for example it can obtain a rank-80 approximation of a 106 × 106 incomplete matrix with 105 observed entries in 2.5 hours, and can fit a rank 40 approximation to the full Netflix training set in 6.6 hours. Our methods show very good performance both in training and test error when compared to other competitive state-of-the art techniques. PMID:21552465
Fast and accurate image recognition algorithms for fresh produce food safety sensing

NASA Astrophysics Data System (ADS)

Yang, Chun-Chieh; Kim, Moon S.; Chao, Kuanglin; Kang, Sukwon; Lefcourt, Alan M.

2011-06-01

This research developed and evaluated the multispectral algorithms derived from hyperspectral line-scan fluorescence imaging under violet LED excitation for detection of fecal contamination on Golden Delicious apples. The algorithms utilized the fluorescence intensities at four wavebands, 680 nm, 684 nm, 720 nm, and 780 nm, for computation of simple functions for effective detection of contamination spots created on the apple surfaces using four concentrations of aqueous fecal dilutions. The algorithms detected more than 99% of the fecal spots. The effective detection of feces showed that a simple multispectral fluorescence imaging algorithm based on violet LED excitation may be appropriate to detect fecal contamination on fast-speed apple processing lines.
A Parametric k-Means Algorithm

PubMed Central

Tarpey, Thaddeus

2007-01-01

Summary The k points that optimally represent a distribution (usually in terms of a squared error loss) are called the k principal points. This paper presents a computationally intensive method that automatically determines the principal points of a parametric distribution. Cluster means from the k-means algorithm are nonparametric estimators of principal points. A parametric k-means approach is introduced for estimating principal points by running the k-means algorithm on a very large simulated data set from a distribution whose parameters are estimated using maximum likelihood. Theoretical and simulation results are presented comparing the parametric k-means algorithm to the usual k-means algorithm and an example on determining sizes of gas masks is used to illustrate the parametric k-means algorithm. PMID:17917692
A finite element method to correct deformable image registration errors in low-contrast regions

NASA Astrophysics Data System (ADS)

Zhong, Hualiang; Kim, Jinkoo; Li, Haisen; Nurushev, Teamour; Movsas, Benjamin; Chetty, Indrin J.

2012-06-01

Image-guided adaptive radiotherapy requires deformable image registration to map radiation dose back and forth between images. The purpose of this study is to develop a novel method to improve the accuracy of an intensity-based image registration algorithm in low-contrast regions. A computational framework has been developed in this study to improve the quality of the ‘demons’ registration. For each voxel in the registration's target image, the standard deviation of image intensity in a neighborhood of this voxel was calculated. A mask for high-contrast regions was generated based on their standard deviations. In the masked regions, a tetrahedral mesh was refined recursively so that a sufficient number of tetrahedral nodes in these regions can be selected as driving nodes. An elastic system driven by the displacements of the selected nodes was formulated using a finite element method (FEM) and implemented on the refined mesh. The displacements of these driving nodes were generated with the ‘demons’ algorithm. The solution of the system was derived using a conjugated gradient method, and interpolated to generate a displacement vector field for the registered images. The FEM correction method was compared with the ‘demons’ algorithm on the computed tomography (CT) images of lung and prostate patients. The performance of the FEM correction relating to the ‘demons’ registration was analyzed based on the physical property of their deformation maps, and quantitatively evaluated through a benchmark model developed specifically for this study. Compared to the benchmark model, the ‘demons’ registration has the maximum error of 1.2 cm, which can be corrected by the FEM to 0.4 cm, and the average error of the ‘demons’ registration is reduced from 0.17 to 0.11 cm. For the CT images of lung and prostate patients, the deformation maps generated by the ‘demons’ algorithm were found unrealistic at several places. In these places, the displacement differences between the ‘demons’ registrations and their FEM corrections were found in the range of 0.4 and 1.1 cm. The mesh refinement and FEM simulation were implemented in a single thread application which requires about 45 min of computation time on a 2.6 GHz computer. This study has demonstrated that the FEM can be integrated with intensity-based image registration algorithms to improve their registration accuracy, especially in low-contrast regions.
Parallel algorithm of VLBI software correlator under multiprocessor environment

NASA Astrophysics Data System (ADS)

Zheng, Weimin; Zhang, Dong

2007-11-01

The correlator is the key signal processing equipment of a Very Lone Baseline Interferometry (VLBI) synthetic aperture telescope. It receives the mass data collected by the VLBI observatories and produces the visibility function of the target, which can be used to spacecraft position, baseline length measurement, synthesis imaging, and other scientific applications. VLBI data correlation is a task of data intensive and computation intensive. This paper presents the algorithms of two parallel software correlators under multiprocessor environments. A near real-time correlator for spacecraft tracking adopts the pipelining and thread-parallel technology, and runs on the SMP (Symmetric Multiple Processor) servers. Another high speed prototype correlator using the mixed Pthreads and MPI (Massage Passing Interface) parallel algorithm is realized on a small Beowulf cluster platform. Both correlators have the characteristic of flexible structure, scalability, and with 10-station data correlating abilities.

A model of proto-object based saliency

PubMed Central

Russell, Alexander F.; Mihalaş, Stefan; von der Heydt, Rudiger; Niebur, Ernst; Etienne-Cummings, Ralph

2013-01-01

Organisms use the process of selective attention to optimally allocate their computational resources to the instantaneously most relevant subsets of a visual scene, ensuring that they can parse the scene in real time. Many models of bottom-up attentional selection assume that elementary image features, like intensity, color and orientation, attract attention. Gestalt psychologists, how-ever, argue that humans perceive whole objects before they analyze individual features. This is supported by recent psychophysical studies that show that objects predict eye-fixations better than features. In this report we present a neurally inspired algorithm of object based, bottom-up attention. The model rivals the performance of state of the art non-biologically plausible feature based algorithms (and outperforms biologically plausible feature based algorithms) in its ability to predict perceptual saliency (eye fixations and subjective interest points) in natural scenes. The model achieves this by computing saliency as a function of proto-objects that establish the perceptual organization of the scene. All computational mechanisms of the algorithm have direct neural correlates, and our results provide evidence for the interface theory of attention. PMID:24184601
Acceleration of the Smith-Waterman algorithm using single and multiple graphics processors

NASA Astrophysics Data System (ADS)

Khajeh-Saeed, Ali; Poole, Stephen; Blair Perot, J.

2010-06-01

Finding regions of similarity between two very long data streams is a computationally intensive problem referred to as sequence alignment. Alignment algorithms must allow for imperfect sequence matching with different starting locations and some gaps and errors between the two data sequences. Perhaps the most well known application of sequence matching is the testing of DNA or protein sequences against genome databases. The Smith-Waterman algorithm is a method for precisely characterizing how well two sequences can be aligned and for determining the optimal alignment of those two sequences. Like many applications in computational science, the Smith-Waterman algorithm is constrained by the memory access speed and can be accelerated significantly by using graphics processors (GPUs) as the compute engine. In this work we show that effective use of the GPU requires a novel reformulation of the Smith-Waterman algorithm. The performance of this new version of the algorithm is demonstrated using the SSCA#1 (Bioinformatics) benchmark running on one GPU and on up to four GPUs executing in parallel. The results indicate that for large problems a single GPU is up to 45 times faster than a CPU for this application, and the parallel implementation shows linear speed up on up to 4 GPUs.
Accelerated gradient-based free form deformable registration for online adaptive radiotherapy

NASA Astrophysics Data System (ADS)

Yu, Gang; Liang, Yueqiang; Yang, Guanyu; Shu, Huazhong; Li, Baosheng; Yin, Yong; Li, Dengwang

2015-04-01

The registration of planning fan-beam computed tomography (FBCT) and daily cone-beam CT (CBCT) is a crucial step in adaptive radiation therapy. The current intensity-based registration algorithms, such as Demons, may fail when they are used to register FBCT and CBCT, because the CT numbers in CBCT cannot exactly correspond to the electron densities. In this paper, we investigated the effects of CBCT intensity inaccuracy on the registration accuracy and developed an accurate gradient-based free form deformation algorithm (GFFD). GFFD distinguishes itself from other free form deformable registration algorithms by (a) measuring the similarity using the 3D gradient vector fields to avoid the effect of inconsistent intensities between the two modalities; (b) accommodating image sampling anisotropy using the local polynomial approximation-intersection of confidence intervals (LPA-ICI) algorithm to ensure a smooth and continuous displacement field; and (c) introducing a ‘bi-directional’ force along with an adaptive force strength adjustment to accelerate the convergence process. It is expected that such a strategy can decrease the effect of the inconsistent intensities between the two modalities, thus improving the registration accuracy and robustness. Moreover, for clinical application, the algorithm was implemented by graphics processing units (GPU) through OpenCL framework. The registration time of the GFFD algorithm for each set of CT data ranges from 8 to 13 s. The applications of on-line adaptive image-guided radiation therapy, including auto-propagation of contours, aperture-optimization and dose volume histogram (DVH) in the course of radiation therapy were also studied by in-house-developed software.
Semivariogram Analysis of Bone Images Implemented on FPGA Architectures.

PubMed

Shirvaikar, Mukul; Lagadapati, Yamuna; Dong, Xuanliang

2017-03-01

Osteoporotic fractures are a major concern for the healthcare of elderly and female populations. Early diagnosis of patients with a high risk of osteoporotic fractures can be enhanced by introducing second-order statistical analysis of bone image data using techniques such as variogram analysis. Such analysis is computationally intensive thereby creating an impediment for introduction into imaging machines found in common clinical settings. This paper investigates the fast implementation of the semivariogram algorithm, which has been proven to be effective in modeling bone strength, and should be of interest to readers in the areas of computer-aided diagnosis and quantitative image analysis. The semivariogram is a statistical measure of the spatial distribution of data, and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. A semi-variance, γ ( h ), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h . Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O ( n 2 ) Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from DXA scans are utilized for the experiments. Implementation results show that a significant advantage in computational speed is attained by the architectures with respect to implementation on a personal computer with an Intel i7 multi-core processor.
Semivariogram Analysis of Bone Images Implemented on FPGA Architectures

PubMed Central

Shirvaikar, Mukul; Lagadapati, Yamuna; Dong, Xuanliang

2016-01-01

Osteoporotic fractures are a major concern for the healthcare of elderly and female populations. Early diagnosis of patients with a high risk of osteoporotic fractures can be enhanced by introducing second-order statistical analysis of bone image data using techniques such as variogram analysis. Such analysis is computationally intensive thereby creating an impediment for introduction into imaging machines found in common clinical settings. This paper investigates the fast implementation of the semivariogram algorithm, which has been proven to be effective in modeling bone strength, and should be of interest to readers in the areas of computer-aided diagnosis and quantitative image analysis. The semivariogram is a statistical measure of the spatial distribution of data, and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. A semi-variance, γ(h), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h. Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O (n2) Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from DXA scans are utilized for the experiments. Implementation results show that a significant advantage in computational speed is attained by the architectures with respect to implementation on a personal computer with an Intel i7 multi-core processor. PMID:28428829
Determining the Effectiveness of Incorporating Geographic Information Into Vehicle Performance Algorithms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sera White

2012-04-01

This thesis presents a research study using one year of driving data obtained from plug-in hybrid electric vehicles (PHEV) located in Sacramento and San Francisco, California to determine the effectiveness of incorporating geographic information into vehicle performance algorithms. Sacramento and San Francisco were chosen because of the availability of high resolution (1/9 arc second) digital elevation data. First, I present a method for obtaining instantaneous road slope, given a latitude and longitude, and introduce its use into common driving intensity algorithms. I show that for trips characterized by >40m of net elevation change (from key on to key off), themore » use of instantaneous road slope significantly changes the results of driving intensity calculations. For trips exhibiting elevation loss, algorithms ignoring road slope overestimated driving intensity by as much as 211 Wh/mile, while for trips exhibiting elevation gain these algorithms underestimated driving intensity by as much as 333 Wh/mile. Second, I describe and test an algorithm that incorporates vehicle route type into computations of city and highway fuel economy. Route type was determined by intersecting trip GPS points with ESRI StreetMap road types and assigning each trip as either city or highway route type according to whichever road type comprised the largest distance traveled. The fuel economy results produced by the geographic classification were compared to the fuel economy results produced by algorithms that assign route type based on average speed or driving style. Most results were within 1 mile per gallon ({approx}3%) of one another; the largest difference was 1.4 miles per gallon for charge depleting highway trips. The methods for acquiring and using geographic data introduced in this thesis will enable other vehicle technology researchers to incorporate geographic data into their research problems.« less
An efficient parallel algorithm: Poststack and prestack Kirchhoff 3D depth migration using flexi-depth iterations

NASA Astrophysics Data System (ADS)

Rastogi, Richa; Srivastava, Abhishek; Khonde, Kiran; Sirasala, Kirannmayi M.; Londhe, Ashutosh; Chavhan, Hitesh

2015-07-01

This paper presents an efficient parallel 3D Kirchhoff depth migration algorithm suitable for current class of multicore architecture. The fundamental Kirchhoff depth migration algorithm exhibits inherent parallelism however, when it comes to 3D data migration, as the data size increases the resource requirement of the algorithm also increases. This challenges its practical implementation even on current generation high performance computing systems. Therefore a smart parallelization approach is essential to handle 3D data for migration. The most compute intensive part of Kirchhoff depth migration algorithm is the calculation of traveltime tables due to its resource requirements such as memory/storage and I/O. In the current research work, we target this area and develop a competent parallel algorithm for post and prestack 3D Kirchhoff depth migration, using hybrid MPI+OpenMP programming techniques. We introduce a concept of flexi-depth iterations while depth migrating data in parallel imaging space, using optimized traveltime table computations. This concept provides flexibility to the algorithm by migrating data in a number of depth iterations, which depends upon the available node memory and the size of data to be migrated during runtime. Furthermore, it minimizes the requirements of storage, I/O and inter-node communication, thus making it advantageous over the conventional parallelization approaches. The developed parallel algorithm is demonstrated and analysed on Yuva II, a PARAM series of supercomputers. Optimization, performance and scalability experiment results along with the migration outcome show the effectiveness of the parallel algorithm.
Semiautomated hybrid algorithm for estimation of three-dimensional liver surface in CT using dynamic cellular automata and level-sets.

PubMed

Dakua, Sarada Prasad; Abinahed, Julien; Al-Ansari, Abdulla

2015-04-01

Liver segmentation continues to remain a major challenge, largely due to its intense complexity with surrounding anatomical structures (stomach, kidney, and heart), high noise level and lack of contrast in pathological computed tomography (CT) data. We present an approach to reconstructing the liver surface in low contrast CT. The main contributions are: (1) a stochastic resonance-based methodology in discrete cosine transform domain is developed to enhance the contrast of pathological liver images, (2) a new formulation is proposed to prevent the object boundary, resulting from the cellular automata method, from leaking into the surrounding areas of similar intensity, and (3) a level-set method is suggested to generate intermediate segmentation contours from two segmented slices distantly located in a subject sequence. We have tested the algorithm on real datasets obtained from two sources, Hamad General Hospital and medical image computing and computer-assisted interventions grand challenge workshop. Various parameters in the algorithm, such as [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], [Formula: see text], and [Formula: see text], play imperative roles, thus their values are precisely selected. Both qualitative and quantitative evaluation performed on liver data show promising segmentation accuracy when compared with ground truth data reflecting the potential of the proposed method.
On-line determination of pork color and intramuscular fat by computer vision

NASA Astrophysics Data System (ADS)

Liao, Yi-Tao; Fan, Yu-Xia; Wu, Xue-Qian; Xie, Li-juan; Cheng, Fang

2010-04-01

In this study, the application potential of computer vision in on-line determination of CIE L*a*b* and content of intramuscular fat (IMF) of pork was evaluated. Images of pork chop from 211 pig carcasses were captured while samples were on a conveyor belt at the speed of 0.25 m•s-1 to simulate the on-line environment. CIE L*a*b* and IMF content were measured with colorimeter and chemical extractor as reference. The KSW algorithm combined with region selection was employed in eliminating the surrounding fat of longissimus dorsi muscle (MLD). RGB values of the pork were counted and five methods were applied for transforming RGB values to CIE L*a*b* values. The region growing algorithm with multiple seed points was applied to mask out the IMF pixels within the intensity corrected images. The performances of the proposed algorithms were verified by comparing the measured reference values and the quality characteristics obtained by image processing. MLD region of six samples could not be identified using the KSW algorithm. Intensity nonuniformity of pork surface in the image can be eliminated efficiently, and IMF region of three corrected images failed to be extracted. Given considerable variety of color and complexity of the pork surface, CIE L*, a* and b* color of MLD could be predicted with correlation coefficients of 0.84, 0.54 and 0.47 respectively, and IMF content could be determined with a correlation coefficient more than 0.70. The study demonstrated that it is feasible to evaluate CIE L*a*b* values and IMF content on-line using computer vision.
Hardware Architectures for Data-Intensive Computing Problems: A Case Study for String Matching

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tumeo, Antonino; Villa, Oreste; Chavarría-Miranda, Daniel

DNA analysis is an emerging application of high performance bioinformatic. Modern sequencing machinery are able to provide, in few hours, large input streams of data, which needs to be matched against exponentially growing databases of known fragments. The ability to recognize these patterns effectively and fastly may allow extending the scale and the reach of the investigations performed by biology scientists. Aho-Corasick is an exact, multiple pattern matching algorithm often at the base of this application. High performance systems are a promising platform to accelerate this algorithm, which is computationally intensive but also inherently parallel. Nowadays, high performance systems alsomore » include heterogeneous processing elements, such as Graphic Processing Units (GPUs), to further accelerate parallel algorithms. Unfortunately, the Aho-Corasick algorithm exhibits large performance variability, depending on the size of the input streams, on the number of patterns to search and on the number of matches, and poses significant challenges on current high performance software and hardware implementations. An adequate mapping of the algorithm on the target architecture, coping with the limit of the underlining hardware, is required to reach the desired high throughputs. In this paper, we discuss the implementation of the Aho-Corasick algorithm for GPU-accelerated high performance systems. We present an optimized implementation of Aho-Corasick for GPUs and discuss its tradeoffs on the Tesla T10 and he new Tesla T20 (codename Fermi) GPUs. We then integrate the optimized GPU code, respectively, in a MPI-based and in a pthreads-based load balancer to enable execution of the algorithm on clusters and large sharedmemory multiprocessors (SMPs) accelerated with multiple GPUs.« less
Optimized 3D stitching algorithm for whole body SPECT based on transition error minimization (TEM)

NASA Astrophysics Data System (ADS)

Cao, Xinhua; Xu, Xiaoyin; Voss, Stephan

2017-02-01

Standard Single Photon Emission Computed Tomography (SPECT) has a limited field of view (FOV) and cannot provide a 3D image of an entire long whole body SPECT. To produce a 3D whole body SPECT image, two to five overlapped SPECT FOVs from head to foot are acquired and assembled using image stitching. Most commercial software from medical imaging manufacturers applies a direct mid-slice stitching method to avoid blurring or ghosting from 3D image blending. Due to intensity changes across the middle slice of overlapped images, direct mid-slice stitching often produces visible seams in the coronal and sagittal views and maximal intensity projection (MIP). In this study, we proposed an optimized algorithm to reduce the visibility of stitching edges. The new algorithm computed, based on transition error minimization (TEM), a 3D stitching interface between two overlapped 3D SPECT images. To test the suggested algorithm, four studies of 2-FOV whole body SPECT were used and included two different reconstruction methods (filtered back projection (FBP) and ordered subset expectation maximization (OSEM)) as well as two different radiopharmaceuticals (Tc-99m MDP for bone metastases and I-131 MIBG for neuroblastoma tumors). Relative transition errors of stitched whole body SPECT using mid-slice stitching and the TEM-based algorithm were measured for objective evaluation. Preliminary experiments showed that the new algorithm reduced the visibility of the stitching interface in the coronal, sagittal, and MIP views. Average relative transition errors were reduced from 56.7% of mid-slice stitching to 11.7% of TEM-based stitching. The proposed algorithm also avoids blurring artifacts by preserving the noise properties of the original SPECT images.
Novel hybrid GPU-CPU implementation of parallelized Monte Carlo parametric expectation maximization estimation method for population pharmacokinetic data analysis.

PubMed

Ng, C M

2013-10-01

The development of a population PK/PD model, an essential component for model-based drug development, is both time- and labor-intensive. A graphical-processing unit (GPU) computing technology has been proposed and used to accelerate many scientific computations. The objective of this study was to develop a hybrid GPU-CPU implementation of parallelized Monte Carlo parametric expectation maximization (MCPEM) estimation algorithm for population PK data analysis. A hybrid GPU-CPU implementation of the MCPEM algorithm (MCPEMGPU) and identical algorithm that is designed for the single CPU (MCPEMCPU) were developed using MATLAB in a single computer equipped with dual Xeon 6-Core E5690 CPU and a NVIDIA Tesla C2070 GPU parallel computing card that contained 448 stream processors. Two different PK models with rich/sparse sampling design schemes were used to simulate population data in assessing the performance of MCPEMCPU and MCPEMGPU. Results were analyzed by comparing the parameter estimation and model computation times. Speedup factor was used to assess the relative benefit of parallelized MCPEMGPU over MCPEMCPU in shortening model computation time. The MCPEMGPU consistently achieved shorter computation time than the MCPEMCPU and can offer more than 48-fold speedup using a single GPU card. The novel hybrid GPU-CPU implementation of parallelized MCPEM algorithm developed in this study holds a great promise in serving as the core for the next-generation of modeling software for population PK/PD analysis.
The performance analysis of three-dimensional track-before-detect algorithm based on Fisher-Tippett-Gnedenko theorem

NASA Astrophysics Data System (ADS)

Cho, Hoonkyung; Chun, Joohwan; Song, Sungchan

2016-09-01

The dim moving target tracking from the infrared image sequence in the presence of high clutter and noise has been recently under intensive investigation. The track-before-detect (TBD) algorithm processing the image sequence over a number of frames before decisions on the target track and existence is known to be especially attractive in very low SNR environments (⩽ 3 dB). In this paper, we shortly present a three-dimensional (3-D) TBD with dynamic programming (TBD-DP) algorithm using multiple IR image sensors. Since traditional two-dimensional TBD algorithm cannot track and detect the along the viewing direction, we use 3-D TBD with multiple sensors and also strictly analyze the detection performance (false alarm and detection probabilities) based on Fisher-Tippett-Gnedenko theorem. The 3-D TBD-DP algorithm which does not require a separate image registration step uses the pixel intensity values jointly read off from multiple image frames to compute the merit function required in the DP process. Therefore, we also establish the relationship between the pixel coordinates of image frame and the reference coordinates.
A fast optimization algorithm for multicriteria intensity modulated proton therapy planning.

PubMed

Chen, Wei; Craft, David; Madden, Thomas M; Zhang, Kewu; Kooy, Hanne M; Herman, Gabor T

2010-09-01

To describe a fast projection algorithm for optimizing intensity modulated proton therapy (IMPT) plans and to describe and demonstrate the use of this algorithm in multicriteria IMPT planning. The authors develop a projection-based solver for a class of convex optimization problems and apply it to IMPT treatment planning. The speed of the solver permits its use in multicriteria optimization, where several optimizations are performed which span the space of possible treatment plans. The authors describe a plan database generation procedure which is customized to the requirements of the solver. The optimality precision of the solver can be specified by the user. The authors apply the algorithm to three clinical cases: A pancreas case, an esophagus case, and a tumor along the rib cage case. Detailed analysis of the pancreas case shows that the algorithm is orders of magnitude faster than industry-standard general purpose algorithms (MOSEK'S interior point optimizer, primal simplex optimizer, and dual simplex optimizer). Additionally, the projection solver has almost no memory overhead. The speed and guaranteed accuracy of the algorithm make it suitable for use in multicriteria treatment planning, which requires the computation of several diverse treatment plans. Additionally, given the low memory overhead of the algorithm, the method can be extended to include multiple geometric instances and proton range possibilities, for robust optimization.
Modern Computational Techniques for the HMMER Sequence Analysis

PubMed Central

2013-01-01

This paper focuses on the latest research and critical reviews on modern computing architectures, software and hardware accelerated algorithms for bioinformatics data analysis with an emphasis on one of the most important sequence analysis applications—hidden Markov models (HMM). We show the detailed performance comparison of sequence analysis tools on various computing platforms recently developed in the bioinformatics society. The characteristics of the sequence analysis, such as data and compute-intensive natures, make it very attractive to optimize and parallelize by using both traditional software approach and innovated hardware acceleration technologies. PMID:25937944
A Faster Parallel Algorithm and Efficient Multithreaded Implementations for Evaluating Betweenness Centrality on Massive Datasets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Madduri, Kamesh; Ediger, David; Jiang, Karl

2009-05-29

We present a new lock-free parallel algorithm for computing betweenness centrality of massive small-world networks. With minor changes to the data structures, our algorithm also achieves better spatial cache locality compared to previous approaches. Betweenness centrality is a key algorithm kernel in the HPCS SSCA#2 Graph Analysis benchmark, which has been extensively used to evaluate the performance of emerging high-performance computing architectures for graph-theoretic computations. We design optimized implementations of betweenness centrality and the SSCA#2 benchmark for two hardware multithreaded systems: a Cray XMT system with the ThreadStorm processor, and a single-socket Sun multicore server with the UltraSparc T2 processor.more » For a small-world network of 134 million vertices and 1.073 billion edges, the 16-processor XMT system and the 8-core Sun Fire T5120 server achieve TEPS scores (an algorithmic performance count for the SSCA#2 benchmark) of 160 million and 90 million respectively, which corresponds to more than a 2X performance improvement over the previous parallel implementations. To better characterize the performance of these multithreaded systems, we correlate the SSCA#2 performance results with data from the memory-intensive STREAM and RandomAccess benchmarks. Finally, we demonstrate the applicability of our implementation to analyze massive real-world datasets by computing approximate betweenness centrality for a large-scale IMDb movie-actor network.« less
Lattice surgery on the Raussendorf lattice

NASA Astrophysics Data System (ADS)

Herr, Daniel; Paler, Alexandru; Devitt, Simon J.; Nori, Franco

2018-07-01

Lattice surgery is a method to perform quantum computation fault-tolerantly by using operations on boundary qubits between different patches of the planar code. This technique allows for universal planar code computation without eliminating the intrinsic two-dimensional nearest-neighbor properties of the surface code that eases physical hardware implementations. Lattice surgery approaches to algorithmic compilation and optimization have been demonstrated to be more resource efficient for resource-intensive components of a fault-tolerant algorithm, and consequently may be preferable over braid-based logic. Lattice surgery can be extended to the Raussendorf lattice, providing a measurement-based approach to the surface code. In this paper we describe how lattice surgery can be performed on the Raussendorf lattice and therefore give a viable alternative to computation using braiding in measurement-based implementations of topological codes.
Automating software design system DESTA

NASA Technical Reports Server (NTRS)

Lovitsky, Vladimir A.; Pearce, Patricia D.

1992-01-01

'DESTA' is the acronym for the Dialogue Evolutionary Synthesizer of Turnkey Algorithms by means of a natural language (Russian or English) functional specification of algorithms or software being developed. DESTA represents the computer-aided and/or automatic artificial intelligence 'forgiving' system which provides users with software tools support for algorithm and/or structured program development. The DESTA system is intended to provide support for the higher levels and earlier stages of engineering design of software in contrast to conventional Computer Aided Design (CAD) systems which provide low level tools for use at a stage when the major planning and structuring decisions have already been taken. DESTA is a knowledge-intensive system. The main features of the knowledge are procedures, functions, modules, operating system commands, batch files, their natural language specifications, and their interlinks. The specific domain for the DESTA system is a high level programming language like Turbo Pascal 6.0. The DESTA system is operational and runs on an IBM PC computer.
A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data.

PubMed

Baur, Brittany; Bozdag, Serdar

2016-01-01

DNA methylation is an important epigenetic event that effects gene expression during development and various diseases such as cancer. Understanding the mechanism of action of DNA methylation is important for downstream analysis. In the Illumina Infinium HumanMethylation 450K array, there are tens of probes associated with each gene. Given methylation intensities of all these probes, it is necessary to compute which of these probes are most representative of the gene centric methylation level. In this study, we developed a feature selection algorithm based on sequential forward selection that utilized different classification methods to compute gene centric DNA methylation using probe level DNA methylation data. We compared our algorithm to other feature selection algorithms such as support vector machines with recursive feature elimination, genetic algorithms and ReliefF. We evaluated all methods based on the predictive power of selected probes on their mRNA expression levels and found that a K-Nearest Neighbors classification using the sequential forward selection algorithm performed better than other algorithms based on all metrics. We also observed that transcriptional activities of certain genes were more sensitive to DNA methylation changes than transcriptional activities of other genes. Our algorithm was able to predict the expression of those genes with high accuracy using only DNA methylation data. Our results also showed that those DNA methylation-sensitive genes were enriched in Gene Ontology terms related to the regulation of various biological processes.
Special-purpose computer for holography HORN-4 with recurrence algorithm

NASA Astrophysics Data System (ADS)

Shimobaba, Tomoyoshi; Hishinuma, Sinsuke; Ito, Tomoyoshi

2002-10-01

We designed and built a special-purpose computer for holography, HORN-4 (HOlographic ReconstructioN) using PLD (Programmable Logic Device) technology. HORN computers have a pipeline architecture. We use HORN-4 as an attached processor to enhance the performance of a general-purpose computer when it is used to generate holograms using a "recurrence formulas" algorithm developed by our previous paper. In the HORN-4 system, we designed the pipeline by adopting our "recurrence formulas" algorithm which can calculate the phase on a hologram. As the result, we could integrate the pipeline composed of 21 units into one PLD chip. The units in the pipeline consists of one BPU (Basic Phase Unit) unit and twenty CU (Cascade Unit) units. These CU units can compute twenty light intensities on a hologram plane at one time. By mounting two of the PLD chips on a PCI (Peripheral Component Interconnect) universal board, HORN-4 can calculate holograms at high speed of about 42 Gflops equivalent. The cost of HORN-4 board is about 1700 US dollar. We could obtain 800×600 grids hologram from a 3D-image composed of 415 points in about 0.45 sec with the HORN-4 system.

Computing Interactions Of Free-Space Radiation With Matter

NASA Technical Reports Server (NTRS)

Wilson, J. W.; Cucinotta, F. A.; Shinn, J. L.; Townsend, L. W.; Badavi, F. F.; Tripathi, R. K.; Silberberg, R.; Tsao, C. H.; Badwar, G. D.

1995-01-01

High Charge and Energy Transport (HZETRN) computer program computationally efficient, user-friendly package of software adressing problem of transport of, and shielding against, radiation in free space. Designed as "black box" for design engineers not concerned with physics of underlying atomic and nuclear radiation processes in free-space environment, but rather primarily interested in obtaining fast and accurate dosimetric information for design and construction of modules and devices for use in free space. Computational efficiency achieved by unique algorithm based on deterministic approach to solution of Boltzmann equation rather than computationally intensive statistical Monte Carlo method. Written in FORTRAN.
Cascaded VLSI neural network architecture for on-line learning

NASA Technical Reports Server (NTRS)

Thakoor, Anilkumar P. (Inventor); Duong, Tuan A. (Inventor); Daud, Taher (Inventor)

1992-01-01

High-speed, analog, fully-parallel, and asynchronous building blocks are cascaded for larger sizes and enhanced resolution. A hardware compatible algorithm permits hardware-in-the-loop learning despite limited weight resolution. A computation intensive feature classification application was demonstrated with this flexible hardware and new algorithm at high speed. This result indicates that these building block chips can be embedded as an application specific coprocessor for solving real world problems at extremely high data rates.
Reflection symmetry detection using locally affine invariant edge correspondence.

PubMed

Wang, Zhaozhong; Tang, Zesheng; Zhang, Xiao

2015-04-01

Reflection symmetry detection receives increasing attentions in recent years. The state-of-the-art algorithms mainly use the matching of intensity-based features (such as the SIFT) within a single image to find symmetry axes. This paper proposes a novel approach by establishing the correspondence of locally affine invariant edge-based features, which are superior to the intensity based in the aspects that it is insensitive to illumination variations, and applicable to textureless objects. The locally affine invariance is achieved by simple linear algebra for efficient and robust computations, making the algorithm suitable for detections under object distortions like perspective projection. Commonly used edge detectors and a voting process are, respectively, used before and after the edge description and matching steps to form a complete reflection detection pipeline. Experiments are performed using synthetic and real-world images with both multiple and single reflection symmetry axis. The test results are compared with existing algorithms to validate the proposed method.
Mathematical model of a rotational bioreactor for the dynamic cultivation of scaffold-adhered human mesenchymal stem cells for bone regeneration

NASA Astrophysics Data System (ADS)

Ganimedov, V. L.; Papaeva, E. O.; Maslov, N. A.; Larionov, P. M.

2017-09-01

Development of cell-mediated scaffold technologies for the treatment of critical bone defects is very important for the purpose of reparative bone regeneration. Today the properties of the bioreactor for cell-seeded scaffold cultivation are the subject of intensive research. We used the mathematical modeling of rotational reactor and construct computational algorithm with the help of ANSYS software package to develop this new procedure. The solution obtained with the help of the constructed computational algorithm is in good agreement with the analytical solution of Couette for the task of two coaxial cylinders. The series of flow computations for different rotation frequencies (1, 0.75, 0.5, 0.33, 1.125 Hz) was performed for the laminar flow regime approximation with the help of computational algorithm. It was found that Taylor vortices appear in the annular gap between the cylinders in a simulated bioreactor. It was obtained that shear stress in the range of interest (0.002-0.1 Pa) arise on outer surface of inner cylinder when it rotates with the frequency not exceeding 0.8 Hz. So the constructed mathematical model and the created computational algorithm for calculating the flow parameters allow predicting the shear stress and pressure values depending on the rotation frequency and geometric parameters, as well as optimizing the operating mode of the bioreactor.
Single-intensity-recording optical encryption technique based on phase retrieval algorithm and QR code

NASA Astrophysics Data System (ADS)

Wang, Zhi-peng; Zhang, Shuai; Liu, Hong-zhao; Qin, Yi

2014-12-01

Based on phase retrieval algorithm and QR code, a new optical encryption technology that only needs to record one intensity distribution is proposed. In this encryption process, firstly, the QR code is generated from the information to be encrypted; and then the generated QR code is placed in the input plane of 4-f system to have a double random phase encryption. For only one intensity distribution in the output plane is recorded as the ciphertext, the encryption process is greatly simplified. In the decryption process, the corresponding QR code is retrieved using phase retrieval algorithm. A priori information about QR code is used as support constraint in the input plane, which helps solve the stagnation problem. The original information can be recovered without distortion by scanning the QR code. The encryption process can be implemented either optically or digitally, and the decryption process uses digital method. In addition, the security of the proposed optical encryption technology is analyzed. Theoretical analysis and computer simulations show that this optical encryption system is invulnerable to various attacks, and suitable for harsh transmission conditions.
Analysis of deformable image registration accuracy using computational modeling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhong Hualiang; Kim, Jinkoo; Chetty, Indrin J.

2010-03-15

Computer aided modeling of anatomic deformation, allowing various techniques and protocols in radiation therapy to be systematically verified and studied, has become increasingly attractive. In this study the potential issues in deformable image registration (DIR) were analyzed based on two numerical phantoms: One, a synthesized, low intensity gradient prostate image, and the other a lung patient's CT image data set. Each phantom was modeled with region-specific material parameters with its deformation solved using a finite element method. The resultant displacements were used to construct a benchmark to quantify the displacement errors of the Demons and B-Spline-based registrations. The results showmore » that the accuracy of these registration algorithms depends on the chosen parameters, the selection of which is closely associated with the intensity gradients of the underlying images. For the Demons algorithm, both single resolution (SR) and multiresolution (MR) registrations required approximately 300 iterations to reach an accuracy of 1.4 mm mean error in the lung patient's CT image (and 0.7 mm mean error averaged in the lung only). For the low gradient prostate phantom, these algorithms (both SR and MR) required at least 1600 iterations to reduce their mean errors to 2 mm. For the B-Spline algorithms, best performance (mean errors of 1.9 mm for SR and 1.6 mm for MR, respectively) on the low gradient prostate was achieved using five grid nodes in each direction. Adding more grid nodes resulted in larger errors. For the lung patient's CT data set, the B-Spline registrations required ten grid nodes in each direction for highest accuracy (1.4 mm for SR and 1.5 mm for MR). The numbers of iterations or grid nodes required for optimal registrations depended on the intensity gradients of the underlying images. In summary, the performance of the Demons and B-Spline registrations have been quantitatively evaluated using numerical phantoms. The results show that parameter selection for optimal accuracy is closely related to the intensity gradients of the underlying images. Also, the result that the DIR algorithms produce much lower errors in heterogeneous lung regions relative to homogeneous (low intensity gradient) regions, suggests that feature-based evaluation of deformable image registration accuracy must be viewed cautiously.« less
Probabilistic segmentation and intensity estimation for microarray images.

PubMed

Gottardo, Raphael; Besag, Julian; Stephens, Matthew; Murua, Alejandro

2006-01-01

We describe a probabilistic approach to simultaneous image segmentation and intensity estimation for complementary DNA microarray experiments. The approach overcomes several limitations of existing methods. In particular, it (a) uses a flexible Markov random field approach to segmentation that allows for a wider range of spot shapes than existing methods, including relatively common 'doughnut-shaped' spots; (b) models the image directly as background plus hybridization intensity, and estimates the two quantities simultaneously, avoiding the common logical error that estimates of foreground may be less than those of the corresponding background if the two are estimated separately; and (c) uses a probabilistic modeling approach to simultaneously perform segmentation and intensity estimation, and to compute spot quality measures. We describe two approaches to parameter estimation: a fast algorithm, based on the expectation-maximization and the iterated conditional modes algorithms, and a fully Bayesian framework. These approaches produce comparable results, and both appear to offer some advantages over other methods. We use an HIV experiment to compare our approach to two commercial software products: Spot and Arrayvision.
Optimization of image processing algorithms on mobile platforms

NASA Astrophysics Data System (ADS)

Poudel, Pramod; Shirvaikar, Mukul

2011-03-01

This work presents a technique to optimize popular image processing algorithms on mobile platforms such as cell phones, net-books and personal digital assistants (PDAs). The increasing demand for video applications like context-aware computing on mobile embedded systems requires the use of computationally intensive image processing algorithms. The system engineer has a mandate to optimize them so as to meet real-time deadlines. A methodology to take advantage of the asymmetric dual-core processor, which includes an ARM and a DSP core supported by shared memory, is presented with implementation details. The target platform chosen is the popular OMAP 3530 processor for embedded media systems. It has an asymmetric dual-core architecture with an ARM Cortex-A8 and a TMS320C64x Digital Signal Processor (DSP). The development platform was the BeagleBoard with 256 MB of NAND RAM and 256 MB SDRAM memory. The basic image correlation algorithm is chosen for benchmarking as it finds widespread application for various template matching tasks such as face-recognition. The basic algorithm prototypes conform to OpenCV, a popular computer vision library. OpenCV algorithms can be easily ported to the ARM core which runs a popular operating system such as Linux or Windows CE. However, the DSP is architecturally more efficient at handling DFT algorithms. The algorithms are tested on a variety of images and performance results are presented measuring the speedup obtained due to dual-core implementation. A major advantage of this approach is that it allows the ARM processor to perform important real-time tasks, while the DSP addresses performance-hungry algorithms.
A brief introduction to computer-intensive methods, with a view towards applications in spatial statistics and stereology.

PubMed

Mattfeldt, Torsten

2011-04-01

Computer-intensive methods may be defined as data analytical procedures involving a huge number of highly repetitive computations. We mention resampling methods with replacement (bootstrap methods), resampling methods without replacement (randomization tests) and simulation methods. The resampling methods are based on simple and robust principles and are largely free from distributional assumptions. Bootstrap methods may be used to compute confidence intervals for a scalar model parameter and for summary statistics from replicated planar point patterns, and for significance tests. For some simple models of planar point processes, point patterns can be simulated by elementary Monte Carlo methods. The simulation of models with more complex interaction properties usually requires more advanced computing methods. In this context, we mention simulation of Gibbs processes with Markov chain Monte Carlo methods using the Metropolis-Hastings algorithm. An alternative to simulations on the basis of a parametric model consists of stochastic reconstruction methods. The basic ideas behind the methods are briefly reviewed and illustrated by simple worked examples in order to encourage novices in the field to use computer-intensive methods. © 2010 The Authors Journal of Microscopy © 2010 Royal Microscopical Society.
A parallel computing engine for a class of time critical processes.

PubMed

Nabhan, T M; Zomaya, A Y

1997-01-01

This paper focuses on the efficient parallel implementation of systems of numerically intensive nature over loosely coupled multiprocessor architectures. These analytical models are of significant importance to many real-time systems that have to meet severe time constants. A parallel computing engine (PCE) has been developed in this work for the efficient simplification and the near optimal scheduling of numerical models over the different cooperating processors of the parallel computer. First, the analytical system is efficiently coded in its general form. The model is then simplified by using any available information (e.g., constant parameters). A task graph representing the interconnections among the different components (or equations) is generated. The graph can then be compressed to control the computation/communication requirements. The task scheduler employs a graph-based iterative scheme, based on the simulated annealing algorithm, to map the vertices of the task graph onto a Multiple-Instruction-stream Multiple-Data-stream (MIMD) type of architecture. The algorithm uses a nonanalytical cost function that properly considers the computation capability of the processors, the network topology, the communication time, and congestion possibilities. Moreover, the proposed technique is simple, flexible, and computationally viable. The efficiency of the algorithm is demonstrated by two case studies with good results.
Numerical characteristics of quantum computer simulation

NASA Astrophysics Data System (ADS)

Chernyavskiy, A.; Khamitov, K.; Teplov, A.; Voevodin, V.; Voevodin, Vl.

2016-12-01

The simulation of quantum circuits is significantly important for the implementation of quantum information technologies. The main difficulty of such modeling is the exponential growth of dimensionality, thus the usage of modern high-performance parallel computations is relevant. As it is well known, arbitrary quantum computation in circuit model can be done by only single- and two-qubit gates, and we analyze the computational structure and properties of the simulation of such gates. We investigate the fact that the unique properties of quantum nature lead to the computational properties of the considered algorithms: the quantum parallelism make the simulation of quantum gates highly parallel, and on the other hand, quantum entanglement leads to the problem of computational locality during simulation. We use the methodology of the AlgoWiki project (algowiki-project.org) to analyze the algorithm. This methodology consists of theoretical (sequential and parallel complexity, macro structure, and visual informational graph) and experimental (locality and memory access, scalability and more specific dynamic characteristics) parts. Experimental part was made by using the petascale Lomonosov supercomputer (Moscow State University, Russia). We show that the simulation of quantum gates is a good base for the research and testing of the development methods for data intense parallel software, and considered methodology of the analysis can be successfully used for the improvement of the algorithms in quantum information science.
CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions.

PubMed

Liu, Yongchao; Wirawan, Adrianto; Schmidt, Bertil

2013-04-04

The maximal sensitivity for local alignments makes the Smith-Waterman algorithm a popular choice for protein sequence database search based on pairwise alignment. However, the algorithm is compute-intensive due to a quadratic time complexity. Corresponding runtimes are further compounded by the rapid growth of sequence databases. We present CUDASW++ 3.0, a fast Smith-Waterman protein database search algorithm, which couples CPU and GPU SIMD instructions and carries out concurrent CPU and GPU computations. For the CPU computation, this algorithm employs SSE-based vector execution units as accelerators. For the GPU computation, we have investigated for the first time a GPU SIMD parallelization, which employs CUDA PTX SIMD video instructions to gain more data parallelism beyond the SIMT execution model. Moreover, sequence alignment workloads are automatically distributed over CPUs and GPUs based on their respective compute capabilities. Evaluation on the Swiss-Prot database shows that CUDASW++ 3.0 gains a performance improvement over CUDASW++ 2.0 up to 2.9 and 3.2, with a maximum performance of 119.0 and 185.6 GCUPS, on a single-GPU GeForce GTX 680 and a dual-GPU GeForce GTX 690 graphics card, respectively. In addition, our algorithm has demonstrated significant speedups over other top-performing tools: SWIPE and BLAST+. CUDASW++ 3.0 is written in CUDA C++ and PTX assembly languages, targeting GPUs based on the Kepler architecture. This algorithm obtains significant speedups over its predecessor: CUDASW++ 2.0, by benefiting from the use of CPU and GPU SIMD instructions as well as the concurrent execution on CPUs and GPUs. The source code and the simulated data are available at http://cudasw.sourceforge.net.
1 kHz 2D Visual Motion Sensor Using 20 × 20 Silicon Retina Optical Sensor and DSP Microcontroller.

PubMed

Liu, Shih-Chii; Yang, MinHao; Steiner, Andreas; Moeckel, Rico; Delbruck, Tobi

2015-04-01

Optical flow sensors have been a long running theme in neuromorphic vision sensors which include circuits that implement the local background intensity adaptation mechanism seen in biological retinas. This paper reports a bio-inspired optical motion sensor aimed towards miniature robotic and aerial platforms. It combines a 20 × 20 continuous-time CMOS silicon retina vision sensor with a DSP microcontroller. The retina sensor has pixels that have local gain control and adapt to background lighting. The system allows the user to validate various motion algorithms without building dedicated custom solutions. Measurements are presented to show that the system can compute global 2D translational motion from complex natural scenes using one particular algorithm: the image interpolation algorithm (I2A). With this algorithm, the system can compute global translational motion vectors at a sample rate of 1 kHz, for speeds up to ±1000 pixels/s, using less than 5 k instruction cycles (12 instructions per pixel) per frame. At 1 kHz sample rate the DSP is 12% occupied with motion computation. The sensor is implemented as a 6 g PCB consuming 170 mW of power.
Reverse time migration by Krylov subspace reduced order modeling

NASA Astrophysics Data System (ADS)

Basir, Hadi Mahdavi; Javaherian, Abdolrahim; Shomali, Zaher Hossein; Firouz-Abadi, Roohollah Dehghani; Gholamy, Shaban Ali

2018-04-01

Imaging is a key step in seismic data processing. To date, a myriad of advanced pre-stack depth migration approaches have been developed; however, reverse time migration (RTM) is still considered as the high-end imaging algorithm. The main limitations associated with the performance cost of reverse time migration are the intensive computation of the forward and backward simulations, time consumption, and memory allocation related to imaging condition. Based on the reduced order modeling, we proposed an algorithm, which can be adapted to all the aforementioned factors. Our proposed method benefit from Krylov subspaces method to compute certain mode shapes of the velocity model computed by as an orthogonal base of reduced order modeling. Reverse time migration by reduced order modeling is helpful concerning the highly parallel computation and strongly reduces the memory requirement of reverse time migration. The synthetic model results showed that suggested method can decrease the computational costs of reverse time migration by several orders of magnitudes, compared with reverse time migration by finite element method.
Cloud Based Metalearning System for Predictive Modeling of Biomedical Data

PubMed Central

Vukićević, Milan

2014-01-01

Rapid growth and storage of biomedical data enabled many opportunities for predictive modeling and improvement of healthcare processes. On the other side analysis of such large amounts of data is a difficult and computationally intensive task for most existing data mining algorithms. This problem is addressed by proposing a cloud based system that integrates metalearning framework for ranking and selection of best predictive algorithms for data at hand and open source big data technologies for analysis of biomedical data. PMID:24892101
Cross-wind profiling based on the scattered wave scintillation in a telescope focus.

PubMed

Banakh, V A; Marakasov, D A; Vorontsov, M A

2007-11-20

The problem of wind profile reconstruction from scintillation of an optical wave scattered off a rough surface in a telescope focus plane is considered. Both the expression for the spatiotemporal correlation function and the algorithm of cross-wind velocity and direction profiles reconstruction based on the spatiotemporal spectrum of intensity of an optical wave scattered by a diffuse target in a turbulent atmosphere are presented. Computer simulations performed under conditions of weak optical turbulence show wind profiles reconstruction by the developed algorithm.
Large-scale atomistic calculations of clusters in intense x-ray pulses

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ho, Phay J.; Knight, Chris

Here, we present the methodology of our recently developed Monte-Carlo/ Molecular-Dynamics method for studying the fundamental ultrafast dynamics induced by high-fluence, high-intensity x-ray free electron laser (XFEL) pulses in clusters. The quantum nature of the initiating ionization process is accounted for by a Monte Carlo method to calculate probabilities of electronic transitions, including photo absorption, inner-shell relaxation, photon scattering, electron collision and recombination dynamics, and thus track the transient electronic configurations explicitly. The freed electrons and ions are followed by classical particle trajectories using a molecular dynamics algorithm. These calculations reveal the surprising role of electron-ion recombination processes that leadmore » to the development of nonuniform spatial charge density profiles in x-ray excited clusters over femtosecond timescales. In the high-intensity limit, it is important to include the recombination dynamics in the calculated scattering response even for a 2- fs pulse. We also demonstrate that our numerical codes and algorithms can make e!cient use of the computational power of massively parallel supercomputers to investigate the intense-field dynamics in systems with increasing complexity and size at the ultrafast timescale and in non-linear x-ray interaction regimes. In particular, picosecond trajectories of XFEL clusters with attosecond time resolution containing millions of particles can be e!ciently computed on upwards of 262,144 processes.« less
Large-scale atomistic calculations of clusters in intense x-ray pulses

DOE PAGES

Ho, Phay J.; Knight, Chris

2017-04-28

Here, we present the methodology of our recently developed Monte-Carlo/ Molecular-Dynamics method for studying the fundamental ultrafast dynamics induced by high-fluence, high-intensity x-ray free electron laser (XFEL) pulses in clusters. The quantum nature of the initiating ionization process is accounted for by a Monte Carlo method to calculate probabilities of electronic transitions, including photo absorption, inner-shell relaxation, photon scattering, electron collision and recombination dynamics, and thus track the transient electronic configurations explicitly. The freed electrons and ions are followed by classical particle trajectories using a molecular dynamics algorithm. These calculations reveal the surprising role of electron-ion recombination processes that leadmore » to the development of nonuniform spatial charge density profiles in x-ray excited clusters over femtosecond timescales. In the high-intensity limit, it is important to include the recombination dynamics in the calculated scattering response even for a 2- fs pulse. We also demonstrate that our numerical codes and algorithms can make e!cient use of the computational power of massively parallel supercomputers to investigate the intense-field dynamics in systems with increasing complexity and size at the ultrafast timescale and in non-linear x-ray interaction regimes. In particular, picosecond trajectories of XFEL clusters with attosecond time resolution containing millions of particles can be e!ciently computed on upwards of 262,144 processes.« less
A Robust Random Forest-Based Approach for Heart Rate Monitoring Using Photoplethysmography Signal Contaminated by Intense Motion Artifacts.

PubMed

Ye, Yalan; He, Wenwen; Cheng, Yunfei; Huang, Wenxia; Zhang, Zhilin

2017-02-16

The estimation of heart rate (HR) based on wearable devices is of interest in fitness. Photoplethysmography (PPG) is a promising approach to estimate HR due to low cost; however, it is easily corrupted by motion artifacts (MA). In this work, a robust approach based on random forest is proposed for accurately estimating HR from the photoplethysmography signal contaminated by intense motion artifacts, consisting of two stages. Stage 1 proposes a hybrid method to effectively remove MA with a low computation complexity, where two MA removal algorithms are combined by an accurate binary decision algorithm whose aim is to decide whether or not to adopt the second MA removal algorithm. Stage 2 proposes a random forest-based spectral peak-tracking algorithm, whose aim is to locate the spectral peak corresponding to HR, formulating the problem of spectral peak tracking into a pattern classification problem. Experiments on the PPG datasets including 22 subjects used in the 2015 IEEE Signal Processing Cup showed that the proposed approach achieved the average absolute error of 1.65 beats per minute (BPM) on the 22 PPG datasets. Compared to state-of-the-art approaches, the proposed approach has better accuracy and robustness to intense motion artifacts, indicating its potential use in wearable sensors for health monitoring and fitness tracking.
A Subspace Pursuit–based Iterative Greedy Hierarchical Solution to the Neuromagnetic Inverse Problem

PubMed Central

Babadi, Behtash; Obregon-Henao, Gabriel; Lamus, Camilo; Hämäläinen, Matti S.; Brown, Emery N.; Purdon, Patrick L.

2013-01-01

Magnetoencephalography (MEG) is an important non-invasive method for studying activity within the human brain. Source localization methods can be used to estimate spatiotemporal activity from MEG measurements with high temporal resolution, but the spatial resolution of these estimates is poor due to the ill-posed nature of the MEG inverse problem. Recent developments in source localization methodology have emphasized temporal as well as spatial constraints to improve source localization accuracy, but these methods can be computationally intense. Solutions emphasizing spatial sparsity hold tremendous promise, since the underlying neurophysiological processes generating MEG signals are often sparse in nature, whether in the form of focal sources, or distributed sources representing large-scale functional networks. Recent developments in the theory of compressed sensing (CS) provide a rigorous framework to estimate signals with sparse structure. In particular, a class of CS algorithms referred to as greedy pursuit algorithms can provide both high recovery accuracy and low computational complexity. Greedy pursuit algorithms are difficult to apply directly to the MEG inverse problem because of the high-dimensional structure of the MEG source space and the high spatial correlation in MEG measurements. In this paper, we develop a novel greedy pursuit algorithm for sparse MEG source localization that overcomes these fundamental problems. This algorithm, which we refer to as the Subspace Pursuit-based Iterative Greedy Hierarchical (SPIGH) inverse solution, exhibits very low computational complexity while achieving very high localization accuracy. We evaluate the performance of the proposed algorithm using comprehensive simulations, as well as the analysis of human MEG data during spontaneous brain activity and somatosensory stimuli. These studies reveal substantial performance gains provided by the SPIGH algorithm in terms of computational complexity, localization accuracy, and robustness. PMID:24055554

A hybrid multi-objective evolutionary algorithm for wind-turbine blade optimization

NASA Astrophysics Data System (ADS)

Sessarego, M.; Dixon, K. R.; Rival, D. E.; Wood, D. H.

2015-08-01

A concurrent-hybrid non-dominated sorting genetic algorithm (hybrid NSGA-II) has been developed and applied to the simultaneous optimization of the annual energy production, flapwise root-bending moment and mass of the NREL 5 MW wind-turbine blade. By hybridizing a multi-objective evolutionary algorithm (MOEA) with gradient-based local search, it is believed that the optimal set of blade designs could be achieved in lower computational cost than for a conventional MOEA. To measure the convergence between the hybrid and non-hybrid NSGA-II on a wind-turbine blade optimization problem, a computationally intensive case was performed using the non-hybrid NSGA-II. From this particular case, a three-dimensional surface representing the optimal trade-off between the annual energy production, flapwise root-bending moment and blade mass was achieved. The inclusion of local gradients in the blade optimization, however, shows no improvement in the convergence for this three-objective problem.
An improved ant colony optimization algorithm with fault tolerance for job scheduling in grid computing systems

PubMed Central

Idris, Hajara; Junaidu, Sahalu B.; Adewumi, Aderemi O.

2017-01-01

The Grid scheduler, schedules user jobs on the best available resource in terms of resource characteristics by optimizing job execution time. Resource failure in Grid is no longer an exception but a regular occurring event as resources are increasingly being used by the scientific community to solve computationally intensive problems which typically run for days or even months. It is therefore absolutely essential that these long-running applications are able to tolerate failures and avoid re-computations from scratch after resource failure has occurred, to satisfy the user’s Quality of Service (QoS) requirement. Job Scheduling with Fault Tolerance in Grid Computing using Ant Colony Optimization is proposed to ensure that jobs are executed successfully even when resource failure has occurred. The technique employed in this paper, is the use of resource failure rate, as well as checkpoint-based roll back recovery strategy. Check-pointing aims at reducing the amount of work that is lost upon failure of the system by immediately saving the state of the system. A comparison of the proposed approach with an existing Ant Colony Optimization (ACO) algorithm is discussed. The experimental results of the implemented Fault Tolerance scheduling algorithm show that there is an improvement in the user’s QoS requirement over the existing ACO algorithm, which has no fault tolerance integrated in it. The performance evaluation of the two algorithms was measured in terms of the three main scheduling performance metrics: makespan, throughput and average turnaround time. PMID:28545075
Parallelization and implementation of approximate root isolation for nonlinear system by Monte Carlo

NASA Astrophysics Data System (ADS)

Khosravi, Ebrahim

1998-12-01

This dissertation solves a fundamental problem of isolating the real roots of nonlinear systems of equations by Monte-Carlo that were published by Bush Jones. This algorithm requires only function values and can be applied readily to complicated systems of transcendental functions. The implementation of this sequential algorithm provides scientists with the means to utilize function analysis in mathematics or other fields of science. The algorithm, however, is so computationally intensive that the system is limited to a very small set of variables, and this will make it unfeasible for large systems of equations. Also a computational technique was needed for investigating a metrology of preventing the algorithm structure from converging to the same root along different paths of computation. The research provides techniques for improving the efficiency and correctness of the algorithm. The sequential algorithm for this technique was corrected and a parallel algorithm is presented. This parallel method has been formally analyzed and is compared with other known methods of root isolation. The effectiveness, efficiency, enhanced overall performance of the parallel processing of the program in comparison to sequential processing is discussed. The message passing model was used for this parallel processing, and it is presented and implemented on Intel/860 MIMD architecture. The parallel processing proposed in this research has been implemented in an ongoing high energy physics experiment: this algorithm has been used to track neutrinoes in a super K detector. This experiment is located in Japan, and data can be processed on-line or off-line locally or remotely.
Detection of unmanned aerial vehicles using a visible camera system.

PubMed

Hu, Shuowen; Goldman, Geoffrey H; Borel-Donohue, Christoph C

2017-01-20

Unmanned aerial vehicles (UAVs) flown by adversaries are an emerging asymmetric threat to homeland security and the military. To help address this threat, we developed and tested a computationally efficient UAV detection algorithm consisting of horizon finding, motion feature extraction, blob analysis, and coherence analysis. We compare the performance of this algorithm against two variants, one using the difference image intensity as the motion features and another using higher-order moments. The proposed algorithm and its variants are tested using field test data of a group 3 UAV acquired with a panoramic video camera in the visible spectrum. The performance of the algorithms was evaluated using receiver operating characteristic curves. The results show that the proposed approach had the best performance compared to the two algorithmic variants.
BFL: a node and edge betweenness based fast layout algorithm for large scale networks

PubMed Central

Hashimoto, Tatsunori B; Nagasaki, Masao; Kojima, Kaname; Miyano, Satoru

2009-01-01

Background Network visualization would serve as a useful first step for analysis. However, current graph layout algorithms for biological pathways are insensitive to biologically important information, e.g. subcellular localization, biological node and graph attributes, or/and not available for large scale networks, e.g. more than 10000 elements. Results To overcome these problems, we propose the use of a biologically important graph metric, betweenness, a measure of network flow. This metric is highly correlated with many biological phenomena such as lethality and clusters. We devise a new fast parallel algorithm calculating betweenness to minimize the preprocessing cost. Using this metric, we also invent a node and edge betweenness based fast layout algorithm (BFL). BFL places the high-betweenness nodes to optimal positions and allows the low-betweenness nodes to reach suboptimal positions. Furthermore, BFL reduces the runtime by combining a sequential insertion algorim with betweenness. For a graph with n nodes, this approach reduces the expected runtime of the algorithm to O(n2) when considering edge crossings, and to O(n log n) when considering only density and edge lengths. Conclusion Our BFL algorithm is compared against fast graph layout algorithms and approaches requiring intensive optimizations. For gene networks, we show that our algorithm is faster than all layout algorithms tested while providing readability on par with intensive optimization algorithms. We achieve a 1.4 second runtime for a graph with 4000 nodes and 12000 edges on a standard desktop computer. PMID:19146673
Controlling Light Transmission Through Highly Scattering Media Using Semi-Definite Programming as a Phase Retrieval Computation Method.

PubMed

N'Gom, Moussa; Lien, Miao-Bin; Estakhri, Nooshin M; Norris, Theodore B; Michielssen, Eric; Nadakuditi, Raj Rao

2017-05-31

Complex Semi-Definite Programming (SDP) is introduced as a novel approach to phase retrieval enabled control of monochromatic light transmission through highly scattering media. In a simple optical setup, a spatial light modulator is used to generate a random sequence of phase-modulated wavefronts, and the resulting intensity speckle patterns in the transmitted light are acquired on a camera. The SDP algorithm allows computation of the complex transmission matrix of the system from this sequence of intensity-only measurements, without need for a reference beam. Once the transmission matrix is determined, optimal wavefronts are computed that focus the incident beam to any position or sequence of positions on the far side of the scattering medium, without the need for any subsequent measurements or wavefront shaping iterations. The number of measurements required and the degree of enhancement of the intensity at focus is determined by the number of pixels controlled by the spatial light modulator.
Reducing false asystole alarms in intensive care.

PubMed

Dekimpe, Remi; Heldt, Thomas

2017-07-01

High rates of false monitoring alarms in intensive care can desensitize staff and therefore pose a significant risk to patient safety. Like other critical arrhythmia alarms, asystole alarms require immediate attention by the care providers as a true asystole event can be acutely life threatening. Here, it is illustrated that most false asystole alarms can be attributed to poor signal quality, and we propose and evaluate an algorithm to identify data windows of poor signal quality and thereby help suppress false asystole alarms. The algorithm combines intuitive signal-quality features (degree of signal saturation and baseline wander) and information from other physiological signals that might be available. Algorithm training and testing was performed on the MIMIC II and 2015 PhysioNet/Computing in Cardiology Challenge databases, respectively. The algorithm achieved an alarm specificity of 81.0% and sensitivity of 95.4%, missing only one out of 22 true asystole alarms. On a separate neonatal data set, the algorithm was able to reject 89.7% (890 out of 992) of false asystole alarms while keeping all 22 true events. The results show that the false asystole alarm rate can be significantly reduced through basic signal quality evaluation.
Scalable Algorithms for Clustering Large Geospatiotemporal Data Sets on Manycore Architectures

NASA Astrophysics Data System (ADS)

Mills, R. T.; Hoffman, F. M.; Kumar, J.; Sreepathi, S.; Sripathi, V.

2016-12-01

The increasing availability of high-resolution geospatiotemporal data sets from sources such as observatory networks, remote sensing platforms, and computational Earth system models has opened new possibilities for knowledge discovery using data sets fused from disparate sources. Traditional algorithms and computing platforms are impractical for the analysis and synthesis of data sets of this size; however, new algorithmic approaches that can effectively utilize the complex memory hierarchies and the extremely high levels of available parallelism in state-of-the-art high-performance computing platforms can enable such analysis. We describe a massively parallel implementation of accelerated k-means clustering and some optimizations to boost computational intensity and utilization of wide SIMD lanes on state-of-the art multi- and manycore processors, including the second-generation Intel Xeon Phi ("Knights Landing") processor based on the Intel Many Integrated Core (MIC) architecture, which includes several new features, including an on-package high-bandwidth memory. We also analyze the code in the context of a few practical applications to the analysis of climatic and remotely-sensed vegetation phenology data sets, and speculate on some of the new applications that such scalable analysis methods may enable.
Parallel algorithms for quantum chemistry. I. Integral transformations on a hypercube multiprocessor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Whiteside, R.A.; Binkley, J.S.; Colvin, M.E.

1987-02-15

For many years it has been recognized that fundamental physical constraints such as the speed of light will limit the ultimate speed of single processor computers to less than about three billion floating point operations per second (3 GFLOPS). This limitation is becoming increasingly restrictive as commercially available machines are now within an order of magnitude of this asymptotic limit. A natural way to avoid this limit is to harness together many processors to work on a single computational problem. In principle, these parallel processing computers have speeds limited only by the number of processors one chooses to acquire. Themore » usefulness of potentially unlimited processing speed to a computationally intensive field such as quantum chemistry is obvious. If these methods are to be applied to significantly larger chemical systems, parallel schemes will have to be employed. For this reason we have developed distributed-memory algorithms for a number of standard quantum chemical methods. We are currently implementing these on a 32 processor Intel hypercube. In this paper we present our algorithm and benchmark results for one of the bottleneck steps in quantum chemical calculations: the four index integral transformation.« less
Optimal Filter Estimation for Lucas-Kanade Optical Flow

PubMed Central

Sharmin, Nusrat; Brad, Remus

2012-01-01

Optical flow algorithms offer a way to estimate motion from a sequence of images. The computation of optical flow plays a key-role in several computer vision applications, including motion detection and segmentation, frame interpolation, three-dimensional scene reconstruction, robot navigation and video compression. In the case of gradient based optical flow implementation, the pre-filtering step plays a vital role, not only for accurate computation of optical flow, but also for the improvement of performance. Generally, in optical flow computation, filtering is used at the initial level on original input images and afterwards, the images are resized. In this paper, we propose an image filtering approach as a pre-processing step for the Lucas-Kanade pyramidal optical flow algorithm. Based on a study of different types of filtering methods and applied on the Iterative Refined Lucas-Kanade, we have concluded on the best filtering practice. As the Gaussian smoothing filter was selected, an empirical approach for the Gaussian variance estimation was introduced. Tested on the Middlebury image sequences, a correlation between the image intensity value and the standard deviation value of the Gaussian function was established. Finally, we have found that our selection method offers a better performance for the Lucas-Kanade optical flow algorithm.
A predictive software tool for optimal timing in contrast enhanced carotid MR angiography

NASA Astrophysics Data System (ADS)

Moghaddam, Abbas N.; Balawi, Tariq; Habibi, Reza; Panknin, Christoph; Laub, Gerhard; Ruehm, Stefan; Finn, J. Paul

2008-03-01

A clear understanding of the first pass dynamics of contrast agents in the vascular system is crucial in synchronizing data acquisition of 3D MR angiography (MRA) with arrival of the contrast bolus in the vessels of interest. We implemented a computational model to simulate contrast dynamics in the vessels using the theory of linear time-invariant systems. The algorithm calculates a patient-specific impulse response for the contrast concentration from time-resolved images following a small test bolus injection. This is performed for a specific region of interest and through deconvolution of the intensity curve using the long division method. Since high spatial resolution 3D MRA is not time-resolved, the method was validated on time-resolved arterial contrast enhancement in Multi Slice CT angiography. For 20 patients, the timing of the contrast enhancement of the main bolus was predicted by our algorithm from the response to the test bolus, and then for each case the predicted time of maximum intensity was compared to the corresponding time in the actual scan which resulted in an acceptable agreement. Furthermore, as a qualitative validation, the algorithm's predictions of the timing of the carotid MRA in 20 patients with high quality MRA were correlated with the actual timing of those studies. We conclude that the above algorithm can be used as a practical clinical tool to eliminate guesswork and to replace empiric formulae by a priori computation of patient-specific timing of data acquisition for MR angiography.
An 'adding' algorithm for the Markov chain formalism for radiation transfer

NASA Technical Reports Server (NTRS)

Esposito, L. W.

1979-01-01

An adding algorithm is presented, that extends the Markov chain method and considers a preceding calculation as a single state of a new Markov chain. This method takes advantage of the description of the radiation transport as a stochastic process. Successive application of this procedure makes calculation possible for any optical depth without increasing the size of the linear system used. It is determined that the time required for the algorithm is comparable to that for a doubling calculation for homogeneous atmospheres. For an inhomogeneous atmosphere the new method is considerably faster than the standard adding routine. It is concluded that the algorithm is efficient, accurate, and suitable for smaller computers in calculating the diffuse intensity scattered by an inhomogeneous planetary atmosphere.
Parallel programming of gradient-based iterative image reconstruction schemes for optical tomography.

PubMed

Hielscher, Andreas H; Bartel, Sebastian

2004-02-01

Optical tomography (OT) is a fast developing novel imaging modality that uses near-infrared (NIR) light to obtain cross-sectional views of optical properties inside the human body. A major challenge remains the time-consuming, computational-intensive image reconstruction problem that converts NIR transmission measurements into cross-sectional images. To increase the speed of iterative image reconstruction schemes that are commonly applied for OT, we have developed and implemented several parallel algorithms on a cluster of workstations. Static process distribution as well as dynamic load balancing schemes suitable for heterogeneous clusters and varying machine performances are introduced and tested. The resulting algorithms are shown to accelerate the reconstruction process to various degrees, substantially reducing the computation times for clinically relevant problems.
A Communication-Optimal Framework for Contracting Distributed Tensors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rajbhandari, Samyam; NIkam, Akshay; Lai, Pai-Wei

Tensor contractions are extremely compute intensive generalized matrix multiplication operations encountered in many computational science fields, such as quantum chemistry and nuclear physics. Unlike distributed matrix multiplication, which has been extensively studied, limited work has been done in understanding distributed tensor contractions. In this paper, we characterize distributed tensor contraction algorithms on torus networks. We develop a framework with three fundamental communication operators to generate communication-efficient contraction algorithms for arbitrary tensor contractions. We show that for a given amount of memory per processor, our framework is communication optimal for all tensor contractions. We demonstrate performance and scalability of our frameworkmore » on up to 262,144 cores of BG/Q supercomputer using five tensor contraction examples.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Rajbhandari, Samyam; NIkam, Akshay; Lai, Pai-Wei

Tensor contractions represent the most compute-intensive core kernels in ab initio computational quantum chemistry and nuclear physics. Symmetries in these tensor contractions makes them difficult to load balance and scale to large distributed systems. In this paper, we develop an efficient and scalable algorithm to contract symmetric tensors. We introduce a novel approach that avoids data redistribution in contracting symmetric tensors while also avoiding redundant storage and maintaining load balance. We present experimental results on two parallel supercomputers for several symmetric contractions that appear in the CCSD quantum chemistry method. We also present a novel approach to tensor redistribution thatmore » can take advantage of parallel hyperplanes when the initial distribution has replicated dimensions, and use collective broadcast when the final distribution has replicated dimensions, making the algorithm very efficient.« less
Linear static structural and vibration analysis on high-performance computers

NASA Technical Reports Server (NTRS)

Baddourah, M. A.; Storaasli, O. O.; Bostic, S. W.

1993-01-01

Parallel computers offer the oppurtunity to significantly reduce the computation time necessary to analyze large-scale aerospace structures. This paper presents algorithms developed for and implemented on massively-parallel computers hereafter referred to as Scalable High-Performance Computers (SHPC), for the most computationally intensive tasks involved in structural analysis, namely, generation and assembly of system matrices, solution of systems of equations and calculation of the eigenvalues and eigenvectors. Results on SHPC are presented for large-scale structural problems (i.e. models for High-Speed Civil Transport). The goal of this research is to develop a new, efficient technique which extends structural analysis to SHPC and makes large-scale structural analyses tractable.
Robust feature extraction for rapid classification of damage in composites

NASA Astrophysics Data System (ADS)

Coelho, Clyde K.; Reynolds, Whitney; Chattopadhyay, Aditi

2009-03-01

The ability to detect anomalies in signals from sensors is imperative for structural health monitoring (SHM) applications. Many of the candidate algorithms for these applications either require a lot of training examples or are very computationally inefficient for large sample sizes. The damage detection framework presented in this paper uses a combination of Linear Discriminant Analysis (LDA) along with Support Vector Machines (SVM) to obtain a computationally efficient classification scheme for rapid damage state determination. LDA was used for feature extraction of damage signals from piezoelectric sensors on a composite plate and these features were used to train the SVM algorithm in parts, reducing the computational intensity associated with the quadratic optimization problem that needs to be solved during training. SVM classifiers were organized into a binary tree structure to speed up classification, which also reduces the total training time required. This framework was validated on composite plates that were impacted at various locations. The results show that the algorithm was able to correctly predict the different impact damage cases in composite laminates using less than 21 percent of the total available training data after data reduction.
A Faster Parallel Algorithm and Efficient Multithreaded Implementations for Evaluating Betweenness Centrality on Massive Datasets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Madduri, Kamesh; Ediger, David; Jiang, Karl

2009-02-15

We present a new lock-free parallel algorithm for computing betweenness centralityof massive small-world networks. With minor changes to the data structures, ouralgorithm also achieves better spatial cache locality compared to previous approaches. Betweenness centrality is a key algorithm kernel in HPCS SSCA#2, a benchmark extensively used to evaluate the performance of emerging high-performance computing architectures for graph-theoretic computations. We design optimized implementations of betweenness centrality and the SSCA#2 benchmark for two hardware multithreaded systems: a Cray XMT system with the Threadstorm processor, and a single-socket Sun multicore server with the UltraSPARC T2 processor. For a small-world network of 134 millionmore » vertices and 1.073 billion edges, the 16-processor XMT system and the 8-core Sun Fire T5120 server achieve TEPS scores (an algorithmic performance count for the SSCA#2 benchmark) of 160 million and 90 million respectively, which corresponds to more than a 2X performance improvement over the previous parallel implementations. To better characterize the performance of these multithreaded systems, we correlate the SSCA#2 performance results with data from the memory-intensive STREAM and RandomAccess benchmarks. Finally, we demonstrate the applicability of our implementation to analyze massive real-world datasets by computing approximate betweenness centrality for a large-scale IMDb movie-actor network.« less
GATECloud.net: a platform for large-scale, open-source text processing on the cloud.

PubMed

Tablan, Valentin; Roberts, Ian; Cunningham, Hamish; Bontcheva, Kalina

2013-01-28

Cloud computing is increasingly being regarded as a key enabler of the 'democratization of science', because on-demand, highly scalable cloud computing facilities enable researchers anywhere to carry out data-intensive experiments. In the context of natural language processing (NLP), algorithms tend to be complex, which makes their parallelization and deployment on cloud platforms a non-trivial task. This study presents a new, unique, cloud-based platform for large-scale NLP research--GATECloud. net. It enables researchers to carry out data-intensive NLP experiments by harnessing the vast, on-demand compute power of the Amazon cloud. Important infrastructural issues are dealt with by the platform, completely transparently for the researcher: load balancing, efficient data upload and storage, deployment on the virtual machines, security and fault tolerance. We also include a cost-benefit analysis and usage evaluation.
NETRA: A parallel architecture for integrated vision systems. 1: Architecture and organization

NASA Technical Reports Server (NTRS)

Choudhary, Alok N.; Patel, Janak H.; Ahuja, Narendra

1989-01-01

Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is considered to be a system that uses vision algorithms from all levels of processing for a high level application (such as object recognition). A model of computation is presented for parallel processing for an IVS. Using the model, desired features and capabilities of a parallel architecture suitable for IVSs are derived. Then a multiprocessor architecture (called NETRA) is presented. This architecture is highly flexible without the use of complex interconnection schemes. The topology of NETRA is recursively defined and hence is easily scalable from small to large systems. Homogeneity of NETRA permits fault tolerance and graceful degradation under faults. It is a recursively defined tree-type hierarchical architecture where each of the leaf nodes consists of a cluster of processors connected with a programmable crossbar with selective broadcast capability to provide for desired flexibility. A qualitative evaluation of NETRA is presented. Then general schemes are described to map parallel algorithms onto NETRA. Algorithms are classified according to their communication requirements for parallel processing. An extensive analysis of inter-cluster communication strategies in NETRA is presented, and parameters affecting performance of parallel algorithms when mapped on NETRA are discussed. Finally, a methodology to evaluate performance of algorithms on NETRA is described.

Online learning in optical tomography: a stochastic approach

NASA Astrophysics Data System (ADS)

Chen, Ke; Li, Qin; Liu, Jian-Guo

2018-07-01

We study the inverse problem of radiative transfer equation (RTE) using stochastic gradient descent method (SGD) in this paper. Mathematically, optical tomography amounts to recovering the optical parameters in RTE using the incoming–outgoing pair of light intensity. We formulate it as a PDE-constraint optimization problem, where the mismatch of computed and measured outgoing data is minimized with same initial data and RTE constraint. The memory and computation cost it requires, however, is typically prohibitive, especially in high dimensional space. Smart iterative solvers that only use partial information in each step is called for thereafter. Stochastic gradient descent method is an online learning algorithm that randomly selects data for minimizing the mismatch. It requires minimum memory and computation, and advances fast, therefore perfectly serves the purpose. In this paper we formulate the problem, in both nonlinear and its linearized setting, apply SGD algorithm and analyze the convergence performance.
A Real Time Controller For Applications In Smart Structures

NASA Astrophysics Data System (ADS)

Ahrens, Christian P.; Claus, Richard O.

1990-02-01

Research in smart structures, especially the area of vibration suppression, has warranted the investigation of advanced computing environments. Real time PC computing power has limited development of high order control algorithms. This paper presents a simple Real Time Embedded Control System (RTECS) in an application of Intelligent Structure Monitoring by way of modal domain sensing for vibration control. It is compared to a PC AT based system for overall functionality and speed. The system employs a novel Reduced Instruction Set Computer (RISC) microcontroller capable of 15 million instructions per second (MIPS) continuous performance and burst rates of 40 MIPS. Advanced Complimentary Metal Oxide Semiconductor (CMOS) circuits are integrated on a single 100 mm by 160 mm printed circuit board requiring only 1 Watt of power. An operating system written in Forth provides high speed operation and short development cycles. The system allows for implementation of Input/Output (I/O) intensive algorithms and provides capability for advanced system development.
Multi-layer Lanczos iteration approach to calculations of vibrational energies and dipole transition intensities for polyatomic molecules

DOE PAGES

Yu, Hua-Gen

2015-01-28

We report a rigorous full dimensional quantum dynamics algorithm, the multi-layer Lanczos method, for computing vibrational energies and dipole transition intensities of polyatomic molecules without any dynamics approximation. The multi-layer Lanczos method is developed by using a few advanced techniques including the guided spectral transform Lanczos method, multi-layer Lanczos iteration approach, recursive residue generation method, and dipole-wavefunction contraction. The quantum molecular Hamiltonian at the total angular momentum J = 0 is represented in a set of orthogonal polyspherical coordinates so that the large amplitude motions of vibrations are naturally described. In particular, the algorithm is general and problem-independent. An applicationmore » is illustrated by calculating the infrared vibrational dipole transition spectrum of CH₄ based on the ab initio T8 potential energy surface of Schwenke and Partridge and the low-order truncated ab initio dipole moment surfaces of Yurchenko and co-workers. A comparison with experiments is made. The algorithm is also applicable for Raman polarizability active spectra.« less
A real-time photogrammetric algorithm for sensor and synthetic image fusion with application to aviation combined vision

NASA Astrophysics Data System (ADS)

Lebedev, M. A.; Stepaniants, D. G.; Komarov, D. V.; Vygolov, O. V.; Vizilter, Yu. V.; Zheltov, S. Yu.

2014-08-01

The paper addresses a promising visualization concept related to combination of sensor and synthetic images in order to enhance situation awareness of a pilot during an aircraft landing. A real-time algorithm for a fusion of a sensor image, acquired by an onboard camera, and a synthetic 3D image of the external view, generated in an onboard computer, is proposed. The pixel correspondence between the sensor and the synthetic images is obtained by an exterior orientation of a "virtual" camera using runway points as a geospatial reference. The runway points are detected by the Projective Hough Transform, which idea is to project the edge map onto a horizontal plane in the object space (the runway plane) and then to calculate intensity projections of edge pixels on different directions of intensity gradient. The performed experiments on simulated images show that on a base glide path the algorithm provides image fusion with pixel accuracy, even in the case of significant navigation errors.
Activity recognition using a single accelerometer placed at the wrist or ankle.

PubMed

Mannini, Andrea; Intille, Stephen S; Rosenberger, Mary; Sabatini, Angelo M; Haskell, William

2013-11-01

Large physical activity surveillance projects such as the UK Biobank and NHANES are using wrist-worn accelerometer-based activity monitors that collect raw data. The goal is to increase wear time by asking subjects to wear the monitors on the wrist instead of the hip, and then to use information in the raw signal to improve activity type and intensity estimation. The purposes of this work was to obtain an algorithm to process wrist and ankle raw data and to classify behavior into four broad activity classes: ambulation, cycling, sedentary, and other activities. Participants (N = 33) wearing accelerometers on the wrist and ankle performed 26 daily activities. The accelerometer data were collected, cleaned, and preprocessed to extract features that characterize 2-, 4-, and 12.8-s data windows. Feature vectors encoding information about frequency and intensity of motion extracted from analysis of the raw signal were used with a support vector machine classifier to identify a subject's activity. Results were compared with categories classified by a human observer. Algorithms were validated using a leave-one-subject-out strategy. The computational complexity of each processing step was also evaluated. With 12.8-s windows, the proposed strategy showed high classification accuracies for ankle data (95.0%) that decreased to 84.7% for wrist data. Shorter (4 s) windows only minimally decreased performances of the algorithm on the wrist to 84.2%. A classification algorithm using 13 features shows good classification into the four classes given the complexity of the activities in the original data set. The algorithm is computationally efficient and could be implemented in real time on mobile devices with only 4-s latency.
Medical Imaging Lesion Detection Based on Unified Gravitational Fuzzy Clustering

PubMed Central

Vianney Kinani, Jean Marie; Gallegos Funes, Francisco; Mújica Vargas, Dante; Ramos Díaz, Eduardo; Arellano, Alfonso

2017-01-01

We develop a swift, robust, and practical tool for detecting brain lesions with minimal user intervention to assist clinicians and researchers in the diagnosis process, radiosurgery planning, and assessment of the patient's response to the therapy. We propose a unified gravitational fuzzy clustering-based segmentation algorithm, which integrates the Newtonian concept of gravity into fuzzy clustering. We first perform fuzzy rule-based image enhancement on our database which is comprised of T1/T2 weighted magnetic resonance (MR) and fluid-attenuated inversion recovery (FLAIR) images to facilitate a smoother segmentation. The scalar output obtained is fed into a gravitational fuzzy clustering algorithm, which separates healthy structures from the unhealthy. Finally, the lesion contour is automatically outlined through the initialization-free level set evolution method. An advantage of this lesion detection algorithm is its precision and its simultaneous use of features computed from the intensity properties of the MR scan in a cascading pattern, which makes the computation fast, robust, and self-contained. Furthermore, we validate our algorithm with large-scale experiments using clinical and synthetic brain lesion datasets. As a result, an 84%–93% overlap performance is obtained, with an emphasis on robustness with respect to different and heterogeneous types of lesion and a swift computation time. PMID:29158887
Electron transport theory in magnetic nanostructures

NASA Astrophysics Data System (ADS)

Choy, Tat-Sang

Magnetic nanostructure has been a new trend because of its application in making magnetic sensors, magnetic memories, and magnetic reading heads in hard disks drives. Although a variety of nanostructures have been realized in experiments in recent years by innovative sample growth techniques, the theoretical study of these devices remain a challenge. On one hand, atomic scale modeling is often required for studying the magnetic nanostructures; on the other, these structures often have a dimension on the order of one micrometer, which makes the calculation numerically intensive. In this work, we have studied the electron transport theory in magnetic nanostructures, with special attention to the giant magnetoresistance (GMR) structure. We have developed a model that includes the details of the band structure and disorder, both of which are both important in obtaining the conductivity. We have also developed an efficient algorithm to compute the conductivity in magnetic nanostructures. The model and the algorithm are general and can be applied to complicated structures. We have applied the theory to current-perpendicular-to-plane GMR structures and the results agree with experiments. Finally, we have searched for the atomic configuration with the highest GMR using the simulated annealing algorithm. This method is computationally intensive because we have to compute the GMR for 103 to 104 configurations. However it is still very efficient because the number of steps it takes to find the maximum is much smaller than the number of all possible GMR structures. We found that ultra-thin NiCu superlattices have surprisingly large GMR even at the moderate disorder in experiments. This finding may be useful in improving the GMR technology.
Nonlinear histogram binning for quantitative analysis of lung tissue fibrosis in high-resolution CT data

NASA Astrophysics Data System (ADS)

Zavaletta, Vanessa A.; Bartholmai, Brian J.; Robb, Richard A.

2007-03-01

Diffuse lung diseases, such as idiopathic pulmonary fibrosis (IPF), can be characterized and quantified by analysis of volumetric high resolution CT scans of the lungs. These data sets typically have dimensions of 512 x 512 x 400. It is too subjective and labor intensive for a radiologist to analyze each slice and quantify regional abnormalities manually. Thus, computer aided techniques are necessary, particularly texture analysis techniques which classify various lung tissue types. Second and higher order statistics which relate the spatial variation of the intensity values are good discriminatory features for various textures. The intensity values in lung CT scans range between [-1024, 1024]. Calculation of second order statistics on this range is too computationally intensive so the data is typically binned between 16 or 32 gray levels. There are more effective ways of binning the gray level range to improve classification. An optimal and very efficient way to nonlinearly bin the histogram is to use a dynamic programming algorithm. The objective of this paper is to show that nonlinear binning using dynamic programming is computationally efficient and improves the discriminatory power of the second and higher order statistics for more accurate quantification of diffuse lung disease.
The estimation of pointing angle and normalized surface scattering cross section from GEOS-3 radar altimeter measurements

NASA Technical Reports Server (NTRS)

Brown, G. S.; Curry, W. J.

1977-01-01

The statistical error of the pointing angle estimation technique is determined as a function of the effective receiver signal to noise ratio. Other sources of error are addressed and evaluated with inadequate calibration being of major concern. The impact of pointing error on the computation of normalized surface scattering cross section (sigma) from radar and the waveform attitude induced altitude bias is considered and quantitative results are presented. Pointing angle and sigma processing algorithms are presented along with some initial data. The intensive mode clean vs. clutter AGC calibration problem is analytically resolved. The use clutter AGC data in the intensive mode is confirmed as the correct calibration set for the sigma computations.
Online System for Faster Multipoint Linkage Analysis via Parallel Execution on Thousands of Personal Computers

PubMed Central

Silberstein, M.; Tzemach, A.; Dovgolevsky, N.; Fishelson, M.; Schuster, A.; Geiger, D.

2006-01-01

Computation of LOD scores is a valuable tool for mapping disease-susceptibility genes in the study of Mendelian and complex diseases. However, computation of exact multipoint likelihoods of large inbred pedigrees with extensive missing data is often beyond the capabilities of a single computer. We present a distributed system called “SUPERLINK-ONLINE,” for the computation of multipoint LOD scores of large inbred pedigrees. It achieves high performance via the efficient parallelization of the algorithms in SUPERLINK, a state-of-the-art serial program for these tasks, and through the use of the idle cycles of thousands of personal computers. The main algorithmic challenge has been to efficiently split a large task for distributed execution in a highly dynamic, nondedicated running environment. Notably, the system is available online, which allows computationally intensive analyses to be performed with no need for either the installation of software or the maintenance of a complicated distributed environment. As the system was being developed, it was extensively tested by collaborating medical centers worldwide on a variety of real data sets, some of which are presented in this article. PMID:16685644
Parallel Markov chain Monte Carlo - bridging the gap to high-performance Bayesian computation in animal breeding and genetics.

PubMed

Wu, Xiao-Lin; Sun, Chuanyu; Beissinger, Timothy M; Rosa, Guilherme Jm; Weigel, Kent A; Gatti, Natalia de Leon; Gianola, Daniel

2012-09-25

Most Bayesian models for the analysis of complex traits are not analytically tractable and inferences are based on computationally intensive techniques. This is true of Bayesian models for genome-enabled selection, which uses whole-genome molecular data to predict the genetic merit of candidate animals for breeding purposes. In this regard, parallel computing can overcome the bottlenecks that can arise from series computing. Hence, a major goal of the present study is to bridge the gap to high-performance Bayesian computation in the context of animal breeding and genetics. Parallel Monte Carlo Markov chain algorithms and strategies are described in the context of animal breeding and genetics. Parallel Monte Carlo algorithms are introduced as a starting point including their applications to computing single-parameter and certain multiple-parameter models. Then, two basic approaches for parallel Markov chain Monte Carlo are described: one aims at parallelization within a single chain; the other is based on running multiple chains, yet some variants are discussed as well. Features and strategies of the parallel Markov chain Monte Carlo are illustrated using real data, including a large beef cattle dataset with 50K SNP genotypes. Parallel Markov chain Monte Carlo algorithms are useful for computing complex Bayesian models, which does not only lead to a dramatic speedup in computing but can also be used to optimize model parameters in complex Bayesian models. Hence, we anticipate that use of parallel Markov chain Monte Carlo will have a profound impact on revolutionizing the computational tools for genomic selection programs.
Parallel Markov chain Monte Carlo - bridging the gap to high-performance Bayesian computation in animal breeding and genetics

PubMed Central

2012-01-01

Background Most Bayesian models for the analysis of complex traits are not analytically tractable and inferences are based on computationally intensive techniques. This is true of Bayesian models for genome-enabled selection, which uses whole-genome molecular data to predict the genetic merit of candidate animals for breeding purposes. In this regard, parallel computing can overcome the bottlenecks that can arise from series computing. Hence, a major goal of the present study is to bridge the gap to high-performance Bayesian computation in the context of animal breeding and genetics. Results Parallel Monte Carlo Markov chain algorithms and strategies are described in the context of animal breeding and genetics. Parallel Monte Carlo algorithms are introduced as a starting point including their applications to computing single-parameter and certain multiple-parameter models. Then, two basic approaches for parallel Markov chain Monte Carlo are described: one aims at parallelization within a single chain; the other is based on running multiple chains, yet some variants are discussed as well. Features and strategies of the parallel Markov chain Monte Carlo are illustrated using real data, including a large beef cattle dataset with 50K SNP genotypes. Conclusions Parallel Markov chain Monte Carlo algorithms are useful for computing complex Bayesian models, which does not only lead to a dramatic speedup in computing but can also be used to optimize model parameters in complex Bayesian models. Hence, we anticipate that use of parallel Markov chain Monte Carlo will have a profound impact on revolutionizing the computational tools for genomic selection programs. PMID:23009363
Three-dimensional digital mapping of the optic nerve head cupping in glaucoma

NASA Astrophysics Data System (ADS)

Mitra, Sunanda; Ramirez, Manuel; Morales, Jose

1992-08-01

Visualization of the optic nerve head cupping is clinically achieved by stereoscopic viewing of a fundus image pair of the suspected eye. A novel algorithm for three-dimensional digital surface representation of the optic nerve head, using fusion of stereo depth map with a linearly stretched intensity image of a stereo fundus image pair, is presented. Prior to depth map acquisition, a number of preprocessing tasks including feature extraction, registration by cepstral analysis, and correction for intensity variations are performed. The depth map is obtained by using a coarse to fine strategy for obtaining disparities between corresponding areas. The required matching techniques to obtain the translational differences in every step, uses cepstral analysis and correlation-like scanning technique in the spatial domain for the finest details. The quantitative and precise representation of the optic nerve head surface topography following this algorithm is not computationally intensive and should provide more useful information than just qualitative stereoscopic viewing of the fundus as one of the diagnostic criteria for diagnosis of glaucoma.
A robust close-range photogrammetric target extraction algorithm for size and type variant targets

NASA Astrophysics Data System (ADS)

Nyarko, Kofi; Thomas, Clayton; Torres, Gilbert

2016-05-01

The Photo-G program conducted by Naval Air Systems Command at the Atlantic Test Range in Patuxent River, Maryland, uses photogrammetric analysis of large amounts of real-world imagery to characterize the motion of objects in a 3-D scene. Current approaches involve several independent processes including target acquisition, target identification, 2-D tracking of image features, and 3-D kinematic state estimation. Each process has its own inherent complications and corresponding degrees of both human intervention and computational complexity. One approach being explored for automated target acquisition relies on exploiting the pixel intensity distributions of photogrammetric targets, which tend to be patterns with bimodal intensity distributions. The bimodal distribution partitioning algorithm utilizes this distribution to automatically deconstruct a video frame into regions of interest (ROI) that are merged and expanded to target boundaries, from which ROI centroids are extracted to mark target acquisition points. This process has proved to be scale, position and orientation invariant, as well as fairly insensitive to global uniform intensity disparities.
Parallel matrix multiplication on the Connection Machine

NASA Technical Reports Server (NTRS)

Tichy, Walter F.

1988-01-01

Matrix multiplication is a computation and communication intensive problem. Six parallel algorithms for matrix multiplication on the Connection Machine are presented and compared with respect to their performance and processor usage. For n by n matrices, the algorithms have theoretical running times of O(n to the 2nd power log n), O(n log n), O(n), and O(log n), and require n, n to the 2nd power, n to the 2nd power, and n to the 3rd power processors, respectively. With careful attention to communication patterns, the theoretically predicted runtimes can indeed be achieved in practice. The parallel algorithms illustrate the tradeoffs between performance, communication cost, and processor usage.
On the Multilevel Solution Algorithm for Markov Chains

NASA Technical Reports Server (NTRS)

Horton, Graham

1997-01-01

We discuss the recently introduced multilevel algorithm for the steady-state solution of Markov chains. The method is based on an aggregation principle which is well established in the literature and features a multiplicative coarse-level correction. Recursive application of the aggregation principle, which uses an operator-dependent coarsening, yields a multi-level method which has been shown experimentally to give results significantly faster than the typical methods currently in use. When cast as a multigrid-like method, the algorithm is seen to be a Galerkin-Full Approximation Scheme with a solution-dependent prolongation operator. Special properties of this prolongation lead to the cancellation of the computationally intensive terms of the coarse-level equations.
Computation of transmitted and received B1 fields in magnetic resonance imaging.

PubMed

Milles, Julien; Zhu, Yue Min; Chen, Nan-Kuei; Panych, Lawrence P; Gimenez, Gérard; Guttmann, Charles R G

2006-05-01

Computation of B1 fields is a key issue for determination and correction of intensity nonuniformity in magnetic resonance images. This paper presents a new method for computing transmitted and received B1 fields. Our method combines a modified MRI acquisition protocol and an estimation technique based on the Levenberg-Marquardt algorithm and spatial filtering. It enables accurate estimation of transmitted and received B1 fields for both homogeneous and heterogeneous objects. The method is validated using numerical simulations and experimental data from phantom and human scans. The experimental results are in agreement with theoretical expectations.
A supportive architecture for CFD-based design optimisation

NASA Astrophysics Data System (ADS)

Li, Ni; Su, Zeya; Bi, Zhuming; Tian, Chao; Ren, Zhiming; Gong, Guanghong

2014-03-01

Multi-disciplinary design optimisation (MDO) is one of critical methodologies to the implementation of enterprise systems (ES). MDO requiring the analysis of fluid dynamics raises a special challenge due to its extremely intensive computation. The rapid development of computational fluid dynamic (CFD) technique has caused a rise of its applications in various fields. Especially for the exterior designs of vehicles, CFD has become one of the three main design tools comparable to analytical approaches and wind tunnel experiments. CFD-based design optimisation is an effective way to achieve the desired performance under the given constraints. However, due to the complexity of CFD, integrating with CFD analysis in an intelligent optimisation algorithm is not straightforward. It is a challenge to solve a CFD-based design problem, which is usually with high dimensions, and multiple objectives and constraints. It is desirable to have an integrated architecture for CFD-based design optimisation. However, our review on existing works has found that very few researchers have studied on the assistive tools to facilitate CFD-based design optimisation. In the paper, a multi-layer architecture and a general procedure are proposed to integrate different CFD toolsets with intelligent optimisation algorithms, parallel computing technique and other techniques for efficient computation. In the proposed architecture, the integration is performed either at the code level or data level to fully utilise the capabilities of different assistive tools. Two intelligent algorithms are developed and embedded with parallel computing. These algorithms, together with the supportive architecture, lay a solid foundation for various applications of CFD-based design optimisation. To illustrate the effectiveness of the proposed architecture and algorithms, the case studies on aerodynamic shape design of a hypersonic cruising vehicle are provided, and the result has shown that the proposed architecture and developed algorithms have performed successfully and efficiently in dealing with the design optimisation with over 200 design variables.
High-speed parallel implementation of a modified PBR algorithm on DSP-based EH topology

NASA Astrophysics Data System (ADS)

Rajan, K.; Patnaik, L. M.; Ramakrishna, J.

1997-08-01

Algebraic Reconstruction Technique (ART) is an age-old method used for solving the problem of three-dimensional (3-D) reconstruction from projections in electron microscopy and radiology. In medical applications, direct 3-D reconstruction is at the forefront of investigation. The simultaneous iterative reconstruction technique (SIRT) is an ART-type algorithm with the potential of generating in a few iterations tomographic images of a quality comparable to that of convolution backprojection (CBP) methods. Pixel-based reconstruction (PBR) is similar to SIRT reconstruction, and it has been shown that PBR algorithms give better quality pictures compared to those produced by SIRT algorithms. In this work, we propose a few modifications to the PBR algorithms. The modified algorithms are shown to give better quality pictures compared to PBR algorithms. The PBR algorithm and the modified PBR algorithms are highly compute intensive, Not many attempts have been made to reconstruct objects in the true 3-D sense because of the high computational overhead. In this study, we have developed parallel two-dimensional (2-D) and 3-D reconstruction algorithms based on modified PBR. We attempt to solve the two problems encountered by the PBR and modified PBR algorithms, i.e., the long computational time and the large memory requirements, by parallelizing the algorithm on a multiprocessor system. We investigate the possible task and data partitioning schemes by exploiting the potential parallelism in the PBR algorithm subject to minimizing the memory requirement. We have implemented an extended hypercube (EH) architecture for the high-speed execution of the 3-D reconstruction algorithm using the commercially available fast floating point digital signal processor (DSP) chips as the processing elements (PEs) and dual-port random access memories (DPR) as channels between the PEs. We discuss and compare the performances of the PBR algorithm on an IBM 6000 RISC workstation, on a Silicon Graphics Indigo 2 workstation, and on an EH system. The results show that an EH(3,1) using DSP chips as PEs executes the modified PBR algorithm about 100 times faster than an LBM 6000 RISC workstation. We have executed the algorithms on a 4-node IBM SP2 parallel computer. The results show that execution time of the algorithm on an EH(3,1) is better than that of a 4-node IBM SP2 system. The speed-up of an EH(3,1) system with eight PEs and one network controller is approximately 7.85.
Combined mine tremors source location and error evaluation in the Lubin Copper Mine (Poland)

NASA Astrophysics Data System (ADS)

Leśniak, Andrzej; Pszczoła, Grzegorz

2008-08-01

A modified method of mine tremors location used in Lubin Copper Mine is presented in the paper. In mines where an intensive exploration is carried out a high accuracy source location technique is usually required. The effect of the flatness of the geophones array, complex geological structure of the rock mass and intense exploitation make the location results ambiguous in such mines. In the present paper an effective method of source location and location's error evaluations are presented, combining data from two different arrays of geophones. The first consists of uniaxial geophones spaced in the whole mine area. The second is installed in one of the mining panels and consists of triaxial geophones. The usage of the data obtained from triaxial geophones allows to increase the hypocenter vertical coordinate precision. The presented two-step location procedure combines standard location methods: P-waves directions and P-waves arrival times. Using computer simulations the efficiency of the created algorithm was tested. The designed algorithm is fully non-linear and was tested on the multilayered rock mass model of the Lubin Copper Mine, showing a computational better efficiency than the traditional P-wave arrival times location algorithm. In this paper we present the complete procedure that effectively solves the non-linear location problems, i.e. the mine tremor location and measurement of the error propagation.

Variational Calculations of Ro-Vibrational Energy Levels and Transition Intensities for Tetratomic Molecules

NASA Technical Reports Server (NTRS)

Schwenke, David W.; Langhoff, Stephen R. (Technical Monitor)

1995-01-01

A description is given of an algorithm for computing ro-vibrational energy levels for tetratomic molecules. The expressions required for evaluating transition intensities are also given. The variational principle is used to determine the energy levels and the kinetic energy operator is simple and evaluated exactly. The computational procedure is split up into the determination of one dimensional radial basis functions, the computation of a contracted rotational-bending basis, followed by a final variational step coupling all degrees of freedom. An angular basis is proposed whereby the rotational-bending contraction takes place in three steps. Angular matrix elements of the potential are evaluated by expansion in terms of a suitable basis and the angular integrals are given in a factorized form which simplifies their evaluation. The basis functions in the final variational step have the full permutation symmetries of the identical particles. Sample results are given for HCCH and BH3.
Computer Simulation for Calculating the Second-Order Correlation Function of Classical and Quantum Light

ERIC Educational Resources Information Center

Facao, M.; Lopes, A.; Silva, A. L.; Silva, P.

2011-01-01

We propose an undergraduate numerical project for simulating the results of the second-order correlation function as obtained by an intensity interference experiment for two kinds of light, namely bunched light with Gaussian or Lorentzian power density spectrum and antibunched light obtained from single-photon sources. While the algorithm for…
Franck-Condon Factors for Diatomics: Insights and Analysis Using the Fourier Grid Hamiltonian Method

ERIC Educational Resources Information Center

Ghosh, Supriya; Dixit, Mayank Kumar; Bhattacharyya, S. P.; Tembe, B. L.

2013-01-01

Franck-Condon factors (FCFs) play a crucial role in determining the intensities of the vibrational bands in electronic transitions. In this article, a relatively simple method to calculate the FCFs is illustrated. An algorithm for the Fourier Grid Hamiltonian (FGH) method for computing the vibrational wave functions and the corresponding energy…
Point Target Detection in IR Image Sequences using Spatio-Temporal Hypotheses Testing

DTIC Science & Technology

1999-02-01

incorporate temporal as well as spatial infor- mation, they are often referred to as \\ track before detect " algorithms. The standard approach was to pose the...6, 3]. A drawback of these track - before - detect techniques is that they are very computationally intensive since the entire 3-D space must be ltered
Investigating power capping toward energy-efficient scientific applications: Investigating Power Capping toward Energy-Efficient Scientific Applications

DOE PAGES

Haidar, Azzam; Jagode, Heike; Vaccaro, Phil; ...

2018-03-22

The emergence of power efficiency as a primary constraint in processor and system design poses new challenges concerning power and energy awareness for numerical libraries and scientific applications. Power consumption also plays a major role in the design of data centers, which may house petascale or exascale-level computing systems. At these extreme scales, understanding and improving the energy efficiency of numerical libraries and their related applications becomes a crucial part of the successful implementation and operation of the computing system. In this paper, we study and investigate the practice of controlling a compute system's power usage, and we explore howmore » different power caps affect the performance of numerical algorithms with different computational intensities. Further, we determine the impact, in terms of performance and energy usage, that these caps have on a system running scientific applications. This analysis will enable us to characterize the types of algorithms that benefit most from these power management schemes. Our experiments are performed using a set of representative kernels and several popular scientific benchmarks. Lastly, we quantify a number of power and performance measurements and draw observations and conclusions that can be viewed as a roadmap to achieving energy efficiency in the design and execution of scientific algorithms.« less
Investigating power capping toward energy-efficient scientific applications: Investigating Power Capping toward Energy-Efficient Scientific Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Haidar, Azzam; Jagode, Heike; Vaccaro, Phil

The emergence of power efficiency as a primary constraint in processor and system design poses new challenges concerning power and energy awareness for numerical libraries and scientific applications. Power consumption also plays a major role in the design of data centers, which may house petascale or exascale-level computing systems. At these extreme scales, understanding and improving the energy efficiency of numerical libraries and their related applications becomes a crucial part of the successful implementation and operation of the computing system. In this paper, we study and investigate the practice of controlling a compute system's power usage, and we explore howmore » different power caps affect the performance of numerical algorithms with different computational intensities. Further, we determine the impact, in terms of performance and energy usage, that these caps have on a system running scientific applications. This analysis will enable us to characterize the types of algorithms that benefit most from these power management schemes. Our experiments are performed using a set of representative kernels and several popular scientific benchmarks. Lastly, we quantify a number of power and performance measurements and draw observations and conclusions that can be viewed as a roadmap to achieving energy efficiency in the design and execution of scientific algorithms.« less
High-Performance Java Codes for Computational Fluid Dynamics

NASA Technical Reports Server (NTRS)

Riley, Christopher; Chatterjee, Siddhartha; Biswas, Rupak; Biegel, Bryan (Technical Monitor)

2001-01-01

The computational science community is reluctant to write large-scale computationally -intensive applications in Java due to concerns over Java's poor performance, despite the claimed software engineering advantages of its object-oriented features. Naive Java implementations of numerical algorithms can perform poorly compared to corresponding Fortran or C implementations. To achieve high performance, Java applications must be designed with good performance as a primary goal. This paper presents the object-oriented design and implementation of two real-world applications from the field of Computational Fluid Dynamics (CFD): a finite-volume fluid flow solver (LAURA, from NASA Langley Research Center), and an unstructured mesh adaptation algorithm (2D_TAG, from NASA Ames Research Center). This work builds on our previous experience with the design of high-performance numerical libraries in Java. We examine the performance of the applications using the currently available Java infrastructure and show that the Java version of the flow solver LAURA performs almost within a factor of 2 of the original procedural version. Our Java version of the mesh adaptation algorithm 2D_TAG performs within a factor of 1.5 of its original procedural version on certain platforms. Our results demonstrate that object-oriented software design principles are not necessarily inimical to high performance.
Comparison of maximum intensity projection and digitally reconstructed radiographic projection for carotid artery stenosis measurement

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hyde, Derek E.; Habets, Damiaan F.; Fox, Allan J.

2007-07-15

Digital subtraction angiography is being supplanted by three-dimensional imaging techniques in many clinical applications, leading to extensive use of maximum intensity projection (MIP) images to depict volumetric vascular data. The MIP algorithm produces intensity profiles that are different than conventional angiograms, and can also increase the vessel-to-tissue contrast-to-noise ratio. We evaluated the effect of the MIP algorithm in a clinical application where quantitative vessel measurement is important: internal carotid artery stenosis grading. Three-dimensional computed rotational angiography (CRA) was performed on 26 consecutive symptomatic patients to verify an internal carotid artery stenosis originally found using duplex ultrasound. These volumes of datamore » were visualized using two different postprocessing projection techniques: MIP and digitally reconstructed radiographic (DRR) projection. A DRR is a radiographic image simulating a conventional digitally subtracted angiogram, but it is derived computationally from the same CRA dataset as the MIP. By visualizing a single volume with two different projection techniques, the postprocessing effect of the MIP algorithm is isolated. Vessel measurements were made, according to the NASCET guidelines, and percentage stenosis grades were calculated. The paired t-test was used to determine if the measurement difference between the two techniques was statistically significant. The CRA technique provided an isotropic voxel spacing of 0.38 mm. The MIPs and DRRs had a mean signal-difference-to-noise-ratio of 30:1 and 26:1, respectively. Vessel measurements from MIPs were, on average, 0.17 mm larger than those from DRRs (P<0.0001). The NASCET-type stenosis grades tended to be underestimated on average by 2.4% with the MIP algorithm, although this was not statistically significant (P=0.09). The mean interobserver variability (standard deviation) of both the MIP and DRR images was 0.35 mm. It was concluded that the MIP algorithm slightly increased the apparent dimensions of the arteries, when applied to these intra-arterial CRA images. This subpixel increase was smaller than both the voxel size and interobserver variability, and was therefore not clinically relevant.« less
Whole brain diffeomorphic metric mapping via integration of sulcal and gyral curves, cortical surfaces, and images

PubMed Central

Du, Jia; Younes, Laurent; Qiu, Anqi

2011-01-01

This paper introduces a novel large deformation diffeomorphic metric mapping algorithm for whole brain registration where sulcal and gyral curves, cortical surfaces, and intensity images are simultaneously carried from one subject to another through a flow of diffeomorphisms. To the best of our knowledge, this is the first time that the diffeomorphic metric from one brain to another is derived in a shape space of intensity images and point sets (such as curves and surfaces) in a unified manner. We describe the Euler–Lagrange equation associated with this algorithm with respect to momentum, a linear transformation of the velocity vector field of the diffeomorphic flow. The numerical implementation for solving this variational problem, which involves large-scale kernel convolution in an irregular grid, is made feasible by introducing a class of computationally friendly kernels. We apply this algorithm to align magnetic resonance brain data. Our whole brain mapping results show that our algorithm outperforms the image-based LDDMM algorithm in terms of the mapping accuracy of gyral/sulcal curves, sulcal regions, and cortical and subcortical segmentation. Moreover, our algorithm provides better whole brain alignment than combined volumetric and surface registration (Postelnicu et al., 2009) and hierarchical attribute matching mechanism for elastic registration (HAMMER) (Shen and Davatzikos, 2002) in terms of cortical and subcortical volume segmentation. PMID:21281722
A Rare Case of Malignant Melanoma of the Mandible:  CT and MRI Findings.

PubMed

Ogura, Ichiro; Sasaki, Yoshihiko; Kameta, Ayako; Sue, Mikiko; Oda, Takaaki

Malignant melanoma of the mandibular gingiva is extremely rare. It is a malignant tumour of melanocytes or their precursor cells, and often misinterpreted as a benign pigmented process. A few reports have described computed tomography (CT) and magnetic resonance imaging (MRI) findings of malignant melanoma in the oral cavity. We report a rare case of malignant melanoma of the mandible and the related CT and MRI findings. Soft tissue algorithm contrast-enhanced CT showed an expansile mass and irregular destruction of alveolar bone in the right side of the mandibular molar area. MR images showed an enhancing mass and the tumour had a low to intermediate signal intensity and a high-signal intensity. Soft tissue algorithm contrast-enhanced CT and MR images showed lymphadenopathy involving the submandibular lymph nodes. Histopathological examination confirmed the diagnosis of malignant melanoma.
Multimodal computational microscopy based on transport of intensity equation

NASA Astrophysics Data System (ADS)

Li, Jiaji; Chen, Qian; Sun, Jiasong; Zhang, Jialin; Zuo, Chao

2016-12-01

Transport of intensity equation (TIE) is a powerful tool for phase retrieval and quantitative phase imaging, which requires intensity measurements only at axially closely spaced planes without a separate reference beam. It does not require coherent illumination and works well on conventional bright-field microscopes. The quantitative phase reconstructed by TIE gives valuable information that has been encoded in the complex wave field by passage through a sample of interest. Such information may provide tremendous flexibility to emulate various microscopy modalities computationally without requiring specialized hardware components. We develop a requisite theory to describe such a hybrid computational multimodal imaging system, which yields quantitative phase, Zernike phase contrast, differential interference contrast, and light field moment imaging, simultaneously. It makes the various observations for biomedical samples easy. Then we give the experimental demonstration of these ideas by time-lapse imaging of live HeLa cell mitosis. Experimental results verify that a tunable lens-based TIE system, combined with the appropriate postprocessing algorithm, can achieve a variety of promising imaging modalities in parallel with the quantitative phase images for the dynamic study of cellular processes.
TADSim: Discrete Event-based Performance Prediction for Temperature Accelerated Dynamics

DOE PAGES

Mniszewski, Susan M.; Junghans, Christoph; Voter, Arthur F.; ...

2015-04-16

Next-generation high-performance computing will require more scalable and flexible performance prediction tools to evaluate software--hardware co-design choices relevant to scientific applications and hardware architectures. Here, we present a new class of tools called application simulators—parameterized fast-running proxies of large-scale scientific applications using parallel discrete event simulation. Parameterized choices for the algorithmic method and hardware options provide a rich space for design exploration and allow us to quickly find well-performing software--hardware combinations. We demonstrate our approach with a TADSim simulator that models the temperature-accelerated dynamics (TAD) method, an algorithmically complex and parameter-rich member of the accelerated molecular dynamics (AMD) family ofmore » molecular dynamics methods. The essence of the TAD application is captured without the computational expense and resource usage of the full code. We accomplish this by identifying the time-intensive elements, quantifying algorithm steps in terms of those elements, abstracting them out, and replacing them by the passage of time. We use TADSim to quickly characterize the runtime performance and algorithmic behavior for the otherwise long-running simulation code. We extend TADSim to model algorithm extensions, such as speculative spawning of the compute-bound stages, and predict performance improvements without having to implement such a method. Validation against the actual TAD code shows close agreement for the evolution of an example physical system, a silver surface. Finally, focused parameter scans have allowed us to study algorithm parameter choices over far more scenarios than would be possible with the actual simulation. This has led to interesting performance-related insights and suggested extensions.« less
Color segmentation in the HSI color space using the K-means algorithm

NASA Astrophysics Data System (ADS)

Weeks, Arthur R.; Hague, G. Eric

1997-04-01

Segmentation of images is an important aspect of image recognition. While grayscale image segmentation has become quite a mature field, much less work has been done with regard to color image segmentation. Until recently, this was predominantly due to the lack of available computing power and color display hardware that is required to manipulate true color images (24-bit). TOday, it is not uncommon to find a standard desktop computer system with a true-color 24-bit display, at least 8 million bytes of memory, and 2 gigabytes of hard disk storage. Segmentation of color images is not as simple as segmenting each of the three RGB color components separately. The difficulty of using the RGB color space is that it doesn't closely model the psychological understanding of color. A better color model, which closely follows that of human visual perception is the hue, saturation, intensity model. This color model separates the color components in terms of chromatic and achromatic information. Strickland et al. was able to show the importance of color in the extraction of edge features form an image. His method enhances the edges that are detectable in the luminance image with information from the saturation image. Segmentation of both the saturation and intensity components is easily accomplished with any gray scale segmentation algorithm, since these spaces are linear. The modulus 2(pi) nature of the hue color component makes its segmentation difficult. For example, a hue of 0 and 2(pi) yields the same color tint. Instead of applying separate image segmentation to each of the hue, saturation, and intensity components, a better method is to segment the chromatic component separately from the intensity component because of the importance that the chromatic information plays in the segmentation of color images. This paper presents a method of using the gray scale K-means algorithm to segment 24-bit color images. Additionally, this paper will show the importance the hue component plays in the segmentation of color images.
Approximate Bayesian Computation in the estimation of the parameters of the Forbush decrease model

NASA Astrophysics Data System (ADS)

Wawrzynczak, A.; Kopka, P.

2017-12-01

Realistic modeling of the complicated phenomena as Forbush decrease of the galactic cosmic ray intensity is a quite challenging task. One aspect is a numerical solution of the Fokker-Planck equation in five-dimensional space (three spatial variables, the time and particles energy). The second difficulty arises from a lack of detailed knowledge about the spatial and time profiles of the parameters responsible for the creation of the Forbush decrease. Among these parameters, the central role plays a diffusion coefficient. Assessment of the correctness of the proposed model can be done only by comparison of the model output with the experimental observations of the galactic cosmic ray intensity. We apply the Approximate Bayesian Computation (ABC) methodology to match the Forbush decrease model to experimental data. The ABC method is becoming increasing exploited for dynamic complex problems in which the likelihood function is costly to compute. The main idea of all ABC methods is to accept samples as an approximate posterior draw if its associated modeled data are close enough to the observed one. In this paper, we present application of the Sequential Monte Carlo Approximate Bayesian Computation algorithm scanning the space of the diffusion coefficient parameters. The proposed algorithm is adopted to create the model of the Forbush decrease observed by the neutron monitors at the Earth in March 2002. The model of the Forbush decrease is based on the stochastic approach to the solution of the Fokker-Planck equation.
FHSA-SED: Two-Locus Model Detection for Genome-Wide Association Study with Harmony Search Algorithm.

PubMed

Tuo, Shouheng; Zhang, Junying; Yuan, Xiguo; Zhang, Yuanyuan; Liu, Zhaowen

2016-01-01

Two-locus model is a typical significant disease model to be identified in genome-wide association study (GWAS). Due to intensive computational burden and diversity of disease models, existing methods have drawbacks on low detection power, high computation cost, and preference for some types of disease models. In this study, two scoring functions (Bayesian network based K2-score and Gini-score) are used for characterizing two SNP locus as a candidate model, the two criteria are adopted simultaneously for improving identification power and tackling the preference problem to disease models. Harmony search algorithm (HSA) is improved for quickly finding the most likely candidate models among all two-locus models, in which a local search algorithm with two-dimensional tabu table is presented to avoid repeatedly evaluating some disease models that have strong marginal effect. Finally G-test statistic is used to further test the candidate models. We investigate our method named FHSA-SED on 82 simulated datasets and a real AMD dataset, and compare it with two typical methods (MACOED and CSE) which have been developed recently based on swarm intelligent search algorithm. The results of simulation experiments indicate that our method outperforms the two compared algorithms in terms of detection power, computation time, evaluation times, sensitivity (TPR), specificity (SPC), positive predictive value (PPV) and accuracy (ACC). Our method has identified two SNPs (rs3775652 and rs10511467) that may be also associated with disease in AMD dataset.
FHSA-SED: Two-Locus Model Detection for Genome-Wide Association Study with Harmony Search Algorithm

PubMed Central

Tuo, Shouheng; Zhang, Junying; Yuan, Xiguo; Zhang, Yuanyuan; Liu, Zhaowen

2016-01-01

Motivation Two-locus model is a typical significant disease model to be identified in genome-wide association study (GWAS). Due to intensive computational burden and diversity of disease models, existing methods have drawbacks on low detection power, high computation cost, and preference for some types of disease models. Method In this study, two scoring functions (Bayesian network based K2-score and Gini-score) are used for characterizing two SNP locus as a candidate model, the two criteria are adopted simultaneously for improving identification power and tackling the preference problem to disease models. Harmony search algorithm (HSA) is improved for quickly finding the most likely candidate models among all two-locus models, in which a local search algorithm with two-dimensional tabu table is presented to avoid repeatedly evaluating some disease models that have strong marginal effect. Finally G-test statistic is used to further test the candidate models. Results We investigate our method named FHSA-SED on 82 simulated datasets and a real AMD dataset, and compare it with two typical methods (MACOED and CSE) which have been developed recently based on swarm intelligent search algorithm. The results of simulation experiments indicate that our method outperforms the two compared algorithms in terms of detection power, computation time, evaluation times, sensitivity (TPR), specificity (SPC), positive predictive value (PPV) and accuracy (ACC). Our method has identified two SNPs (rs3775652 and rs10511467) that may be also associated with disease in AMD dataset. PMID:27014873
Adaptive-weighted Total Variation Minimization for Sparse Data toward Low-dose X-ray Computed Tomography Image Reconstruction

PubMed Central

Liu, Yan; Ma, Jianhua; Fan, Yi; Liang, Zhengrong

2012-01-01

Previous studies have shown that by minimizing the total variation (TV) of the to-be-estimated image with some data and other constraints, a piecewise-smooth X-ray computed tomography (CT) can be reconstructed from sparse-view projection data without introducing noticeable artifacts. However, due to the piecewise constant assumption for the image, a conventional TV minimization algorithm often suffers from over-smoothness on the edges of the resulting image. To mitigate this drawback, we present an adaptive-weighted TV (AwTV) minimization algorithm in this paper. The presented AwTV model is derived by considering the anisotropic edge property among neighboring image voxels, where the associated weights are expressed as an exponential function and can be adaptively adjusted by the local image-intensity gradient for the purpose of preserving the edge details. Inspired by the previously-reported TV-POCS (projection onto convex sets) implementation, a similar AwTV-POCS implementation was developed to minimize the AwTV subject to data and other constraints for the purpose of sparse-view low-dose CT image reconstruction. To evaluate the presented AwTV-POCS algorithm, both qualitative and quantitative studies were performed by computer simulations and phantom experiments. The results show that the presented AwTV-POCS algorithm can yield images with several noticeable gains, in terms of noise-resolution tradeoff plots and full width at half maximum values, as compared to the corresponding conventional TV-POCS algorithm. PMID:23154621
Adaptive-weighted total variation minimization for sparse data toward low-dose x-ray computed tomography image reconstruction.

PubMed

Liu, Yan; Ma, Jianhua; Fan, Yi; Liang, Zhengrong

2012-12-07

Previous studies have shown that by minimizing the total variation (TV) of the to-be-estimated image with some data and other constraints, piecewise-smooth x-ray computed tomography (CT) can be reconstructed from sparse-view projection data without introducing notable artifacts. However, due to the piecewise constant assumption for the image, a conventional TV minimization algorithm often suffers from over-smoothness on the edges of the resulting image. To mitigate this drawback, we present an adaptive-weighted TV (AwTV) minimization algorithm in this paper. The presented AwTV model is derived by considering the anisotropic edge property among neighboring image voxels, where the associated weights are expressed as an exponential function and can be adaptively adjusted by the local image-intensity gradient for the purpose of preserving the edge details. Inspired by the previously reported TV-POCS (projection onto convex sets) implementation, a similar AwTV-POCS implementation was developed to minimize the AwTV subject to data and other constraints for the purpose of sparse-view low-dose CT image reconstruction. To evaluate the presented AwTV-POCS algorithm, both qualitative and quantitative studies were performed by computer simulations and phantom experiments. The results show that the presented AwTV-POCS algorithm can yield images with several notable gains, in terms of noise-resolution tradeoff plots and full-width at half-maximum values, as compared to the corresponding conventional TV-POCS algorithm.
GPU acceleration of a petascale application for turbulent mixing at high Schmidt number using OpenMP 4.5

NASA Astrophysics Data System (ADS)

Clay, M. P.; Buaria, D.; Yeung, P. K.; Gotoh, T.

2018-07-01

This paper reports on the successful implementation of a massively parallel GPU-accelerated algorithm for the direct numerical simulation of turbulent mixing at high Schmidt number. The work stems from a recent development (Comput. Phys. Commun., vol. 219, 2017, 313-328), in which a low-communication algorithm was shown to attain high degrees of scalability on the Cray XE6 architecture when overlapping communication and computation via dedicated communication threads. An even higher level of performance has now been achieved using OpenMP 4.5 on the Cray XK7 architecture, where on each node the 16 integer cores of an AMD Interlagos processor share a single Nvidia K20X GPU accelerator. In the new algorithm, data movements are minimized by performing virtually all of the intensive scalar field computations in the form of combined compact finite difference (CCD) operations on the GPUs. A memory layout in departure from usual practices is found to provide much better performance for a specific kernel required to apply the CCD scheme. Asynchronous execution enabled by adding the OpenMP 4.5 NOWAIT clause to TARGET constructs improves scalability when used to overlap computation on the GPUs with computation and communication on the CPUs. On the 27-petaflops supercomputer Titan at Oak Ridge National Laboratory, USA, a GPU-to-CPU speedup factor of approximately 5 is consistently observed at the largest problem size of 81923 grid points for the scalar field computed with 8192 XK7 nodes.
A Finite Element Method to Correct Deformable Image Registration Errors in Low-Contrast Regions

PubMed Central

Zhong, Hualiang; Kim, Jinkoo; Li, Haisen; Nurushev, Teamour; Movsas, Benjamin; Chetty, Indrin J.

2012-01-01

Image-guided adaptive radiotherapy requires deformable image registration to map radiation dose back and forth between images. The purpose of this study is to develop a novel method to improve the accuracy of an intensity-based image registration algorithm in low-contrast regions. A computational framework has been developed in this study to improve the quality of the “demons” registration. For each voxel in the registration’s target image, the standard deviation of image intensity in a neighborhood of this voxel was calculated. A mask for high-contrast regions was generated based on their standard deviations. In the masked regions, a tetrahedral mesh was refined recursively so that a sufficient number of tetrahedral nodes in these regions can be selected as driving nodes. An elastic system driven by the displacements of the selected nodes was formulated using a finite element method (FEM) and implemented on the refined mesh. The displacements of these driving nodes were generated with the “demons” algorithm. The solution of the system was derived using a conjugated gradient method, and interpolated to generate a displacement vector field for the registered images. The FEM correction method was compared with the “demons” algorithm on the CT images of lung and prostate patients. The performance of the FEM correction relating to the “demons” registration was analyzed based on the physical property of their deformation maps, and quantitatively evaluated through a benchmark model developed specifically for this study. Compared to the benchmark model, the “demons” registration has the maximum error of 1.2 cm, which can be corrected by the FEM method to 0.4 cm, and the average error of the “demons” registration is reduced from 0.17 cm to 0.11 cm. For the CT images of lung and prostate patients, the deformation maps generated by the “demons” algorithm were found unrealistic at several places. In these places, the displacement differences between the “demons” registrations and their FEM corrections were found in the range of 0.4 cm and 1.1cm. The mesh refinement and FEM simulation were implemented in a single thread application which requires about 45 minutes of computation time on a 2.6 GH computer. This study has demonstrated that the finite element method can be integrated with intensity-based image registration algorithms to improve their registration accuracy, especially in low-contrast regions. PMID:22581269

ASTEC: Controls analysis for personal computers

NASA Technical Reports Server (NTRS)

Downing, John P.; Bauer, Frank H.; Thorpe, Christopher J.

1989-01-01

The ASTEC (Analysis and Simulation Tools for Engineering Controls) software is under development at Goddard Space Flight Center (GSFC). The design goal is to provide a wide selection of controls analysis tools at the personal computer level, as well as the capability to upload compute-intensive jobs to a mainframe or supercomputer. The project is a follow-on to the INCA (INteractive Controls Analysis) program that has been developed at GSFC over the past five years. While ASTEC makes use of the algorithms and expertise developed for the INCA program, the user interface was redesigned to take advantage of the capabilities of the personal computer. The design philosophy and the current capabilities of the ASTEC software are described.
Accelerating Wright–Fisher Forward Simulations on the Graphics Processing Unit

PubMed Central

Lawrie, David S.

2017-01-01

Forward Wright–Fisher simulations are powerful in their ability to model complex demography and selection scenarios, but suffer from slow execution on the Central Processor Unit (CPU), thus limiting their usefulness. However, the single-locus Wright–Fisher forward algorithm is exceedingly parallelizable, with many steps that are so-called “embarrassingly parallel,” consisting of a vast number of individual computations that are all independent of each other and thus capable of being performed concurrently. The rise of modern Graphics Processing Units (GPUs) and programming languages designed to leverage the inherent parallel nature of these processors have allowed researchers to dramatically speed up many programs that have such high arithmetic intensity and intrinsic concurrency. The presented GPU Optimized Wright–Fisher simulation, or “GO Fish” for short, can be used to simulate arbitrary selection and demographic scenarios while running over 250-fold faster than its serial counterpart on the CPU. Even modest GPU hardware can achieve an impressive speedup of over two orders of magnitude. With simulations so accelerated, one can not only do quick parametric bootstrapping of previously estimated parameters, but also use simulated results to calculate the likelihoods and summary statistics of demographic and selection models against real polymorphism data, all without restricting the demographic and selection scenarios that can be modeled or requiring approximations to the single-locus forward algorithm for efficiency. Further, as many of the parallel programming techniques used in this simulation can be applied to other computationally intensive algorithms important in population genetics, GO Fish serves as an exciting template for future research into accelerating computation in evolution. GO Fish is part of the Parallel PopGen Package available at: http://dl42.github.io/ParallelPopGen/. PMID:28768689
Computer-automated evolution of an X-band antenna for NASA's Space Technology 5 mission.

PubMed

Hornby, Gregory S; Lohn, Jason D; Linden, Derek S

2011-01-01

Whereas the current practice of designing antennas by hand is severely limited because it is both time and labor intensive and requires a significant amount of domain knowledge, evolutionary algorithms can be used to search the design space and automatically find novel antenna designs that are more effective than would otherwise be developed. Here we present our work in using evolutionary algorithms to automatically design an X-band antenna for NASA's Space Technology 5 (ST5) spacecraft. Two evolutionary algorithms were used: the first uses a vector of real-valued parameters and the second uses a tree-structured generative representation for constructing the antenna. The highest-performance antennas from both algorithms were fabricated and tested and both outperformed a hand-designed antenna produced by the antenna contractor for the mission. Subsequent changes to the spacecraft orbit resulted in a change in requirements for the spacecraft antenna. By adjusting our fitness function we were able to rapidly evolve a new set of antennas for this mission in less than a month. One of these new antenna designs was built, tested, and approved for deployment on the three ST5 spacecraft, which were successfully launched into space on March 22, 2006. This evolved antenna design is the first computer-evolved antenna to be deployed for any application and is the first computer-evolved hardware in space.
Scalability problems of simple genetic algorithms.

PubMed

Thierens, D

1999-01-01

Scalable evolutionary computation has become an intensively studied research topic in recent years. The issue of scalability is predominant in any field of algorithmic design, but it became particularly relevant for the design of competent genetic algorithms once the scalability problems of simple genetic algorithms were understood. Here we present some of the work that has aided in getting a clear insight in the scalability problems of simple genetic algorithms. Particularly, we discuss the important issue of building block mixing. We show how the need for mixing places a boundary in the GA parameter space that, together with the boundary from the schema theorem, delimits the region where the GA converges reliably to the optimum in problems of bounded difficulty. This region shrinks rapidly with increasing problem size unless the building blocks are tightly linked in the problem coding structure. In addition, we look at how straightforward extensions of the simple genetic algorithm-namely elitism, niching, and restricted mating are not significantly improving the scalability problems.
Enhanced Automated Guidance System for Horizontal Auger Boring Based on Image Processing

PubMed Central

Wu, Lingling; Wen, Guojun; Wang, Yudan; Huang, Lei; Zhou, Jiang

2018-01-01

Horizontal auger boring (HAB) is a widely used trenchless technology for the high-accuracy installation of gravity or pressure pipelines on line and grade. Differing from other pipeline installations, HAB requires a more precise and automated guidance system for use in a practical project. This paper proposes an economic and enhanced automated optical guidance system, based on optimization research of light-emitting diode (LED) light target and five automated image processing bore-path deviation algorithms. An LED target was optimized for many qualities, including light color, filter plate color, luminous intensity, and LED layout. The image preprocessing algorithm, feature extraction algorithm, angle measurement algorithm, deflection detection algorithm, and auto-focus algorithm, compiled in MATLAB, are used to automate image processing for deflection computing and judging. After multiple indoor experiments, this guidance system is applied in a project of hot water pipeline installation, with accuracy controlled within 2 mm in 48-m distance, providing accurate line and grade controls and verifying the feasibility and reliability of the guidance system. PMID:29462855
Scheduling in Sensor Grid Middleware for Telemedicine Using ABC Algorithm

PubMed Central

Vigneswari, T.; Mohamed, M. A. Maluk

2014-01-01

Advances in microelectromechanical systems (MEMS) and nanotechnology have enabled design of low power wireless sensor nodes capable of sensing different vital signs in our body. These nodes can communicate with each other to aggregate data and transmit vital parameters to a base station (BS). The data collected in the base station can be used to monitor health in real time. The patient wearing sensors may be mobile leading to aggregation of data from different BS for processing. Processing real time data is compute-intensive and telemedicine facilities may not have appropriate hardware to process the real time data effectively. To overcome this, sensor grid has been proposed in literature wherein sensor data is integrated to the grid for processing. This work proposes a scheduling algorithm to efficiently process telemedicine data in the grid. The proposed algorithm uses the popular swarm intelligence algorithm for scheduling to overcome the NP complete problem of grid scheduling. Results compared with other heuristic scheduling algorithms show the effectiveness of the proposed algorithm. PMID:25548557
Enhanced Automated Guidance System for Horizontal Auger Boring Based on Image Processing.

PubMed

Wu, Lingling; Wen, Guojun; Wang, Yudan; Huang, Lei; Zhou, Jiang

2018-02-15

Horizontal auger boring (HAB) is a widely used trenchless technology for the high-accuracy installation of gravity or pressure pipelines on line and grade. Differing from other pipeline installations, HAB requires a more precise and automated guidance system for use in a practical project. This paper proposes an economic and enhanced automated optical guidance system, based on optimization research of light-emitting diode (LED) light target and five automated image processing bore-path deviation algorithms. An LED light target was optimized for many qualities, including light color, filter plate color, luminous intensity, and LED layout. The image preprocessing algorithm, direction location algorithm, angle measurement algorithm, deflection detection algorithm, and auto-focus algorithm, compiled in MATLAB, are used to automate image processing for deflection computing and judging. After multiple indoor experiments, this guidance system is applied in a project of hot water pipeline installation, with accuracy controlled within 2 mm in 48-m distance, providing accurate line and grade controls and verifying the feasibility and reliability of the guidance system.
A computationally efficient method for incorporating spike waveform information into decoding algorithms.

PubMed

Ventura, Valérie; Todorova, Sonia

2015-05-01

Spike-based brain-computer interfaces (BCIs) have the potential to restore motor ability to people with paralysis and amputation, and have shown impressive performance in the lab. To transition BCI devices from the lab to the clinic, decoding must proceed automatically and in real time, which prohibits the use of algorithms that are computationally intensive or require manual tweaking. A common choice is to avoid spike sorting and treat the signal on each electrode as if it came from a single neuron, which is fast, easy, and therefore desirable for clinical use. But this approach ignores the kinematic information provided by individual neurons recorded on the same electrode. The contribution of this letter is a linear decoding model that extracts kinematic information from individual neurons without spike-sorting the electrode signals. The method relies on modeling sample averages of waveform features as functions of kinematics, which is automatic and requires minimal data storage and computation. In offline reconstruction of arm trajectories of a nonhuman primate performing reaching tasks, the proposed method performs as well as decoders based on expertly manually and automatically sorted spikes.
Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver

NASA Astrophysics Data System (ADS)

Shao, Meiyue; Aktulga, H. Metin; Yang, Chao; Ng, Esmond G.; Maris, Pieter; Vary, James P.

2018-01-01

We describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. The use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. We also discuss the implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.
A novel parallel architecture for local histogram equalization

NASA Astrophysics Data System (ADS)

Ohannessian, Mesrob I.; Choueiter, Ghinwa F.; Diab, Hassan

2005-07-01

Local histogram equalization is an image enhancement algorithm that has found wide application in the pre-processing stage of areas such as computer vision, pattern recognition and medical imaging. The computationally intensive nature of the procedure, however, is a main limitation when real time interactive applications are in question. This work explores the possibility of performing parallel local histogram equalization, using an array of special purpose elementary processors, through an HDL implementation that targets FPGA or ASIC platforms. A novel parallelization scheme is presented and the corresponding architecture is derived. The algorithm is reduced to pixel-level operations. Processing elements are assigned image blocks, to maintain a reasonable performance-cost ratio. To further simplify both processor and memory organizations, a bit-serial access scheme is used. A brief performance assessment is provided to illustrate and quantify the merit of the approach.
Memory-Intensive Benchmarks: IRAM vs. Cache-Based Machines

NASA Technical Reports Server (NTRS)

Biswas, Rupak; Gaeke, Brian R.; Husbands, Parry; Li, Xiaoye S.; Oliker, Leonid; Yelick, Katherine A.; Biegel, Bryan (Technical Monitor)

2002-01-01

The increasing gap between processor and memory performance has lead to new architectural models for memory-intensive applications. In this paper, we explore the performance of a set of memory-intensive benchmarks and use them to compare the performance of conventional cache-based microprocessors to a mixed logic and DRAM processor called VIRAM. The benchmarks are based on problem statements, rather than specific implementations, and in each case we explore the fundamental hardware requirements of the problem, as well as alternative algorithms and data structures that can help expose fine-grained parallelism or simplify memory access patterns. The benchmarks are characterized by their memory access patterns, their basic control structures, and the ratio of computation to memory operation.
Computational Intelligence for Medical Imaging Simulations.

PubMed

Chang, Victor

2017-11-25

This paper describes how to simulate medical imaging by computational intelligence to explore areas that cannot be easily achieved by traditional ways, including genes and proteins simulations related to cancer development and immunity. This paper has presented simulations and virtual inspections of BIRC3, BIRC6, CCL4, KLKB1 and CYP2A6 with their outputs and explanations, as well as brain segment intensity due to dancing. Our proposed MapReduce framework with the fusion algorithm can simulate medical imaging. The concept is very similar to the digital surface theories to simulate how biological units can get together to form bigger units, until the formation of the entire unit of biological subject. The M-Fusion and M-Update function by the fusion algorithm can achieve a good performance evaluation which can process and visualize up to 40 GB of data within 600 s. We conclude that computational intelligence can provide effective and efficient healthcare research offered by simulations and visualization.
Computational code in atomic and nuclear quantum optics: Advanced computing multiphoton resonance parameters for atoms in a strong laser field

NASA Astrophysics Data System (ADS)

Glushkov, A. V.; Gurskaya, M. Yu; Ignatenko, A. V.; Smirnov, A. V.; Serga, I. N.; Svinarenko, A. A.; Ternovsky, E. V.

2017-10-01

The consistent relativistic energy approach to the finite Fermi-systems (atoms and nuclei) in a strong realistic laser field is presented and applied to computing the multiphoton resonances parameters in some atoms and nuclei. The approach is based on the Gell-Mann and Low S-matrix formalism, multiphoton resonance lines moments technique and advanced Ivanov-Ivanova algorithm of calculating the Green’s function of the Dirac equation. The data for multiphoton resonance width and shift for the Cs atom and the 57Fe nucleus in dependence upon the laser intensity are listed.
Multiscale Simulations of Reactive Transport

NASA Astrophysics Data System (ADS)

Tartakovsky, D. M.; Bakarji, J.

2014-12-01

Discrete, particle-based simulations offer distinct advantages when modeling solute transport and chemical reactions. For example, Brownian motion is often used to model diffusion in complex pore networks, and Gillespie-type algorithms allow one to handle multicomponent chemical reactions with uncertain reaction pathways. Yet such models can be computationally more intensive than their continuum-scale counterparts, e.g., advection-dispersion-reaction equations. Combining the discrete and continuum models has a potential to resolve the quantity of interest with a required degree of physicochemical granularity at acceptable computational cost. We present computational examples of such "hybrid models" and discuss the challenges associated with coupling these two levels of description.
Computational methods for reactive transport modeling: A Gibbs energy minimization approach for multiphase equilibrium calculations

NASA Astrophysics Data System (ADS)

Leal, Allan M. M.; Kulik, Dmitrii A.; Kosakowski, Georg

2016-02-01

We present a numerical method for multiphase chemical equilibrium calculations based on a Gibbs energy minimization approach. The method can accurately and efficiently determine the stable phase assemblage at equilibrium independently of the type of phases and species that constitute the chemical system. We have successfully applied our chemical equilibrium algorithm in reactive transport simulations to demonstrate its effective use in computationally intensive applications. We used FEniCS to solve the governing partial differential equations of mass transport in porous media using finite element methods in unstructured meshes. Our equilibrium calculations were benchmarked with GEMS3K, the numerical kernel of the geochemical package GEMS. This allowed us to compare our results with a well-established Gibbs energy minimization algorithm, as well as their performance on every mesh node, at every time step of the transport simulation. The benchmark shows that our novel chemical equilibrium algorithm is accurate, robust, and efficient for reactive transport applications, and it is an improvement over the Gibbs energy minimization algorithm used in GEMS3K. The proposed chemical equilibrium method has been implemented in Reaktoro, a unified framework for modeling chemically reactive systems, which is now used as an alternative numerical kernel of GEMS.
An exact and efficient first passage time algorithm for reaction-diffusion processes on a 2D-lattice

NASA Astrophysics Data System (ADS)

Bezzola, Andri; Bales, Benjamin B.; Alkire, Richard C.; Petzold, Linda R.

2014-01-01

We present an exact and efficient algorithm for reaction-diffusion-nucleation processes on a 2D-lattice. The algorithm makes use of first passage time (FPT) to replace the computationally intensive simulation of diffusion hops in KMC by larger jumps when particles are far away from step-edges or other particles. Our approach computes exact probability distributions of jump times and target locations in a closed-form formula, based on the eigenvectors and eigenvalues of the corresponding 1D transition matrix, maintaining atomic-scale resolution of resulting shapes of deposit islands. We have applied our method to three different test cases of electrodeposition: pure diffusional aggregation for large ranges of diffusivity rates and for simulation domain sizes of up to 4096×4096 sites, the effect of diffusivity on island shapes and sizes in combination with a KMC edge diffusion, and the calculation of an exclusion zone in front of a step-edge, confirming statistical equivalence to standard KMC simulations. The algorithm achieves significant speedup compared to standard KMC for cases where particles diffuse over long distances before nucleating with other particles or being captured by larger islands.
An exact and efficient first passage time algorithm for reaction–diffusion processes on a 2D-lattice

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bezzola, Andri, E-mail: andri.bezzola@gmail.com; Bales, Benjamin B., E-mail: bbbales2@gmail.com; Alkire, Richard C., E-mail: r-alkire@uiuc.edu

2014-01-01

We present an exact and efficient algorithm for reaction–diffusion–nucleation processes on a 2D-lattice. The algorithm makes use of first passage time (FPT) to replace the computationally intensive simulation of diffusion hops in KMC by larger jumps when particles are far away from step-edges or other particles. Our approach computes exact probability distributions of jump times and target locations in a closed-form formula, based on the eigenvectors and eigenvalues of the corresponding 1D transition matrix, maintaining atomic-scale resolution of resulting shapes of deposit islands. We have applied our method to three different test cases of electrodeposition: pure diffusional aggregation for largemore » ranges of diffusivity rates and for simulation domain sizes of up to 4096×4096 sites, the effect of diffusivity on island shapes and sizes in combination with a KMC edge diffusion, and the calculation of an exclusion zone in front of a step-edge, confirming statistical equivalence to standard KMC simulations. The algorithm achieves significant speedup compared to standard KMC for cases where particles diffuse over long distances before nucleating with other particles or being captured by larger islands.« less
Efficient Parallel Video Processing Techniques on GPU: From Framework to Implementation

PubMed Central

Su, Huayou; Wen, Mei; Wu, Nan; Ren, Ju; Zhang, Chunyuan

2014-01-01

Through reorganizing the execution order and optimizing the data structure, we proposed an efficient parallel framework for H.264/AVC encoder based on massively parallel architecture. We implemented the proposed framework by CUDA on NVIDIA's GPU. Not only the compute intensive components of the H.264 encoder are parallelized but also the control intensive components are realized effectively, such as CAVLC and deblocking filter. In addition, we proposed serial optimization methods, including the multiresolution multiwindow for motion estimation, multilevel parallel strategy to enhance the parallelism of intracoding as much as possible, component-based parallel CAVLC, and direction-priority deblocking filter. More than 96% of workload of H.264 encoder is offloaded to GPU. Experimental results show that the parallel implementation outperforms the serial program by 20 times of speedup ratio and satisfies the requirement of the real-time HD encoding of 30 fps. The loss of PSNR is from 0.14 dB to 0.77 dB, when keeping the same bitrate. Through the analysis to the kernels, we found that speedup ratios of the compute intensive algorithms are proportional with the computation power of the GPU. However, the performance of the control intensive parts (CAVLC) is much related to the memory bandwidth, which gives an insight for new architecture design. PMID:24757432
Computational study of scattering of a zero-order Bessel beam by large nonspherical homogeneous particles with the multilevel fast multipole algorithm

NASA Astrophysics Data System (ADS)

Yang, Minglin; Wu, Yueqian; Sheng, Xinqing; Ren, Kuan Fang

2017-12-01

Computation of scattering of shaped beams by large nonspherical particles is a challenge in both optics and electromagnetics domains since it concerns many research fields. In this paper, we report our new progress in the numerical computation of the scattering diagrams. Our algorithm permits to calculate the scattering of a particle of size as large as 110 wavelengths or 700 in size parameter. The particle can be transparent or absorbing of arbitrary shape, smooth or with a sharp surface, such as the Chebyshev particles or ice crystals. To illustrate the capacity of the algorithm, a zero order Bessel beam is taken as the incident beam, and the scattering of ellipsoidal particles and Chebyshev particles are taken as examples. Some special phenomena have been revealed and examined. The scattering problem is formulated with the combined tangential formulation and solved iteratively with the aid of the multilevel fast multipole algorithm, which is well parallelized with the message passing interface on the distributed memory computer platform using the hybrid partitioning strategy. The numerical predictions are compared with the results of the rigorous method for a spherical particle to validate the accuracy of the approach. The scattering diagrams of large ellipsoidal particles with various parameters are examined. The effect of aspect ratios, as well as half-cone angle of the incident zero-order Bessel beam and the off-axis distance on scattered intensity, is studied. Scattering by asymmetry Chebyshev particle with size parameter larger than 700 is also given to show the capability of the method for computing scattering by arbitrary shaped particles.
Fast semivariogram computation using FPGA architectures

NASA Astrophysics Data System (ADS)

Lagadapati, Yamuna; Shirvaikar, Mukul; Dong, Xuanliang

2015-02-01

The semivariogram is a statistical measure of the spatial distribution of data and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. The semivariogram is a plot of semivariances for different lag distances between pixels. A semi-variance, γ(h), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h. Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O(n2). Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz, but they can perform tens of thousands of calculations per clock cycle while operating in the low range of power. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. The design consists of several modules dedicated to the constituent computational tasks. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. Anisotropic semivariogram implementation is anticipated to be an extension of the current architecture, ostensibly based on refinements to the current modules. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from MRI scans are utilized for the experiments. Computational speedup is measured with respect to Matlab implementation on a personal computer with an Intel i7 multi-core processor. Preliminary simulation results indicate that a significant advantage in speed can be attained by the architectures, making the algorithm viable for implementation in medical devices

Using patient-specific phantoms to evaluate deformable image registration algorithms for adaptive radiation therapy

PubMed Central

Stanley, Nick; Glide-Hurst, Carri; Kim, Jinkoo; Adams, Jeffrey; Li, Shunshan; Wen, Ning; Chetty, Indrin J.; Zhong, Hualiang

2014-01-01

The quality of adaptive treatment planning depends on the accuracy of its underlying deformable image registration (DIR). The purpose of this study is to evaluate the performance of two DIR algorithms, B-spline–based deformable multipass (DMP) and deformable demons (Demons), implemented in a commercial software package. Evaluations were conducted using both computational and physical deformable phantoms. Based on a finite element method (FEM), a total of 11 computational models were developed from a set of CT images acquired from four lung and one prostate cancer patients. FEM generated displacement vector fields (DVF) were used to construct the lung and prostate image phantoms. Based on a fast-Fourier transform technique, image noise power spectrum was incorporated into the prostate image phantoms to create simulated CBCT images. The FEM-DVF served as a gold standard for verification of the two registration algorithms performed on these phantoms. The registration algorithms were also evaluated at the homologous points quantified in the CT images of a physical lung phantom. The results indicated that the mean errors of the DMP algorithm were in the range of 1.0 ~ 3.1 mm for the computational phantoms and 1.9 mm for the physical lung phantom. For the computational prostate phantoms, the corresponding mean error was 1.0–1.9 mm in the prostate, 1.9–2.4 mm in the rectum, and 1.8–2.1 mm over the entire patient body. Sinusoidal errors induced by B-spline interpolations were observed in all the displacement profiles of the DMP registrations. Regions of large displacements were observed to have more registration errors. Patient-specific FEM models have been developed to evaluate the DIR algorithms implemented in the commercial software package. It has been found that the accuracy of these algorithms is patient-dependent and related to various factors including tissue deformation magnitudes and image intensity gradients across the regions of interest. This may suggest that DIR algorithms need to be verified for each registration instance when implementing adaptive radiation therapy. PMID:24257278
Using patient‐specific phantoms to evaluate deformable image registration algorithms for adaptive radiation therapy

PubMed Central

Stanley, Nick; Glide‐Hurst, Carri; Kim, Jinkoo; Adams, Jeffrey; Li, Shunshan; Wen, Ning; Chetty, Indrin J

2013-01-01

The quality of adaptive treatment planning depends on the accuracy of its underlying deformable image registration (DIR). The purpose of this study is to evaluate the performance of two DIR algorithms, B‐spline‐based deformable multipass (DMP) and deformable demons (Demons), implemented in a commercial software package. Evaluations were conducted using both computational and physical deformable phantoms. Based on a finite element method (FEM), a total of 11 computational models were developed from a set of CT images acquired from four lung and one prostate cancer patients. FEM generated displacement vector fields (DVF) were used to construct the lung and prostate image phantoms. Based on a fast‐Fourier transform technique, image noise power spectrum was incorporated into the prostate image phantoms to create simulated CBCT images. The FEM‐DVF served as a gold standard for verification of the two registration algorithms performed on these phantoms. The registration algorithms were also evaluated at the homologous points quantified in the CT images of a physical lung phantom. The results indicated that the mean errors of the DMP algorithm were in the range of 1.0~3.1mm for the computational phantoms and 1.9 mm for the physical lung phantom. For the computational prostate phantoms, the corresponding mean error was 1.0–1.9 mm in the prostate, 1.9–2.4 mm in the rectum, and 1.8–2.1 mm over the entire patient body. Sinusoidal errors induced by B‐spline interpolations were observed in all the displacement profiles of the DMP registrations. Regions of large displacements were observed to have more registration errors. Patient‐specific FEM models have been developed to evaluate the DIR algorithms implemented in the commercial software package. It has been found that the accuracy of these algorithms is patient‐dependent and related to various factors including tissue deformation magnitudes and image intensity gradients across the regions of interest. This may suggest that DIR algorithms need to be verified for each registration instance when implementing adaptive radiation therapy. PACS numbers: 87.10.Kn, 87.55.km, 87.55.Qr, 87.57.nj
Comparative Implementation of High Performance Computing for Power System Dynamic Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Shuangshuang; Huang, Zhenyu; Diao, Ruisheng

Dynamic simulation for transient stability assessment is one of the most important, but intensive, computations for power system planning and operation. Present commercial software is mainly designed for sequential computation to run a single simulation, which is very time consuming with a single processer. The application of High Performance Computing (HPC) to dynamic simulations is very promising in accelerating the computing process by parallelizing its kernel algorithms while maintaining the same level of computation accuracy. This paper describes the comparative implementation of four parallel dynamic simulation schemes in two state-of-the-art HPC environments: Message Passing Interface (MPI) and Open Multi-Processing (OpenMP).more » These implementations serve to match the application with dedicated multi-processor computing hardware and maximize the utilization and benefits of HPC during the development process.« less
Deformable registration of CT and cone-beam CT with local intensity matching.

PubMed

Park, Seyoun; Plishker, William; Quon, Harry; Wong, John; Shekhar, Raj; Lee, Junghoon

2017-02-07

Cone-beam CT (CBCT) is a widely used intra-operative imaging modality in image-guided radiotherapy and surgery. A short scan followed by a filtered-backprojection is typically used for CBCT reconstruction. While data on the mid-plane (plane of source-detector rotation) is complete, off-mid-planes undergo different information deficiency and the computed reconstructions are approximate. This causes different reconstruction artifacts at off-mid-planes depending on slice locations, and therefore impedes accurate registration between CT and CBCT. In this paper, we propose a method to accurately register CT and CBCT by iteratively matching local CT and CBCT intensities. We correct CBCT intensities by matching local intensity histograms slice by slice in conjunction with intensity-based deformable registration. The correction-registration steps are repeated in an alternating way until the result image converges. We integrate the intensity matching into three different deformable registration methods, B-spline, demons, and optical flow that are widely used for CT-CBCT registration. All three registration methods were implemented on a graphics processing unit for efficient parallel computation. We tested the proposed methods on twenty five head and neck cancer cases and compared the performance with state-of-the-art registration methods. Normalized cross correlation (NCC), structural similarity index (SSIM), and target registration error (TRE) were computed to evaluate the registration performance. Our method produced overall NCC of 0.96, SSIM of 0.94, and TRE of 2.26 → 2.27 mm, outperforming existing methods by 9%, 12%, and 27%, respectively. Experimental results also show that our method performs consistently and is more accurate than existing algorithms, and also computationally efficient.
Deformable registration of CT and cone-beam CT with local intensity matching

NASA Astrophysics Data System (ADS)

Park, Seyoun; Plishker, William; Quon, Harry; Wong, John; Shekhar, Raj; Lee, Junghoon

2017-02-01

Cone-beam CT (CBCT) is a widely used intra-operative imaging modality in image-guided radiotherapy and surgery. A short scan followed by a filtered-backprojection is typically used for CBCT reconstruction. While data on the mid-plane (plane of source-detector rotation) is complete, off-mid-planes undergo different information deficiency and the computed reconstructions are approximate. This causes different reconstruction artifacts at off-mid-planes depending on slice locations, and therefore impedes accurate registration between CT and CBCT. In this paper, we propose a method to accurately register CT and CBCT by iteratively matching local CT and CBCT intensities. We correct CBCT intensities by matching local intensity histograms slice by slice in conjunction with intensity-based deformable registration. The correction-registration steps are repeated in an alternating way until the result image converges. We integrate the intensity matching into three different deformable registration methods, B-spline, demons, and optical flow that are widely used for CT-CBCT registration. All three registration methods were implemented on a graphics processing unit for efficient parallel computation. We tested the proposed methods on twenty five head and neck cancer cases and compared the performance with state-of-the-art registration methods. Normalized cross correlation (NCC), structural similarity index (SSIM), and target registration error (TRE) were computed to evaluate the registration performance. Our method produced overall NCC of 0.96, SSIM of 0.94, and TRE of 2.26 → 2.27 mm, outperforming existing methods by 9%, 12%, and 27%, respectively. Experimental results also show that our method performs consistently and is more accurate than existing algorithms, and also computationally efficient.
SU-F-T-148: Are the Approximations in Analytic Semi-Empirical Dose Calculation Algorithms for Intensity Modulated Proton Therapy for Complex Heterogeneities of Head and Neck Clinically Significant?

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yepes, P; UT MD Anderson Cancer Center, Houston, TX; Titt, U

2016-06-15

Purpose: Evaluate the differences in dose distributions between the proton analytic semi-empirical dose calculation algorithm used in the clinic and Monte Carlo calculations for a sample of 50 head-and-neck (H&N) patients and estimate the potential clinical significance of the differences. Methods: A cohort of 50 H&N patients, treated at the University of Texas Cancer Center with Intensity Modulated Proton Therapy (IMPT), were selected for evaluation of clinical significance of approximations in computed dose distributions. H&N site was selected because of the highly inhomogeneous nature of the anatomy. The Fast Dose Calculator (FDC), a fast track-repeating accelerated Monte Carlo algorithm formore » proton therapy, was utilized for the calculation of dose distributions delivered during treatment plans. Because of its short processing time, FDC allows for the processing of large cohorts of patients. FDC has been validated versus GEANT4, a full Monte Carlo system and measurements in water and for inhomogeneous phantoms. A gamma-index analysis, DVHs, EUDs, and TCP and NTCPs computed using published models were utilized to evaluate the differences between the Treatment Plan System (TPS) and FDC. Results: The Monte Carlo results systematically predict lower dose delivered in the target. The observed differences can be as large as 8 Gy, and should have a clinical impact. Gamma analysis also showed significant differences between both approaches, especially for the target volumes. Conclusion: Monte Carlo calculations with fast algorithms is practical and should be considered for the clinic, at least as a treatment plan verification tool.« less
Combining watershed and graph cuts methods to segment organs at risk in radiotherapy

NASA Astrophysics Data System (ADS)

Dolz, Jose; Kirisli, Hortense A.; Viard, Romain; Massoptier, Laurent

2014-03-01

Computer-aided segmentation of anatomical structures in medical images is a valuable tool for efficient radiation therapy planning (RTP). As delineation errors highly affect the radiation oncology treatment, it is crucial to delineate geometric structures accurately. In this paper, a semi-automatic segmentation approach for computed tomography (CT) images, based on watershed and graph-cuts methods, is presented. The watershed pre-segmentation groups small areas of similar intensities in homogeneous labels, which are subsequently used as input for the graph-cuts algorithm. This methodology does not require of prior knowledge of the structure to be segmented; even so, it performs well with complex shapes and low intensity. The presented method also allows the user to add foreground and background strokes in any of the three standard orthogonal views - axial, sagittal or coronal - making the interaction with the algorithm easy and fast. Hence, the segmentation information is propagated within the whole volume, providing a spatially coherent result. The proposed algorithm has been evaluated using 9 CT volumes, by comparing its segmentation performance over several organs - lungs, liver, spleen, heart and aorta - to those of manual delineation from experts. A Dicés coefficient higher than 0.89 was achieved in every case. That demonstrates that the proposed approach works well for all the anatomical structures analyzed. Due to the quality of the results, the introduction of the proposed approach in the RTP process will be a helpful tool for organs at risk (OARs) segmentation.
Efficient Fourier-based algorithms for time-periodic unsteady problems

NASA Astrophysics Data System (ADS)

Gopinath, Arathi Kamath

2007-12-01

This dissertation work proposes two algorithms for the simulation of time-periodic unsteady problems via the solution of Unsteady Reynolds-Averaged Navier-Stokes (URANS) equations. These algorithms use a Fourier representation in time and hence solve for the periodic state directly without resolving transients (which consume most of the resources in a time-accurate scheme). In contrast to conventional Fourier-based techniques which solve the governing equations in frequency space, the new algorithms perform all the calculations in the time domain, and hence require minimal modifications to an existing solver. The complete space-time solution is obtained by iterating in a fifth pseudo-time dimension. Various time-periodic problems such as helicopter rotors, wind turbines, turbomachinery and flapping-wings can be simulated using the Time Spectral method. The algorithm is first validated using pitching airfoil/wing test cases. The method is further extended to turbomachinery problems, and computational results verified by comparison with a time-accurate calculation. The technique can be very memory intensive for large problems, since the solution is computed (and hence stored) simultaneously at all time levels. Often, the blade counts of a turbomachine are rescaled such that a periodic fraction of the annulus can be solved. This approximation enables the solution to be obtained at a fraction of the cost of a full-scale time-accurate solution. For a viscous computation over a three-dimensional single-stage rescaled compressor, an order of magnitude savings is achieved. The second algorithm, the reduced-order Harmonic Balance method is applicable only to turbomachinery flows, and offers even larger computational savings than the Time Spectral method. It simulates the true geometry of the turbomachine using only one blade passage per blade row as the computational domain. In each blade row of the turbomachine, only the dominant frequencies are resolved, namely, combinations of neighbor's blade passing. An appropriate set of frequencies can be chosen by the analyst/designer based on a trade-off between accuracy and computational resources available. A cost comparison with a time-accurate computation for an Euler calculation on a two-dimensional multi-stage compressor obtained an order of magnitude savings, and a RANS calculation on a three-dimensional single-stage compressor achieved two orders of magnitude savings, with comparable accuracy.
Hierarchical layered and semantic-based image segmentation using ergodicity map

NASA Astrophysics Data System (ADS)

Yadegar, Jacob; Liu, Xiaoqing

2010-04-01

Image segmentation plays a foundational role in image understanding and computer vision. Although great strides have been made and progress achieved on automatic/semi-automatic image segmentation algorithms, designing a generic, robust, and efficient image segmentation algorithm is still challenging. Human vision is still far superior compared to computer vision, especially in interpreting semantic meanings/objects in images. We present a hierarchical/layered semantic image segmentation algorithm that can automatically and efficiently segment images into hierarchical layered/multi-scaled semantic regions/objects with contextual topological relationships. The proposed algorithm bridges the gap between high-level semantics and low-level visual features/cues (such as color, intensity, edge, etc.) through utilizing a layered/hierarchical ergodicity map, where ergodicity is computed based on a space filling fractal concept and used as a region dissimilarity measurement. The algorithm applies a highly scalable, efficient, and adaptive Peano- Cesaro triangulation/tiling technique to decompose the given image into a set of similar/homogenous regions based on low-level visual cues in a top-down manner. The layered/hierarchical ergodicity map is built through a bottom-up region dissimilarity analysis. The recursive fractal sweep associated with the Peano-Cesaro triangulation provides efficient local multi-resolution refinement to any level of detail. The generated binary decomposition tree also provides efficient neighbor retrieval mechanisms for contextual topological object/region relationship generation. Experiments have been conducted within the maritime image environment where the segmented layered semantic objects include the basic level objects (i.e. sky/land/water) and deeper level objects in the sky/land/water surfaces. Experimental results demonstrate the proposed algorithm has the capability to robustly and efficiently segment images into layered semantic objects/regions with contextual topological relationships.
Large-Scale Compute-Intensive Analysis via a Combined In-situ and Co-scheduling Workflow Approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Messer, Bronson; Sewell, Christopher; Heitmann, Katrin

2015-01-01

Large-scale simulations can produce tens of terabytes of data per analysis cycle, complicating and limiting the efficiency of workflows. Traditionally, outputs are stored on the file system and analyzed in post-processing. With the rapidly increasing size and complexity of simulations, this approach faces an uncertain future. Trending techniques consist of performing the analysis in situ, utilizing the same resources as the simulation, and/or off-loading subsets of the data to a compute-intensive analysis system. We introduce an analysis framework developed for HACC, a cosmological N-body code, that uses both in situ and co-scheduling approaches for handling Petabyte-size outputs. An initial inmore » situ step is used to reduce the amount of data to be analyzed, and to separate out the data-intensive tasks handled off-line. The analysis routines are implemented using the PISTON/VTK-m framework, allowing a single implementation of an algorithm that simultaneously targets a variety of GPU, multi-core, and many-core architectures.« less
A Probabilistic Framework for Peptide and Protein Quantification from Data-Dependent and Data-Independent LC-MS Proteomics Experiments

PubMed Central

Richardson, Keith; Denny, Richard; Hughes, Chris; Skilling, John; Sikora, Jacek; Dadlez, Michał; Manteca, Angel; Jung, Hye Ryung; Jensen, Ole Nørregaard; Redeker, Virginie; Melki, Ronald; Langridge, James I.; Vissers, Johannes P.C.

2013-01-01

A probability-based quantification framework is presented for the calculation of relative peptide and protein abundance in label-free and label-dependent LC-MS proteomics data. The results are accompanied by credible intervals and regulation probabilities. The algorithm takes into account data uncertainties via Poisson statistics modified by a noise contribution that is determined automatically during an initial normalization stage. Protein quantification relies on assignments of component peptides to the acquired data. These assignments are generally of variable reliability and may not be present across all of the experiments comprising an analysis. It is also possible for a peptide to be identified to more than one protein in a given mixture. For these reasons the algorithm accepts a prior probability of peptide assignment for each intensity measurement. The model is constructed in such a way that outliers of any type can be automatically reweighted. Two discrete normalization methods can be employed. The first method is based on a user-defined subset of peptides, while the second method relies on the presence of a dominant background of endogenous peptides for which the concentration is assumed to be unaffected. Normalization is performed using the same computational and statistical procedures employed by the main quantification algorithm. The performance of the algorithm will be illustrated on example data sets, and its utility demonstrated for typical proteomics applications. The quantification algorithm supports relative protein quantification based on precursor and product ion intensities acquired by means of data-dependent methods, originating from all common isotopically-labeled approaches, as well as label-free ion intensity-based data-independent methods. PMID:22871168
Reconstructing three-dimensional protein crystal intensities from sparse unoriented two-axis X-ray diffraction patterns

PubMed Central

Lan, Ti-Yen; Wierman, Jennifer L.; Tate, Mark W.; Philipp, Hugh T.; Elser, Veit

2017-01-01

Recently, there has been a growing interest in adapting serial microcrystallography (SMX) experiments to existing storage ring (SR) sources. For very small crystals, however, radiation damage occurs before sufficient numbers of photons are diffracted to determine the orientation of the crystal. The challenge is to merge data from a large number of such ‘sparse’ frames in order to measure the full reciprocal space intensity. To simulate sparse frames, a dataset was collected from a large lysozyme crystal illuminated by a dim X-ray source. The crystal was continuously rotated about two orthogonal axes to sample a subset of the rotation space. With the EMC algorithm [expand–maximize–compress; Loh & Elser (2009). Phys. Rev. E, 80, 026705], it is shown that the diffracted intensity of the crystal can still be reconstructed even without knowledge of the orientation of the crystal in any sparse frame. Moreover, parallel computation implementations were designed to considerably improve the time and memory scaling of the algorithm. The results show that EMC-based SMX experiments should be feasible at SR sources. PMID:28808431
TU-H-206-04: An Effective Homomorphic Unsharp Mask Filtering Method to Correct Intensity Inhomogeneity in Daily Treatment MR Images

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yang, D; Gach, H; Li, H

Purpose: The daily treatment MRIs acquired on MR-IGRT systems, like diagnostic MRIs, suffer from intensity inhomogeneity issue, associated with B1 and B0 inhomogeneities. An improved homomorphic unsharp mask (HUM) filtering method, automatic and robust body segmentation, and imaging field-of-view (FOV) detection methods were developed to compute the multiplicative slow-varying correction field and correct the intensity inhomogeneity. The goal is to improve and normalize the voxel intensity so that the images could be processed more accurately by quantitative methods (e.g., segmentation and registration) that require consistent image voxel intensity values. Methods: HUM methods have been widely used for years. A bodymore » mask is required, otherwise the body surface in the corrected image would be incorrectly bright due to the sudden intensity transition at the body surface. In this study, we developed an improved HUM-based correction method that includes three main components: 1) Robust body segmentation on the normalized image gradient map, 2) Robust FOV detection (needed for body segmentation) using region growing and morphologic filters, and 3) An effective implementation of HUM using repeated Gaussian convolution. Results: The proposed method was successfully tested on patient images of common anatomical sites (H/N, lung, abdomen and pelvis). Initial qualitative comparisons showed that this improved HUM method outperformed three recently published algorithms (FCM, LEMS, MICO) in both computation speed (by 50+ times) and robustness (in intermediate to severe inhomogeneity situations). Currently implemented in MATLAB, it takes 20 to 25 seconds to process a 3D MRI volume. Conclusion: Compared to more sophisticated MRI inhomogeneity correction algorithms, the improved HUM method is simple and effective. The inhomogeneity correction, body mask, and FOV detection methods developed in this study would be useful as preprocessing tools for many MRI-related research and clinical applications in radiotherapy. Authors have received research grants from ViewRay and Varian.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Cohen, J; Dossa, D; Gokhale, M

Critical data science applications requiring frequent access to storage perform poorly on today's computing architectures. This project addresses efficient computation of data-intensive problems in national security and basic science by exploring, advancing, and applying a new form of computing called storage-intensive supercomputing (SISC). Our goal is to enable applications that simply cannot run on current systems, and, for a broad range of data-intensive problems, to deliver an order of magnitude improvement in price/performance over today's data-intensive architectures. This technical report documents much of the work done under LDRD 07-ERD-063 Storage Intensive Supercomputing during the period 05/07-09/07. The following chapters describe:more » (1) a new file I/O monitoring tool iotrace developed to capture the dynamic I/O profiles of Linux processes; (2) an out-of-core graph benchmark for level-set expansion of scale-free graphs; (3) an entity extraction benchmark consisting of a pipeline of eight components; and (4) an image resampling benchmark drawn from the SWarp program in the LSST data processing pipeline. The performance of the graph and entity extraction benchmarks was measured in three different scenarios: data sets residing on the NFS file server and accessed over the network; data sets stored on local disk; and data sets stored on the Fusion I/O parallel NAND Flash array. The image resampling benchmark compared performance of software-only to GPU-accelerated. In addition to the work reported here, an additional text processing application was developed that used an FPGA to accelerate n-gram profiling for language classification. The n-gram application will be presented at SC07 at the High Performance Reconfigurable Computing Technologies and Applications Workshop. The graph and entity extraction benchmarks were run on a Supermicro server housing the NAND Flash 40GB parallel disk array, the Fusion-io. The Fusion system specs are as follows: SuperMicro X7DBE Xeon Dual Socket Blackford Server Motherboard; 2 Intel Xeon Dual-Core 2.66 GHz processors; 1 GB DDR2 PC2-5300 RAM (2 x 512); 80GB Hard Drive (Seagate SATA II Barracuda). The Fusion board is presently capable of 4X in a PCIe slot. The image resampling benchmark was run on a dual Xeon workstation with NVIDIA graphics card (see Chapter 5 for full specification). An XtremeData Opteron+FPGA was used for the language classification application. We observed that these benchmarks are not uniformly I/O intensive. The only benchmark that showed greater that 50% of the time in I/O was the graph algorithm when it accessed data files over NFS. When local disk was used, the graph benchmark spent at most 40% of its time in I/O. The other benchmarks were CPU dominated. The image resampling benchmark and language classification showed order of magnitude speedup over software by using co-processor technology to offload the CPU-intensive kernels. Our experiments to date suggest that emerging hardware technologies offer significant benefit to boosting the performance of data-intensive algorithms. Using GPU and FPGA co-processors, we were able to improve performance by more than an order of magnitude on the benchmark algorithms, eliminating the processor bottleneck of CPU-bound tasks. Experiments with a prototype solid state nonvolative memory available today show 10X better throughput on random reads than disk, with a 2X speedup on a graph processing benchmark when compared to the use of local SATA disk.« less
Parallelization of sequential Gaussian, indicator and direct simulation algorithms

NASA Astrophysics Data System (ADS)

Nunes, Ruben; Almeida, José A.

2010-08-01

Improving the performance and robustness of algorithms on new high-performance parallel computing architectures is a key issue in efficiently performing 2D and 3D studies with large amount of data. In geostatistics, sequential simulation algorithms are good candidates for parallelization. When compared with other computational applications in geosciences (such as fluid flow simulators), sequential simulation software is not extremely computationally intensive, but parallelization can make it more efficient and creates alternatives for its integration in inverse modelling approaches. This paper describes the implementation and benchmarking of a parallel version of the three classic sequential simulation algorithms: direct sequential simulation (DSS), sequential indicator simulation (SIS) and sequential Gaussian simulation (SGS). For this purpose, the source used was GSLIB, but the entire code was extensively modified to take into account the parallelization approach and was also rewritten in the C programming language. The paper also explains in detail the parallelization strategy and the main modifications. Regarding the integration of secondary information, the DSS algorithm is able to perform simple kriging with local means, kriging with an external drift and collocated cokriging with both local and global correlations. SIS includes a local correction of probabilities. Finally, a brief comparison is presented of simulation results using one, two and four processors. All performance tests were carried out on 2D soil data samples. The source code is completely open source and easy to read. It should be noted that the code is only fully compatible with Microsoft Visual C and should be adapted for other systems/compilers.
Automatic segmentation of tumor-laden lung volumes from the LIDC database

NASA Astrophysics Data System (ADS)

O'Dell, Walter G.

2012-03-01

The segmentation of the lung parenchyma is often a critical pre-processing step prior to application of computer-aided detection of lung nodules. Segmentation of the lung volume can dramatically decrease computation time and reduce the number of false positive detections by excluding from consideration extra-pulmonary tissue. However, while many algorithms are capable of adequately segmenting the healthy lung, none have been demonstrated to work reliably well on tumor-laden lungs. Of particular challenge is to preserve tumorous masses attached to the chest wall, mediastinum or major vessels. In this role, lung volume segmentation comprises an important computational step that can adversely affect the performance of the overall CAD algorithm. An automated lung volume segmentation algorithm has been developed with the goals to maximally exclude extra-pulmonary tissue while retaining all true nodules. The algorithm comprises a series of tasks including intensity thresholding, 2-D and 3-D morphological operations, 2-D and 3-D floodfilling, and snake-based clipping of nodules attached to the chest wall. It features the ability to (1) exclude trachea and bowels, (2) snip large attached nodules using snakes, (3) snip small attached nodules using dilation, (4) preserve large masses fully internal to lung volume, (5) account for basal aspects of the lung where in a 2-D slice the lower sections appear to be disconnected from main lung, and (6) achieve separation of the right and left hemi-lungs. The algorithm was developed and trained to on the first 100 datasets of the LIDC image database.
ON COMPUTING UPPER LIMITS TO SOURCE INTENSITIES

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kashyap, Vinay L.; Siemiginowska, Aneta; Van Dyk, David A.

2010-08-10

A common problem in astrophysics is determining how bright a source could be and still not be detected in an observation. Despite the simplicity with which the problem can be stated, the solution involves complicated statistical issues that require careful analysis. In contrast to the more familiar confidence bound, this concept has never been formally analyzed, leading to a great variety of often ad hoc solutions. Here we formulate and describe the problem in a self-consistent manner. Detection significance is usually defined by the acceptable proportion of false positives (background fluctuations that are claimed as detections, or Type I error),more » and we invoke the complementary concept of false negatives (real sources that go undetected, or Type II error), based on the statistical power of a test, to compute an upper limit to the detectable source intensity. To determine the minimum intensity that a source must have for it to be detected, we first define a detection threshold and then compute the probabilities of detecting sources of various intensities at the given threshold. The intensity that corresponds to the specified Type II error probability defines that minimum intensity and is identified as the upper limit. Thus, an upper limit is a characteristic of the detection procedure rather than the strength of any particular source. It should not be confused with confidence intervals or other estimates of source intensity. This is particularly important given the large number of catalogs that are being generated from increasingly sensitive surveys. We discuss, with examples, the differences between these upper limits and confidence bounds. Both measures are useful quantities that should be reported in order to extract the most science from catalogs, though they answer different statistical questions: an upper bound describes an inference range on the source intensity, while an upper limit calibrates the detection process. We provide a recipe for computing upper limits that applies to all detection algorithms.« less
RAINLINK: Retrieval algorithm for rainfall monitoring employing microwave links from a cellular communication network

NASA Astrophysics Data System (ADS)

Uijlenhoet, R.; Overeem, A.; Leijnse, H.; Rios Gaona, M. F.

2017-12-01

The basic principle of rainfall estimation using microwave links is as follows. Rainfall attenuates the electromagnetic signals transmitted from one telephone tower to another. By measuring the received power at one end of a microwave link as a function of time, the path-integrated attenuation due to rainfall can be calculated, which can be converted to average rainfall intensities over the length of a link. Microwave links from cellular communication networks have been proposed as a promising new rainfall measurement technique for one decade. They are particularly interesting for those countries where few surface rainfall observations are available. Yet to date no operational (real-time) link-based rainfall products are available. To advance the process towards operational application and upscaling of this technique, there is a need for freely available, user-friendly computer code for microwave link data processing and rainfall mapping. Such software is now available as R package "RAINLINK" on GitHub (https://github.com/overeem11/RAINLINK). It contains a working example to compute link-based 15-min rainfall maps for the entire surface area of The Netherlands for 40 hours from real microwave link data. This is a working example using actual data from an extensive network of commercial microwave links, for the first time, which will allow users to test their own algorithms and compare their results with ours. The package consists of modular functions, which facilitates running only part of the algorithm. The main processings steps are: 1) Preprocessing of link data (initial quality and consistency checks); 2) Wet-dry classification using link data; 3) Reference signal determination; 4) Removal of outliers ; 5) Correction of received signal powers; 6) Computation of mean path-averaged rainfall intensities; 7) Interpolation of rainfall intensities ; 8) Rainfall map visualisation. Some applications of RAINLINK will be shown based on microwave link data from a temperate climate (the Netherlands), and from a subtropical climate (Brazil). We hope that RAINLINK will promote the application of rainfall monitoring using microwave links in poorly gauged regions around the world. We invite researchers to contribute to RAINLINK to make the code more generally applicable to data from different networks and climates.
Study of Variation in Dose Calculation Accuracy Between kV Cone-Beam Computed Tomography and kV fan-Beam Computed Tomography

PubMed Central

Kaliyaperumal, Venkatesan; Raphael, C. Jomon; Varghese, K. Mathew; Gopu, Paul; Sivakumar, S.; Boban, Minu; Raj, N. Arunai Nambi; Senthilnathan, K.; Babu, P. Ramesh

2017-01-01

Cone-beam computed tomography (CBCT) images are presently used for geometric verification for daily patient positioning. In this work, we have compared the images of CBCT with the images of conventional fan beam CT (FBCT) in terms of image quality and Hounsfield units (HUs). We also compared the dose calculated using CBCT with that of FBCT. Homogenous RW3 plates and Catphan phantom were scanned by FBCT and CBCT. In RW3 and Catphan phantom, percentage depth dose (PDD), profiles, isodose distributions (for intensity modulated radiotherapy plans), and calculated dose volume histograms were compared. The HU difference was within ± 20 HU (central region) and ± 30 HU (peripheral region) for homogeneous RW3 plates. In the Catphan phantom, the difference in HU was ± 20 HU in the central area and peripheral areas. The HU differences were within ± 30 HU for all HU ranges starting from −1000 to 990 in phantom and patient images. In treatment plans done with simple symmetric and asymmetric fields, dose difference (DD) between CBCT plan and FBCT plan was within 1.2% for both phantoms. In intensity modulated radiotherapy (IMRT) treatment plans, for different target volumes, the difference was <2%. This feasibility study investigated HU variation and dose calculation accuracy between FBCT and CBCT based planning and has validated inverse planning algorithms with CBCT. In our study, we observed a larger deviation of HU values in the peripheral region compared to the central region. This is due to the ring artifact and scatter contribution which may prevent the use of CBCT as the primary imaging modality for radiotherapy treatment planning. The reconstruction algorithm needs to be modified further for improving the image quality and accuracy in HU values. However, our study with TG-119 and intensity modulated radiotherapy test targets shows that CBCT can be used for adaptive replanning as the recalculation of dose with the anisotropic analytical algorithm is in full accord with conventional planning CT except in the build-up regions. Patient images with CBCT have to be carefully analyzed for any artifacts before using them for such dose calculations. PMID:28974864
Performance analysis of parallel branch and bound search with the hypercube architecture

NASA Technical Reports Server (NTRS)

Mraz, Richard T.

1987-01-01

With the availability of commercial parallel computers, researchers are examining new classes of problems which might benefit from parallel computing. This paper presents results of an investigation of the class of search intensive problems. The specific problem discussed is the Least-Cost Branch and Bound search method of deadline job scheduling. The object-oriented design methodology was used to map the problem into a parallel solution. While the initial design was good for a prototype, the best performance resulted from fine-tuning the algorithm for a specific computer. The experiments analyze the computation time, the speed up over a VAX 11/785, and the load balance of the problem when using loosely coupled multiprocessor system based on the hypercube architecture.

Wavefront sensorless adaptive optics OCT with the DONE algorithm for in vivo human retinal imaging [Invited

PubMed Central

Verstraete, Hans R. G. W.; Heisler, Morgan; Ju, Myeong Jin; Wahl, Daniel; Bliek, Laurens; Kalkman, Jeroen; Bonora, Stefano; Jian, Yifan; Verhaegen, Michel; Sarunic, Marinko V.

2017-01-01

In this report, which is an international collaboration of OCT, adaptive optics, and control research, we demonstrate the Data-based Online Nonlinear Extremum-seeker (DONE) algorithm to guide the image based optimization for wavefront sensorless adaptive optics (WFSL-AO) OCT for in vivo human retinal imaging. The ocular aberrations were corrected using a multi-actuator adaptive lens after linearization of the hysteresis in the piezoelectric actuators. The DONE algorithm succeeded in drastically improving image quality and the OCT signal intensity, up to a factor seven, while achieving a computational time of 1 ms per iteration, making it applicable for many high speed applications. We demonstrate the correction of five aberrations using 70 iterations of the DONE algorithm performed over 2.8 s of continuous volumetric OCT acquisition. Data acquired from an imaging phantom and in vivo from human research volunteers are presented. PMID:28736670
Wavefront sensorless adaptive optics OCT with the DONE algorithm for in vivo human retinal imaging [Invited].

PubMed

Verstraete, Hans R G W; Heisler, Morgan; Ju, Myeong Jin; Wahl, Daniel; Bliek, Laurens; Kalkman, Jeroen; Bonora, Stefano; Jian, Yifan; Verhaegen, Michel; Sarunic, Marinko V

2017-04-01

In this report, which is an international collaboration of OCT, adaptive optics, and control research, we demonstrate the Data-based Online Nonlinear Extremum-seeker (DONE) algorithm to guide the image based optimization for wavefront sensorless adaptive optics (WFSL-AO) OCT for in vivo human retinal imaging. The ocular aberrations were corrected using a multi-actuator adaptive lens after linearization of the hysteresis in the piezoelectric actuators. The DONE algorithm succeeded in drastically improving image quality and the OCT signal intensity, up to a factor seven, while achieving a computational time of 1 ms per iteration, making it applicable for many high speed applications. We demonstrate the correction of five aberrations using 70 iterations of the DONE algorithm performed over 2.8 s of continuous volumetric OCT acquisition. Data acquired from an imaging phantom and in vivo from human research volunteers are presented.
Discovering and understanding oncogenic gene fusions through data intensive computational approaches

PubMed Central

Latysheva, Natasha S.; Babu, M. Madan

2016-01-01

Abstract Although gene fusions have been recognized as important drivers of cancer for decades, our understanding of the prevalence and function of gene fusions has been revolutionized by the rise of next-generation sequencing, advances in bioinformatics theory and an increasing capacity for large-scale computational biology. The computational work on gene fusions has been vastly diverse, and the present state of the literature is fragmented. It will be fruitful to merge three camps of gene fusion bioinformatics that appear to rarely cross over: (i) data-intensive computational work characterizing the molecular biology of gene fusions; (ii) development research on fusion detection tools, candidate fusion prioritization algorithms and dedicated fusion databases and (iii) clinical research that seeks to either therapeutically target fusion transcripts and proteins or leverages advances in detection tools to perform large-scale surveys of gene fusion landscapes in specific cancer types. In this review, we unify these different—yet highly complementary and symbiotic—approaches with the view that increased synergy will catalyze advancements in gene fusion identification, characterization and significance evaluation. PMID:27105842
The correlation study of parallel feature extractor and noise reduction approaches

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dewi, Deshinta Arrova; Sundararajan, Elankovan; Prabuwono, Anton Satria

2015-05-15

This paper presents literature reviews that show variety of techniques to develop parallel feature extractor and finding its correlation with noise reduction approaches for low light intensity images. Low light intensity images are normally displayed as darker images and low contrast. Without proper handling techniques, those images regularly become evidences of misperception of objects and textures, the incapability to section them. The visual illusions regularly clues to disorientation, user fatigue, poor detection and classification performance of humans and computer algorithms. Noise reduction approaches (NR) therefore is an essential step for other image processing steps such as edge detection, image segmentation,more » image compression, etc. Parallel Feature Extractor (PFE) meant to capture visual contents of images involves partitioning images into segments, detecting image overlaps if any, and controlling distributed and redistributed segments to extract the features. Working on low light intensity images make the PFE face challenges and closely depend on the quality of its pre-processing steps. Some papers have suggested many well established NR as well as PFE strategies however only few resources have suggested or mentioned the correlation between them. This paper reviews best approaches of the NR and the PFE with detailed explanation on the suggested correlation. This finding may suggest relevant strategies of the PFE development. With the help of knowledge based reasoning, computational approaches and algorithms, we present the correlation study between the NR and the PFE that can be useful for the development and enhancement of other existing PFE.« less
New correction procedures for the fast field program which extend its range

NASA Technical Reports Server (NTRS)

West, M.; Sack, R. A.

1990-01-01

A fast field program (FFP) algorithm was developed based on the method of Lee et al., for the prediction of sound pressure level from low frequency, high intensity sources. In order to permit accurate predictions at distances greater than 2 km, new correction procedures have had to be included in the algorithm. Certain functions, whose Hankel transforms can be determined analytically, are subtracted from the depth dependent Green's function. The distance response is then obtained as the sum of these transforms and the Fast Fourier Transformation (FFT) of the residual k dependent function. One procedure, which permits the elimination of most complex exponentials, has allowed significant changes in the structure of the FFP algorithm, which has resulted in a substantial reduction in computation time.
A CT and MRI scan to MCNP input conversion program.

PubMed

Van Riper, Kenneth A

2005-01-01

We describe a new program to read a sequence of tomographic scans and prepare the geometry and material sections of an MCNP input file. Image processing techniques include contrast controls and mapping of grey scales to colour. The user interface provides several tools with which the user can associate a range of image intensities to an MCNP material. Materials are loaded from a library. A separate material assignment can be made to a pixel intensity or range of intensities when that intensity dominates the image boundaries; this material is assigned to all pixels with that intensity contiguous with the boundary. Material fractions are computed in a user-specified voxel grid overlaying the scans. New materials are defined by mixing the library materials using the fractions. The geometry can be written as an MCNP lattice or as individual cells. A combination algorithm can be used to join neighbouring cells with the same material.
Demons deformable registration of CT and cone-beam CT using an iterative intensity matching approach.

PubMed

Nithiananthan, Sajendra; Schafer, Sebastian; Uneri, Ali; Mirota, Daniel J; Stayman, J Webster; Zbijewski, Wojciech; Brock, Kristy K; Daly, Michael J; Chan, Harley; Irish, Jonathan C; Siewerdsen, Jeffrey H

2011-04-01

A method of intensity-based deformable registration of CT and cone-beam CT (CBCT) images is described, in which intensity correction occurs simultaneously within the iterative registration process. The method preserves the speed and simplicity of the popular Demons algorithm while providing robustness and accuracy in the presence of large mismatch between CT and CBCT voxel values ("intensity"). A variant of the Demons algorithm was developed in which an estimate of the relationship between CT and CBCT intensity values for specific materials in the image is computed at each iteration based on the set of currently overlapping voxels. This tissue-specific intensity correction is then used to estimate the registration output for that iteration and the process is repeated. The robustness of the method was tested in CBCT images of a cadaveric head exhibiting a broad range of simulated intensity variations associated with x-ray scatter, object truncation, and/or errors in the reconstruction algorithm. The accuracy of CT-CBCT registration was also measured in six real cases, exhibiting deformations ranging from simple to complex during surgery or radiotherapy guided by a CBCT-capable C-arm or linear accelerator, respectively. The iterative intensity matching approach was robust against all levels of intensity variation examined, including spatially varying errors in voxel value of a factor of 2 or more, as can be encountered in cases of high x-ray scatter. Registration accuracy without intensity matching degraded severely with increasing magnitude of intensity error and introduced image distortion. A single histogram match performed prior to registration alleviated some of these effects but was also prone to image distortion and was quantifiably less robust and accurate than the iterative approach. Within the six case registration accuracy study, iterative intensity matching Demons reduced mean TRE to (2.5 +/- 2.8) mm compared to (3.5 +/- 3.0) mm with rigid registration. A method was developed to iteratively correct CT-CBCT intensity disparity during Demons registration, enabling fast, intensity-based registration in CBCT-guided procedures such as surgery and radiotherapy, in which CBCT voxel values may be inaccurate. Accurate CT-CBCT registration in turn facilitates registration of multimodality preoperative image and planning data to intraoperative CBCT by way of the preoperative CT, thereby linking the intraoperative frame of reference to a wealth of preoperative information that could improve interventional guidance.
Spatiotemporal Domain Decomposition for Massive Parallel Computation of Space-Time Kernel Density

NASA Astrophysics Data System (ADS)

Hohl, A.; Delmelle, E. M.; Tang, W.

2015-07-01

Accelerated processing capabilities are deemed critical when conducting analysis on spatiotemporal datasets of increasing size, diversity and availability. High-performance parallel computing offers the capacity to solve computationally demanding problems in a limited timeframe, but likewise poses the challenge of preventing processing inefficiency due to workload imbalance between computing resources. Therefore, when designing new algorithms capable of implementing parallel strategies, careful spatiotemporal domain decomposition is necessary to account for heterogeneity in the data. In this study, we perform octtree-based adaptive decomposition of the spatiotemporal domain for parallel computation of space-time kernel density. In order to avoid edge effects near subdomain boundaries, we establish spatiotemporal buffers to include adjacent data-points that are within the spatial and temporal kernel bandwidths. Then, we quantify computational intensity of each subdomain to balance workloads among processors. We illustrate the benefits of our methodology using a space-time epidemiological dataset of Dengue fever, an infectious vector-borne disease that poses a severe threat to communities in tropical climates. Our parallel implementation of kernel density reaches substantial speedup compared to sequential processing, and achieves high levels of workload balance among processors due to great accuracy in quantifying computational intensity. Our approach is portable of other space-time analytical tests.
Particle-in-cell simulations on graphic processing units

NASA Astrophysics Data System (ADS)

Ren, C.; Zhou, X.; Li, J.; Huang, M. C.; Zhao, Y.

2014-10-01

We will show our recent progress in using GPU's to accelerate the PIC code OSIRIS [Fonseca et al. LNCS 2331, 342 (2002)]. The OISRIS parallel structure is retained and the computation-intensive kernels are shipped to GPU's. Algorithms for the kernels are adapted for the GPU, including high-order charge-conserving current deposition schemes with few branching and parallel particle sorting [Kong et al., JCP 230, 1676 (2011)]. These algorithms make efficient use of the GPU shared memory. This work was supported by U.S. Department of Energy under Grant No. DE-FC02-04ER54789 and by NSF under Grant No. PHY-1314734.
Low-level processing for real-time image analysis

NASA Technical Reports Server (NTRS)

Eskenazi, R.; Wilf, J. M.

1979-01-01

A system that detects object outlines in television images in real time is described. A high-speed pipeline processor transforms the raw image into an edge map and a microprocessor, which is integrated into the system, clusters the edges, and represents them as chain codes. Image statistics, useful for higher level tasks such as pattern recognition, are computed by the microprocessor. Peak intensity and peak gradient values are extracted within a programmable window and are used for iris and focus control. The algorithms implemented in hardware and the pipeline processor architecture are described. The strategy for partitioning functions in the pipeline was chosen to make the implementation modular. The microprocessor interface allows flexible and adaptive control of the feature extraction process. The software algorithms for clustering edge segments, creating chain codes, and computing image statistics are also discussed. A strategy for real time image analysis that uses this system is given.
Combination of the discontinuous Galerkin method with finite differences for simulation of seismic wave propagation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lisitsa, Vadim, E-mail: lisitsavv@ipgg.sbras.ru; Novosibirsk State University, Novosibirsk; Tcheverda, Vladimir

We present an algorithm for the numerical simulation of seismic wave propagation in models with a complex near surface part and free surface topography. The approach is based on the combination of finite differences with the discontinuous Galerkin method. The discontinuous Galerkin method can be used on polyhedral meshes; thus, it is easy to handle the complex surfaces in the models. However, this approach is computationally intense in comparison with finite differences. Finite differences are computationally efficient, but in general, they require rectangular grids, leading to the stair-step approximation of the interfaces, which causes strong diffraction of the wavefield. Inmore » this research we present a hybrid algorithm where the discontinuous Galerkin method is used in a relatively small upper part of the model and finite differences are applied to the main part of the model.« less
An efficient direct method for image registration of flat objects

NASA Astrophysics Data System (ADS)

Nikolaev, Dmitry; Tihonkih, Dmitrii; Makovetskii, Artyom; Voronin, Sergei

2017-09-01

Image alignment of rigid surfaces is a rapidly developing area of research and has many practical applications. Alignment methods can be roughly divided into two types: feature-based methods and direct methods. Known SURF and SIFT algorithms are examples of the feature-based methods. Direct methods refer to those that exploit the pixel intensities without resorting to image features and image-based deformations are general direct method to align images of deformable objects in 3D space. Nevertheless, it is not good for the registration of images of 3D rigid objects since the underlying structure cannot be directly evaluated. In the article, we propose a model that is suitable for image alignment of rigid flat objects under various illumination models. The brightness consistency assumptions used for reconstruction of optimal geometrical transformation. Computer simulation results are provided to illustrate the performance of the proposed algorithm for computing of an accordance between pixels of two images.
Load management strategy for Particle-In-Cell simulations in high energy particle acceleration

NASA Astrophysics Data System (ADS)

Beck, A.; Frederiksen, J. T.; Dérouillat, J.

2016-09-01

In the wake of the intense effort made for the experimental CILEX project, numerical simulation campaigns have been carried out in order to finalize the design of the facility and to identify optimal laser and plasma parameters. These simulations bring, of course, important insight into the fundamental physics at play. As a by-product, they also characterize the quality of our theoretical and numerical models. In this paper, we compare the results given by different codes and point out algorithmic limitations both in terms of physical accuracy and computational performances. These limitations are illustrated in the context of electron laser wakefield acceleration (LWFA). The main limitation we identify in state-of-the-art Particle-In-Cell (PIC) codes is computational load imbalance. We propose an innovative algorithm to deal with this specific issue as well as milestones towards a modern, accurate high-performance PIC code for high energy particle acceleration.
Sub-Selective Quantization for Learning Binary Codes in Large-Scale Image Search.

PubMed

Li, Yeqing; Liu, Wei; Huang, Junzhou

2018-06-01

Recently with the explosive growth of visual content on the Internet, large-scale image search has attracted intensive attention. It has been shown that mapping high-dimensional image descriptors to compact binary codes can lead to considerable efficiency gains in both storage and performing similarity computation of images. However, most existing methods still suffer from expensive training devoted to large-scale binary code learning. To address this issue, we propose a sub-selection based matrix manipulation algorithm, which can significantly reduce the computational cost of code learning. As case studies, we apply the sub-selection algorithm to several popular quantization techniques including cases using linear and nonlinear mappings. Crucially, we can justify the resulting sub-selective quantization by proving its theoretic properties. Extensive experiments are carried out on three image benchmarks with up to one million samples, corroborating the efficacy of the sub-selective quantization method in terms of image retrieval.
Mean curvature and texture constrained composite weighted random walk algorithm for optic disc segmentation towards glaucoma screening.

PubMed

Panda, Rashmi; Puhan, N B; Panda, Ganapati

2018-02-01

Accurate optic disc (OD) segmentation is an important step in obtaining cup-to-disc ratio-based glaucoma screening using fundus imaging. It is a challenging task because of the subtle OD boundary, blood vessel occlusion and intensity inhomogeneity. In this Letter, the authors propose an improved version of the random walk algorithm for OD segmentation to tackle such challenges. The algorithm incorporates the mean curvature and Gabor texture energy features to define the new composite weight function to compute the edge weights. Unlike the deformable model-based OD segmentation techniques, the proposed algorithm remains unaffected by curve initialisation and local energy minima problem. The effectiveness of the proposed method is verified with DRIVE, DIARETDB1, DRISHTI-GS and MESSIDOR database images using the performance measures such as mean absolute distance, overlapping ratio, dice coefficient, sensitivity, specificity and precision. The obtained OD segmentation results and quantitative performance measures show robustness and superiority of the proposed algorithm in handling the complex challenges in OD segmentation.
Generation and assessment of turntable SAR data for the support of ATR development

NASA Astrophysics Data System (ADS)

Cohen, Marvin N.; Showman, Gregory A.; Sangston, K. James; Sylvester, Vincent B.; Gostin, Lamar; Scheer, C. Ruby

1998-10-01

Inverse synthetic aperture radar (ISAR) imaging on a turntable-tower test range permits convenient generation of high resolution two-dimensional images of radar targets under controlled conditions for testing SAR image processing and for supporting automatic target recognition (ATR) algorithm development. However, turntable ISAR images are often obtained under near-field geometries and hence may suffer geometric distortions not present in airborne SAR images. In this paper, turntable data collected at Georgia Tech's Electromagnetic Test Facility are used to begin to assess the utility of two- dimensional ISAR imaging algorithms in forming images to support ATR development. The imaging algorithms considered include a simple 2D discrete Fourier transform (DFT), a 2-D DFT with geometric correction based on image domain resampling, and a computationally-intensive geometric matched filter solution. Images formed with the various algorithms are used to develop ATR templates, which are then compared with an eye toward utilization in an ATR algorithm.
Emergence of a snake-like structure in mobile distributed agents: an exploratory agent-based modeling approach.

PubMed

Niazi, Muaz A

2014-01-01

The body structure of snakes is composed of numerous natural components thereby making it resilient, flexible, adaptive, and dynamic. In contrast, current computer animations as well as physical implementations of snake-like autonomous structures are typically designed to use either a single or a relatively smaller number of components. As a result, not only these artificial structures are constrained by the dimensions of the constituent components but often also require relatively more computationally intensive algorithms to model and animate. Still, these animations often lack life-like resilience and adaptation. This paper presents a solution to the problem of modeling snake-like structures by proposing an agent-based, self-organizing algorithm resulting in an emergent and surprisingly resilient dynamic structure involving a minimal of interagent communication. Extensive simulation experiments demonstrate the effectiveness as well as resilience of the proposed approach. The ideas originating from the proposed algorithm can not only be used for developing self-organizing animations but can also have practical applications such as in the form of complex, autonomous, evolvable robots with self-organizing, mobile components with minimal individual computational capabilities. The work also demonstrates the utility of exploratory agent-based modeling (EABM) in the engineering of artificial life-like complex adaptive systems.
Emergence of a Snake-Like Structure in Mobile Distributed Agents: An Exploratory Agent-Based Modeling Approach

PubMed Central

Niazi, Muaz A.

2014-01-01

The body structure of snakes is composed of numerous natural components thereby making it resilient, flexible, adaptive, and dynamic. In contrast, current computer animations as well as physical implementations of snake-like autonomous structures are typically designed to use either a single or a relatively smaller number of components. As a result, not only these artificial structures are constrained by the dimensions of the constituent components but often also require relatively more computationally intensive algorithms to model and animate. Still, these animations often lack life-like resilience and adaptation. This paper presents a solution to the problem of modeling snake-like structures by proposing an agent-based, self-organizing algorithm resulting in an emergent and surprisingly resilient dynamic structure involving a minimal of interagent communication. Extensive simulation experiments demonstrate the effectiveness as well as resilience of the proposed approach. The ideas originating from the proposed algorithm can not only be used for developing self-organizing animations but can also have practical applications such as in the form of complex, autonomous, evolvable robots with self-organizing, mobile components with minimal individual computational capabilities. The work also demonstrates the utility of exploratory agent-based modeling (EABM) in the engineering of artificial life-like complex adaptive systems. PMID:24701135
Exact Synthesis of Reversible Circuits Using A* Algorithm

NASA Astrophysics Data System (ADS)

Datta, K.; Rathi, G. K.; Sengupta, I.; Rahaman, H.

2015-06-01

With the growing emphasis on low-power design methodologies, and the result that theoretical zero power dissipation is possible only if computations are information lossless, design and synthesis of reversible logic circuits have become very important in recent years. Reversible logic circuits are also important in the context of quantum computing, where the basic operations are reversible in nature. Several synthesis methodologies for reversible circuits have been reported. Some of these methods are termed as exact, where the motivation is to get the minimum-gate realization for a given reversible function. These methods are computationally very intensive, and are able to synthesize only very small functions. There are other methods based on function transformations or higher-level representation of functions like binary decision diagrams or exclusive-or sum-of-products, that are able to handle much larger circuits without any guarantee of optimality or near-optimality. Design of exact synthesis algorithms is interesting in this context, because they set some kind of benchmarks against which other methods can be compared. This paper proposes an exact synthesis approach based on an iterative deepening version of the A* algorithm using the multiple-control Toffoli gate library. Experimental results are presented with comparisons with other exact and some heuristic based synthesis approaches.
A Direct Position-Determination Approach for Multiple Sources Based on Neural Network Computation.

PubMed

Chen, Xin; Wang, Ding; Yin, Jiexin; Wu, Ying

2018-06-13

The most widely used localization technology is the two-step method that localizes transmitters by measuring one or more specified positioning parameters. Direct position determination (DPD) is a promising technique that directly localizes transmitters from sensor outputs and can offer superior localization performance. However, existing DPD algorithms such as maximum likelihood (ML)-based and multiple signal classification (MUSIC)-based estimations are computationally expensive, making it difficult to satisfy real-time demands. To solve this problem, we propose the use of a modular neural network for multiple-source DPD. In this method, the area of interest is divided into multiple sub-areas. Multilayer perceptron (MLP) neural networks are employed to detect the presence of a source in a sub-area and filter sources in other sub-areas, and radial basis function (RBF) neural networks are utilized for position estimation. Simulation results show that a number of appropriately trained neural networks can be successfully used for DPD. The performance of the proposed MLP-MLP-RBF method is comparable to the performance of the conventional MUSIC-based DPD algorithm for various signal-to-noise ratios and signal power ratios. Furthermore, the MLP-MLP-RBF network is less computationally intensive than the classical DPD algorithm and is therefore an attractive choice for real-time applications.

Complex Spiral Structure in the HD 100546 Transitional Disk as Revealed by GPI and MagAO

DOE Office of Scientific and Technical Information (OSTI.GOV)

Follette, Katherine B.; Macintosh, Bruce; Mullen, Wyatt

We present optical and near-infrared high-contrast images of the transitional disk HD 100546 taken with the Magellan Adaptive Optics system (MagAO) and the Gemini Planet Imager (GPI). GPI data include both polarized intensity and total intensity imagery, and MagAO data are taken in Simultaneous Differential Imaging mode at H α . The new GPI H -band total intensity data represent a significant enhancement in sensitivity and field rotation compared to previous data sets and enable a detailed exploration of substructure in the disk. The data are processed with a variety of differential imaging techniques (polarized, angular, reference, and simultaneous differentialmore » imaging) in an attempt to identify the disk structures that are most consistent across wavelengths, processing techniques, and algorithmic parameters. The inner disk cavity at 15 au is clearly resolved in multiple data sets, as are a variety of spiral features. While the cavity and spiral structures are identified at levels significantly distinct from the neighboring regions of the disk under several algorithms and with a range of algorithmic parameters, emission at the location of HD 100546 “ c ” varies from point-like under aggressive algorithmic parameters to a smooth continuous structure with conservative parameters, and is consistent with disk emission. Features identified in the HD 100546 disk bear qualitative similarity to computational models of a moderately inclined two-armed spiral disk, where projection effects and wrapping of the spiral arms around the star result in a number of truncated spiral features in forward-modeled images.« less
Optical image encryption via high-quality computational ghost imaging using iterative phase retrieval

NASA Astrophysics Data System (ADS)

Liansheng, Sui; Yin, Cheng; Bing, Li; Ailing, Tian; Krishna Asundi, Anand

2018-07-01

A novel computational ghost imaging scheme based on specially designed phase-only masks, which can be efficiently applied to encrypt an original image into a series of measured intensities, is proposed in this paper. First, a Hadamard matrix with a certain order is generated, where the number of elements in each row is equal to the size of the original image to be encrypted. Each row of the matrix is rearranged into the corresponding 2D pattern. Then, each pattern is encoded into the phase-only masks by making use of an iterative phase retrieval algorithm. These specially designed masks can be wholly or partially used in the process of computational ghost imaging to reconstruct the original information with high quality. When a significantly small number of phase-only masks are used to record the measured intensities in a single-pixel bucket detector, the information can be authenticated without clear visualization by calculating the nonlinear correlation map between the original image and its reconstruction. The results illustrate the feasibility and effectiveness of the proposed computational ghost imaging mechanism, which will provide an effective alternative for enriching the related research on the computational ghost imaging technique.
A parallel-processing approach to computing for the geographic sciences

USGS Publications Warehouse

Crane, Michael; Steinwand, Dan; Beckmann, Tim; Krpan, Greg; Haga, Jim; Maddox, Brian; Feller, Mark

2001-01-01

The overarching goal of this project is to build a spatially distributed infrastructure for information science research by forming a team of information science researchers and providing them with similar hardware and software tools to perform collaborative research. Four geographically distributed Centers of the U.S. Geological Survey (USGS) are developing their own clusters of low-cost personal computers into parallel computing environments that provide a costeffective way for the USGS to increase participation in the high-performance computing community. Referred to as Beowulf clusters, these hybrid systems provide the robust computing power required for conducting research into various areas, such as advanced computer architecture, algorithms to meet the processing needs for real-time image and data processing, the creation of custom datasets from seamless source data, rapid turn-around of products for emergency response, and support for computationally intense spatial and temporal modeling.
Antenna analysis using neural networks

NASA Technical Reports Server (NTRS)

Smith, William T.

1992-01-01

Conventional computing schemes have long been used to analyze problems in electromagnetics (EM). The vast majority of EM applications require computationally intensive algorithms involving numerical integration and solutions to large systems of equations. The feasibility of using neural network computing algorithms for antenna analysis is investigated. The ultimate goal is to use a trained neural network algorithm to reduce the computational demands of existing reflector surface error compensation techniques. Neural networks are computational algorithms based on neurobiological systems. Neural nets consist of massively parallel interconnected nonlinear computational elements. They are often employed in pattern recognition and image processing problems. Recently, neural network analysis has been applied in the electromagnetics area for the design of frequency selective surfaces and beam forming networks. The backpropagation training algorithm was employed to simulate classical antenna array synthesis techniques. The Woodward-Lawson (W-L) and Dolph-Chebyshev (D-C) array pattern synthesis techniques were used to train the neural network. The inputs to the network were samples of the desired synthesis pattern. The outputs are the array element excitations required to synthesize the desired pattern. Once trained, the network is used to simulate the W-L or D-C techniques. Various sector patterns and cosecant-type patterns (27 total) generated using W-L synthesis were used to train the network. Desired pattern samples were then fed to the neural network. The outputs of the network were the simulated W-L excitations. A 20 element linear array was used. There were 41 input pattern samples with 40 output excitations (20 real parts, 20 imaginary). A comparison between the simulated and actual W-L techniques is shown for a triangular-shaped pattern. Dolph-Chebyshev is a different class of synthesis technique in that D-C is used for side lobe control as opposed to pattern shaping. The interesting thing about D-C synthesis is that the side lobes have the same amplitude. Five-element arrays were used. Again, 41 pattern samples were used for the input. Nine actual D-C patterns ranging from -10 dB to -30 dB side lobe levels were used to train the network. A comparison between simulated and actual D-C techniques for a pattern with -22 dB side lobe level is shown. The goal for this research was to evaluate the performance of neural network computing with antennas. Future applications will employ the backpropagation training algorithm to drastically reduce the computational complexity involved in performing EM compensation for surface errors in large space reflector antennas.
Antenna analysis using neural networks

NASA Astrophysics Data System (ADS)

Smith, William T.

1992-09-01

Conventional computing schemes have long been used to analyze problems in electromagnetics (EM). The vast majority of EM applications require computationally intensive algorithms involving numerical integration and solutions to large systems of equations. The feasibility of using neural network computing algorithms for antenna analysis is investigated. The ultimate goal is to use a trained neural network algorithm to reduce the computational demands of existing reflector surface error compensation techniques. Neural networks are computational algorithms based on neurobiological systems. Neural nets consist of massively parallel interconnected nonlinear computational elements. They are often employed in pattern recognition and image processing problems. Recently, neural network analysis has been applied in the electromagnetics area for the design of frequency selective surfaces and beam forming networks. The backpropagation training algorithm was employed to simulate classical antenna array synthesis techniques. The Woodward-Lawson (W-L) and Dolph-Chebyshev (D-C) array pattern synthesis techniques were used to train the neural network. The inputs to the network were samples of the desired synthesis pattern. The outputs are the array element excitations required to synthesize the desired pattern. Once trained, the network is used to simulate the W-L or D-C techniques. Various sector patterns and cosecant-type patterns (27 total) generated using W-L synthesis were used to train the network. Desired pattern samples were then fed to the neural network. The outputs of the network were the simulated W-L excitations. A 20 element linear array was used. There were 41 input pattern samples with 40 output excitations (20 real parts, 20 imaginary).

Processor core for real time background identification of HD video based on OpenCV Gaussian mixture model algorithm

NASA Astrophysics Data System (ADS)

Genovese, Mariangela; Napoli, Ettore

2013-05-01

The identification of moving objects is a fundamental step in computer vision processing chains. The development of low cost and lightweight smart cameras steadily increases the request of efficient and high performance circuits able to process high definition video in real time. The paper proposes two processor cores aimed to perform the real time background identification on High Definition (HD, 1920 1080 pixel) video streams. The implemented algorithm is the OpenCV version of the Gaussian Mixture Model (GMM), an high performance probabilistic algorithm for the segmentation of the background that is however computationally intensive and impossible to implement on general purpose CPU with the constraint of real time processing. In the proposed paper, the equations of the OpenCV GMM algorithm are optimized in such a way that a lightweight and low power implementation of the algorithm is obtained. The reported performances are also the result of the use of state of the art truncated binary multipliers and ROM compression techniques for the implementation of the non-linear functions. The first circuit has commercial FPGA devices as a target and provides speed and logic resource occupation that overcome previously proposed implementations. The second circuit is oriented to an ASIC (UMC-90nm) standard cell implementation. Both implementations are able to process more than 60 frames per second in 1080p format, a frame rate compatible with HD television.
Iterative pixelwise approach applied to computer-generated holograms and diffractive optical elements.

PubMed

Hsu, Wei-Feng; Lin, Shih-Chih

2018-01-01

This paper presents a novel approach to optimizing the design of phase-only computer-generated holograms (CGH) for the creation of binary images in an optical Fourier transform system. Optimization begins by selecting an image pixel with a temporal change in amplitude. The modulated image function undergoes an inverse Fourier transform followed by the imposition of a CGH constraint and the Fourier transform to yield an image function associated with the change in amplitude of the selected pixel. In iterations where the quality of the image is improved, that image function is adopted as the input for the next iteration. In cases where the image quality is not improved, the image function before the pixel changed is used as the input. Thus, the proposed approach is referred to as the pixelwise hybrid input-output (PHIO) algorithm. The PHIO algorithm was shown to achieve image quality far exceeding that of the Gerchberg-Saxton (GS) algorithm. The benefits were particularly evident when the PHIO algorithm was equipped with a dynamic range of image intensities equivalent to the amplitude freedom of the image signal. The signal variation of images reconstructed from the GS algorithm was 1.0223, but only 0.2537 when using PHIO, i.e., a 75% improvement. Nonetheless, the proposed scheme resulted in a 10% degradation in diffraction efficiency and signal-to-noise ratio.
FPGA implementation of image dehazing algorithm for real time applications

NASA Astrophysics Data System (ADS)

Kumar, Rahul; Kaushik, Brajesh Kumar; Balasubramanian, R.

2017-09-01

Weather degradation such as haze, fog, mist, etc. severely reduces the effective range of visual surveillance. This degradation is a spatially varying phenomena, which makes this problem non trivial. Dehazing is an essential preprocessing stage in applications such as long range imaging, border security, intelligent transportation system, etc. However, these applications require low latency of the preprocessing block. In this work, single image dark channel prior algorithm is modified and implemented for fast processing with comparable visual quality of the restored image/video. Although conventional single image dark channel prior algorithm is computationally expensive, it yields impressive results. Moreover, a two stage image dehazing architecture is introduced, wherein, dark channel and airlight are estimated in the first stage. Whereas, transmission map and intensity restoration are computed in the next stages. The algorithm is implemented using Xilinx Vivado software and validated by using Xilinx zc702 development board, which contains an Artix7 equivalent Field Programmable Gate Array (FPGA) and ARM Cortex A9 dual core processor. Additionally, high definition multimedia interface (HDMI) has been incorporated for video feed and display purposes. The results show that the dehazing algorithm attains 29 frames per second for the image resolution of 1920x1080 which is suitable of real time applications. The design utilizes 9 18K_BRAM, 97 DSP_48, 6508 FFs and 8159 LUTs.
Recursive least-squares learning algorithms for neural networks

NASA Astrophysics Data System (ADS)

Lewis, Paul S.; Hwang, Jenq N.

1990-11-01

This paper presents the development of a pair of recursive least squares (ItLS) algorithms for online training of multilayer perceptrons which are a class of feedforward artificial neural networks. These algorithms incorporate second order information about the training error surface in order to achieve faster learning rates than are possible using first order gradient descent algorithms such as the generalized delta rule. A least squares formulation is derived from a linearization of the training error function. Individual training pattern errors are linearized about the network parameters that were in effect when the pattern was presented. This permits the recursive solution of the least squares approximation either via conventional RLS recursions or by recursive QR decomposition-based techniques. The computational complexity of the update is 0(N2) where N is the number of network parameters. This is due to the estimation of the N x N inverse Hessian matrix. Less computationally intensive approximations of the ilLS algorithms can be easily derived by using only block diagonal elements of this matrix thereby partitioning the learning into independent sets. A simulation example is presented in which a neural network is trained to approximate a two dimensional Gaussian bump. In this example RLS training required an order of magnitude fewer iterations on average (527) than did training with the generalized delta rule (6 1 BACKGROUND Artificial neural networks (ANNs) offer an interesting and potentially useful paradigm for signal processing and pattern recognition. The majority of ANN applications employ the feed-forward multilayer perceptron (MLP) network architecture in which network parameters are " trained" by a supervised learning algorithm employing the generalized delta rule (GDIt) [1 2]. The GDR algorithm approximates a fixed step steepest descent algorithm using derivatives computed by error backpropagatiori. The GDII algorithm is sometimes referred to as the backpropagation algorithm. However in this paper we will use the term backpropagation to refer only to the process of computing error derivatives. While multilayer perceptrons provide a very powerful nonlinear modeling capability GDR training can be very slow and inefficient. In linear adaptive filtering the analog of the GDR algorithm is the leastmean- squares (LMS) algorithm. Steepest descent-based algorithms such as GDR or LMS are first order because they use only first derivative or gradient information about the training error to be minimized. To speed up the training process second order algorithms may be employed that take advantage of second derivative or Hessian matrix information. Second order information can be incorporated into MLP training in different ways. In many applications especially in the area of pattern recognition the training set is finite. In these cases block learning can be applied using standard nonlinear optimization techniques [3 4 5].
Generic algorithms for high performance scalable geocomputing

NASA Astrophysics Data System (ADS)

de Jong, Kor; Schmitz, Oliver; Karssenberg, Derek

2016-04-01

During the last decade, the characteristics of computing hardware have changed a lot. For example, instead of a single general purpose CPU core, personal computers nowadays contain multiple cores per CPU and often general purpose accelerators, like GPUs. Additionally, compute nodes are often grouped together to form clusters or a supercomputer, providing enormous amounts of compute power. For existing earth simulation models to be able to use modern hardware platforms, their compute intensive parts must be rewritten. This can be a major undertaking and may involve many technical challenges. Compute tasks must be distributed over CPU cores, offloaded to hardware accelerators, or distributed to different compute nodes. And ideally, all of this should be done in such a way that the compute task scales well with the hardware resources. This presents two challenges: 1) how to make good use of all the compute resources and 2) how to make these compute resources available for developers of simulation models, who may not (want to) have the required technical background for distributing compute tasks. The first challenge requires the use of specialized technology (e.g.: threads, OpenMP, MPI, OpenCL, CUDA). The second challenge requires the abstraction of the logic handling the distribution of compute tasks from the model-specific logic, hiding the technical details from the model developer. To assist the model developer, we are developing a C++ software library (called Fern) containing algorithms that can use all CPU cores available in a single compute node (distributing tasks over multiple compute nodes will be done at a later stage). The algorithms are grid-based (finite difference) and include local and spatial operations such as convolution filters. The algorithms handle distribution of the compute tasks to CPU cores internally. In the resulting model the low-level details of how this is done is separated from the model-specific logic representing the modeled system. This contrasts with practices in which code for distributing of compute tasks is mixed with model-specific code, and results in a better maintainable model. For flexibility and efficiency, the algorithms are configurable at compile-time with the respect to the following aspects: data type, value type, no-data handling, input value domain handling, and output value range handling. This makes the algorithms usable in very different contexts, without the need for making intrusive changes to existing models when using them. Applications that benefit from using the Fern library include the construction of forward simulation models in (global) hydrology (e.g. PCR-GLOBWB (Van Beek et al. 2011)), ecology, geomorphology, or land use change (e.g. PLUC (Verstegen et al. 2014)) and manipulation of hyper-resolution land surface data such as digital elevation models and remote sensing data. Using the Fern library, we have also created an add-on to the PCRaster Python Framework (Karssenberg et al. 2010) allowing its users to speed up their spatio-temporal models, sometimes by changing just a single line of Python code in their model. In our presentation we will give an overview of the design of the algorithms, providing examples of different contexts where they can be used to replace existing sequential algorithms, including the PCRaster environmental modeling software (www.pcraster.eu). We will show how the algorithms can be configured to behave differently when necessary. References Karssenberg, D., Schmitz, O., Salamon, P., De Jong, K. and Bierkens, M.F.P., 2010, A software framework for construction of process-based stochastic spatio-temporal models and data assimilation. Environmental Modelling & Software, 25, pp. 489-502, Link. Best Paper Award 2010: Software and Decision Support. Van Beek, L. P. H., Y. Wada, and M. F. P. Bierkens. 2011. Global monthly water stress: 1. Water balance and water availability. Water Resources Research. 47. Verstegen, J. A., D. Karssenberg, F. van der Hilst, and A. P. C. Faaij. 2014. Identifying a land use change cellular automaton by Bayesian data assimilation. Environmental Modelling & Software 53:121-136.
Multi-atlas segmentation for abdominal organs with Gaussian mixture models

NASA Astrophysics Data System (ADS)

Burke, Ryan P.; Xu, Zhoubing; Lee, Christopher P.; Baucom, Rebeccah B.; Poulose, Benjamin K.; Abramson, Richard G.; Landman, Bennett A.

2015-03-01

Abdominal organ segmentation with clinically acquired computed tomography (CT) is drawing increasing interest in the medical imaging community. Gaussian mixture models (GMM) have been extensively used through medical segmentation, most notably in the brain for cerebrospinal fluid / gray matter / white matter differentiation. Because abdominal CT exhibit strong localized intensity characteristics, GMM have recently been incorporated in multi-stage abdominal segmentation algorithms. In the context of variable abdominal anatomy and rich algorithms, it is difficult to assess the marginal contribution of GMM. Herein, we characterize the efficacy of an a posteriori framework that integrates GMM of organ-wise intensity likelihood with spatial priors from multiple target-specific registered labels. In our study, we first manually labeled 100 CT images. Then, we assigned 40 images to use as training data for constructing target-specific spatial priors and intensity likelihoods. The remaining 60 images were evaluated as test targets for segmenting 12 abdominal organs. The overlap between the true and the automatic segmentations was measured by Dice similarity coefficient (DSC). A median improvement of 145% was achieved by integrating the GMM intensity likelihood against the specific spatial prior. The proposed framework opens the opportunities for abdominal organ segmentation by efficiently using both the spatial and appearance information from the atlases, and creates a benchmark for large-scale automatic abdominal segmentation.
Deterministic object tracking using Gaussian ringlet and directional edge features

NASA Astrophysics Data System (ADS)

Krieger, Evan W.; Sidike, Paheding; Aspiras, Theus; Asari, Vijayan K.

2017-10-01

Challenges currently existing for intensity-based histogram feature tracking methods in wide area motion imagery (WAMI) data include object structural information distortions, background variations, and object scale change. These issues are caused by different pavement or ground types and from changing the sensor or altitude. All of these challenges need to be overcome in order to have a robust object tracker, while attaining a computation time appropriate for real-time processing. To achieve this, we present a novel method, Directional Ringlet Intensity Feature Transform (DRIFT), which employs Kirsch kernel filtering for edge features and a ringlet feature mapping for rotational invariance. The method also includes an automatic scale change component to obtain accurate object boundaries and improvements for lowering computation times. We evaluated the DRIFT algorithm on two challenging WAMI datasets, namely Columbus Large Image Format (CLIF) and Large Area Image Recorder (LAIR), to evaluate its robustness and efficiency. Additional evaluations on general tracking video sequences are performed using the Visual Tracker Benchmark and Visual Object Tracking 2014 databases to demonstrate the algorithms ability with additional challenges in long complex sequences including scale change. Experimental results show that the proposed approach yields competitive results compared to state-of-the-art object tracking methods on the testing datasets.
In Situ Three-Dimensional Reciprocal-Space Mapping of Diffuse Scattering Intensity Distribution and Data Analysis for Precursor Phenomenon in Shape-Memory Alloy

NASA Astrophysics Data System (ADS)

Cheng, Tian-Le; Ma, Fengde D.; Zhou, Jie E.; Jennings, Guy; Ren, Yang; Jin, Yongmei M.; Wang, Yu U.

2012-01-01

Diffuse scattering contains rich information on various structural disorders, thus providing a useful means to study the nanoscale structural deviations from the average crystal structures determined by Bragg peak analysis. Extraction of maximal information from diffuse scattering requires concerted efforts in high-quality three-dimensional (3D) data measurement, quantitative data analysis and visualization, theoretical interpretation, and computer simulations. Such an endeavor is undertaken to study the correlated dynamic atomic position fluctuations caused by thermal vibrations (phonons) in precursor state of shape-memory alloys. High-quality 3D diffuse scattering intensity data around representative Bragg peaks are collected by using in situ high-energy synchrotron x-ray diffraction and two-dimensional digital x-ray detector (image plate). Computational algorithms and codes are developed to construct the 3D reciprocal-space map of diffuse scattering intensity distribution from the measured data, which are further visualized and quantitatively analyzed to reveal in situ physical behaviors. Diffuse scattering intensity distribution is explicitly formulated in terms of atomic position fluctuations to interpret the experimental observations and identify the most relevant physical mechanisms, which help set up reduced structural models with minimal parameters to be efficiently determined by computer simulations. Such combined procedures are demonstrated by a study of phonon softening phenomenon in precursor state and premartensitic transformation of Ni-Mn-Ga shape-memory alloy.
A Hierarchical Auction-Based Mechanism for Real-Time Resource Allocation in Cloud Robotic Systems.

PubMed

Wang, Lujia; Liu, Ming; Meng, Max Q-H

2017-02-01

Cloud computing enables users to share computing resources on-demand. The cloud computing framework cannot be directly mapped to cloud robotic systems with ad hoc networks since cloud robotic systems have additional constraints such as limited bandwidth and dynamic structure. However, most multirobotic applications with cooperative control adopt this decentralized approach to avoid a single point of failure. Robots need to continuously update intensive data to execute tasks in a coordinated manner, which implies real-time requirements. Thus, a resource allocation strategy is required, especially in such resource-constrained environments. This paper proposes a hierarchical auction-based mechanism, namely link quality matrix (LQM) auction, which is suitable for ad hoc networks by introducing a link quality indicator. The proposed algorithm produces a fast and robust method that is accurate and scalable. It reduces both global communication and unnecessary repeated computation. The proposed method is designed for firm real-time resource retrieval for physical multirobot systems. A joint surveillance scenario empirically validates the proposed mechanism by assessing several practical metrics. The results show that the proposed LQM auction outperforms state-of-the-art algorithms for resource allocation.
Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver

DOE PAGES

Shao, Meiyue; Aktulga, H. Metin; Yang, Chao; ...

2017-09-14

In this paper, we describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. Themore » use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. Finally, we also discuss the implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.« less
GPU-based acceleration of computations in nonlinear finite element deformation analysis.

PubMed

Mafi, Ramin; Sirouspour, Shahin

2014-03-01

The physics of deformation for biological soft-tissue is best described by nonlinear continuum mechanics-based models, which then can be discretized by the FEM for a numerical solution. However, computational complexity of such models have limited their use in applications requiring real-time or fast response. In this work, we propose a graphic processing unit-based implementation of the FEM using implicit time integration for dynamic nonlinear deformation analysis. This is the most general formulation of the deformation analysis. It is valid for large deformations and strains and can account for material nonlinearities. The data-parallel nature and the intense arithmetic computations of nonlinear FEM equations make it particularly suitable for implementation on a parallel computing platform such as graphic processing unit. In this work, we present and compare two different designs based on the matrix-free and conventional preconditioned conjugate gradients algorithms for solving the FEM equations arising in deformation analysis. The speedup achieved with the proposed parallel implementations of the algorithms will be instrumental in the development of advanced surgical simulators and medical image registration methods involving soft-tissue deformation. Copyright © 2013 John Wiley & Sons, Ltd.
Using Approximations to Accelerate Engineering Design Optimization

NASA Technical Reports Server (NTRS)

Torczon, Virginia; Trosset, Michael W.

1998-01-01

Optimization problems that arise in engineering design are often characterized by several features that hinder the use of standard nonlinear optimization techniques. Foremost among these features is that the functions used to define the engineering optimization problem often are computationally intensive. Within a standard nonlinear optimization algorithm, the computational expense of evaluating the functions that define the problem would necessarily be incurred for each iteration of the optimization algorithm. Faced with such prohibitive computational costs, an attractive alternative is to make use of surrogates within an optimization context since surrogates can be chosen or constructed so that they are typically much less expensive to compute. For the purposes of this paper, we will focus on the use of algebraic approximations as surrogates for the objective. In this paper we introduce the use of so-called merit functions that explicitly recognize the desirability of improving the current approximation to the objective during the course of the optimization. We define and experiment with the use of merit functions chosen to simultaneously improve both the solution to the optimization problem (the objective) and the quality of the approximation. Our goal is to further improve the effectiveness of our general approach without sacrificing any of its rigor.
Edge-Based Efficient Search over Encrypted Data Mobile Cloud Storage

PubMed Central

Liu, Fang; Cai, Zhiping; Xiao, Nong; Zhao, Ziming

2018-01-01

Smart sensor-equipped mobile devices sense, collect, and process data generated by the edge network to achieve intelligent control, but such mobile devices usually have limited storage and computing resources. Mobile cloud storage provides a promising solution owing to its rich storage resources, great accessibility, and low cost. But it also brings a risk of information leakage. The encryption of sensitive data is the basic step to resist the risk. However, deploying a high complexity encryption and decryption algorithm on mobile devices will greatly increase the burden of terminal operation and the difficulty to implement the necessary privacy protection algorithm. In this paper, we propose ENSURE (EfficieNt and SecURE), an efficient and secure encrypted search architecture over mobile cloud storage. ENSURE is inspired by edge computing. It allows mobile devices to offload the computation intensive task onto the edge server to achieve a high efficiency. Besides, to protect data security, it reduces the information acquisition of untrusted cloud by hiding the relevance between query keyword and search results from the cloud. Experiments on a real data set show that ENSURE reduces the computation time by 15% to 49% and saves the energy consumption by 38% to 69% per query. PMID:29652810
Edge-Based Efficient Search over Encrypted Data Mobile Cloud Storage.

PubMed

Guo, Yeting; Liu, Fang; Cai, Zhiping; Xiao, Nong; Zhao, Ziming

2018-04-13

Smart sensor-equipped mobile devices sense, collect, and process data generated by the edge network to achieve intelligent control, but such mobile devices usually have limited storage and computing resources. Mobile cloud storage provides a promising solution owing to its rich storage resources, great accessibility, and low cost. But it also brings a risk of information leakage. The encryption of sensitive data is the basic step to resist the risk. However, deploying a high complexity encryption and decryption algorithm on mobile devices will greatly increase the burden of terminal operation and the difficulty to implement the necessary privacy protection algorithm. In this paper, we propose ENSURE (EfficieNt and SecURE), an efficient and secure encrypted search architecture over mobile cloud storage. ENSURE is inspired by edge computing. It allows mobile devices to offload the computation intensive task onto the edge server to achieve a high efficiency. Besides, to protect data security, it reduces the information acquisition of untrusted cloud by hiding the relevance between query keyword and search results from the cloud. Experiments on a real data set show that ENSURE reduces the computation time by 15% to 49% and saves the energy consumption by 38% to 69% per query.
Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shao, Meiyue; Aktulga, H. Metin; Yang, Chao

In this paper, we describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. Themore » use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. Finally, we also discuss the implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.« less

GPS-SNO: computational prediction of protein S-nitrosylation sites with a modified GPS algorithm.

PubMed

Xue, Yu; Liu, Zexian; Gao, Xinjiao; Jin, Changjiang; Wen, Longping; Yao, Xuebiao; Ren, Jian

2010-06-24

As one of the most important and ubiquitous post-translational modifications (PTMs) of proteins, S-nitrosylation plays important roles in a variety of biological processes, including the regulation of cellular dynamics and plasticity. Identification of S-nitrosylated substrates with their exact sites is crucial for understanding the molecular mechanisms of S-nitrosylation. In contrast with labor-intensive and time-consuming experimental approaches, prediction of S-nitrosylation sites using computational methods could provide convenience and increased speed. In this work, we developed a novel software of GPS-SNO 1.0 for the prediction of S-nitrosylation sites. We greatly improved our previously developed algorithm and released the GPS 3.0 algorithm for GPS-SNO. By comparison, the prediction performance of GPS 3.0 algorithm was better than other methods, with an accuracy of 75.80%, a sensitivity of 53.57% and a specificity of 80.14%. As an application of GPS-SNO 1.0, we predicted putative S-nitrosylation sites for hundreds of potentially S-nitrosylated substrates for which the exact S-nitrosylation sites had not been experimentally determined. In this regard, GPS-SNO 1.0 should prove to be a useful tool for experimentalists. The online service and local packages of GPS-SNO were implemented in JAVA and are freely available at: http://sno.biocuckoo.org/.
Accurate in silico identification of protein succinylation sites using an iterative semi-supervised learning technique.

PubMed

Zhao, Xiaowei; Ning, Qiao; Chai, Haiting; Ma, Zhiqiang

2015-06-07

As a widespread type of protein post-translational modifications (PTMs), succinylation plays an important role in regulating protein conformation, function and physicochemical properties. Compared with the labor-intensive and time-consuming experimental approaches, computational predictions of succinylation sites are much desirable due to their convenient and fast speed. Currently, numerous computational models have been developed to identify PTMs sites through various types of two-class machine learning algorithms. These methods require both positive and negative samples for training. However, designation of the negative samples of PTMs was difficult and if it is not properly done can affect the performance of computational models dramatically. So that in this work, we implemented the first application of positive samples only learning (PSoL) algorithm to succinylation sites prediction problem, which was a special class of semi-supervised machine learning that used positive samples and unlabeled samples to train the model. Meanwhile, we proposed a novel succinylation sites computational predictor called SucPred (succinylation site predictor) by using multiple feature encoding schemes. Promising results were obtained by the SucPred predictor with an accuracy of 88.65% using 5-fold cross validation on the training dataset and an accuracy of 84.40% on the independent testing dataset, which demonstrated that the positive samples only learning algorithm presented here was particularly useful for identification of protein succinylation sites. Besides, the positive samples only learning algorithm can be applied to build predictors for other types of PTMs sites with ease. A web server for predicting succinylation sites was developed and was freely accessible at http://59.73.198.144:8088/SucPred/. Copyright © 2015 Elsevier Ltd. All rights reserved.
Boundary identification and error analysis of shocked material images

NASA Astrophysics Data System (ADS)

Hock, Margaret; Howard, Marylesa; Cooper, Leora; Meehan, Bernard; Nelson, Keith

2017-06-01

To compute quantities such as pressure and velocity from laser-induced shock waves propagating through materials, high-speed images are captured and analyzed. Shock images typically display high noise and spatially-varying intensities, causing conventional analysis techniques to have difficulty identifying boundaries in the images without making significant assumptions about the data. We present a novel machine learning algorithm that efficiently segments, or partitions, images with high noise and spatially-varying intensities, and provides error maps that describe a level of uncertainty in the partitioning. The user trains the algorithm by providing locations of known materials within the image but no assumptions are made on the geometries in the image. The error maps are used to provide lower and upper bounds on quantities of interest, such as velocity and pressure, once boundaries have been identified and propagated through equations of state. This algorithm will be demonstrated on images of shock waves with noise and aberrations to quantify properties of the wave as it progresses. DOE/NV/25946-3126 This work was done by National Security Technologies, LLC, under Contract No. DE- AC52-06NA25946 with the U.S. Department of Energy and supported by the SDRD Program.
Image recombination transform algorithm for superresolution structured illumination microscopy

PubMed Central

Zhou, Xing; Lei, Ming; Dan, Dan; Yao, Baoli; Yang, Yanlong; Qian, Jia; Chen, Guangde; Bianco, Piero R.

2016-01-01

Abstract. Structured illumination microscopy (SIM) is an attractive choice for fast superresolution imaging. The generation of structured illumination patterns made by interference of laser beams is broadly employed to obtain high modulation depth of patterns, while the polarizations of the laser beams must be elaborately controlled to guarantee the high contrast of interference intensity, which brings a more complex configuration for the polarization control. The emerging pattern projection strategy is much more compact, but the modulation depth of patterns is deteriorated by the optical transfer function of the optical system, especially in high spatial frequency near the diffraction limit. Therefore, the traditional superresolution reconstruction algorithm for interference-based SIM will suffer from many artifacts in the case of projection-based SIM that possesses a low modulation depth. Here, we propose an alternative reconstruction algorithm based on image recombination transform, which provides an alternative solution to address this problem even in a weak modulation depth. We demonstrated the effectiveness of this algorithm in the multicolor superresolution imaging of bovine pulmonary arterial endothelial cells in our developed projection-based SIM system, which applies a computer controlled digital micromirror device for fast fringe generation and multicolor light-emitting diodes for illumination. The merit of the system incorporated with the proposed algorithm allows for a low excitation intensity fluorescence imaging even less than 1 W/cm2, which is beneficial for the long-term, in vivo superresolved imaging of live cells and tissues. PMID:27653935
The reduced basis method for the electric field integral equation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fares, M., E-mail: fares@cerfacs.f; Hesthaven, J.S., E-mail: Jan_Hesthaven@Brown.ed; Maday, Y., E-mail: maday@ann.jussieu.f

We introduce the reduced basis method (RBM) as an efficient tool for parametrized scattering problems in computational electromagnetics for problems where field solutions are computed using a standard Boundary Element Method (BEM) for the parametrized electric field integral equation (EFIE). This combination enables an algorithmic cooperation which results in a two step procedure. The first step consists of a computationally intense assembling of the reduced basis, that needs to be effected only once. In the second step, we compute output functionals of the solution, such as the Radar Cross Section (RCS), independently of the dimension of the discretization space, formore » many different parameter values in a many-query context at very little cost. Parameters include the wavenumber, the angle of the incident plane wave and its polarization.« less
Intensity-based hierarchical clustering in CT-scans: application to interactive segmentation in cardiology

NASA Astrophysics Data System (ADS)

Hadida, Jonathan; Desrosiers, Christian; Duong, Luc

2011-03-01

The segmentation of anatomical structures in Computed Tomography Angiography (CTA) is a pre-operative task useful in image guided surgery. Even though very robust and precise methods have been developed to help achieving a reliable segmentation (level sets, active contours, etc), it remains very time consuming both in terms of manual interactions and in terms of computation time. The goal of this study is to present a fast method to find coarse anatomical structures in CTA with few parameters, based on hierarchical clustering. The algorithm is organized as follows: first, a fast non-parametric histogram clustering method is proposed to compute a piecewise constant mask. A second step then indexes all the space-connected regions in the piecewise constant mask. Finally, a hierarchical clustering is achieved to build a graph representing the connections between the various regions in the piecewise constant mask. This step builds up a structural knowledge about the image. Several interactive features for segmentation are presented, for instance association or disassociation of anatomical structures. A comparison with the Mean-Shift algorithm is presented.
Computing the Evans function via solving a linear boundary value ODE

NASA Astrophysics Data System (ADS)

Wahl, Colin; Nguyen, Rose; Ventura, Nathaniel; Barker, Blake; Sandstede, Bjorn

2015-11-01

Determining the stability of traveling wave solutions to partial differential equations can oftentimes be computationally intensive but of great importance to understanding the effects of perturbations on the physical systems (chemical reactions, hydrodynamics, etc.) they model. For waves in one spatial dimension, one may linearize around the wave and form an Evans function - an analytic Wronskian-like function which has zeros that correspond in multiplicity to the eigenvalues of the linearized system. If eigenvalues with a positive real part do not exist, the traveling wave will be stable. Two methods exist for calculating the Evans function numerically: the exterior-product method and the method of continuous orthogonalization. The first is numerically expensive, and the second reformulates the originally linear system as a nonlinear system. We develop a new algorithm for computing the Evans function through appropriate linear boundary-value problems. This algorithm is cheaper than the previous methods, and we prove that it preserves analyticity of the Evans function. We also provide error estimates and implement it on some classical one- and two-dimensional systems, one being the Swift-Hohenberg equation in a channel, to show the advantages.
Two-dimensional nonsteady viscous flow simulation on the Navier-Stokes computer miniNode

NASA Technical Reports Server (NTRS)

Nosenchuck, Daniel M.; Littman, Michael G.; Flannery, William

1986-01-01

The needs of large-scale scientific computation are outpacing the growth in performance of mainframe supercomputers. In particular, problems in fluid mechanics involving complex flow simulations require far more speed and capacity than that provided by current and proposed Class VI supercomputers. To address this concern, the Navier-Stokes Computer (NSC) was developed. The NSC is a parallel-processing machine, comprised of individual Nodes, each comparable in performance to current supercomputers. The global architecture is that of a hypercube, and a 128-Node NSC has been designed. New architectural features, such as a reconfigurable many-function ALU pipeline and a multifunction memory-ALU switch, have provided the capability to efficiently implement a wide range of algorithms. Efficient algorithms typically involve numerically intensive tasks, which often include conditional operations. These operations may be efficiently implemented on the NSC without, in general, sacrificing vector-processing speed. To illustrate the architecture, programming, and several of the capabilities of the NSC, the simulation of two-dimensional, nonsteady viscous flows on a prototype Node, called the miniNode, is presented.
An image morphing technique based on optimal mass preserving mapping.

PubMed

Zhu, Lei; Yang, Yan; Haker, Steven; Tannenbaum, Allen

2007-06-01

Image morphing, or image interpolation in the time domain, deals with the metamorphosis of one image into another. In this paper, a new class of image morphing algorithms is proposed based on the theory of optimal mass transport. The L(2) mass moving energy functional is modified by adding an intensity penalizing term, in order to reduce the undesired double exposure effect. It is an intensity-based approach and, thus, is parameter free. The optimal warping function is computed using an iterative gradient descent approach. This proposed morphing method is also extended to doubly connected domains using a harmonic parameterization technique, along with finite-element methods.
An Image Morphing Technique Based on Optimal Mass Preserving Mapping

PubMed Central

Zhu, Lei; Yang, Yan; Haker, Steven; Tannenbaum, Allen

2013-01-01

Image morphing, or image interpolation in the time domain, deals with the metamorphosis of one image into another. In this paper, a new class of image morphing algorithms is proposed based on the theory of optimal mass transport. The L2 mass moving energy functional is modified by adding an intensity penalizing term, in order to reduce the undesired double exposure effect. It is an intensity-based approach and, thus, is parameter free. The optimal warping function is computed using an iterative gradient descent approach. This proposed morphing method is also extended to doubly connected domains using a harmonic parameterization technique, along with finite-element methods. PMID:17547128
Computational method for multi-modal microscopy based on transport of intensity equation

NASA Astrophysics Data System (ADS)

Li, Jiaji; Chen, Qian; Sun, Jiasong; Zhang, Jialin; Zuo, Chao

2017-02-01

In this paper, we develop the requisite theory to describe a hybrid virtual-physical multi-modal imaging system which yields quantitative phase, Zernike phase contrast, differential interference contrast (DIC), and light field moment imaging simultaneously based on transport of intensity equation(TIE). We then give the experimental demonstration of these ideas by time-lapse imaging of live HeLa cell mitosis. Experimental results verify that a tunable lens based TIE system, combined with the appropriate post-processing algorithm, can achieve a variety of promising imaging modalities in parallel with the quantitative phase images for the dynamic study of cellular processes.
Automated intraretinal layer segmentation of optical coherence tomography images using graph-theoretical methods

NASA Astrophysics Data System (ADS)

Roy, Priyanka; Gholami, Peyman; Kuppuswamy Parthasarathy, Mohana; Zelek, John; Lakshminarayanan, Vasudevan

2018-02-01

Segmentation of spectral-domain Optical Coherence Tomography (SD-OCT) images facilitates visualization and quantification of sub-retinal layers for diagnosis of retinal pathologies. However, manual segmentation is subjective, expertise dependent, and time-consuming, which limits applicability of SD-OCT. Efforts are therefore being made to implement active-contours, artificial intelligence, and graph-search to automatically segment retinal layers with accuracy comparable to that of manual segmentation, to ease clinical decision-making. Although, low optical contrast, heavy speckle noise, and pathologies pose challenges to automated segmentation. Graph-based image segmentation approach stands out from the rest because of its ability to minimize the cost function while maximising the flow. This study has developed and implemented a shortest-path based graph-search algorithm for automated intraretinal layer segmentation of SD-OCT images. The algorithm estimates the minimal-weight path between two graph-nodes based on their gradients. Boundary position indices (BPI) are computed from the transition between pixel intensities. The mean difference between BPIs of two consecutive layers quantify individual layer thicknesses, which shows statistically insignificant differences when compared to a previous study [for overall retina: p = 0.17, for individual layers: p > 0.05 (except one layer: p = 0.04)]. These results substantiate the accurate delineation of seven intraretinal boundaries in SD-OCT images by this algorithm, with a mean computation time of 0.93 seconds (64-bit Windows10, core i5, 8GB RAM). Besides being self-reliant for denoising, the algorithm is further computationally optimized to restrict segmentation within the user defined region-of-interest. The efficiency and reliability of this algorithm, even in noisy image conditions, makes it clinically applicable.
Implementation in an FPGA circuit of Edge detection algorithm based on the Discrete Wavelet Transforms

NASA Astrophysics Data System (ADS)

Bouganssa, Issam; Sbihi, Mohamed; Zaim, Mounia

2017-07-01

The 2D Discrete Wavelet Transform (DWT) is a computationally intensive task that is usually implemented on specific architectures in many imaging systems in real time. In this paper, a high throughput edge or contour detection algorithm is proposed based on the discrete wavelet transform. A technique for applying the filters on the three directions (Horizontal, Vertical and Diagonal) of the image is used to present the maximum of the existing contours. The proposed architectures were designed in VHDL and mapped to a Xilinx Sparten6 FPGA. The results of the synthesis show that the proposed architecture has a low area cost and can operate up to 100 MHz, which can perform 2D wavelet analysis for a sequence of images while maintaining the flexibility of the system to support an adaptive algorithm.
Automatic selection of landmarks in T1-weighted head MRI with regression forests for image registration initialization

NASA Astrophysics Data System (ADS)

Wang, Jianing; Liu, Yuan; Noble, Jack H.; Dawant, Benoit M.

2017-02-01

Medical image registration establishes a correspondence between images of biological structures and it is at the core of many applications. Commonly used deformable image registration methods are dependent on a good preregistration initialization. The initialization can be performed by localizing homologous landmarks and calculating a point-based transformation between the images. The selection of landmarks is however important. In this work, we present a learning-based method to automatically find a set of robust landmarks in 3D MR image volumes of the head to initialize non-rigid transformations. To validate our method, these selected landmarks are localized in unknown image volumes and they are used to compute a smoothing thin-plate splines transformation that registers the atlas to the volumes. The transformed atlas image is then used as the preregistration initialization of an intensity-based non-rigid registration algorithm. We show that the registration accuracy of this algorithm is statistically significantly improved when using the presented registration initialization over a standard intensity-based affine registration.
Some uses of wavelets for imaging dynamic processes in live cochlear structures

NASA Astrophysics Data System (ADS)

Boutet de Monvel, J.

2007-09-01

A variety of image and signal processing algorithms based on wavelet filtering tools have been developed during the last few decades, that are well adapted to the experimental variability typically encountered in live biological microscopy. A number of processing tools are reviewed, that use wavelets for adaptive image restoration and for motion or brightness variation analysis by optical flow computation. The usefulness of these tools for biological imaging is illustrated in the context of the restoration of images of the inner ear and the analysis of cochlear motion patterns in two and three dimensions. I also report on recent work that aims at capturing fluorescence intensity changes associated with vesicle dynamics at synaptic zones of sensory hair cells. This latest application requires one to separate the intensity variations associated with the physiological process under study from the variations caused by motion of the observed structures. A wavelet optical flow algorithm for doing this is presented, and its effectiveness is demonstrated on artificial and experimental image sequences.
Algorithms to eliminate the influence of non-uniform intensity distributions on wavefront reconstruction by quadri-wave lateral shearing interferometers

NASA Astrophysics Data System (ADS)

Chen, Xiao-jun; Dong, Li-zhi; Wang, Shuai; Yang, Ping; Xu, Bing

2017-11-01

In quadri-wave lateral shearing interferometry (QWLSI), when the intensity distribution of the incident light wave is non-uniform, part of the information of the intensity distribution will couple with the wavefront derivatives to cause wavefront reconstruction errors. In this paper, we propose two algorithms to reduce the influence of a non-uniform intensity distribution on wavefront reconstruction. Our simulation results demonstrate that the reconstructed amplitude distribution (RAD) algorithm can effectively reduce the influence of the intensity distribution on the wavefront reconstruction and that the collected amplitude distribution (CAD) algorithm can almost eliminate it.
Navigational strategies underlying phototaxis in larval zebrafish.

PubMed

Chen, Xiuye; Engert, Florian

2014-01-01

Understanding how the brain transforms sensory input into complex behavior is a fundamental question in systems neuroscience. Using larval zebrafish, we study the temporal component of phototaxis, which is defined as orientation decisions based on comparisons of light intensity at successive moments in time. We developed a novel "Virtual Circle" assay where whole-field illumination is abruptly turned off when the fish swims out of a virtually defined circular border, and turned on again when it returns into the circle. The animal receives no direct spatial cues and experiences only whole-field temporal light changes. Remarkably, the fish spends most of its time within the invisible virtual border. Behavioral analyses of swim bouts in relation to light transitions were used to develop four discrete temporal algorithms that transform the binary visual input (uniform light/uniform darkness) into the observed spatial behavior. In these algorithms, the turning angle is dependent on the behavioral history immediately preceding individual turning events. Computer simulations show that the algorithms recapture most of the swim statistics of real fish. We discovered that turning properties in larval zebrafish are distinctly modulated by temporal step functions in light intensity in combination with the specific motor history preceding these turns. Several aspects of the behavior suggest memory usage of up to 10 swim bouts (~10 sec). Thus, we show that a complex behavior like spatial navigation can emerge from a small number of relatively simple behavioral algorithms.
A stochastic estimation procedure for intermittently-observed semi-Markov multistate models with back transitions.

PubMed

Aralis, Hilary; Brookmeyer, Ron

2017-01-01

Multistate models provide an important method for analyzing a wide range of life history processes including disease progression and patient recovery following medical intervention. Panel data consisting of the states occupied by an individual at a series of discrete time points are often used to estimate transition intensities of the underlying continuous-time process. When transition intensities depend on the time elapsed in the current state and back transitions between states are possible, this intermittent observation process presents difficulties in estimation due to intractability of the likelihood function. In this manuscript, we present an iterative stochastic expectation-maximization algorithm that relies on a simulation-based approximation to the likelihood function and implement this algorithm using rejection sampling. In a simulation study, we demonstrate the feasibility and performance of the proposed procedure. We then demonstrate application of the algorithm to a study of dementia, the Nun Study, consisting of intermittently-observed elderly subjects in one of four possible states corresponding to intact cognition, impaired cognition, dementia, and death. We show that the proposed stochastic expectation-maximization algorithm substantially reduces bias in model parameter estimates compared to an alternative approach used in the literature, minimal path estimation. We conclude that in estimating intermittently observed semi-Markov models, the proposed approach is a computationally feasible and accurate estimation procedure that leads to substantial improvements in back transition estimates.
Label-free sensor for automatic identification of erythrocytes using digital in-line holographic microscopy and machine learning.

PubMed

Go, Taesik; Byeon, Hyeokjun; Lee, Sang Joon

2018-04-30

Cell types of erythrocytes should be identified because they are closely related to their functionality and viability. Conventional methods for classifying erythrocytes are time consuming and labor intensive. Therefore, an automatic and accurate erythrocyte classification system is indispensable in healthcare and biomedical fields. In this study, we proposed a new label-free sensor for automatic identification of erythrocyte cell types using a digital in-line holographic microscopy (DIHM) combined with machine learning algorithms. A total of 12 features, including information on intensity distributions, morphological descriptors, and optical focusing characteristics, is quantitatively obtained from numerically reconstructed holographic images. All individual features for discocytes, echinocytes, and spherocytes are statistically different. To improve the performance of cell type identification, we adopted several machine learning algorithms, such as decision tree model, support vector machine, linear discriminant classification, and k-nearest neighbor classification. With the aid of these machine learning algorithms, the extracted features are effectively utilized to distinguish erythrocytes. Among the four tested algorithms, the decision tree model exhibits the best identification performance for the training sets (n = 440, 98.18%) and test sets (n = 190, 97.37%). This proposed methodology, which smartly combined DIHM and machine learning, would be helpful for sensing abnormal erythrocytes and computer-aided diagnosis of hematological diseases in clinic. Copyright © 2017 Elsevier B.V. All rights reserved.
The Application of Support Vector Machine (svm) Using Cielab Color Model, Color Intensity and Color Constancy as Features for Ortho Image Classification of Benthic Habitats in Hinatuan, Surigao del Sur, Philippines

NASA Astrophysics Data System (ADS)

Cubillas, J. E.; Japitana, M.

2016-06-01

This study demonstrates the application of CIELAB, Color intensity, and One Dimensional Scalar Constancy as features for image recognition and classifying benthic habitats in an image with the coastal areas of Hinatuan, Surigao Del Sur, Philippines as the study area. The study area is composed of four datasets, namely: (a) Blk66L005, (b) Blk66L021, (c) Blk66L024, and (d) Blk66L0114. SVM optimization was performed in Matlab® software with the help of Parallel Computing Toolbox to hasten the SVM computing speed. The image used for collecting samples for SVM procedure was Blk66L0114 in which a total of 134,516 sample objects of mangrove, possible coral existence with rocks, sand, sea, fish pens and sea grasses were collected and processed. The collected samples were then used as training sets for the supervised learning algorithm and for the creation of class definitions. The learned hyper-planes separating one class from another in the multi-dimensional feature space can be thought of as a super feature which will then be used in developing the C (classifier) rule set in eCognition® software. The classification results of the sampling site yielded an accuracy of 98.85% which confirms the reliability of remote sensing techniques and analysis employed to orthophotos like the CIELAB, Color Intensity and One dimensional scalar constancy and the use of SVM classification algorithm in classifying benthic habitats.

Application of a distributed network in computational fluid dynamic simulations

NASA Technical Reports Server (NTRS)

Deshpande, Manish; Feng, Jinzhang; Merkle, Charles L.; Deshpande, Ashish

1994-01-01

A general-purpose 3-D, incompressible Navier-Stokes algorithm is implemented on a network of concurrently operating workstations using parallel virtual machine (PVM) and compared with its performance on a CRAY Y-MP and on an Intel iPSC/860. The problem is relatively computationally intensive, and has a communication structure based primarily on nearest-neighbor communication, making it ideally suited to message passing. Such problems are frequently encountered in computational fluid dynamics (CDF), and their solution is increasingly in demand. The communication structure is explicitly coded in the implementation to fully exploit the regularity in message passing in order to produce a near-optimal solution. Results are presented for various grid sizes using up to eight processors.
Computer-Simulation Surrogates for Optimization: Application to Trapezoidal Ducts and Axisymmetric Bodies

NASA Technical Reports Server (NTRS)

Otto, John C.; Paraschivoiu, Marius; Yesilyurt, Serhat; Patera, Anthony T.

1995-01-01

Engineering design and optimization efforts using computational systems rapidly become resource intensive. The goal of the surrogate-based approach is to perform a complete optimization with limited resources. In this paper we present a Bayesian-validated approach that informs the designer as to how well the surrogate performs; in particular, our surrogate framework provides precise (albeit probabilistic) bounds on the errors incurred in the surrogate-for-simulation substitution. The theory and algorithms of our computer{simulation surrogate framework are first described. The utility of the framework is then demonstrated through two illustrative examples: maximization of the flowrate of fully developed ow in trapezoidal ducts; and design of an axisymmetric body that achieves a target Stokes drag.
Machine learning: Trends, perspectives, and prospects.

PubMed

Jordan, M I; Mitchell, T M

2015-07-17

Machine learning addresses the question of how to build computers that improve automatically through experience. It is one of today's most rapidly growing technical fields, lying at the intersection of computer science and statistics, and at the core of artificial intelligence and data science. Recent progress in machine learning has been driven both by the development of new learning algorithms and theory and by the ongoing explosion in the availability of online data and low-cost computation. The adoption of data-intensive machine-learning methods can be found throughout science, technology and commerce, leading to more evidence-based decision-making across many walks of life, including health care, manufacturing, education, financial modeling, policing, and marketing. Copyright © 2015, American Association for the Advancement of Science.
Cloud Computing Boosts Business Intelligence of Telecommunication Industry

NASA Astrophysics Data System (ADS)

Xu, Meng; Gao, Dan; Deng, Chao; Luo, Zhiguo; Sun, Shaoling

Business Intelligence becomes an attracting topic in today's data intensive applications, especially in telecommunication industry. Meanwhile, Cloud Computing providing IT supporting Infrastructure with excellent scalability, large scale storage, and high performance becomes an effective way to implement parallel data processing and data mining algorithms. BC-PDM (Big Cloud based Parallel Data Miner) is a new MapReduce based parallel data mining platform developed by CMRI (China Mobile Research Institute) to fit the urgent requirements of business intelligence in telecommunication industry. In this paper, the architecture, functionality and performance of BC-PDM are presented, together with the experimental evaluation and case studies of its applications. The evaluation result demonstrates both the usability and the cost-effectiveness of Cloud Computing based Business Intelligence system in applications of telecommunication industry.
Novel Fluorescein Angiography-Based Computer-Aided Algorithm for Assessment of Retinal Vessel Permeability

PubMed Central

Chassidim, Yoash; Parmet, Yisrael; Tomkins, Oren; Knyazer, Boris; Friedman, Alon; Levy, Jaime

2013-01-01

Purpose To present a novel method for quantitative assessment of retinal vessel permeability using a fluorescein angiography-based computer algorithm. Methods Twenty-one subjects (13 with diabetic retinopathy, 8 healthy volunteers) underwent fluorescein angiography (FA). Image pre-processing included removal of non-retinal and noisy images and registration to achieve spatial and temporal pixel-based analysis. Permeability was assessed for each pixel by computing intensity kinetics normalized to arterial values. A linear curve was fitted and the slope value was assigned, color-coded and displayed. The initial FA studies and the computed permeability maps were interpreted in a masked and randomized manner by three experienced ophthalmologists for statistical validation of diagnosis accuracy and efficacy. Results Permeability maps were successfully generated for all subjects. For healthy volunteers permeability values showed a normal distribution with a comparable range between subjects. Based on the mean cumulative histogram for the healthy population a threshold (99.5%) for pathological permeability was determined. Clear differences were found between patients and healthy subjects in the number and spatial distribution of pixels with pathological vascular leakage. The computed maps improved the discrimination between patients and healthy subjects, achieved sensitivity and specificity of 0.974 and 0.833 respectively, and significantly improved the consensus among raters for the localization of pathological regions. Conclusion The new algorithm allows quantification of retinal vessel permeability and provides objective, more sensitive and accurate evaluation than the present subjective clinical diagnosis. Future studies with a larger patients’ cohort and different retinal pathologies are awaited to further validate this new approach and its role in diagnosis and treatment follow-up. Successful evaluation of vasculature permeability may be used for the early diagnosis of brain microvascular pathology and potentially predict associated neurological sequelae. Finally, the algorithm could be implemented for intraoperative evaluation of micovascular integrity in other organs or during animal experiments. PMID:23626701
Optimal Load Shedding and Generation Rescheduling for Overload Suppression in Large Power Systems.

NASA Astrophysics Data System (ADS)

Moon, Young-Hyun

Ever-increasing size, complexity and operation costs in modern power systems have stimulated the intensive study of an optimal Load Shedding and Generator Rescheduling (LSGR) strategy in the sense of a secure and economic system operation. The conventional approach to LSGR has been based on the application of LP (Linear Programming) with the use of an approximately linearized model, and the LP algorithm is currently considered to be the most powerful tool for solving the LSGR problem. However, all of the LP algorithms presented in the literature essentially lead to the following disadvantages: (i) piecewise linearization involved in the LP algorithms requires the introduction of a number of new inequalities and slack variables, which creates significant burden to the computing facilities, and (ii) objective functions are not formulated in terms of the state variables of the adopted models, resulting in considerable numerical inefficiency in the process of computing the optimal solution. A new approach is presented, based on the development of a new linearized model and on the application of QP (Quadratic Programming). The changes in line flows as a result of changes to bus injection power are taken into account in the proposed model by the introduction of sensitivity coefficients, which avoids the mentioned second disadvantages. A precise method to calculate these sensitivity coefficients is given. A comprehensive review of the theory of optimization is included, in which results of the development of QP algorithms for LSGR as based on Wolfe's method and Kuhn -Tucker theory are evaluated in detail. The validity of the proposed model and QP algorithms has been verified and tested on practical power systems, showing the significant reduction of both computation time and memory requirements as well as the expected lower generation costs of the optimal solution as compared with those obtained from computing the optimal solution with LP. Finally, it is noted that an efficient reactive power compensation algorithm is developed to suppress voltage disturbances due to load sheddings, and that a new method for multiple contingency simulation is presented.
Efficient 3D geometric and Zernike moments computation from unstructured surface meshes.

PubMed

Pozo, José María; Villa-Uriol, Maria-Cruz; Frangi, Alejandro F

2011-03-01

This paper introduces and evaluates a fast exact algorithm and a series of faster approximate algorithms for the computation of 3D geometric moments from an unstructured surface mesh of triangles. Being based on the object surface reduces the computational complexity of these algorithms with respect to volumetric grid-based algorithms. In contrast, it can only be applied for the computation of geometric moments of homogeneous objects. This advantage and restriction is shared with other proposed algorithms based on the object boundary. The proposed exact algorithm reduces the computational complexity for computing geometric moments up to order N with respect to previously proposed exact algorithms, from N(9) to N(6). The approximate series algorithm appears as a power series on the rate between triangle size and object size, which can be truncated at any desired degree. The higher the number and quality of the triangles, the better the approximation. This approximate algorithm reduces the computational complexity to N(3). In addition, the paper introduces a fast algorithm for the computation of 3D Zernike moments from the computed geometric moments, with a computational complexity N(4), while the previously proposed algorithm is of order N(6). The error introduced by the proposed approximate algorithms is evaluated in different shapes and the cost-benefit ratio in terms of error, and computational time is analyzed for different moment orders.
Demons deformable registration of CT and cone-beam CT using an iterative intensity matching approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nithiananthan, Sajendra; Schafer, Sebastian; Uneri, Ali

2011-04-15

Purpose: A method of intensity-based deformable registration of CT and cone-beam CT (CBCT) images is described, in which intensity correction occurs simultaneously within the iterative registration process. The method preserves the speed and simplicity of the popular Demons algorithm while providing robustness and accuracy in the presence of large mismatch between CT and CBCT voxel values (''intensity''). Methods: A variant of the Demons algorithm was developed in which an estimate of the relationship between CT and CBCT intensity values for specific materials in the image is computed at each iteration based on the set of currently overlapping voxels. This tissue-specificmore » intensity correction is then used to estimate the registration output for that iteration and the process is repeated. The robustness of the method was tested in CBCT images of a cadaveric head exhibiting a broad range of simulated intensity variations associated with x-ray scatter, object truncation, and/or errors in the reconstruction algorithm. The accuracy of CT-CBCT registration was also measured in six real cases, exhibiting deformations ranging from simple to complex during surgery or radiotherapy guided by a CBCT-capable C-arm or linear accelerator, respectively. Results: The iterative intensity matching approach was robust against all levels of intensity variation examined, including spatially varying errors in voxel value of a factor of 2 or more, as can be encountered in cases of high x-ray scatter. Registration accuracy without intensity matching degraded severely with increasing magnitude of intensity error and introduced image distortion. A single histogram match performed prior to registration alleviated some of these effects but was also prone to image distortion and was quantifiably less robust and accurate than the iterative approach. Within the six case registration accuracy study, iterative intensity matching Demons reduced mean TRE to (2.5{+-}2.8) mm compared to (3.5{+-}3.0) mm with rigid registration. Conclusions: A method was developed to iteratively correct CT-CBCT intensity disparity during Demons registration, enabling fast, intensity-based registration in CBCT-guided procedures such as surgery and radiotherapy, in which CBCT voxel values may be inaccurate. Accurate CT-CBCT registration in turn facilitates registration of multimodality preoperative image and planning data to intraoperative CBCT by way of the preoperative CT, thereby linking the intraoperative frame of reference to a wealth of preoperative information that could improve interventional guidance.« less
Hidden Markov Model-Based CNV Detection Algorithms for Illumina Genotyping Microarrays.

PubMed

Seiser, Eric L; Innocenti, Federico

2014-01-01

Somatic alterations in DNA copy number have been well studied in numerous malignancies, yet the role of germline DNA copy number variation in cancer is still emerging. Genotyping microarrays generate allele-specific signal intensities to determine genotype, but may also be used to infer DNA copy number using additional computational approaches. Numerous tools have been developed to analyze Illumina genotype microarray data for copy number variant (CNV) discovery, although commonly utilized algorithms freely available to the public employ approaches based upon the use of hidden Markov models (HMMs). QuantiSNP, PennCNV, and GenoCN utilize HMMs with six copy number states but vary in how transition and emission probabilities are calculated. Performance of these CNV detection algorithms has been shown to be variable between both genotyping platforms and data sets, although HMM approaches generally outperform other current methods. Low sensitivity is prevalent with HMM-based algorithms, suggesting the need for continued improvement in CNV detection methodologies.
Boost-phase discrimination research activities

NASA Technical Reports Server (NTRS)

Cooper, David M.; Deiwert, George S.

1989-01-01

Theoretical research in two areas was performed. The aerothermodynamics research focused on the hard-body and rocket plume flows. Analytical real gas models to describe finite rate chemistry were developed and incorporated into the three-dimensional flow codes. New numerical algorithms capable of treating multi-species reacting gas equations and treating flows with large gradients were also developed. The computational chemistry research focused on the determination of spectral radiative intensity factors, transport properties and reaction rates. Ab initio solutions to the Schrodinger equation provided potential energy curves transition moments (radiative probabilities and strengths) and potential energy surfaces. These surfaces were then coupled with classical particle reactive trajectories to compute reaction cross-sections and rates.
Implementation of Parallel Dynamic Simulation on Shared-Memory vs. Distributed-Memory Environments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Shuangshuang; Chen, Yousu; Wu, Di

2015-12-09

Power system dynamic simulation computes the system response to a sequence of large disturbance, such as sudden changes in generation or load, or a network short circuit followed by protective branch switching operation. It consists of a large set of differential and algebraic equations, which is computational intensive and challenging to solve using single-processor based dynamic simulation solution. High-performance computing (HPC) based parallel computing is a very promising technology to speed up the computation and facilitate the simulation process. This paper presents two different parallel implementations of power grid dynamic simulation using Open Multi-processing (OpenMP) on shared-memory platform, and Messagemore » Passing Interface (MPI) on distributed-memory clusters, respectively. The difference of the parallel simulation algorithms and architectures of the two HPC technologies are illustrated, and their performances for running parallel dynamic simulation are compared and demonstrated.« less
Integrating the Apache Big Data Stack with HPC for Big Data

NASA Astrophysics Data System (ADS)

Fox, G. C.; Qiu, J.; Jha, S.

2014-12-01

There is perhaps a broad consensus as to important issues in practical parallel computing as applied to large scale simulations; this is reflected in supercomputer architectures, algorithms, libraries, languages, compilers and best practice for application development. However, the same is not so true for data intensive computing, even though commercially clouds devote much more resources to data analytics than supercomputers devote to simulations. We look at a sample of over 50 big data applications to identify characteristics of data intensive applications and to deduce needed runtime and architectures. We suggest a big data version of the famous Berkeley dwarfs and NAS parallel benchmarks and use these to identify a few key classes of hardware/software architectures. Our analysis builds on combining HPC and ABDS the Apache big data software stack that is well used in modern cloud computing. Initial results on clouds and HPC systems are encouraging. We propose the development of SPIDAL - Scalable Parallel Interoperable Data Analytics Library -- built on system aand data abstractions suggested by the HPC-ABDS architecture. We discuss how it can be used in several application areas including Polar Science.
A Computational Framework for High-Throughput Isotopic Natural Abundance Correction of Omics-Level Ultra-High Resolution FT-MS Datasets

PubMed Central

Carreer, William J.; Flight, Robert M.; Moseley, Hunter N. B.

2013-01-01

New metabolomics applications of ultra-high resolution and accuracy mass spectrometry can provide thousands of detectable isotopologues, with the number of potentially detectable isotopologues increasing exponentially with the number of stable isotopes used in newer isotope tracing methods like stable isotope-resolved metabolomics (SIRM) experiments. This huge increase in usable data requires software capable of correcting the large number of isotopologue peaks resulting from SIRM experiments in a timely manner. We describe the design of a new algorithm and software system capable of handling these high volumes of data, while including quality control methods for maintaining data quality. We validate this new algorithm against a previous single isotope correction algorithm in a two-step cross-validation. Next, we demonstrate the algorithm and correct for the effects of natural abundance for both 13C and 15N isotopes on a set of raw isotopologue intensities of UDP-N-acetyl-D-glucosamine derived from a 13C/15N-tracing experiment. Finally, we demonstrate the algorithm on a full omics-level dataset. PMID:24404440
Neuromimetic Sound Representation for Percept Detection and Manipulation

NASA Astrophysics Data System (ADS)

Zotkin, Dmitry N.; Chi, Taishih; Shamma, Shihab A.; Duraiswami, Ramani

2005-12-01

The acoustic wave received at the ears is processed by the human auditory system to separate different sounds along the intensity, pitch, and timbre dimensions. Conventional Fourier-based signal processing, while endowed with fast algorithms, is unable to easily represent a signal along these attributes. In this paper, we discuss the creation of maximally separable sounds in auditory user interfaces and use a recently proposed cortical sound representation, which performs a biomimetic decomposition of an acoustic signal, to represent and manipulate sound for this purpose. We briefly overview algorithms for obtaining, manipulating, and inverting a cortical representation of a sound and describe algorithms for manipulating signal pitch and timbre separately. The algorithms are also used to create sound of an instrument between a "guitar" and a "trumpet." Excellent sound quality can be achieved if processing time is not a concern, and intelligible signals can be reconstructed in reasonable processing time (about ten seconds of computational time for a one-second signal sampled at [InlineEquation not available: see fulltext.]). Work on bringing the algorithms into the real-time processing domain is ongoing.
Continuous intensity map optimization (CIMO): A novel approach to leaf sequencing in step and shoot IMRT

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cao Daliang; Earl, Matthew A.; Luan, Shuang

2006-04-15

A new leaf-sequencing approach has been developed that is designed to reduce the number of required beam segments for step-and-shoot intensity modulated radiation therapy (IMRT). This approach to leaf sequencing is called continuous-intensity-map-optimization (CIMO). Using a simulated annealing algorithm, CIMO seeks to minimize differences between the optimized and sequenced intensity maps. Two distinguishing features of the CIMO algorithm are (1) CIMO does not require that each optimized intensity map be clustered into discrete levels and (2) CIMO is not rule-based but rather simultaneously optimizes both the aperture shapes and weights. To test the CIMO algorithm, ten IMRT patient cases weremore » selected (four head-and-neck, two pancreas, two prostate, one brain, and one pelvis). For each case, the optimized intensity maps were extracted from the Pinnacle{sup 3} treatment planning system. The CIMO algorithm was applied, and the optimized aperture shapes and weights were loaded back into Pinnacle. A final dose calculation was performed using Pinnacle's convolution/superposition based dose calculation. On average, the CIMO algorithm provided a 54% reduction in the number of beam segments as compared with Pinnacle's leaf sequencer. The plans sequenced using the CIMO algorithm also provided improved target dose uniformity and a reduced discrepancy between the optimized and sequenced intensity maps. For ten clinical intensity maps, comparisons were performed between the CIMO algorithm and the power-of-two reduction algorithm of Xia and Verhey [Med. Phys. 25(8), 1424-1434 (1998)]. When the constraints of a Varian Millennium multileaf collimator were applied, the CIMO algorithm resulted in a 26% reduction in the number of segments. For an Elekta multileaf collimator, the CIMO algorithm resulted in a 67% reduction in the number of segments. An average leaf sequencing time of less than one minute per beam was observed.« less
A fast and pragmatic approach for scatter correction in flat-detector CT using elliptic modeling and iterative optimization

NASA Astrophysics Data System (ADS)

Meyer, Michael; Kalender, Willi A.; Kyriakou, Yiannis

2010-01-01

Scattered radiation is a major source of artifacts in flat detector computed tomography (FDCT) due to the increased irradiated volumes. We propose a fast projection-based algorithm for correction of scatter artifacts. The presented algorithm combines a convolution method to determine the spatial distribution of the scatter intensity distribution with an object-size-dependent scaling of the scatter intensity distributions using a priori information generated by Monte Carlo simulations. A projection-based (PBSE) and an image-based (IBSE) strategy for size estimation of the scanned object are presented. Both strategies provide good correction and comparable results; the faster PBSE strategy is recommended. Even with such a fast and simple algorithm that in the PBSE variant does not rely on reconstructed volumes or scatter measurements, it is possible to provide a reasonable scatter correction even for truncated scans. For both simulations and measurements, scatter artifacts were significantly reduced and the algorithm showed stable behavior in the z-direction. For simulated voxelized head, hip and thorax phantoms, a figure of merit Q of 0.82, 0.76 and 0.77 was reached, respectively (Q = 0 for uncorrected, Q = 1 for ideal). For a water phantom with 15 cm diameter, for example, a cupping reduction from 10.8% down to 2.1% was achieved. The performance of the correction method has limitations in the case of measurements using non-ideal detectors, intensity calibration, etc. An iterative approach to overcome most of these limitations was proposed. This approach is based on root finding of a cupping metric and may be useful for other scatter correction methods as well. By this optimization, cupping of the measured water phantom was further reduced down to 0.9%. The algorithm was evaluated on a commercial system including truncated and non-homogeneous clinically relevant objects.
Acoustic intensity calculations for axisymmetrically modeled fluid regions

NASA Technical Reports Server (NTRS)

Hambric, Stephen A.; Everstine, Gordon C.

1992-01-01

An algorithm for calculating acoustic intensities from a time harmonic pressure field in an axisymmetric fluid region is presented. Acoustic pressures are computed in a mesh of NASTRAN triangular finite elements of revolution (TRIAAX) using an analogy between the scalar wave equation and elasticity equations. Acoustic intensities are then calculated from pressures and pressure derivatives taken over the mesh of TRIAAX elements. Intensities are displayed as vectors indicating the directions and magnitudes of energy flow at all mesh points in the acoustic field. A prolate spheroidal shell is modeled with axisymmetric shell elements (CONEAX) and submerged in a fluid region of TRIAAX elements. The model is analyzed to illustrate the acoustic intensity method and the usefulness of energy flow paths in the understanding of the response of fluid-structure interaction problems. The structural-acoustic analogy used is summarized for completeness. This study uncovered a NASTRAN limitation involving numerical precision issues in the CONEAX stiffness calculation causing large errors in the system matrices for nearly cylindrical cones.
Fast direct fourier reconstruction of radial and PROPELLER MRI data using the chirp transform algorithm on graphics hardware.

PubMed

Feng, Yanqiu; Song, Yanli; Wang, Cong; Xin, Xuegang; Feng, Qianjin; Chen, Wufan

2013-10-01

To develop and test a new algorithm for fast direct Fourier transform (DrFT) reconstruction of MR data on non-Cartesian trajectories composed of lines with equally spaced points. The DrFT, which is normally used as a reference in evaluating the accuracy of other reconstruction methods, can reconstruct images directly from non-Cartesian MR data without interpolation. However, DrFT reconstruction involves substantially intensive computation, which makes the DrFT impractical for clinical routine applications. In this article, the Chirp transform algorithm was introduced to accelerate the DrFT reconstruction of radial and Periodically Rotated Overlapping ParallEL Lines with Enhanced Reconstruction (PROPELLER) MRI data located on the trajectories that are composed of lines with equally spaced points. The performance of the proposed Chirp transform algorithm-DrFT algorithm was evaluated by using simulation and in vivo MRI data. After implementing the algorithm on a graphics processing unit, the proposed Chirp transform algorithm-DrFT algorithm achieved an acceleration of approximately one order of magnitude, and the speed-up factor was further increased to approximately three orders of magnitude compared with the traditional single-thread DrFT reconstruction. Implementation the Chirp transform algorithm-DrFT algorithm on the graphics processing unit can efficiently calculate the DrFT reconstruction of the radial and PROPELLER MRI data. Copyright © 2012 Wiley Periodicals, Inc.
Demons deformable registration of CT and cone-beam CT using an iterative intensity matching approach

PubMed Central

Nithiananthan, Sajendra; Schafer, Sebastian; Uneri, Ali; Mirota, Daniel J.; Stayman, J. Webster; Zbijewski, Wojciech; Brock, Kristy K.; Daly, Michael J.; Chan, Harley; Irish, Jonathan C.; Siewerdsen, Jeffrey H.

2011-01-01

Purpose: A method of intensity-based deformable registration of CT and cone-beam CT (CBCT) images is described, in which intensity correction occurs simultaneously within the iterative registration process. The method preserves the speed and simplicity of the popular Demons algorithm while providing robustness and accuracy in the presence of large mismatch between CT and CBCT voxel values (“intensity”). Methods: A variant of the Demons algorithm was developed in which an estimate of the relationship between CT and CBCT intensity values for specific materials in the image is computed at each iteration based on the set of currently overlapping voxels. This tissue-specific intensity correction is then used to estimate the registration output for that iteration and the process is repeated. The robustness of the method was tested in CBCT images of a cadaveric head exhibiting a broad range of simulated intensity variations associated with x-ray scatter, object truncation, and∕or errors in the reconstruction algorithm. The accuracy of CT-CBCT registration was also measured in six real cases, exhibiting deformations ranging from simple to complex during surgery or radiotherapy guided by a CBCT-capable C-arm or linear accelerator, respectively. Results: The iterative intensity matching approach was robust against all levels of intensity variation examined, including spatially varying errors in voxel value of a factor of 2 or more, as can be encountered in cases of high x-ray scatter. Registration accuracy without intensity matching degraded severely with increasing magnitude of intensity error and introduced image distortion. A single histogram match performed prior to registration alleviated some of these effects but was also prone to image distortion and was quantifiably less robust and accurate than the iterative approach. Within the six case registration accuracy study, iterative intensity matching Demons reduced mean TRE to (2.5±2.8) mm compared to (3.5±3.0) mm with rigid registration. Conclusions: A method was developed to iteratively correct CT-CBCT intensity disparity during Demons registration, enabling fast, intensity-based registration in CBCT-guided procedures such as surgery and radiotherapy, in which CBCT voxel values may be inaccurate. Accurate CT-CBCT registration in turn facilitates registration of multimodality preoperative image and planning data to intraoperative CBCT by way of the preoperative CT, thereby linking the intraoperative frame of reference to a wealth of preoperative information that could improve interventional guidance. PMID:21626913
An Evolutionary Method for Financial Forecasting in Microscopic High-Speed Trading Environment.

PubMed

Huang, Chien-Feng; Li, Hsu-Chih

2017-01-01

The advancement of information technology in financial applications nowadays have led to fast market-driven events that prompt flash decision-making and actions issued by computer algorithms. As a result, today's markets experience intense activity in the highly dynamic environment where trading systems respond to others at a much faster pace than before. This new breed of technology involves the implementation of high-speed trading strategies which generate significant portion of activity in the financial markets and present researchers with a wealth of information not available in traditional low-speed trading environments. In this study, we aim at developing feasible computational intelligence methodologies, particularly genetic algorithms (GA), to shed light on high-speed trading research using price data of stocks on the microscopic level. Our empirical results show that the proposed GA-based system is able to improve the accuracy of the prediction significantly for price movement, and we expect this GA-based methodology to advance the current state of research for high-speed trading and other relevant financial applications.

High-Performance Signal Detection for Adverse Drug Events using MapReduce Paradigm.

PubMed

Fan, Kai; Sun, Xingzhi; Tao, Ying; Xu, Linhao; Wang, Chen; Mao, Xianling; Peng, Bo; Pan, Yue

2010-11-13

Post-marketing pharmacovigilance is important for public health, as many Adverse Drug Events (ADEs) are unknown when those drugs were approved for marketing. However, due to the large number of reported drugs and drug combinations, detecting ADE signals by mining these reports is becoming a challenging task in terms of computational complexity. Recently, a parallel programming model, MapReduce has been introduced by Google to support large-scale data intensive applications. In this study, we proposed a MapReduce-based algorithm, for common ADE detection approach, Proportional Reporting Ratio (PRR), and tested it in mining spontaneous ADE reports from FDA. The purpose is to investigate the possibility of using MapReduce principle to speed up biomedical data mining tasks using this pharmacovigilance case as one specific example. The results demonstrated that MapReduce programming model could improve the performance of common signal detection algorithm for pharmacovigilance in a distributed computation environment at approximately liner speedup rates.
Optimization and real-time control for laser treatment of heterogeneous soft tissues.

PubMed

Feng, Yusheng; Fuentes, David; Hawkins, Andrea; Bass, Jon M; Rylander, Marissa Nichole

2009-01-01

Predicting the outcome of thermotherapies in cancer treatment requires an accurate characterization of the bioheat transfer processes in soft tissues. Due to the biological and structural complexity of tumor (soft tissue) composition and vasculature, it is often very difficult to obtain reliable tissue properties that is one of the key factors for the accurate treatment outcome prediction. Efficient algorithms employing in vivo thermal measurements to determine heterogeneous thermal tissues properties in conjunction with a detailed sensitivity analysis can produce essential information for model development and optimal control. The goals of this paper are to present a general formulation of the bioheat transfer equation for heterogeneous soft tissues, review models and algorithms developed for cell damage, heat shock proteins, and soft tissues with nanoparticle inclusion, and demonstrate an overall computational strategy for developing a laser treatment framework with the ability to perform real-time robust calibrations and optimal control. This computational strategy can be applied to other thermotherapies using the heat source such as radio frequency or high intensity focused ultrasound.
Adaptively combined FIR and functional link artificial neural network equalizer for nonlinear communication channel.

PubMed

Zhao, Haiquan; Zhang, Jiashu

2009-04-01

This paper proposes a novel computational efficient adaptive nonlinear equalizer based on combination of finite impulse response (FIR) filter and functional link artificial neural network (CFFLANN) to compensate linear and nonlinear distortions in nonlinear communication channel. This convex nonlinear combination results in improving the speed while retaining the lower steady-state error. In addition, since the CFFLANN needs not the hidden layers, which exist in conventional neural-network-based equalizers, it exhibits a simpler structure than the traditional neural networks (NNs) and can require less computational burden during the training mode. Moreover, appropriate adaptation algorithm for the proposed equalizer is derived by the modified least mean square (MLMS). Results obtained from the simulations clearly show that the proposed equalizer using the MLMS algorithm can availably eliminate various intensity linear and nonlinear distortions, and be provided with better anti-jamming performance. Furthermore, comparisons of the mean squared error (MSE), the bit error rate (BER), and the effect of eigenvalue ratio (EVR) of input correlation matrix are presented.
Automated medication reconciliation and complexity of care transitions.

PubMed

Silva, Pamela A Bozzo; Bernstam, Elmer V; Markowitz, Eliz; Johnson, Todd R; Zhang, Jiajie; Herskovic, Jorge R

2011-01-01

Medication reconciliation is a National Patient Safety Goal (NPSG) from The Joint Commission (TJC) that entails reviewing all medications a patient takes after a health care transition. Medication reconciliation is a resource-intensive, error-prone task, and the resources to accomplish it may not be routinely available. Computer-based methods have the potential to overcome these barriers. We designed and explored a rule-based medication reconciliation algorithm to accomplish this task across different healthcare transitions. We tested our algorithm on a random sample of 94 transitions from the Clinical Data Warehouse at the University of Texas Health Science Center at Houston. We found that the algorithm reconciled, on average, 23.4% of the potentially reconcilable medications. Our study did not have sufficient statistical power to establish whether the kind of transition affects reconcilability. We conclude that automated reconciliation is possible and will help accomplish the NPSG.
Fashion sketch design by interactive genetic algorithms

NASA Astrophysics Data System (ADS)

Mok, P. Y.; Wang, X. X.; Xu, J.; Kwok, Y. L.

2012-11-01

Computer aided design is vitally important for the modern industry, particularly for the creative industry. Fashion industry faced intensive challenges to shorten the product development process. In this paper, a methodology is proposed for sketch design based on interactive genetic algorithms. The sketch design system consists of a sketch design model, a database and a multi-stage sketch design engine. First, a sketch design model is developed based on the knowledge of fashion design to describe fashion product characteristics by using parameters. Second, a database is built based on the proposed sketch design model to define general style elements. Third, a multi-stage sketch design engine is used to construct the design. Moreover, an interactive genetic algorithm (IGA) is used to accelerate the sketch design process. The experimental results have demonstrated that the proposed method is effective in helping laypersons achieve satisfied fashion design sketches.
Reducing the number of templates for aligned-spin compact binary coalescence gravitational wave searches using metric-agnostic template nudging

NASA Astrophysics Data System (ADS)

Indik, Nathaniel; Fehrmann, Henning; Harke, Franz; Krishnan, Badri; Nielsen, Alex B.

2018-06-01

Efficient multidimensional template placement is crucial in computationally intensive matched-filtering searches for gravitational waves (GWs). Here, we implement the neighboring cell algorithm (NCA) to improve the detection volume of an existing compact binary coalescence (CBC) template bank. This algorithm has already been successfully applied for a binary millisecond pulsar search in data from the Fermi satellite. It repositions templates from overdense regions to underdense regions and reduces the number of templates that would have been required by a stochastic method to achieve the same detection volume. Our method is readily generalizable to other CBC parameter spaces. Here we apply this method to the aligned-single-spin neutron star-black hole binary coalescence inspiral-merger-ringdown gravitational wave parameter space. We show that the template nudging algorithm can attain the equivalent effectualness of the stochastic method with 12% fewer templates.
An algorithm for the detection and characterisation of volcanic plumes using thermal camera imagery

NASA Astrophysics Data System (ADS)

Bombrun, Maxime; Jessop, David; Harris, Andrew; Barra, Vincent

2018-02-01

Volcanic plumes are turbulent mixtures of particles and gas which are injected into the atmosphere during a volcanic eruption. Depending on the intensity of the eruption, plumes can rise from a few tens of metres up to many tens of kilometres above the vent and thus, present a major hazard for the surrounding population. Currently, however, few if any algorithms are available for automated plume tracking and assessment. Here, we present a new image processing algorithm for segmentation, tracking and parameters extraction of convective plume recorded with thermal cameras. We used thermal video of two volcanic eruptions and two plumes simulated in laboratory to develop and test an efficient technique for analysis of volcanic plumes. We validated our method by two different approaches. First, we compare our segmentation method to previously published algorithms. Next, we computed plume parameters, such as height, width and spreading angle at regular intervals of time. These parameters allowed us to calculate an entrainment coefficient and obtain information about the entrainment efficiency in Strombolian eruptions. Our proposed algorithm is rapid, automated while producing better visual outlines compared to the other segmentation algorithms, and provides output that is at least as accurate as manual measurements of plumes.
CAMPAIGN: an open-source library of GPU-accelerated data clustering algorithms.

PubMed

Kohlhoff, Kai J; Sosnick, Marc H; Hsu, William T; Pande, Vijay S; Altman, Russ B

2011-08-15

Data clustering techniques are an essential component of a good data analysis toolbox. Many current bioinformatics applications are inherently compute-intense and work with very large datasets. Sequential algorithms are inadequate for providing the necessary performance. For this reason, we have created Clustering Algorithms for Massively Parallel Architectures, Including GPU Nodes (CAMPAIGN), a central resource for data clustering algorithms and tools that are implemented specifically for execution on massively parallel processing architectures. CAMPAIGN is a library of data clustering algorithms and tools, written in 'C for CUDA' for Nvidia GPUs. The library provides up to two orders of magnitude speed-up over respective CPU-based clustering algorithms and is intended as an open-source resource. New modules from the community will be accepted into the library and the layout of it is such that it can easily be extended to promising future platforms such as OpenCL. Releases of the CAMPAIGN library are freely available for download under the LGPL from https://simtk.org/home/campaign. Source code can also be obtained through anonymous subversion access as described on https://simtk.org/scm/?group_id=453. kjk33@cantab.net.
Computation of Symmetric Discrete Cosine Transform Using Bakhvalov's Algorithm

NASA Technical Reports Server (NTRS)

Aburdene, Maurice F.; Strojny, Brian C.; Dorband, John E.

2005-01-01

A number of algorithms for recursive computation of the discrete cosine transform (DCT) have been developed recently. This paper presents a new method for computing the discrete cosine transform and its inverse using Bakhvalov's algorithm, a method developed for evaluation of a polynomial at a point. In this paper, we will focus on both the application of the algorithm to the computation of the DCT-I and its complexity. In addition, Bakhvalov s algorithm is compared with Clenshaw s algorithm for the computation of the DCT.
Hybrid optimization and Bayesian inference techniques for a non-smooth radiation detection problem

DOE PAGES

Stefanescu, Razvan; Schmidt, Kathleen; Hite, Jason; ...

2016-12-12

In this paper, we propose several algorithms to recover the location and intensity of a radiation source located in a simulated 250 × 180 m block of an urban center based on synthetic measurements. Radioactive decay and detection are Poisson random processes, so we employ likelihood functions based on this distribution. Owing to the domain geometry and the proposed response model, the negative logarithm of the likelihood is only piecewise continuous differentiable, and it has multiple local minima. To address these difficulties, we investigate three hybrid algorithms composed of mixed optimization techniques. For global optimization, we consider simulated annealing, particlemore » swarm, and genetic algorithm, which rely solely on objective function evaluations; that is, they do not evaluate the gradient in the objective function. By employing early stopping criteria for the global optimization methods, a pseudo-optimum point is obtained. This is subsequently utilized as the initial value by the deterministic implicit filtering method, which is able to find local extrema in non-smooth functions, to finish the search in a narrow domain. These new hybrid techniques, combining global optimization and implicit filtering address, difficulties associated with the non-smooth response, and their performances, are shown to significantly decrease the computational time over the global optimization methods. To quantify uncertainties associated with the source location and intensity, we employ the delayed rejection adaptive Metropolis and DiffeRential Evolution Adaptive Metropolis algorithms. Finally, marginal densities of the source properties are obtained, and the means of the chains compare accurately with the estimates produced by the hybrid algorithms.« less
Computation of Material Demand in the Risk Assessment and Mitigation Framework for Strategic Materials (RAMF-SM) Process

DTIC Science & Technology

2015-08-01

Congress concerning requirements for the National Defense Stockpile (NDS) of strategic and critical non- fuel materials. 1 RAMF-SM, which was...critical non- fuel materials. The NDS was established in the World War II era and has been managed by the Department of Defense (DOD) since 1988. By...Department of the Interior. An alternative algorithm is used for materials with intensive defense demands. v Contents 1 . Introduction
The Virtual Earthquake and Seismology Research Community e-science environment in Europe (VERCE) FP7-INFRA-2011-2 project

NASA Astrophysics Data System (ADS)

Vilotte, J.-P.; Atkinson, M.; Michelini, A.; Igel, H.; van Eck, T.

2012-04-01

Increasingly dense seismic and geodetic networks are continuously transmitting a growing wealth of data from around the world. The multi-use of these data leaded the seismological community to pioneer globally distributed open-access data infrastructures, standard services and formats, e.g., the Federation of Digital Seismic Networks (FDSN) and the European Integrated Data Archives (EIDA). Our ability to acquire observational data outpaces our ability to manage, analyze and model them. Research in seismology is today facing a fundamental paradigm shift. Enabling advanced data-intensive analysis and modeling applications challenges conventional storage, computation and communication models and requires a new holistic approach. It is instrumental to exploit the cornucopia of data, and to guarantee optimal operation and design of the high-cost monitoring facilities. The strategy of VERCE is driven by the needs of the seismological data-intensive applications in data analysis and modeling. It aims to provide a comprehensive architecture and framework adapted to the scale and the diversity of those applications, and integrating the data infrastructures with Grid, Cloud and HPC infrastructures. It will allow prototyping solutions for new use cases as they emerge within the European Plate Observatory Systems (EPOS), the ESFRI initiative of the solid Earth community. Computational seismology, and information management, is increasingly revolving around massive amounts of data that stem from: (1) the flood of data from the observational systems; (2) the flood of data from large-scale simulations and inversions; (3) the ability to economically store petabytes of data online; (4) the evolving Internet and Data-aware computing capabilities. As data-intensive applications are rapidly increasing in scale and complexity, they require additional services-oriented architectures offering a virtualization-based flexibility for complex and re-usable workflows. Scientific information management poses computer science challenges: acquisition, organization, query and visualization tasks scale almost linearly with the data volumes. Commonly used FTP-GREP metaphor allows today to scan gigabyte-sized datasets but will not work for scanning terabyte-sized continuous waveform datasets. New data analysis and modeling methods, exploiting the signal coherence within dense network arrays, are nonlinear. Pair-algorithms on N points scale as N2. Wave form inversion and stochastic simulations raise computing and data handling challenges These applications are unfeasible for tera-scale datasets without new parallel algorithms that use near-linear processing, storage and bandwidth, and that can exploit new computing paradigms enabled by the intersection of several technologies (HPC, parallel scalable database crawler, data-aware HPC). This issues will be discussed based on a number of core pilot data-intensive applications and use cases retained in VERCE. This core applications are related to: (1) data processing and data analysis methods based on correlation techniques; (2) cpu-intensive applications such as large-scale simulation of synthetic waveforms in complex earth systems, and full waveform inversion and tomography. We shall analyze their workflow and data flow, and their requirements for a new service-oriented architecture and a data-aware platform with services and tools. Finally, we will outline the importance of a new collaborative environment between seismology and computer science, together with the need for the emergence and the recognition of 'research technologists' mastering the evolving data-aware technologies and the data-intensive research goals in seismology.
An efficient parallel termination detection algorithm

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, A. H.; Crivelli, S.; Jessup, E. R.

2004-05-27

Information local to any one processor is insufficient to monitor the overall progress of most distributed computations. Typically, a second distributed computation for detecting termination of the main computation is necessary. In order to be a useful computational tool, the termination detection routine must operate concurrently with the main computation, adding minimal overhead, and it must promptly and correctly detect termination when it occurs. In this paper, we present a new algorithm for detecting the termination of a parallel computation on distributed-memory MIMD computers that satisfies all of those criteria. A variety of termination detection algorithms have been devised. Ofmore » these, the algorithm presented by Sinha, Kale, and Ramkumar (henceforth, the SKR algorithm) is unique in its ability to adapt to the load conditions of the system on which it runs, thereby minimizing the impact of termination detection on performance. Because their algorithm also detects termination quickly, we consider it to be the most efficient practical algorithm presently available. The termination detection algorithm presented here was developed for use in the PMESC programming library for distributed-memory MIMD computers. Like the SKR algorithm, our algorithm adapts to system loads and imposes little overhead. Also like the SKR algorithm, ours is tree-based, and it does not depend on any assumptions about the physical interconnection topology of the processors or the specifics of the distributed computation. In addition, our algorithm is easier to implement and requires only half as many tree traverses as does the SKR algorithm. This paper is organized as follows. In section 2, we define our computational model. In section 3, we review the SKR algorithm. We introduce our new algorithm in section 4, and prove its correctness in section 5. We discuss its efficiency and present experimental results in section 6.« less
Accelerated Compressed Sensing Based CT Image Reconstruction.

PubMed

Hashemi, SayedMasoud; Beheshti, Soosan; Gill, Patrick R; Paul, Narinder S; Cobbold, Richard S C

2015-01-01

In X-ray computed tomography (CT) an important objective is to reduce the radiation dose without significantly degrading the image quality. Compressed sensing (CS) enables the radiation dose to be reduced by producing diagnostic images from a limited number of projections. However, conventional CS-based algorithms are computationally intensive and time-consuming. We propose a new algorithm that accelerates the CS-based reconstruction by using a fast pseudopolar Fourier based Radon transform and rebinning the diverging fan beams to parallel beams. The reconstruction process is analyzed using a maximum-a-posterior approach, which is transformed into a weighted CS problem. The weights involved in the proposed model are calculated based on the statistical characteristics of the reconstruction process, which is formulated in terms of the measurement noise and rebinning interpolation error. Therefore, the proposed method not only accelerates the reconstruction, but also removes the rebinning and interpolation errors. Simulation results are shown for phantoms and a patient. For example, a 512 × 512 Shepp-Logan phantom when reconstructed from 128 rebinned projections using a conventional CS method had 10% error, whereas with the proposed method the reconstruction error was less than 1%. Moreover, computation times of less than 30 sec were obtained using a standard desktop computer without numerical optimization.
Accelerated Compressed Sensing Based CT Image Reconstruction

PubMed Central

Hashemi, SayedMasoud; Beheshti, Soosan; Gill, Patrick R.; Paul, Narinder S.; Cobbold, Richard S. C.

2015-01-01

In X-ray computed tomography (CT) an important objective is to reduce the radiation dose without significantly degrading the image quality. Compressed sensing (CS) enables the radiation dose to be reduced by producing diagnostic images from a limited number of projections. However, conventional CS-based algorithms are computationally intensive and time-consuming. We propose a new algorithm that accelerates the CS-based reconstruction by using a fast pseudopolar Fourier based Radon transform and rebinning the diverging fan beams to parallel beams. The reconstruction process is analyzed using a maximum-a-posterior approach, which is transformed into a weighted CS problem. The weights involved in the proposed model are calculated based on the statistical characteristics of the reconstruction process, which is formulated in terms of the measurement noise and rebinning interpolation error. Therefore, the proposed method not only accelerates the reconstruction, but also removes the rebinning and interpolation errors. Simulation results are shown for phantoms and a patient. For example, a 512 × 512 Shepp-Logan phantom when reconstructed from 128 rebinned projections using a conventional CS method had 10% error, whereas with the proposed method the reconstruction error was less than 1%. Moreover, computation times of less than 30 sec were obtained using a standard desktop computer without numerical optimization. PMID:26167200
Efficient volumetric estimation from plenoptic data

NASA Astrophysics Data System (ADS)

Anglin, Paul; Reeves, Stanley J.; Thurow, Brian S.

2013-03-01

The commercial release of the Lytro camera, and greater availability of plenoptic imaging systems in general, have given the image processing community cost-effective tools for light-field imaging. While this data is most commonly used to generate planar images at arbitrary focal depths, reconstruction of volumetric fields is also possible. Similarly, deconvolution is a technique that is conventionally used in planar image reconstruction, or deblurring, algorithms. However, when leveraged with the ability of a light-field camera to quickly reproduce multiple focal planes within an imaged volume, deconvolution offers a computationally efficient method of volumetric reconstruction. Related research has shown than light-field imaging systems in conjunction with tomographic reconstruction techniques are also capable of estimating the imaged volume and have been successfully applied to particle image velocimetry (PIV). However, while tomographic volumetric estimation through algorithms such as multiplicative algebraic reconstruction techniques (MART) have proven to be highly accurate, they are computationally intensive. In this paper, the reconstruction problem is shown to be solvable by deconvolution. Deconvolution offers significant improvement in computational efficiency through the use of fast Fourier transforms (FFTs) when compared to other tomographic methods. This work describes a deconvolution algorithm designed to reconstruct a 3-D particle field from simulated plenoptic data. A 3-D extension of existing 2-D FFT-based refocusing techniques is presented to further improve efficiency when computing object focal stacks and system point spread functions (PSF). Reconstruction artifacts are identified; their underlying source and methods of mitigation are explored where possible, and reconstructions of simulated particle fields are provided.
Performance Comparison of Big Data Analytics With NEXUS and Giovanni

NASA Astrophysics Data System (ADS)

Jacob, J. C.; Huang, T.; Lynnes, C.

2016-12-01

NEXUS is an emerging data-intensive analysis framework developed with a new approach for handling science data that enables large-scale data analysis. It is available through open source. We compare performance of NEXUS and Giovanni for 3 statistics algorithms applied to NASA datasets. Giovanni is a statistics web service at NASA Distributed Active Archive Centers (DAACs). NEXUS is a cloud-computing environment developed at JPL and built on Apache Solr, Cassandra, and Spark. We compute global time-averaged map, correlation map, and area-averaged time series. The first two algorithms average over time to produce a value for each pixel in a 2-D map. The third algorithm averages spatially to produce a single value for each time step. This talk is our report on benchmark comparison findings that indicate 15x speedup with NEXUS over Giovanni to compute area-averaged time series of daily precipitation rate for the Tropical Rainfall Measuring Mission (TRMM with 0.25 degree spatial resolution) for the Continental United States over 14 years (2000-2014) with 64-way parallelism and 545 tiles per granule. 16-way parallelism with 16 tiles per granule worked best with NEXUS for computing an 18-year (1998-2015) TRMM daily precipitation global time averaged map (2.5 times speedup) and 18-year global map of correlation between TRMM daily precipitation and TRMM real time daily precipitation (7x speedup). These and other benchmark results will be presented along with key lessons learned in applying the NEXUS tiling approach to big data analytics in the cloud.
SAR processing in the cloud for oil detection in the Arctic

NASA Astrophysics Data System (ADS)

Garron, J.; Stoner, C.; Meyer, F. J.

2016-12-01

A new world of opportunity is being thawed from the ice of the Arctic, driven by decreased persistent Arctic sea-ice cover, increases in shipping, tourism, natural resource development. Tools that can automatically monitor key sea ice characteristics and potential oil spills are essential for safe passage in these changing waters. Synthetic aperture radar (SAR) data can be used to discriminate sea ice types and oil on the ocean surface and also for feature tracking. Additionally, SAR can image the earth through the night and most weather conditions. SAR data is volumetrically large and requires significant computing power to manipulate. Algorithms designed to identify key environmental features, like oil spills, in SAR imagery require secondary processing, and are computationally intensive, which can functionally limit their application in a real-time setting. Cloud processing is designed to manage big data and big data processing jobs by means of small cycles of off-site computations, eliminating up-front hardware costs. Pairing SAR data with cloud processing has allowed us to create and solidify a processing pipeline for SAR data products in the cloud to compare operational algorithms efficiency and effectiveness when run using an Alaska Satellite Facility (ASF) defined Amazon Machine Image (AMI). The products created from this secondary processing, were compared to determine which algorithm was most accurate in Arctic feature identification, and what operational conditions were required to produce the results on the ASF defined AMI. Results will be used to inform a series of recommendations to oil-spill response data managers and SAR users interested in expanding their analytical computing power.
Rapid algorithm prototyping and implementation for power quality measurement

NASA Astrophysics Data System (ADS)

Kołek, Krzysztof; Piątek, Krzysztof

2015-12-01

This article presents a Model-Based Design (MBD) approach to rapidly implement power quality (PQ) metering algorithms. Power supply quality is a very important aspect of modern power systems and will become even more important in future smart grids. In this case, maintaining the PQ parameters at the desired level will require efficient implementation methods of the metering algorithms. Currently, the development of new, advanced PQ metering algorithms requires new hardware with adequate computational capability and time intensive, cost-ineffective manual implementations. An alternative, considered here, is an MBD approach. The MBD approach focuses on the modelling and validation of the model by simulation, which is well-supported by a Computer-Aided Engineering (CAE) packages. This paper presents two algorithms utilized in modern PQ meters: a phase-locked loop based on an Enhanced Phase Locked Loop (EPLL), and the flicker measurement according to the IEC 61000-4-15 standard. The algorithms were chosen because of their complexity and non-trivial development. They were first modelled in the MATLAB/Simulink package, then tested and validated in a simulation environment. The models, in the form of Simulink diagrams, were next used to automatically generate C code. The code was compiled and executed in real-time on the Zynq Xilinx platform that combines a reconfigurable Field Programmable Gate Array (FPGA) with a dual-core processor. The MBD development of PQ algorithms, automatic code generation, and compilation form a rapid algorithm prototyping and implementation path for PQ measurements. The main advantage of this approach is the ability to focus on the design, validation, and testing stages while skipping over implementation issues. The code generation process renders production-ready code that can be easily used on the target hardware. This is especially important when standards for PQ measurement are in constant development, and the PQ issues in emerging smart grids will require tools for rapid development and implementation of such algorithms.
Standardized evaluation of algorithms for computer-aided diagnosis of dementia based on structural MRI: the CADDementia challenge.

PubMed

Bron, Esther E; Smits, Marion; van der Flier, Wiesje M; Vrenken, Hugo; Barkhof, Frederik; Scheltens, Philip; Papma, Janne M; Steketee, Rebecca M E; Méndez Orellana, Carolina; Meijboom, Rozanna; Pinto, Madalena; Meireles, Joana R; Garrett, Carolina; Bastos-Leite, António J; Abdulkadir, Ahmed; Ronneberger, Olaf; Amoroso, Nicola; Bellotti, Roberto; Cárdenas-Peña, David; Álvarez-Meza, Andrés M; Dolph, Chester V; Iftekharuddin, Khan M; Eskildsen, Simon F; Coupé, Pierrick; Fonov, Vladimir S; Franke, Katja; Gaser, Christian; Ledig, Christian; Guerrero, Ricardo; Tong, Tong; Gray, Katherine R; Moradi, Elaheh; Tohka, Jussi; Routier, Alexandre; Durrleman, Stanley; Sarica, Alessia; Di Fatta, Giuseppe; Sensi, Francesco; Chincarini, Andrea; Smith, Garry M; Stoyanov, Zhivko V; Sørensen, Lauge; Nielsen, Mads; Tangaro, Sabina; Inglese, Paolo; Wachinger, Christian; Reuter, Martin; van Swieten, John C; Niessen, Wiro J; Klein, Stefan

2015-05-01

Algorithms for computer-aided diagnosis of dementia based on structural MRI have demonstrated high performance in the literature, but are difficult to compare as different data sets and methodology were used for evaluation. In addition, it is unclear how the algorithms would perform on previously unseen data, and thus, how they would perform in clinical practice when there is no real opportunity to adapt the algorithm to the data at hand. To address these comparability, generalizability and clinical applicability issues, we organized a grand challenge that aimed to objectively compare algorithms based on a clinically representative multi-center data set. Using clinical practice as the starting point, the goal was to reproduce the clinical diagnosis. Therefore, we evaluated algorithms for multi-class classification of three diagnostic groups: patients with probable Alzheimer's disease, patients with mild cognitive impairment and healthy controls. The diagnosis based on clinical criteria was used as reference standard, as it was the best available reference despite its known limitations. For evaluation, a previously unseen test set was used consisting of 354 T1-weighted MRI scans with the diagnoses blinded. Fifteen research teams participated with a total of 29 algorithms. The algorithms were trained on a small training set (n=30) and optionally on data from other sources (e.g., the Alzheimer's Disease Neuroimaging Initiative, the Australian Imaging Biomarkers and Lifestyle flagship study of aging). The best performing algorithm yielded an accuracy of 63.0% and an area under the receiver-operating-characteristic curve (AUC) of 78.8%. In general, the best performances were achieved using feature extraction based on voxel-based morphometry or a combination of features that included volume, cortical thickness, shape and intensity. The challenge is open for new submissions via the web-based framework: http://caddementia.grand-challenge.org. Copyright © 2015 Elsevier Inc. All rights reserved.

QPSO-Based Adaptive DNA Computing Algorithm

PubMed Central

Karakose, Mehmet; Cigdem, Ugur

2013-01-01

DNA (deoxyribonucleic acid) computing that is a new computation model based on DNA molecules for information storage has been increasingly used for optimization and data analysis in recent years. However, DNA computing algorithm has some limitations in terms of convergence speed, adaptability, and effectiveness. In this paper, a new approach for improvement of DNA computing is proposed. This new approach aims to perform DNA computing algorithm with adaptive parameters towards the desired goal using quantum-behaved particle swarm optimization (QPSO). Some contributions provided by the proposed QPSO based on adaptive DNA computing algorithm are as follows: (1) parameters of population size, crossover rate, maximum number of operations, enzyme and virus mutation rate, and fitness function of DNA computing algorithm are simultaneously tuned for adaptive process, (2) adaptive algorithm is performed using QPSO algorithm for goal-driven progress, faster operation, and flexibility in data, and (3) numerical realization of DNA computing algorithm with proposed approach is implemented in system identification. Two experiments with different systems were carried out to evaluate the performance of the proposed approach with comparative results. Experimental results obtained with Matlab and FPGA demonstrate ability to provide effective optimization, considerable convergence speed, and high accuracy according to DNA computing algorithm. PMID:23935409
Numerical phase retrieval from beam intensity measurements in three planes

NASA Astrophysics Data System (ADS)

Bruel, Laurent

2003-05-01

A system and method have been developed at CEA to retrieve phase information from multiple intensity measurements along a laser beam. The device has been patented. Commonly used devices for beam measurement provide phase and intensity information separately or with a rather poor resolution whereas the MIROMA method provides both at the same time, allowing direct use of the results in numerical models. Usual phase retrieval algorithms use two intensity measurements, typically the image plane and the focal plane (Gerschberg-Saxton algorithm) related by a Fourier transform, or the image plane and a lightly defocus plane (D.L. Misell). The principal drawback of such iterative algorithms is their inability to provide unambiguous convergence in all situations. The algorithms can stagnate on bad solutions and the error between measured and calculated intensities remains unacceptable. If three planes rather than two are used, the data redundancy created confers to the method good convergence capability and noise immunity. It provides an excellent agreement between intensity determined from the retrieved phase data set in the image plane and intensity measurements in any diffraction plane. The method employed for MIROMA is inspired from GS algorithm, replacing Fourier transforms by a beam-propagating kernel with gradient search accelerating techniques and special care for phase branch cuts. A fast one dimensional algorithm provides an initial guess for the iterative algorithm. Applications of the algorithm on synthetic data find out the best reconstruction planes that have to be chosen. Robustness and sensibility are evaluated. Results on collimated and distorted laser beams are presented.
Anisotropic scattering of discrete particle arrays.

PubMed

Paul, Joseph S; Fu, Wai Chong; Dokos, Socrates; Box, Michael

2010-05-01

Far-field intensities of light scattered from a linear centro-symmetric array illuminated by a plane wave of incident light are estimated at a series of detector angles. The intensities are computed from the superposition of E-fields scattered by the individual array elements. An average scattering phase function is used to model the scattered fields of individual array elements. The nature of scattering from the array is investigated using an image (theta-phi plot) of the far-field intensities computed at a series of locations obtained by rotating the detector angle from 0 degrees to 360 degrees, corresponding to each angle of incidence in the interval [0 degrees 360 degrees]. The diffraction patterns observed from the theta-Phi plot are compared with those for isotropic scattering. In the absence of prior information on the array geometry, the intensities corresponding to theta-Phi pairs satisfying the Bragg condition are used to estimate the phase function. An algorithmic procedure is presented for this purpose and tested using synthetic data. The relative error between estimated and theoretical values of the phase function is shown to be determined by the mean spacing factor, the number of elements, and the far-field distance. An empirical relationship is presented to calculate the optimal far-field distance for a given specification of the percentage error.
Extraction of edge-based and region-based features for object recognition

NASA Astrophysics Data System (ADS)

Coutts, Benjamin; Ravi, Srinivas; Hu, Gongzhu; Shrikhande, Neelima

1993-08-01

One of the central problems of computer vision is object recognition. A catalogue of model objects is described as a set of features such as edges and surfaces. The same features are extracted from the scene and matched against the models for object recognition. Edges and surfaces extracted from the scenes are often noisy and imperfect. In this paper algorithms are described for improving low level edge and surface features. Existing edge extraction algorithms are applied to the intensity image to obtain edge features. Initial edges are traced by following directions of the current contour. These are improved by using corresponding depth and intensity information for decision making at branch points. Surface fitting routines are applied to the range image to obtain planar surface patches. An algorithm of region growing is developed that starts with a coarse segmentation and uses quadric surface fitting to iteratively merge adjacent regions into quadric surfaces based on approximate orthogonal distance regression. Surface information obtained is returned to the edge extraction routine to detect and remove fake edges. This process repeats until no more merging or edge improvement can take place. Both synthetic (with Gaussian noise) and real images containing multiple object scenes have been tested using the merging criteria. Results appeared quite encouraging.
Web-client based distributed generalization and geoprocessing

USGS Publications Warehouse

Wolf, E.B.; Howe, K.

2009-01-01

Generalization and geoprocessing operations on geospatial information were once the domain of complex software running on high-performance workstations. Currently, these computationally intensive processes are the domain of desktop applications. Recent efforts have been made to move geoprocessing operations server-side in a distributed, web accessible environment. This paper initiates research into portable client-side generalization and geoprocessing operations as part of a larger effort in user-centered design for the US Geological Survey's The National Map. An implementation of the Ramer-Douglas-Peucker (RDP) line simplification algorithm was created in the open source OpenLayers geoweb client. This algorithm implementation was benchmarked using differing data structures and browser platforms. The implementation and results of the benchmarks are discussed in the general context of client-side geoprocessing. (Abstract).
Gene selection in cancer classification using sparse logistic regression with Bayesian regularization.

PubMed

Cawley, Gavin C; Talbot, Nicola L C

2006-10-01

Gene selection algorithms for cancer classification, based on the expression of a small number of biomarker genes, have been the subject of considerable research in recent years. Shevade and Keerthi propose a gene selection algorithm based on sparse logistic regression (SLogReg) incorporating a Laplace prior to promote sparsity in the model parameters, and provide a simple but efficient training procedure. The degree of sparsity obtained is determined by the value of a regularization parameter, which must be carefully tuned in order to optimize performance. This normally involves a model selection stage, based on a computationally intensive search for the minimizer of the cross-validation error. In this paper, we demonstrate that a simple Bayesian approach can be taken to eliminate this regularization parameter entirely, by integrating it out analytically using an uninformative Jeffrey's prior. The improved algorithm (BLogReg) is then typically two or three orders of magnitude faster than the original algorithm, as there is no longer a need for a model selection step. The BLogReg algorithm is also free from selection bias in performance estimation, a common pitfall in the application of machine learning algorithms in cancer classification. The SLogReg, BLogReg and Relevance Vector Machine (RVM) gene selection algorithms are evaluated over the well-studied colon cancer and leukaemia benchmark datasets. The leave-one-out estimates of the probability of test error and cross-entropy of the BLogReg and SLogReg algorithms are very similar, however the BlogReg algorithm is found to be considerably faster than the original SLogReg algorithm. Using nested cross-validation to avoid selection bias, performance estimation for SLogReg on the leukaemia dataset takes almost 48 h, whereas the corresponding result for BLogReg is obtained in only 1 min 24 s, making BLogReg by far the more practical algorithm. BLogReg also demonstrates better estimates of conditional probability than the RVM, which are of great importance in medical applications, with similar computational expense. A MATLAB implementation of the sparse logistic regression algorithm with Bayesian regularization (BLogReg) is available from http://theoval.cmp.uea.ac.uk/~gcc/cbl/blogreg/
Angle Statistics Reconstruction: a robust reconstruction algorithm for Muon Scattering Tomography

NASA Astrophysics Data System (ADS)

Stapleton, M.; Burns, J.; Quillin, S.; Steer, C.

2014-11-01

Muon Scattering Tomography (MST) is a technique for using the scattering of cosmic ray muons to probe the contents of enclosed volumes. As a muon passes through material it undergoes multiple Coulomb scattering, where the amount of scattering is dependent on the density and atomic number of the material as well as the path length. Hence, MST has been proposed as a means of imaging dense materials, for instance to detect special nuclear material in cargo containers. Algorithms are required to generate an accurate reconstruction of the material density inside the volume from the muon scattering information and some have already been proposed, most notably the Point of Closest Approach (PoCA) and Maximum Likelihood/Expectation Maximisation (MLEM) algorithms. However, whilst PoCA-based algorithms are easy to implement, they perform rather poorly in practice. Conversely, MLEM is a complicated algorithm to implement and computationally intensive and there is currently no published, fast and easily-implementable algorithm that performs well in practice. In this paper, we first provide a detailed analysis of the source of inaccuracy in PoCA-based algorithms. We then motivate an alternative method, based on ideas first laid out by Morris et al, presenting and fully specifying an algorithm that performs well against simulations of realistic scenarios. We argue this new algorithm should be adopted by developers of Muon Scattering Tomography as an alternative to PoCA.
A comparison between computer-controlled and set work rate exercise based on target heart rate

NASA Technical Reports Server (NTRS)

Pratt, Wanda M.; Siconolfi, Steven F.; Webster, Laurie; Hayes, Judith C.; Mazzocca, Augustus D.; Harris, Bernard A., Jr.

1991-01-01

Two methods are compared for observing the heart rate (HR), metabolic equivalents, and time in target HR zone (defined as the target HR + or - 5 bpm) during 20 min of exercise at a prescribed intensity of the maximum working capacity. In one method, called set-work rate exercise, the information from a graded exercise test is used to select a target HR and to calculate a corresponding constant work rate that should induce the desired HR. In the other method, the work rate is controlled by a computer algorithm to achieve and maintain a prescribed target HR. It is shown that computer-controlled exercise is an effective alternative to the traditional set work rate exercise, particularly when tight control of cardiovascular responses is necessary.
Termination of the solar wind in the hot, partially ionized interstellar medium. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Lombard, C. K.

1974-01-01

Theoretical foundations for understanding the problem of the termination of the solar wind are reexamined in the light of most recent findings concerning the states of the solar wind and the local interstellar medium. The investigation suggests that a simple extention of Parker's (1961) analytical model provides a useful approximate description of the combined solar wind, interstellar wind plasma flowfield under conditions presently thought to occur. A linear perturbation solution exhibiting both the effects of photoionization and charge exchange is obtained for the supersonic solar wind. A numerical algorithm is described for computing moments of the non-equilibrium hydrogen distribution function and associated source terms for the MHD equations. Computed using the algorithm in conjunction with the extended Parker solution to approximate the plasma flowfield, profiles of hydrogen number density are given in the solar wind along the upstream and downstream axes of flow with respect to the direction of the interstellar wind. Predictions of solar Lyman-alpha backscatter intensities to be observed at 1 a.u. have been computed, in turn, from a set of such hydrogen number density profiles varied over assumed conditions of the interstellar wind.
Iterative methods for 3D implicit finite-difference migration using the complex Padé approximation

NASA Astrophysics Data System (ADS)

Costa, Carlos A. N.; Campos, Itamara S.; Costa, Jessé C.; Neto, Francisco A.; Schleicher, Jörg; Novais, Amélia

2013-08-01

Conventional implementations of 3D finite-difference (FD) migration use splitting techniques to accelerate performance and save computational cost. However, such techniques are plagued with numerical anisotropy that jeopardises the correct positioning of dipping reflectors in the directions not used for the operator splitting. We implement 3D downward continuation FD migration without splitting using a complex Padé approximation. In this way, the numerical anisotropy is eliminated at the expense of a computationally more intensive solution of a large-band linear system. We compare the performance of the iterative stabilized biconjugate gradient (BICGSTAB) and that of the multifrontal massively parallel direct solver (MUMPS). It turns out that the use of the complex Padé approximation not only stabilizes the solution, but also acts as an effective preconditioner for the BICGSTAB algorithm, reducing the number of iterations as compared to the implementation using the real Padé expansion. As a consequence, the iterative BICGSTAB method is more efficient than the direct MUMPS method when solving a single term in the Padé expansion. The results of both algorithms, here evaluated by computing the migration impulse response in the SEG/EAGE salt model, are of comparable quality.
Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo

DOE PAGES

McDaniel, Tyler; D’Azevedo, Ed F.; Li, Ying Wai; ...

2017-11-07

Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is therefore formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with applicationmore » of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. Here this procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi- core CPUs and GPUs.« less
Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo

DOE Office of Scientific and Technical Information (OSTI.GOV)

McDaniel, Tyler; D’Azevedo, Ed F.; Li, Ying Wai

Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is therefore formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with applicationmore » of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. Here this procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi- core CPUs and GPUs.« less
Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo.

PubMed

McDaniel, T; D'Azevedo, E F; Li, Y W; Wong, K; Kent, P R C

2017-11-07

Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is, therefore, formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with an application of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. This procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo, where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi-core central processing units and graphical processing units.
Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo

NASA Astrophysics Data System (ADS)

McDaniel, T.; D'Azevedo, E. F.; Li, Y. W.; Wong, K.; Kent, P. R. C.

2017-11-01

Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is, therefore, formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with an application of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. This procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo, where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi-core central processing units and graphical processing units.
High-efficiency wavefunction updates for large scale Quantum Monte Carlo

NASA Astrophysics Data System (ADS)

Kent, Paul; McDaniel, Tyler; Li, Ying Wai; D'Azevedo, Ed

Within ab intio Quantum Monte Carlo (QMC) simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunctions. The evaluation of each Monte Carlo move requires finding the determinant of a dense matrix, which is traditionally iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. For calculations with thousands of electrons, this operation dominates the execution profile. We propose a novel rank- k delayed update scheme. This strategy enables probability evaluation for multiple successive Monte Carlo moves, with application of accepted moves to the matrices delayed until after a predetermined number of moves, k. Accepted events grouped in this manner are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency. This procedure does not change the underlying Monte Carlo sampling or the sampling efficiency. For large systems and algorithms such as diffusion Monte Carlo where the acceptance ratio is high, order of magnitude speedups can be obtained on both multi-core CPU and on GPUs, making this algorithm highly advantageous for current petascale and future exascale computations.
Accelerated probabilistic inference of RNA structure evolution

PubMed Central

Holmes, Ian

2005-01-01

Background Pairwise stochastic context-free grammars (Pair SCFGs) are powerful tools for evolutionary analysis of RNA, including simultaneous RNA sequence alignment and secondary structure prediction, but the associated algorithms are intensive in both CPU and memory usage. The same problem is faced by other RNA alignment-and-folding algorithms based on Sankoff's 1985 algorithm. It is therefore desirable to constrain such algorithms, by pre-processing the sequences and using this first pass to limit the range of structures and/or alignments that can be considered. Results We demonstrate how flexible classes of constraint can be imposed, greatly reducing the computational costs while maintaining a high quality of structural homology prediction. Any score-attributed context-free grammar (e.g. energy-based scoring schemes, or conditionally normalized Pair SCFGs) is amenable to this treatment. It is now possible to combine independent structural and alignment constraints of unprecedented general flexibility in Pair SCFG alignment algorithms. We outline several applications to the bioinformatics of RNA sequence and structure, including Waterman-Eggert N-best alignments and progressive multiple alignment. We evaluate the performance of the algorithm on test examples from the RFAM database. Conclusion A program, Stemloc, that implements these algorithms for efficient RNA sequence alignment and structure prediction is available under the GNU General Public License. PMID:15790387
Efficient terrestrial laser scan segmentation exploiting data structure

NASA Astrophysics Data System (ADS)

Mahmoudabadi, Hamid; Olsen, Michael J.; Todorovic, Sinisa

2016-09-01

New technologies such as lidar enable the rapid collection of massive datasets to model a 3D scene as a point cloud. However, while hardware technology continues to advance, processing 3D point clouds into informative models remains complex and time consuming. A common approach to increase processing efficiently is to segment the point cloud into smaller sections. This paper proposes a novel approach for point cloud segmentation using computer vision algorithms to analyze panoramic representations of individual laser scans. These panoramas can be quickly created using an inherent neighborhood structure that is established during the scanning process, which scans at fixed angular increments in a cylindrical or spherical coordinate system. In the proposed approach, a selected image segmentation algorithm is applied on several input layers exploiting this angular structure including laser intensity, range, normal vectors, and color information. These segments are then mapped back to the 3D point cloud so that modeling can be completed more efficiently. This approach does not depend on pre-defined mathematical models and consequently setting parameters for them. Unlike common geometrical point cloud segmentation methods, the proposed method employs the colorimetric and intensity data as another source of information. The proposed algorithm is demonstrated on several datasets encompassing variety of scenes and objects. Results show a very high perceptual (visual) level of segmentation and thereby the feasibility of the proposed algorithm. The proposed method is also more efficient compared to Random Sample Consensus (RANSAC), which is a common approach for point cloud segmentation.
Navigational strategies underlying phototaxis in larval zebrafish

PubMed Central

Chen, Xiuye; Engert, Florian

2014-01-01

Understanding how the brain transforms sensory input into complex behavior is a fundamental question in systems neuroscience. Using larval zebrafish, we study the temporal component of phototaxis, which is defined as orientation decisions based on comparisons of light intensity at successive moments in time. We developed a novel “Virtual Circle” assay where whole-field illumination is abruptly turned off when the fish swims out of a virtually defined circular border, and turned on again when it returns into the circle. The animal receives no direct spatial cues and experiences only whole-field temporal light changes. Remarkably, the fish spends most of its time within the invisible virtual border. Behavioral analyses of swim bouts in relation to light transitions were used to develop four discrete temporal algorithms that transform the binary visual input (uniform light/uniform darkness) into the observed spatial behavior. In these algorithms, the turning angle is dependent on the behavioral history immediately preceding individual turning events. Computer simulations show that the algorithms recapture most of the swim statistics of real fish. We discovered that turning properties in larval zebrafish are distinctly modulated by temporal step functions in light intensity in combination with the specific motor history preceding these turns. Several aspects of the behavior suggest memory usage of up to 10 swim bouts (~10 sec). Thus, we show that a complex behavior like spatial navigation can emerge from a small number of relatively simple behavioral algorithms. PMID:24723859
Chemotaxis can provide biological organisms with good solutions to the travelling salesman problem.

PubMed

Reynolds, A M

2011-05-01

The ability to find good solutions to the traveling salesman problem can benefit some biological organisms. Bacterial infection would, for instance, be eradicated most promptly if cells of the immune system minimized the total distance they traveled when moving between bacteria. Similarly, foragers would maximize their net energy gain if the distance that they traveled between multiple dispersed prey items was minimized. The traveling salesman problem is one of the most intensively studied problems in combinatorial optimization. There are no efficient algorithms for even solving the problem approximately (within a guaranteed constant factor from the optimum) because the problem is nondeterministic polynomial time complete. The best approximate algorithms can typically find solutions within 1%-2% of the optimal, but these are computationally intensive and can not be implemented by biological organisms. Biological organisms could, in principle, implement the less efficient greedy nearest-neighbor algorithm, i.e., always move to the nearest surviving target. Implementation of this strategy does, however, require quite sophisticated cognitive abilities and prior knowledge of the target locations. Here, with the aid of numerical simulations, it is shown that biological organisms can simply use chemotaxis to solve, or at worst provide good solutions (comparable to those found by the greedy algorithm) to, the traveling salesman problem when the targets are sources of a chemoattractant and are modest in number (n < 10). This applies to neutrophils and macrophages in microbial defense and to some predators.
Cost-Benefit Analysis of Computer Resources for Machine Learning

USGS Publications Warehouse

Champion, Richard A.

2007-01-01

Machine learning describes pattern-recognition algorithms - in this case, probabilistic neural networks (PNNs). These can be computationally intensive, in part because of the nonlinear optimizer, a numerical process that calibrates the PNN by minimizing a sum of squared errors. This report suggests efficiencies that are expressed as cost and benefit. The cost is computer time needed to calibrate the PNN, and the benefit is goodness-of-fit, how well the PNN learns the pattern in the data. There may be a point of diminishing returns where a further expenditure of computer resources does not produce additional benefits. Sampling is suggested as a cost-reduction strategy. One consideration is how many points to select for calibration and another is the geometric distribution of the points. The data points may be nonuniformly distributed across space, so that sampling at some locations provides additional benefit while sampling at other locations does not. A stratified sampling strategy can be designed to select more points in regions where they reduce the calibration error and fewer points in regions where they do not. Goodness-of-fit tests ensure that the sampling does not introduce bias. This approach is illustrated by statistical experiments for computing correlations between measures of roadless area and population density for the San Francisco Bay Area. The alternative to training efficiencies is to rely on high-performance computer systems. These may require specialized programming and algorithms that are optimized for parallel performance.

Robust generative asymmetric GMM for brain MR image segmentation.

PubMed

Ji, Zexuan; Xia, Yong; Zheng, Yuhui

2017-11-01

Accurate segmentation of brain tissues from magnetic resonance (MR) images based on the unsupervised statistical models such as Gaussian mixture model (GMM) has been widely studied during last decades. However, most GMM based segmentation methods suffer from limited accuracy due to the influences of noise and intensity inhomogeneity in brain MR images. To further improve the accuracy for brain MR image segmentation, this paper presents a Robust Generative Asymmetric GMM (RGAGMM) for simultaneous brain MR image segmentation and intensity inhomogeneity correction. First, we develop an asymmetric distribution to fit the data shapes, and thus construct a spatial constrained asymmetric model. Then, we incorporate two pseudo-likelihood quantities and bias field estimation into the model's log-likelihood, aiming to exploit the neighboring priors of within-cluster and between-cluster and to alleviate the impact of intensity inhomogeneity, respectively. Finally, an expectation maximization algorithm is derived to iteratively maximize the approximation of the data log-likelihood function to overcome the intensity inhomogeneity in the image and segment the brain MR images simultaneously. To demonstrate the performances of the proposed algorithm, we first applied the proposed algorithm to a synthetic brain MR image to show the intermediate illustrations and the estimated distribution of the proposed algorithm. The next group of experiments is carried out in clinical 3T-weighted brain MR images which contain quite serious intensity inhomogeneity and noise. Then we quantitatively compare our algorithm to state-of-the-art segmentation approaches by using Dice coefficient (DC) on benchmark images obtained from IBSR and BrainWeb with different level of noise and intensity inhomogeneity. The comparison results on various brain MR images demonstrate the superior performances of the proposed algorithm in dealing with the noise and intensity inhomogeneity. In this paper, the RGAGMM algorithm is proposed which can simply and efficiently incorporate spatial constraints into an EM framework to simultaneously segment brain MR images and estimate the intensity inhomogeneity. The proposed algorithm is flexible to fit the data shapes, and can simultaneously overcome the influence of noise and intensity inhomogeneity, and hence is capable of improving over 5% segmentation accuracy comparing with several state-of-the-art algorithms. Copyright © 2017 Elsevier B.V. All rights reserved.
Real-time non-rigid target tracking for ultrasound-guided clinical interventions

NASA Astrophysics Data System (ADS)

Zachiu, C.; Ries, M.; Ramaekers, P.; Guey, J.-L.; Moonen, C. T. W.; de Senneville, B. Denis

2017-10-01

Biological motion is a problem for non- or mini-invasive interventions when conducted in mobile/deformable organs due to the targeted pathology moving/deforming with the organ. This may lead to high miss rates and/or incomplete treatment of the pathology. Therefore, real-time tracking of the target anatomy during the intervention would be beneficial for such applications. Since the aforementioned interventions are often conducted under B-mode ultrasound (US) guidance, target tracking can be achieved via image registration, by comparing the acquired US images to a separate image established as positional reference. However, such US images are intrinsically altered by speckle noise, introducing incoherent gray-level intensity variations. This may prove problematic for existing intensity-based registration methods. In the current study we address US-based target tracking by employing the recently proposed EVolution registration algorithm. The method is, by construction, robust to transient gray-level intensities. Instead of directly matching image intensities, EVolution aligns similar contrast patterns in the images. Moreover, the displacement is computed by evaluating a matching criterion for image sub-regions rather than on a point-by-point basis, which typically provides more robust motion estimates. However, unlike similar previously published approaches, which assume rigid displacements in the image sub-regions, the EVolution algorithm integrates the matching criterion in a global functional, allowing the estimation of an elastic dense deformation. The approach was validated for soft tissue tracking under free-breathing conditions on the abdomen of seven healthy volunteers. Contact echography was performed on all volunteers, while three of the volunteers also underwent standoff echography. Each of the two modalities is predominantly specific to a particular type of non- or mini-invasive clinical intervention. The method demonstrated on average an accuracy of ˜1.5 mm and submillimeter precision. This, together with a computational performance of 20 images per second make the proposed method an attractive solution for real-time target tracking during US-guided clinical interventions.
Real-time non-rigid target tracking for ultrasound-guided clinical interventions.

PubMed

Zachiu, C; Ries, M; Ramaekers, P; Guey, J-L; Moonen, C T W; de Senneville, B Denis

2017-10-04

Biological motion is a problem for non- or mini-invasive interventions when conducted in mobile/deformable organs due to the targeted pathology moving/deforming with the organ. This may lead to high miss rates and/or incomplete treatment of the pathology. Therefore, real-time tracking of the target anatomy during the intervention would be beneficial for such applications. Since the aforementioned interventions are often conducted under B-mode ultrasound (US) guidance, target tracking can be achieved via image registration, by comparing the acquired US images to a separate image established as positional reference. However, such US images are intrinsically altered by speckle noise, introducing incoherent gray-level intensity variations. This may prove problematic for existing intensity-based registration methods. In the current study we address US-based target tracking by employing the recently proposed EVolution registration algorithm. The method is, by construction, robust to transient gray-level intensities. Instead of directly matching image intensities, EVolution aligns similar contrast patterns in the images. Moreover, the displacement is computed by evaluating a matching criterion for image sub-regions rather than on a point-by-point basis, which typically provides more robust motion estimates. However, unlike similar previously published approaches, which assume rigid displacements in the image sub-regions, the EVolution algorithm integrates the matching criterion in a global functional, allowing the estimation of an elastic dense deformation. The approach was validated for soft tissue tracking under free-breathing conditions on the abdomen of seven healthy volunteers. Contact echography was performed on all volunteers, while three of the volunteers also underwent standoff echography. Each of the two modalities is predominantly specific to a particular type of non- or mini-invasive clinical intervention. The method demonstrated on average an accuracy of ∼1.5 mm and submillimeter precision. This, together with a computational performance of 20 images per second make the proposed method an attractive solution for real-time target tracking during US-guided clinical interventions.
Color enhancement and image defogging in HSI based on Retinex model

NASA Astrophysics Data System (ADS)

Gao, Han; Wei, Ping; Ke, Jun

2015-08-01

Retinex is a luminance perceptual algorithm based on color consistency. It has a good performance in color enhancement. But in some cases, the traditional Retinex algorithms, both Single-Scale Retinex(SSR) and Multi-Scale Retinex(MSR) in RGB color space, do not work well and will cause color deviation. To solve this problem, we present improved SSR and MSR algorithms. Compared to other Retinex algorithms, we implement Retinex algorithms in HSI(Hue, Saturation, Intensity) color space, and use a parameter αto improve quality of the image. Moreover, the algorithms presented in this paper has a good performance in image defogging. Contrasted with traditional Retinex algorithms, we use intensity channel to obtain reflection information of an image. The intensity channel is processed using a Gaussian center-surround image filter to get light information, which should be removed from intensity channel. After that, we subtract the light information from intensity channel to obtain the reflection image, which only includes the attribute of the objects in image. Using the reflection image and a parameter α, which is an arbitrary scale factor set manually, we improve the intensity channel, and complete the color enhancement. Our experiments show that this approach works well compared with existing methods for color enhancement. Besides a better performance in color deviation problem and image defogging, a visible improvement in the image quality for human contrast perception is also observed.
A novel constructive-optimizer neural network for the traveling salesman problem.

PubMed

Saadatmand-Tarzjan, Mahdi; Khademi, Morteza; Akbarzadeh-T, Mohammad-R; Moghaddam, Hamid Abrishami

2007-08-01

In this paper, a novel constructive-optimizer neural network (CONN) is proposed for the traveling salesman problem (TSP). CONN uses a feedback structure similar to Hopfield-type neural networks and a competitive training algorithm similar to the Kohonen-type self-organizing maps (K-SOMs). Consequently, CONN is composed of a constructive part, which grows the tour and an optimizer part to optimize it. In the training algorithm, an initial tour is created first and introduced to CONN. Then, it is trained in the constructive phase for adding a number of cities to the tour. Next, the training algorithm switches to the optimizer phase for optimizing the current tour by displacing the tour cities. After convergence in this phase, the training algorithm switches to the constructive phase anew and is continued until all cities are added to the tour. Furthermore, we investigate a relationship between the number of TSP cities and the number of cities to be added in each constructive phase. CONN was tested on nine sets of benchmark TSPs from TSPLIB to demonstrate its performance and efficiency. It performed better than several typical Neural networks (NNs), including KNIES_TSP_Local, KNIES_TSP_Global, Budinich's SOM, Co-Adaptive Net, and multivalued Hopfield network as wall as computationally comparable variants of the simulated annealing algorithm, in terms of both CPU time and accuracy. Furthermore, CONN converged considerably faster than expanding SOM and evolved integrated SOM and generated shorter tours compared to KNIES_DECOMPOSE. Although CONN is not yet comparable in terms of accuracy with some sophisticated computationally intensive algorithms, it converges significantly faster than they do. Generally speaking, CONN provides the best compromise between CPU time and accuracy among currently reported NNs for TSP.
Restoration of MRI Data for Field Nonuniformities using High Order Neighborhood Statistics

PubMed Central

Hadjidemetriou, Stathis; Studholme, Colin; Mueller, Susanne; Weiner, Michael; Schuff, Norbert

2007-01-01

MRI at high magnetic fields (> 3.0 T ) is complicated by strong inhomogeneous radio-frequency fields, sometimes termed the “bias field”. These lead to nonuniformity of image intensity, greatly complicating further analysis such as registration and segmentation. Existing methods for bias field correction are effective for 1.5 T or 3.0 T MRI, but are not completely satisfactory for higher field data. This paper develops an effective bias field correction for high field MRI based on the assumption that the nonuniformity is smoothly varying in space. Also, nonuniformity is quantified and unmixed using high order neighborhood statistics of intensity cooccurrences. They are computed within spherical windows of limited size over the entire image. The restoration is iterative and makes use of a novel stable stopping criterion that depends on the scaled entropy of the cooccurrence statistics, which is a non monotonic function of the iterations; the Shannon entropy of the cooccurrence statistics normalized to the effective dynamic range of the image. The algorithm restores whole head data, is robust to intense nonuniformities present in high field acquisitions, and is robust to variations in anatomy. This algorithm significantly improves bias field correction in comparison to N3 on phantom 1.5 T head data and high field 4 T human head data. PMID:18193095
On the Fast Evaluation Method of Temperature and Gas Mixing Ratio Weighting Functions for Remote Sensing of Planetary Atmospheres in Thermal IR and Microwave

NASA Technical Reports Server (NTRS)

Ustinov, E. A.

1999-01-01

Evaluation of weighting functions in the atmospheric remote sensing is usually the most computer-intensive part of the inversion algorithms. We present an analytic approach to computations of temperature and mixing ratio weighting functions that is based on our previous results but the resulting expressions use the intermediate variables that are generated in computations of observable radiances themselves. Upwelling radiances at the given level in the atmosphere and atmospheric transmittances from space to the given level are combined with local values of the total absorption coefficient and its components due to absorption of atmospheric constituents under study. This makes it possible to evaluate the temperature and mixing ratio weighting functions in parallel with evaluation of radiances. This substantially decreases the computer time required for evaluation of weighting functions. Implications for the nadir and limb viewing geometries are discussed.
Automated image quality assessment for chest CT scans.

PubMed

Reeves, Anthony P; Xie, Yiting; Liu, Shuang

2018-02-01

Medical image quality needs to be maintained at standards sufficient for effective clinical reading. Automated computer analytic methods may be applied to medical images for quality assessment. For chest CT scans in a lung cancer screening context, an automated quality assessment method is presented that characterizes image noise and image intensity calibration. This is achieved by image measurements in three automatically segmented homogeneous regions of the scan: external air, trachea lumen air, and descending aorta blood. Profiles of CT scanner behavior are also computed. The method has been evaluated on both phantom and real low-dose chest CT scans and results show that repeatable noise and calibration measures may be realized by automated computer algorithms. Noise and calibration profiles show relevant differences between different scanners and protocols. Automated image quality assessment may be useful for quality control for lung cancer screening and may enable performance improvements to automated computer analysis methods. © 2017 American Association of Physicists in Medicine.
Deployment of the OSIRIS EM-PIC code on the Intel Knights Landing architecture

NASA Astrophysics Data System (ADS)

Fonseca, Ricardo

2017-10-01

Electromagnetic particle-in-cell (EM-PIC) codes such as OSIRIS have found widespread use in modelling the highly nonlinear and kinetic processes that occur in several relevant plasma physics scenarios, ranging from astrophysical settings to high-intensity laser plasma interaction. Being computationally intensive, these codes require large scale HPC systems, and a continuous effort in adapting the algorithm to new hardware and computing paradigms. In this work, we report on our efforts on deploying the OSIRIS code on the new Intel Knights Landing (KNL) architecture. Unlike the previous generation (Knights Corner), these boards are standalone systems, and introduce several new features, include the new AVX-512 instructions and on-package MCDRAM. We will focus on the parallelization and vectorization strategies followed, as well as memory management, and present a detailed performance evaluation of code performance in comparison with the CPU code. This work was partially supported by Fundaçã para a Ciência e Tecnologia (FCT), Portugal, through Grant No. PTDC/FIS-PLA/2940/2014.
Ganalyzer: A tool for automatic galaxy image analysis

NASA Astrophysics Data System (ADS)

Shamir, Lior

2011-05-01

Ganalyzer is a model-based tool that automatically analyzes and classifies galaxy images. Ganalyzer works by separating the galaxy pixels from the background pixels, finding the center and radius of the galaxy, generating the radial intensity plot, and then computing the slopes of the peaks detected in the radial intensity plot to measure the spirality of the galaxy and determine its morphological class. Unlike algorithms that are based on machine learning, Ganalyzer is based on measuring the spirality of the galaxy, a task that is difficult to perform manually, and in many cases can provide a more accurate analysis compared to manual observation. Ganalyzer is simple to use, and can be easily embedded into other image analysis applications. Another advantage is its speed, which allows it to analyze ~10,000,000 galaxy images in five days using a standard modern desktop computer. These capabilities can make Ganalyzer a useful tool in analyzing large datasets of galaxy images collected by autonomous sky surveys such as SDSS, LSST or DES.
Simple and sensitive technique for alignment of the pinhole of a spatial filter of a high-energy, high-power laser system.

PubMed

Sharma, Avnish Kumar; Patidar, Rajesh Kumar; Daiya, Deepak; Joshi, Anandverdhan; Naik, Prasad Anant; Gupta, Parshotam Dass

2013-04-20

In this paper, a new method for alignment of the pinhole of a spatial filter (SF) has been proposed and demonstrated experimentally. The effect of the misalignment of the pinhole on the laser beam profiles has been calculated for circular and elliptical Gaussian laser beams. Theoretical computation has been carried out to illustrate the effect of an intensity mask, placed before the focusing lens of the SF, on the spatial beam profile after the pinhole of the SF. It is shown, both theoretically and experimentally, that a simple intensity mask, consisting of a black dot, can be used to visually align the pinhole with a high accuracy of 5% of the pinhole diameter. The accuracy may be further improved using a computer-based image processing algorithm. Finally, the proposed technique has been demonstrated to align a vacuum SF of a compact 40 J Nd:phosphate glass laser system.
Computer Modeling of High-Intensity Cs-Sputter Ion Sources

NASA Astrophysics Data System (ADS)

Brown, T. A.; Roberts, M. L.; Southon, J. R.

The grid-point mesh program NEDLab has been used to computer model the interior of the high-intensity Cs-sputter source used in routine operations at the Center for Accelerator Mass Spectrometry (CAMS), with the goal of improving negative ion output. NEDLab has several features that are important to realistic modeling of such sources. First, space-charge effects are incorporated in the calculations through an automated ion-trajectories/Poissonelectric-fields successive-iteration process. Second, space charge distributions can be averaged over successive iterations to suppress model instabilities. Third, space charge constraints on ion emission from surfaces can be incorporate under Child's Law based algorithms. Fourth, the energy of ions emitted from a surface can be randomly chosen from within a thermal energy distribution. And finally, ions can be emitted from a surface at randomized angles The results of our modeling effort indicate that significant modification of the interior geometry of the source will double Cs+ ion production from our spherical ionizer and produce a significant increase in negative ion output from the source.
Augmentation of Early Intensity Forecasting in Tropical Cyclones

DTIC Science & Technology

2011-09-30

modeled storms to the measured signatures. APPROACH The deviation-angle variance technique was introduced in Pineros et al. (2008) as a procedure to...the algorithm developed in the first year of the project. The new method used best-track storm fixes as the points to compute the DAV signal. We...In the North Atlantic basin, RMSE for tropical storm category is 11 kt, hurricane categories 1-3 is 12.5 kt, category 4 is 18 kt and category 5 is
ProteoCloud: a full-featured open source proteomics cloud computing pipeline.

PubMed

Muth, Thilo; Peters, Julian; Blackburn, Jonathan; Rapp, Erdmann; Martens, Lennart

2013-08-02

We here present the ProteoCloud pipeline, a freely available, full-featured cloud-based platform to perform computationally intensive, exhaustive searches in a cloud environment using five different peptide identification algorithms. ProteoCloud is entirely open source, and is built around an easy to use and cross-platform software client with a rich graphical user interface. This client allows full control of the number of cloud instances to initiate and of the spectra to assign for identification. It also enables the user to track progress, and to visualize and interpret the results in detail. Source code, binaries and documentation are all available at http://proteocloud.googlecode.com. Copyright © 2012 Elsevier B.V. All rights reserved.
Diffraction Correlation to Reconstruct Highly Strained Particles

NASA Astrophysics Data System (ADS)

Brown, Douglas; Harder, Ross; Clark, Jesse; Kim, J. W.; Kiefer, Boris; Fullerton, Eric; Shpyrko, Oleg; Fohtung, Edwin

2015-03-01

Through the use of coherent x-ray diffraction a three-dimensional diffraction pattern of a highly strained nano-crystal can be recorded in reciprocal space by a detector. Only the intensities are recorded, resulting in a loss of the complex phase. The recorded diffraction pattern therefore requires computational processing to reconstruct the density and complex distribution of the diffracted nano-crystal. For highly strained crystals, standard methods using HIO and ER algorithms are no longer sufficient to reconstruct the diffraction pattern. Our solution is to correlate the symmetry in reciprocal space to generate an a priori shape constraint to guide the computational reconstruction of the diffraction pattern. This approach has improved the ability to accurately reconstruct highly strained nano-crystals.
Window-based method for approximating the Hausdorff in three-dimensional range imagery

DOEpatents

Koch, Mark W [Albuquerque, NM

2009-06-02

One approach to pattern recognition is to use a template from a database of objects and match it to a probe image containing the unknown. Accordingly, the Hausdorff distance can be used to measure the similarity of two sets of points. In particular, the Hausdorff can measure the goodness of a match in the presence of occlusion, clutter, and noise. However, existing 3D algorithms for calculating the Hausdorff are computationally intensive, making them impractical for pattern recognition that requires scanning of large databases. The present invention is directed to a new method that can efficiently, in time and memory, compute the Hausdorff for 3D range imagery. The method uses a window-based approach.
Pteros: fast and easy to use open-source C++ library for molecular analysis.

PubMed

Yesylevskyy, Semen O

2012-07-15

An open-source Pteros library for molecular modeling and analysis of molecular dynamics trajectories for C++ programming language is introduced. Pteros provides a number of routine analysis operations ranging from reading and writing trajectory files and geometry transformations to structural alignment and computation of nonbonded interaction energies. The library features asynchronous trajectory reading and parallel execution of several analysis routines, which greatly simplifies development of computationally intensive trajectory analysis algorithms. Pteros programming interface is very simple and intuitive while the source code is well documented and easily extendible. Pteros is available for free under open-source Artistic License from http://sourceforge.net/projects/pteros/. Copyright © 2012 Wiley Periodicals, Inc.
Synthetic Generation of Myocardial Blood-Oxygen-Level-Dependent MRI Time Series via Structural Sparse Decomposition Modeling

PubMed Central

Rusu, Cristian; Morisi, Rita; Boschetto, Davide; Dharmakumar, Rohan; Tsaftaris, Sotirios A.

2014-01-01

This paper aims to identify approaches that generate appropriate synthetic data (computer generated) for Cardiac Phase-resolved Blood-Oxygen-Level-Dependent (CP–BOLD) MRI. CP–BOLD MRI is a new contrast agent- and stress-free approach for examining changes in myocardial oxygenation in response to coronary artery disease. However, since signal intensity changes are subtle, rapid visualization is not possible with the naked eye. Quantifying and visualizing the extent of disease relies on myocardial segmentation and registration to isolate the myocardium and establish temporal correspondences and ischemia detection algorithms to identify temporal differences in BOLD signal intensity patterns. If transmurality of the defect is of interest pixel-level analysis is necessary and thus a higher precision in registration is required. Such precision is currently not available affecting the design and performance of the ischemia detection algorithms. In this work, to enable algorithmic developments of ischemia detection irrespective to registration accuracy, we propose an approach that generates synthetic pixel-level myocardial time series. We do this by (a) modeling the temporal changes in BOLD signal intensity based on sparse multi-component dictionary learning, whereby segmentally derived myocardial time series are extracted from canine experimental data to learn the model; and (b) demonstrating the resemblance between real and synthetic time series for validation purposes. We envision that the proposed approach has the capacity to accelerate development of tools for ischemia detection while markedly reducing experimental costs so that cardiac BOLD MRI can be rapidly translated into the clinical arena for the noninvasive assessment of ischemic heart disease. PMID:24691119
Synthetic generation of myocardial blood-oxygen-level-dependent MRI time series via structural sparse decomposition modeling.

PubMed

Rusu, Cristian; Morisi, Rita; Boschetto, Davide; Dharmakumar, Rohan; Tsaftaris, Sotirios A

2014-07-01

This paper aims to identify approaches that generate appropriate synthetic data (computer generated) for cardiac phase-resolved blood-oxygen-level-dependent (CP-BOLD) MRI. CP-BOLD MRI is a new contrast agent- and stress-free approach for examining changes in myocardial oxygenation in response to coronary artery disease. However, since signal intensity changes are subtle, rapid visualization is not possible with the naked eye. Quantifying and visualizing the extent of disease relies on myocardial segmentation and registration to isolate the myocardium and establish temporal correspondences and ischemia detection algorithms to identify temporal differences in BOLD signal intensity patterns. If transmurality of the defect is of interest pixel-level analysis is necessary and thus a higher precision in registration is required. Such precision is currently not available affecting the design and performance of the ischemia detection algorithms. In this work, to enable algorithmic developments of ischemia detection irrespective to registration accuracy, we propose an approach that generates synthetic pixel-level myocardial time series. We do this by 1) modeling the temporal changes in BOLD signal intensity based on sparse multi-component dictionary learning, whereby segmentally derived myocardial time series are extracted from canine experimental data to learn the model; and 2) demonstrating the resemblance between real and synthetic time series for validation purposes. We envision that the proposed approach has the capacity to accelerate development of tools for ischemia detection while markedly reducing experimental costs so that cardiac BOLD MRI can be rapidly translated into the clinical arena for the noninvasive assessment of ischemic heart disease.
A system for automatic aorta sections measurements on chest CT

NASA Astrophysics Data System (ADS)

Pfeffer, Yitzchak; Mayer, Arnaldo; Zholkover, Adi; Konen, Eli

2016-03-01

A new method is proposed for caliber measurement of the ascending aorta (AA) and descending aorta (DA). A key component of the method is the automatic detection of the carina, as an anatomical landmark around which an axial volume of interest (VOI) can be defined to observe the aortic caliber. For each slice in the VOI, a linear profile line connecting the AA with the DA is found by pattern matching on the underlying intensity profile. Next, the aortic center position is found using Hough transform on the best linear segment candidate. Finally, region growing around the center provides an accurate segmentation and caliber measurement. We evaluated the algorithm on 113 sequential chest CT scans, slice thickness of 0.75 - 3.75mm, 90 with contrast agent injected. The algorithm success rates were computed as the percentage of scans in which the center of the AA was found. Automated measurements of AA caliber were compared with independent measurements of two experienced chest radiologists, comparing the absolute difference between the two radiologists with the absolute difference between the algorithm and each of the radiologists. The measurement stability was demonstrated by computing the STD of the absolute difference between the radiologists, and between the algorithm and the radiologists. Results: Success rates of 93% and 74% were achieved, for contrast injected cases and non-contrast cases, respectively. These results indicate that the algorithm can be robust in large variability of image quality, such as the cases in a realworld clinical setting. The average absolute difference between the algorithm and the radiologists was 1.85mm, lower than the average absolute difference between the radiologists, which was 2.1mm. The STD of the absolute difference between the algorithm and the radiologists was 1.5mm vs 1.6mm between the two radiologists. These results demonstrate the clinical relevance of the algorithm measurements.

Efficient Simulation of Tropical Cyclone Pathways with Stochastic Perturbations

NASA Astrophysics Data System (ADS)

Webber, R.; Plotkin, D. A.; Abbot, D. S.; Weare, J.

2017-12-01

Global Climate Models (GCMs) are known to statistically underpredict intense tropical cyclones (TCs) because they fail to capture the rapid intensification and high wind speeds characteristic of the most destructive TCs. Stochastic parametrization schemes have the potential to improve the accuracy of GCMs. However, current analysis of these schemes through direct sampling is limited by the computational expense of simulating a rare weather event at fine spatial gridding. The present work introduces a stochastically perturbed parametrization tendency (SPPT) scheme to increase simulated intensity of TCs. We adapt the Weighted Ensemble algorithm to simulate the distribution of TCs at a fraction of the computational effort required in direct sampling. We illustrate the efficiency of the SPPT scheme by comparing simulations at different spatial resolutions and stochastic parameter regimes. Stochastic parametrization and rare event sampling strategies have great potential to improve TC prediction and aid understanding of tropical cyclogenesis. Since rising sea surface temperatures are postulated to increase the intensity of TCs, these strategies can also improve predictions about climate change-related weather patterns. The rare event sampling strategies used in the current work are not only a novel tool for studying TCs, but they may also be applied to sampling any range of extreme weather events.
Scalable Conjunction Processing using Spatiotemporally Indexed Ephemeris Data

NASA Astrophysics Data System (ADS)

Budianto-Ho, I.; Johnson, S.; Sivilli, R.; Alberty, C.; Scarberry, R.

2014-09-01

The collision warnings produced by the Joint Space Operations Center (JSpOC) are of critical importance in protecting U.S. and allied spacecraft against destructive collisions and protecting the lives of astronauts during space flight. As the Space Surveillance Network (SSN) improves its sensor capabilities for tracking small and dim space objects, the number of tracked objects increases from thousands to hundreds of thousands of objects, while the number of potential conjunctions increases with the square of the number of tracked objects. Classical filtering techniques such as apogee and perigee filters have proven insufficient. Novel and orders of magnitude faster conjunction analysis algorithms are required to find conjunctions in a timely manner. Stellar Science has developed innovative filtering techniques for satellite conjunction processing using spatiotemporally indexed ephemeris data that efficiently and accurately reduces the number of objects requiring high-fidelity and computationally-intensive conjunction analysis. Two such algorithms, one based on the k-d Tree pioneered in robotics applications and the other based on Spatial Hash Tables used in computer gaming and animation, use, at worst, an initial O(N log N) preprocessing pass (where N is the number of tracked objects) to build large O(N) spatial data structures that substantially reduce the required number of O(N^2) computations, substituting linear memory usage for quadratic processing time. The filters have been implemented as Open Services Gateway initiative (OSGi) plug-ins for the Continuous Anomalous Orbital Situation Discriminator (CAOS-D) conjunction analysis architecture. We have demonstrated the effectiveness, efficiency, and scalability of the techniques using a catalog of 100,000 objects, an analysis window of one day, on a 64-core computer with 1TB shared memory. Each algorithm can process the full catalog in 6 minutes or less, almost a twenty-fold performance improvement over the baseline implementation running on the same machine. We will present an overview of the algorithms and results that demonstrate the scalability of our concepts.
Design of an FPGA-Based Algorithm for Real-Time Solutions of Statistics-Based Positioning

PubMed Central

DeWitt, Don; Johnson-Williams, Nathan G.; Miyaoka, Robert S.; Li, Xiaoli; Lockhart, Cate; Lewellen, Tom K.; Hauck, Scott

2010-01-01

We report on the implementation of an algorithm and hardware platform to allow real-time processing of the statistics-based positioning (SBP) method for continuous miniature crystal element (cMiCE) detectors. The SBP method allows an intrinsic spatial resolution of ~1.6 mm FWHM to be achieved using our cMiCE design. Previous SBP solutions have required a postprocessing procedure due to the computation and memory intensive nature of SBP. This new implementation takes advantage of a combination of algebraic simplifications, conversion to fixed-point math, and a hierarchal search technique to greatly accelerate the algorithm. For the presented seven stage, 127 × 127 bin LUT implementation, these algorithm improvements result in a reduction from >7 × 106 floating-point operations per event for an exhaustive search to < 5 × 103 integer operations per event. Simulations show nearly identical FWHM positioning resolution for this accelerated SBP solution, and positioning differences of <0.1 mm from the exhaustive search solution. A pipelined field programmable gate array (FPGA) implementation of this optimized algorithm is able to process events in excess of 250 K events per second, which is greater than the maximum expected coincidence rate for an individual detector. In contrast with all detectors being processed at a centralized host, as in the current system, a separate FPGA is available at each detector, thus dividing the computational load. These methods allow SBP results to be calculated in real-time and to be presented to the image generation components in real-time. A hardware implementation has been developed using a commercially available prototype board. PMID:21197135
The pseudo-compartment method for coupling partial differential equation and compartment-based models of diffusion.

PubMed

Yates, Christian A; Flegg, Mark B

2015-05-06

Spatial reaction-diffusion models have been employed to describe many emergent phenomena in biological systems. The modelling technique most commonly adopted in the literature implements systems of partial differential equations (PDEs), which assumes there are sufficient densities of particles that a continuum approximation is valid. However, owing to recent advances in computational power, the simulation and therefore postulation, of computationally intensive individual-based models has become a popular way to investigate the effects of noise in reaction-diffusion systems in which regions of low copy numbers exist. The specific stochastic models with which we shall be concerned in this manuscript are referred to as 'compartment-based' or 'on-lattice'. These models are characterized by a discretization of the computational domain into a grid/lattice of 'compartments'. Within each compartment, particles are assumed to be well mixed and are permitted to react with other particles within their compartment or to transfer between neighbouring compartments. Stochastic models provide accuracy, but at the cost of significant computational resources. For models that have regions of both low and high concentrations, it is often desirable, for reasons of efficiency, to employ coupled multi-scale modelling paradigms. In this work, we develop two hybrid algorithms in which a PDE in one region of the domain is coupled to a compartment-based model in the other. Rather than attempting to balance average fluxes, our algorithms answer a more fundamental question: 'how are individual particles transported between the vastly different model descriptions?' First, we present an algorithm derived by carefully redefining the continuous PDE concentration as a probability distribution. While this first algorithm shows very strong convergence to analytical solutions of test problems, it can be cumbersome to simulate. Our second algorithm is a simplified and more efficient implementation of the first, it is derived in the continuum limit over the PDE region alone. We test our hybrid methods for functionality and accuracy in a variety of different scenarios by comparing the averaged simulations with analytical solutions of PDEs for mean concentrations. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
Pre-Hardware Optimization and Implementation Of Fast Optics Closed Control Loop Algorithms

NASA Technical Reports Server (NTRS)

Kizhner, Semion; Lyon, Richard G.; Herman, Jay R.; Abuhassan, Nader

2004-01-01

One of the main heritage tools used in scientific and engineering data spectrum analysis is the Fourier Integral Transform and its high performance digital equivalent - the Fast Fourier Transform (FFT). The FFT is particularly useful in two-dimensional (2-D) image processing (FFT2) within optical systems control. However, timing constraints of a fast optics closed control loop would require a supercomputer to run the software implementation of the FFT2 and its inverse, as well as other image processing representative algorithm, such as numerical image folding and fringe feature extraction. A laboratory supercomputer is not always available even for ground operations and is not feasible for a night project. However, the computationally intensive algorithms still warrant alternative implementation using reconfigurable computing technologies (RC) such as Digital Signal Processors (DSP) and Field Programmable Gate Arrays (FPGA), which provide low cost compact super-computing capabilities. We present a new RC hardware implementation and utilization architecture that significantly reduces the computational complexity of a few basic image-processing algorithm, such as FFT2, image folding and phase diversity for the NASA Solar Viewing Interferometer Prototype (SVIP) using a cluster of DSPs and FPGAs. The DSP cluster utilization architecture also assures avoidance of a single point of failure, while using commercially available hardware. This, combined with the control algorithms pre-hardware optimization, or the first time allows construction of image-based 800 Hertz (Hz) optics closed control loops on-board a spacecraft, based on the SVIP ground instrument. That spacecraft is the proposed Earth Atmosphere Solar Occultation Imager (EASI) to study greenhouse gases CO2, C2H, H2O, O3, O2, N2O from Lagrange-2 point in space. This paper provides an advanced insight into a new type of science capabilities for future space exploration missions based on on-board image processing for control and for robotics missions using vision sensors. It presents a top-level description of technologies required for the design and construction of SVIP and EASI and to advance the spatial-spectral imaging and large-scale space interferometry science and engineering.
Comparison of optimization strategy and similarity metric in atlas-to-subject registration using statistical deformation model

NASA Astrophysics Data System (ADS)

Otake, Y.; Murphy, R. J.; Grupp, R. B.; Sato, Y.; Taylor, R. H.; Armand, M.

2015-03-01

A robust atlas-to-subject registration using a statistical deformation model (SDM) is presented. The SDM uses statistics of voxel-wise displacement learned from pre-computed deformation vectors of a training dataset. This allows an atlas instance to be directly translated into an intensity volume and compared with a patient's intensity volume. Rigid and nonrigid transformation parameters were simultaneously optimized via the Covariance Matrix Adaptation - Evolutionary Strategy (CMA-ES), with image similarity used as the objective function. The algorithm was tested on CT volumes of the pelvis from 55 female subjects. A performance comparison of the CMA-ES and Nelder-Mead downhill simplex optimization algorithms with the mutual information and normalized cross correlation similarity metrics was conducted. Simulation studies using synthetic subjects were performed, as well as leave-one-out cross validation studies. Both studies suggested that mutual information and CMA-ES achieved the best performance. The leave-one-out test demonstrated 4.13 mm error with respect to the true displacement field, and 26,102 function evaluations in 180 seconds, on average.
Edge Pushing is Equivalent to Vertex Elimination for Computing Hessians

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Mu; Pothen, Alex; Hovland, Paul

We prove the equivalence of two different Hessian evaluation algorithms in AD. The first is the Edge Pushing algorithm of Gower and Mello, which may be viewed as a second order Reverse mode algorithm for computing the Hessian. In earlier work, we have derived the Edge Pushing algorithm by exploiting a Reverse mode invariant based on the concept of live variables in compiler theory. The second algorithm is based on eliminating vertices in a computational graph of the gradient, in which intermediate variables are successively eliminated from the graph, and the weights of the edges are updated suitably. We provemore » that if the vertices are eliminated in a reverse topological order while preserving symmetry in the computational graph of the gradient, then the Vertex Elimination algorithm and the Edge Pushing algorithm perform identical computations. In this sense, the two algorithms are equivalent. This insight that unifies two seemingly disparate approaches to Hessian computations could lead to improved algorithms and implementations for computing Hessians. Read More: http://epubs.siam.org/doi/10.1137/1.9781611974690.ch11« less
Efficient visibility-driven medical image visualisation via adaptive binned visibility histogram.

PubMed

Jung, Younhyun; Kim, Jinman; Kumar, Ashnil; Feng, David Dagan; Fulham, Michael

2016-07-01

'Visibility' is a fundamental optical property that represents the observable, by users, proportion of the voxels in a volume during interactive volume rendering. The manipulation of this 'visibility' improves the volume rendering processes; for instance by ensuring the visibility of regions of interest (ROIs) or by guiding the identification of an optimal rendering view-point. The construction of visibility histograms (VHs), which represent the distribution of all the visibility of all voxels in the rendered volume, enables users to explore the volume with real-time feedback about occlusion patterns among spatially related structures during volume rendering manipulations. Volume rendered medical images have been a primary beneficiary of VH given the need to ensure that specific ROIs are visible relative to the surrounding structures, e.g. the visualisation of tumours that may otherwise be occluded by neighbouring structures. VH construction and its subsequent manipulations, however, are computationally expensive due to the histogram binning of the visibilities. This limits the real-time application of VH to medical images that have large intensity ranges and volume dimensions and require a large number of histogram bins. In this study, we introduce an efficient adaptive binned visibility histogram (AB-VH) in which a smaller number of histogram bins are used to represent the visibility distribution of the full VH. We adaptively bin medical images by using a cluster analysis algorithm that groups the voxels according to their intensity similarities into a smaller subset of bins while preserving the distribution of the intensity range of the original images. We increase efficiency by exploiting the parallel computation and multiple render targets (MRT) extension of the modern graphical processing units (GPUs) and this enables efficient computation of the histogram. We show the application of our method to single-modality computed tomography (CT), magnetic resonance (MR) imaging and multi-modality positron emission tomography-CT (PET-CT). In our experiments, the AB-VH markedly improved the computational efficiency for the VH construction and thus improved the subsequent VH-driven volume manipulations. This efficiency was achieved without major degradation in the VH visually and numerical differences between the AB-VH and its full-bin counterpart. We applied several variants of the K-means clustering algorithm with varying Ks (the number of clusters) and found that higher values of K resulted in better performance at a lower computational gain. The AB-VH also had an improved performance when compared to the conventional method of down-sampling of the histogram bins (equal binning) for volume rendering visualisation. Copyright © 2016 Elsevier Ltd. All rights reserved.
Algorithmic Mechanism Design of Evolutionary Computation.

PubMed

Pei, Yan

2015-01-01

We consider algorithmic design, enhancement, and improvement of evolutionary computation as a mechanism design problem. All individuals or several groups of individuals can be considered as self-interested agents. The individuals in evolutionary computation can manipulate parameter settings and operations by satisfying their own preferences, which are defined by an evolutionary computation algorithm designer, rather than by following a fixed algorithm rule. Evolutionary computation algorithm designers or self-adaptive methods should construct proper rules and mechanisms for all agents (individuals) to conduct their evolution behaviour correctly in order to definitely achieve the desired and preset objective(s). As a case study, we propose a formal framework on parameter setting, strategy selection, and algorithmic design of evolutionary computation by considering the Nash strategy equilibrium of a mechanism design in the search process. The evaluation results present the efficiency of the framework. This primary principle can be implemented in any evolutionary computation algorithm that needs to consider strategy selection issues in its optimization process. The final objective of our work is to solve evolutionary computation design as an algorithmic mechanism design problem and establish its fundamental aspect by taking this perspective. This paper is the first step towards achieving this objective by implementing a strategy equilibrium solution (such as Nash equilibrium) in evolutionary computation algorithm.
Algorithmic Mechanism Design of Evolutionary Computation

PubMed Central

2015-01-01

We consider algorithmic design, enhancement, and improvement of evolutionary computation as a mechanism design problem. All individuals or several groups of individuals can be considered as self-interested agents. The individuals in evolutionary computation can manipulate parameter settings and operations by satisfying their own preferences, which are defined by an evolutionary computation algorithm designer, rather than by following a fixed algorithm rule. Evolutionary computation algorithm designers or self-adaptive methods should construct proper rules and mechanisms for all agents (individuals) to conduct their evolution behaviour correctly in order to definitely achieve the desired and preset objective(s). As a case study, we propose a formal framework on parameter setting, strategy selection, and algorithmic design of evolutionary computation by considering the Nash strategy equilibrium of a mechanism design in the search process. The evaluation results present the efficiency of the framework. This primary principle can be implemented in any evolutionary computation algorithm that needs to consider strategy selection issues in its optimization process. The final objective of our work is to solve evolutionary computation design as an algorithmic mechanism design problem and establish its fundamental aspect by taking this perspective. This paper is the first step towards achieving this objective by implementing a strategy equilibrium solution (such as Nash equilibrium) in evolutionary computation algorithm. PMID:26257777
A configurable distributed high-performance computing framework for satellite's TDI-CCD imaging simulation

NASA Astrophysics Data System (ADS)

Xue, Bo; Mao, Bingjing; Chen, Xiaomei; Ni, Guoqiang

2010-11-01

This paper renders a configurable distributed high performance computing(HPC) framework for TDI-CCD imaging simulation. It uses strategy pattern to adapt multi-algorithms. Thus, this framework help to decrease the simulation time with low expense. Imaging simulation for TDI-CCD mounted on satellite contains four processes: 1) atmosphere leads degradation, 2) optical system leads degradation, 3) electronic system of TDI-CCD leads degradation and re-sampling process, 4) data integration. Process 1) to 3) utilize diversity data-intensity algorithms such as FFT, convolution and LaGrange Interpol etc., which requires powerful CPU. Even uses Intel Xeon X5550 processor, regular series process method takes more than 30 hours for a simulation whose result image size is 1500 * 1462. With literature study, there isn't any mature distributing HPC framework in this field. Here we developed a distribute computing framework for TDI-CCD imaging simulation, which is based on WCF[1], uses Client/Server (C/S) layer and invokes the free CPU resources in LAN. The server pushes the process 1) to 3) tasks to those free computing capacity. Ultimately we rendered the HPC in low cost. In the computing experiment with 4 symmetric nodes and 1 server , this framework reduced about 74% simulation time. Adding more asymmetric nodes to the computing network, the time decreased namely. In conclusion, this framework could provide unlimited computation capacity in condition that the network and task management server are affordable. And this is the brand new HPC solution for TDI-CCD imaging simulation and similar applications.
xPerm: fast index canonicalization for tensor computer algebra

NASA Astrophysics Data System (ADS)

Martín-García, José M.

2008-10-01

We present a very fast implementation of the Butler-Portugal algorithm for index canonicalization with respect to permutation symmetries. It is called xPerm, and has been written as a combination of a Mathematica package and a C subroutine. The latter performs the most demanding parts of the computations and can be linked from any other program or computer algebra system. We demonstrate with tests and timings the effectively polynomial performance of the Butler-Portugal algorithm with respect to the number of indices, though we also show a case in which it is exponential. Our implementation handles generic tensorial expressions with several dozen indices in hundredths of a second, or one hundred indices in a few seconds, clearly outperforming all other current canonicalizers. The code has been already under intensive testing for several years and has been essential in recent investigations in large-scale tensor computer algebra. Program summaryProgram title: xPerm Catalogue identifier: AEBH_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEBH_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 93 582 No. of bytes in distributed program, including test data, etc.: 1 537 832 Distribution format: tar.gz Programming language: C and Mathematica (version 5.0 or higher) Computer: Any computer running C and Mathematica (version 5.0 or higher) Operating system: Linux, Unix, Windows XP, MacOS RAM:: 20 Mbyte Word size: 64 or 32 bits Classification: 1.5, 5 Nature of problem: Canonicalization of indexed expressions with respect to permutation symmetries. Solution method: The Butler-Portugal algorithm. Restrictions: Multiterm symmetries are not considered. Running time: A few seconds with generic expressions of up to 100 indices. The xPermDoc.nb notebook supplied with the distribution takes approximately one and a half hours to execute in full.
Clinical application and validation of an iterative forward projection matching algorithm for permanent brachytherapy seed localization from conebeam-CT x-ray projections.

PubMed

Pokhrel, Damodar; Murphy, Martin J; Todor, Dorin A; Weiss, Elisabeth; Williamson, Jeffrey F

2010-09-01

To experimentally validate a new algorithm for reconstructing the 3D positions of implanted brachytherapy seeds from postoperatively acquired 2D conebeam-CT (CBCT) projection images. The iterative forward projection matching (IFPM) algorithm finds the 3D seed geometry that minimizes the sum of the squared intensity differences between computed projections of an initial estimate of the seed configuration and radiographic projections of the implant. In-house machined phantoms, containing arrays of 12 and 72 seeds, respectively, are used to validate this method. Also, four 103Pd postimplant patients are scanned using an ACUITY digital simulator. Three to ten x-ray images are selected from the CBCT projection set and processed to create binary seed-only images. To quantify IFPM accuracy, the reconstructed seed positions are forward projected and overlaid on the measured seed images to find the nearest-neighbor distance between measured and computed seed positions for each image pair. Also, the estimated 3D seed coordinates are compared to known seed positions in the phantom and clinically obtained VariSeed planning coordinates for the patient data. For the phantom study, seed localization error is (0.58 +/- 0.33) mm. For all four patient cases, the mean registration error is better than 1 mm while compared against the measured seed projections. IFPM converges in 20-28 iterations, with a computation time of about 1.9-2.8 min/ iteration on a 1 GHz processor. The IFPM algorithm avoids the need to match corresponding seeds in each projection as required by standard back-projection methods. The authors' results demonstrate approximately 1 mm accuracy in reconstructing the 3D positions of brachytherapy seeds from the measured 2D projections. This algorithm also successfully localizes overlapping clustered and highly migrated seeds in the implant.
Clinical application and validation of an iterative forward projection matching algorithm for permanent brachytherapy seed localization from conebeam-CT x-ray projections

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pokhrel, Damodar; Murphy, Martin J.; Todor, Dorin A.

2010-09-15

Purpose: To experimentally validate a new algorithm for reconstructing the 3D positions of implanted brachytherapy seeds from postoperatively acquired 2D conebeam-CT (CBCT) projection images. Methods: The iterative forward projection matching (IFPM) algorithm finds the 3D seed geometry that minimizes the sum of the squared intensity differences between computed projections of an initial estimate of the seed configuration and radiographic projections of the implant. In-house machined phantoms, containing arrays of 12 and 72 seeds, respectively, are used to validate this method. Also, four {sup 103}Pd postimplant patients are scanned using an ACUITY digital simulator. Three to ten x-ray images are selectedmore » from the CBCT projection set and processed to create binary seed-only images. To quantify IFPM accuracy, the reconstructed seed positions are forward projected and overlaid on the measured seed images to find the nearest-neighbor distance between measured and computed seed positions for each image pair. Also, the estimated 3D seed coordinates are compared to known seed positions in the phantom and clinically obtained VariSeed planning coordinates for the patient data. Results: For the phantom study, seed localization error is (0.58{+-}0.33) mm. For all four patient cases, the mean registration error is better than 1 mm while compared against the measured seed projections. IFPM converges in 20-28 iterations, with a computation time of about 1.9-2.8 min/iteration on a 1 GHz processor. Conclusions: The IFPM algorithm avoids the need to match corresponding seeds in each projection as required by standard back-projection methods. The authors' results demonstrate {approx}1 mm accuracy in reconstructing the 3D positions of brachytherapy seeds from the measured 2D projections. This algorithm also successfully localizes overlapping clustered and highly migrated seeds in the implant.« less
A review on quantum search algorithms

NASA Astrophysics Data System (ADS)

Giri, Pulak Ranjan; Korepin, Vladimir E.

2017-12-01

The use of superposition of states in quantum computation, known as quantum parallelism, has significant advantage in terms of speed over the classical computation. It is evident from the early invented quantum algorithms such as Deutsch's algorithm, Deutsch-Jozsa algorithm and its variation as Bernstein-Vazirani algorithm, Simon algorithm, Shor's algorithms, etc. Quantum parallelism also significantly speeds up the database search algorithm, which is important in computer science because it comes as a subroutine in many important algorithms. Quantum database search of Grover achieves the task of finding the target element in an unsorted database in a time quadratically faster than the classical computer. We review Grover's quantum search algorithms for a singe and multiple target elements in a database. The partial search algorithm of Grover and Radhakrishnan and its optimization by Korepin called GRK algorithm are also discussed.
A system for the delivery of programmable, adaptive stimulation intensity envelopes for drop foot correction applications.

PubMed

Breen, P P; O'Keeffe, D T; Conway, R; Lyons, G M

2006-03-01

We describe the design of an intelligent drop foot stimulator unit for use in conjunction with a commercial neuromuscular electrical nerve stimulation (NMES) unit, the NT2000. The developed micro-controller unit interfaces to a personal computer (PC) and a graphical user interface (GUI) allows the clinician to graphically specify the shape of the stimulation intensity envelope required for a subject undergoing drop foot correction. The developed unit is based on the ADuC812S micro-controller evaluation board from Analog Devices and uses two force sensitive resistor (FSR) based foot-switches to control application of stimulus. The unit has the ability to display to the clinician how the stimulus intensity envelope is being delivered during walking using a data capture capability. The developed system has a built-in algorithm to dynamically adjust the delivery of stimulus to reflect changes both within the gait cycle and from cycle to cycle. Thus, adaptive control of stimulus intensity is achieved.
Direct recovery of regional tracer kinetics from temporally inconsistent dynamic ECT projections using dimension-reduced time-activity basis

NASA Astrophysics Data System (ADS)

Maltz, Jonathan S.

2000-11-01

We present an algorithm of reduced computational cost which is able to estimate kinetic model parameters directly from dynamic ECT sinograms made up of temporally inconsistent projections. The algorithm exploits the extreme degree of parameter redundancy inherent in linear combinations of the exponential functions which represent the modes of first-order compartmental systems. The singular value decomposition is employed to find a small set of orthogonal functions, the linear combinations of which are able to accurately represent all modes within the physiologically anticipated range in a given study. The reduced-dimension basis is formed as the convolution of this orthogonal set with a measured input function. The Moore-Penrose pseudoinverse is used to find coefficients of this basis. Algorithm performance is evaluated at realistic count rates using MCAT phantom and clinical 99mTc-teboroxime myocardial study data. Phantom data are modelled as originating from a Poisson process. For estimates recovered from a single slice projection set containing 2.5×105 total counts, recovered tissue responses compare favourably with those obtained using more computationally intensive methods. The corresponding kinetic parameter estimates (coefficients of the new basis) exhibit negligible bias, while parameter variances are low, falling within 30% of the Cramér-Rao lower bound.
IMAGEP - A FORTRAN ALGORITHM FOR DIGITAL IMAGE PROCESSING

NASA Technical Reports Server (NTRS)

Roth, D. J.

1994-01-01

IMAGEP is a FORTRAN computer algorithm containing various image processing, analysis, and enhancement functions. It is a keyboard-driven program organized into nine subroutines. Within the subroutines are other routines, also, selected via keyboard. Some of the functions performed by IMAGEP include digitization, storage and retrieval of images; image enhancement by contrast expansion, addition and subtraction, magnification, inversion, and bit shifting; display and movement of cursor; display of grey level histogram of image; and display of the variation of grey level intensity as a function of image position. This algorithm has possible scientific, industrial, and biomedical applications in material flaw studies, steel and ore analysis, and pathology, respectively. IMAGEP is written in VAX FORTRAN for DEC VAX series computers running VMS. The program requires the use of a Grinnell 274 image processor which can be obtained from Mark McCloud Associates, Campbell, CA. An object library of the required GMR series software is included on the distribution media. IMAGEP requires 1Mb of RAM for execution. The standard distribution medium for this program is a 1600 BPI 9track magnetic tape in VAX FILES-11 format. It is also available on a TK50 tape cartridge in VAX FILES-11 format. This program was developed in 1991. DEC, VAX, VMS, and TK50 are trademarks of Digital Equipment Corporation.
Reverse engineering and analysis of large genome-scale gene networks

PubMed Central

Aluru, Maneesha; Zola, Jaroslaw; Nettleton, Dan; Aluru, Srinivas

2013-01-01

Reverse engineering the whole-genome networks of complex multicellular organisms continues to remain a challenge. While simpler models easily scale to large number of genes and gene expression datasets, more accurate models are compute intensive limiting their scale of applicability. To enable fast and accurate reconstruction of large networks, we developed Tool for Inferring Network of Genes (TINGe), a parallel mutual information (MI)-based program. The novel features of our approach include: (i) B-spline-based formulation for linear-time computation of MI, (ii) a novel algorithm for direct permutation testing and (iii) development of parallel algorithms to reduce run-time and facilitate construction of large networks. We assess the quality of our method by comparison with ARACNe (Algorithm for the Reconstruction of Accurate Cellular Networks) and GeneNet and demonstrate its unique capability by reverse engineering the whole-genome network of Arabidopsis thaliana from 3137 Affymetrix ATH1 GeneChips in just 9 min on a 1024-core cluster. We further report on the development of a new software Gene Network Analyzer (GeNA) for extracting context-specific subnetworks from a given set of seed genes. Using TINGe and GeNA, we performed analysis of 241 Arabidopsis AraCyc 8.0 pathways, and the results are made available through the web. PMID:23042249
Structured surface reflector design for oblique incidence beam splitter at 610 GHz.

PubMed

Defrance, F; Casaletti, M; Sarrazin, J; Wiedner, M C; Gibson, H; Gay, G; Lefèvre, R; Delorme, Y

2016-09-05

An iterative alternate projection-based algorithm is developed to design structured surface reflectors to operate as beam splitters at GHz and THz frequencies. To validate the method, a surface profile is determined to achieve a reflector at 610 GHz that generates four equal-intensity beams towards desired directions of ±12.6° with respect to the specular reflection axis. A prototype is fabricated and the beam splitter behavior is experimentally demonstrated. Measurements confirm a good agreement (within 1%) with computer simulations using Feko, validating the method. The beam splitter at 610 GHz has a measured efficiency of 78% under oblique incidence illumination that ensures a similar intensity between the four reflected beams (variation of about 1%).

Optimization of beam orientation in radiotherapy using planar geometry

NASA Astrophysics Data System (ADS)

Haas, O. C. L.; Burnham, K. J.; Mills, J. A.

1998-08-01

This paper proposes a new geometrical formulation of the coplanar beam orientation problem combined with a hybrid multiobjective genetic algorithm. The approach is demonstrated by optimizing the beam orientation in two dimensions, with the objectives being formulated using planar geometry. The traditional formulation of the objectives associated with the organs at risk has been modified to account for the use of complex dose delivery techniques such as beam intensity modulation. The new algorithm attempts to replicate the approach of a treatment planner whilst reducing the amount of computation required. Hybrid genetic search operators have been developed to improve the performance of the genetic algorithm by exploiting problem-specific features. The multiobjective genetic algorithm is formulated around the concept of Pareto optimality which enables the algorithm to search in parallel for different objectives. When the approach is applied without constraining the number of beams, the solution produces an indication of the minimum number of beams required. It is also possible to obtain non-dominated solutions for various numbers of beams, thereby giving the clinicians a choice in terms of the number of beams as well as in the orientation of these beams.
Pruning Rogue Taxa Improves Phylogenetic Accuracy: An Efficient Algorithm and Webservice

PubMed Central

Aberer, Andre J.; Krompass, Denis; Stamatakis, Alexandros

2013-01-01

Abstract The presence of rogue taxa (rogues) in a set of trees can frequently have a negative impact on the results of a bootstrap analysis (e.g., the overall support in consensus trees). We introduce an efficient graph-based algorithm for rogue taxon identification as well as an interactive webservice implementing this algorithm. Compared with our previous method, the new algorithm is up to 4 orders of magnitude faster, while returning qualitatively identical results. Because of this significant improvement in scalability, the new algorithm can now identify substantially more complex and compute-intensive rogue taxon constellations. On a large and diverse collection of real-world data sets, we show that our method yields better supported reduced/pruned consensus trees than any competing rogue taxon identification method. Using the parallel version of our open-source code, we successfully identified rogue taxa in a set of 100 trees with 116 334 taxa each. For simulated data sets, we show that when removing/pruning rogue taxa with our method from a tree set, we consistently obtain bootstrap consensus trees as well as maximum-likelihood trees that are topologically closer to the respective true trees. PMID:22962004
Pruning rogue taxa improves phylogenetic accuracy: an efficient algorithm and webservice.

PubMed

Aberer, Andre J; Krompass, Denis; Stamatakis, Alexandros

2013-01-01

The presence of rogue taxa (rogues) in a set of trees can frequently have a negative impact on the results of a bootstrap analysis (e.g., the overall support in consensus trees). We introduce an efficient graph-based algorithm for rogue taxon identification as well as an interactive webservice implementing this algorithm. Compared with our previous method, the new algorithm is up to 4 orders of magnitude faster, while returning qualitatively identical results. Because of this significant improvement in scalability, the new algorithm can now identify substantially more complex and compute-intensive rogue taxon constellations. On a large and diverse collection of real-world data sets, we show that our method yields better supported reduced/pruned consensus trees than any competing rogue taxon identification method. Using the parallel version of our open-source code, we successfully identified rogue taxa in a set of 100 trees with 116 334 taxa each. For simulated data sets, we show that when removing/pruning rogue taxa with our method from a tree set, we consistently obtain bootstrap consensus trees as well as maximum-likelihood trees that are topologically closer to the respective true trees.
Distribution majorization of corner points by reinforcement learning for moving object detection

NASA Astrophysics Data System (ADS)

Wu, Hao; Yu, Hao; Zhou, Dongxiang; Cheng, Yongqiang

2018-04-01

Corner points play an important role in moving object detection, especially in the case of free-moving camera. Corner points provide more accurate information than other pixels and reduce the computation which is unnecessary. Previous works only use intensity information to locate the corner points, however, the information that former and the last frames provided also can be used. We utilize the information to focus on more valuable area and ignore the invaluable area. The proposed algorithm is based on reinforcement learning, which regards the detection of corner points as a Markov process. In the Markov model, the video to be detected is regarded as environment, the selections of blocks for one corner point are regarded as actions and the performance of detection is regarded as state. Corner points are assigned to be the blocks which are seperated from original whole image. Experimentally, we select a conventional method which uses marching and Random Sample Consensus algorithm to obtain objects as the main framework and utilize our algorithm to improve the result. The comparison between the conventional method and the same one with our algorithm show that our algorithm reduce 70% of the false detection.
Algorithmic processing of intrinsic signals in affixed transmission speckle analysis (ATSA) (Conference Presentation)

NASA Astrophysics Data System (ADS)

Ghijsen, Michael T.; Tromberg, Bruce J.

2017-03-01

Affixed Transmission Speckle Analysis (ATSA) is a method recently developed to measure blood flow that is based on laser speckle imaging miniaturized into a clip-on form factor the size of a pulse-oximeter. Measuring at a rate of 250 Hz, ATSA is capable or obtaining the cardiac waveform in blood flow data, referred to as the Speckle-Plethysmogram (SPG). ATSA is also capable of simultaneously measuring the Photoplethysmogram (PPG), a more conventional signal related to light intensity. In this work we present several novel algorithms for extracting physiologically relevant information from the combined SPG-PPG waveform data. First we show that there is a slight time-delay between the SPG and PPG that can be extracted computationally. Second, we present a set of frequency domain algorithms that measure harmonic content on pulse-by-pulse basis for both the SPG and PPG. Finally, we apply these algorithms to data obtained from a set of subjects including healthy controls and individuals with heightened cardiovascular risk. We hypothesize that the time-delay and frequency content are correlated with cardiovascular health; specifically with vascular stiffening.
An underwater turbulence degraded image restoration algorithm

NASA Astrophysics Data System (ADS)

Furhad, Md. Hasan; Tahtali, Murat; Lambert, Andrew

2017-09-01

Underwater turbulence occurs due to random fluctuations of temperature and salinity in the water. These fluctuations are responsible for variations in water density, refractive index and attenuation. These impose random geometric distortions, spatio-temporal varying blur, limited range visibility and limited contrast on the acquired images. There are some restoration techniques developed to address this problem, such as image registration based, lucky region based and centroid-based image restoration algorithms. Although these methods demonstrate better results in terms of removing turbulence, they require computationally intensive image registration, higher CPU load and memory allocations. Thus, in this paper, a simple patch based dictionary learning algorithm is proposed to restore the image by alleviating the costly image registration step. Dictionary learning is a machine learning technique which builds a dictionary of non-zero atoms derived from the sparse representation of an image or signal. The image is divided into several patches and the sharp patches are detected from them. Next, dictionary learning is performed on these patches to estimate the restored image. Finally, an image deconvolution algorithm is employed on the estimated restored image to remove noise that still exists.
High dynamic range algorithm based on HSI color space

NASA Astrophysics Data System (ADS)

Zhang, Jiancheng; Liu, Xiaohua; Dong, Liquan; Zhao, Yuejin; Liu, Ming

2014-10-01

This paper presents a High Dynamic Range algorithm based on HSI color space. To keep hue and saturation of original image and conform to human eye vision effect is the first problem, convert the input image data to HSI color space which include intensity dimensionality. To raise the speed of the algorithm is the second problem, use integral image figure out the average of every pixel intensity value under a certain scale, as local intensity component of the image, and figure out detail intensity component. To adjust the overall image intensity is the third problem, we can get an S type curve according to the original image information, adjust the local intensity component according to the S type curve. To enhance detail information is the fourth problem, adjust the detail intensity component according to the curve designed in advance. The weighted sum of local intensity component after adjusted and detail intensity component after adjusted is final intensity. Converting synthetic intensity and other two dimensionality to output color space can get final processed image.
Low Latency Workflow Scheduling and an Application of Hyperspectral Brightness Temperatures

NASA Astrophysics Data System (ADS)

Nguyen, P. T.; Chapman, D. R.; Halem, M.

2012-12-01

New system analytics for Big Data computing holds the promise of major scientific breakthroughs and discoveries from the exploration and mining of the massive data sets becoming available to the science community. However, such data intensive scientific applications face severe challenges in accessing, managing and analyzing petabytes of data. While the Hadoop MapReduce environment has been successfully applied to data intensive problems arising in business, there are still many scientific problem domains where limitations in the functionality of MapReduce systems prevent its wide adoption by those communities. This is mainly because MapReduce does not readily support the unique science discipline needs such as special science data formats, graphic and computational data analysis tools, maintaining high degrees of computational accuracies, and interfacing with application's existing components across heterogeneous computing processors. We address some of these limitations by exploiting the MapReduce programming model for satellite data intensive scientific problems and address scalability, reliability, scheduling, and data management issues when dealing with climate data records and their complex observational challenges. In addition, we will present techniques to support the unique Earth science discipline needs such as dealing with special science data formats (HDF and NetCDF). We have developed a Hadoop task scheduling algorithm that improves latency by 2x for a scientific workflow including the gridding of the EOS AIRS hyperspectral Brightness Temperatures (BT). This workflow processing algorithm has been tested at the Multicore Computing Center private Hadoop based Intel Nehalem cluster, as well as in a virtual mode under the Open Source Eucalyptus cloud. The 55TB AIRS hyperspectral L1b Brightness Temperature record has been gridded at the resolution of 0.5x1.0 degrees, and we have computed a 0.9 annual anti-correlation to the El Nino Southern oscillation in the Nino 4 region, as well as a 1.9 Kelvin decadal Arctic warming in the 4u and 12u spectral regions. Additionally, we will present the frequency of extreme global warming events by the use of a normalized maximum BT in a grid cell relative to its local standard deviation. A low-latency Hadoop scheduling environment maintains data integrity and fault tolerance in a MapReduce data intensive Cloud environment while improving the "time to solution" metric by 35% when compared to a more traditional parallel processing system for the same dataset. Our next step will be to improve the usability of our Hadoop task scheduling system, to enable rapid prototyping of data intensive experiments by means of processing "kernels". We will report on the performance and experience of implementing these experiments on the NEX testbed, and propose the use of a graphical directed acyclic graph (DAG) interface to help us develop on-demand scientific experiments. Our workflow system works within Hadoop infrastructure as a replacement for the FIFO or FairScheduler, thus the use of Apache "Pig" latin or other Apache tools may also be worth investigating on the NEX system to improve the usability of our workflow scheduling infrastructure for rapid experimentation.
Real-time processing of radar return on a parallel computer

NASA Technical Reports Server (NTRS)

Aalfs, David D.

1992-01-01

NASA is working with the FAA to demonstrate the feasibility of pulse Doppler radar as a candidate airborne sensor to detect low altitude windshears. The need to provide the pilot with timely information about possible hazards has motivated a demand for real-time processing of a radar return. Investigated here is parallel processing as a means of accommodating the high data rates required. A PC based parallel computer, called the transputer, is used to investigate issues in real time concurrent processing of radar signals. A transputer network is made up of an array of single instruction stream processors that can be networked in a variety of ways. They are easily reconfigured and software development is largely independent of the particular network topology. The performance of the transputer is evaluated in light of the computational requirements. A number of algorithms have been implemented on the transputers in OCCAM, a language specially designed for parallel processing. These include signal processing algorithms such as the Fast Fourier Transform (FFT), pulse-pair, and autoregressive modelling, as well as routing software to support concurrency. The most computationally intensive task is estimating the spectrum. Two approaches have been taken on this problem, the first and most conventional of which is to use the FFT. By using table look-ups for the basis function and other optimizing techniques, an algorithm has been developed that is sufficient for real time. The other approach is to model the signal as an autoregressive process and estimate the spectrum based on the model coefficients. This technique is attractive because it does not suffer from the spectral leakage problem inherent in the FFT. Benchmark tests indicate that autoregressive modeling is feasible in real time.
Harmony search optimization for HDR prostate brachytherapy

NASA Astrophysics Data System (ADS)

Panchal, Aditya

In high dose-rate (HDR) prostate brachytherapy, multiple catheters are inserted interstitially into the target volume. The process of treating the prostate involves calculating and determining the best dose distribution to the target and organs-at-risk by means of optimizing the time that the radioactive source dwells at specified positions within the catheters. It is the goal of this work to investigate the use of a new optimization algorithm, known as Harmony Search, in order to optimize dwell times for HDR prostate brachytherapy. The new algorithm was tested on 9 different patients and also compared with the genetic algorithm. Simulations were performed to determine the optimal value of the Harmony Search parameters. Finally, multithreading of the simulation was examined to determine potential benefits. First, a simulation environment was created using the Python programming language and the wxPython graphical interface toolkit, which was necessary to run repeated optimizations. DICOM RT data from Varian BrachyVision was parsed and used to obtain patient anatomy and HDR catheter information. Once the structures were indexed, the volume of each structure was determined and compared to the original volume calculated in BrachyVision for validation. Dose was calculated using the AAPM TG-43 point source model of the GammaMed 192Ir HDR source and was validated against Varian BrachyVision. A DVH-based objective function was created and used for the optimization simulation. Harmony Search and the genetic algorithm were implemented as optimization algorithms for the simulation and were compared against each other. The optimal values for Harmony Search parameters (Harmony Memory Size [HMS], Harmony Memory Considering Rate [HMCR], and Pitch Adjusting Rate [PAR]) were also determined. Lastly, the simulation was modified to use multiple threads of execution in order to achieve faster computational times. Experimental results show that the volume calculation that was implemented in this thesis was within 2% of the values computed by Varian BrachyVision for the prostate, within 3% for the rectum and bladder and 6% for the urethra. The calculation of dose compared to BrachyVision was determined to be different by only 0.38%. Isodose curves were also generated and were found to be similar to BrachyVision. The comparison between Harmony Search and genetic algorithm showed that Harmony Search was over 4 times faster when compared over multiple data sets. The optimal Harmony Memory Size was found to be 5 or lower; the Harmony Memory Considering Rate was determined to be 0.95, and the Pitch Adjusting Rate was found to be 0.9. Ultimately, the effect of multithreading showed that as intensive computations such as optimization and dose calculation are involved, the threads of execution scale with the number of processors, achieving a speed increase proportional to the number of processor cores. In conclusion, this work showed that Harmony Search is a viable alternative to existing algorithms for use in HDR prostate brachytherapy optimization. Coupled with the optimal parameters for the algorithm and a multithreaded simulation, this combination has the capability to significantly decrease the time spent on minimizing optimization problems in the clinic that are time intensive, such as brachytherapy, IMRT and beam angle optimization.
Novel approach for image skeleton and distance transformation parallel algorithms

NASA Astrophysics Data System (ADS)

Qing, Kent P.; Means, Robert W.

1994-05-01

Image Understanding is more important in medical imaging than ever, particularly where real-time automatic inspection, screening and classification systems are installed. Skeleton and distance transformations are among the common operations that extract useful information from binary images and aid in Image Understanding. The distance transformation describes the objects in an image by labeling every pixel in each object with the distance to its nearest boundary. The skeleton algorithm starts from the distance transformation and finds the set of pixels that have a locally maximum label. The distance algorithm has to scan the entire image several times depending on the object width. For each pixel, the algorithm must access the neighboring pixels and find the maximum distance from the nearest boundary. It is a computational and memory access intensive procedure. In this paper, we propose a novel parallel approach to the distance transform and skeleton algorithms using the latest VLSI high- speed convolutional chips such as HNC's ViP. The algorithm speed is dependent on the object's width and takes (k + [(k-1)/3]) * 7 milliseconds for a 512 X 512 image with k being the maximum distance of the largest object. All objects in the image will be skeletonized at the same time in parallel.
SU-E-T-175: Clinical Evaluations of Monte Carlo-Based Inverse Treatment Plan Optimization for Intensity Modulated Radiotherapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chi, Y; Li, Y; Tian, Z

2015-06-15

Purpose: Pencil-beam or superposition-convolution type dose calculation algorithms are routinely used in inverse plan optimization for intensity modulated radiation therapy (IMRT). However, due to their limited accuracy in some challenging cases, e.g. lung, the resulting dose may lose its optimality after being recomputed using an accurate algorithm, e.g. Monte Carlo (MC). It is the objective of this study to evaluate the feasibility and advantages of a new method to include MC in the treatment planning process. Methods: We developed a scheme to iteratively perform MC-based beamlet dose calculations and plan optimization. In the MC stage, a GPU-based dose engine wasmore » used and the particle number sampled from a beamlet was proportional to its optimized fluence from the previous step. We tested this scheme in four lung cancer IMRT cases. For each case, the original plan dose, plan dose re-computed by MC, and dose optimized by our scheme were obtained. Clinically relevant dosimetric quantities in these three plans were compared. Results: Although the original plan achieved a satisfactory PDV dose coverage, after re-computing doses using MC method, it was found that the PTV D95% were reduced by 4.60%–6.67%. After re-optimizing these cases with our scheme, the PTV coverage was improved to the same level as in the original plan, while the critical OAR coverages were maintained to clinically acceptable levels. Regarding the computation time, it took on average 144 sec per case using only one GPU card, including both MC-based beamlet dose calculation and treatment plan optimization. Conclusion: The achieved dosimetric gains and high computational efficiency indicate the feasibility and advantages of the proposed MC-based IMRT optimization method. Comprehensive validations in more patient cases are in progress.« less
Real-time data-intensive computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Parkinson, Dilworth Y., E-mail: dyparkinson@lbl.gov; Chen, Xian; Hexemer, Alexander

2016-07-27

Today users visit synchrotrons as sources of understanding and discovery—not as sources of just light, and not as sources of data. To achieve this, the synchrotron facilities frequently provide not just light but often the entire end station and increasingly, advanced computational facilities that can reduce terabytes of data into a form that can reveal a new key insight. The Advanced Light Source (ALS) has partnered with high performance computing, fast networking, and applied mathematics groups to create a “super-facility”, giving users simultaneous access to the experimental, computational, and algorithmic resources to make this possible. This combination forms an efficientmore » closed loop, where data—despite its high rate and volume—is transferred and processed immediately and automatically on appropriate computing resources, and results are extracted, visualized, and presented to users or to the experimental control system, both to provide immediate insight and to guide decisions about subsequent experiments during beamtime. We will describe our work at the ALS ptychography, scattering, micro-diffraction, and micro-tomography beamlines.« less
Raw data normalization for a multi source inverse geometry CT system

PubMed Central

Baek, Jongduk; De Man, Bruno; Harrison, Daniel; Pelc, Norbert J.

2015-01-01

A multi-source inverse-geometry CT (MS-IGCT) system consists of a small 2D detector array and multiple x-ray sources. During data acquisition, each source is activated sequentially, and may have random source intensity fluctuations relative to their respective nominal intensity. While a conventional 3rd generation CT system uses a reference channel to monitor the source intensity fluctuation, the MS-IGCT system source illuminates a small portion of the entire field-of-view (FOV). Therefore, it is difficult for all sources to illuminate the reference channel and the projection data computed by standard normalization using flat field data of each source contains error and can cause significant artifacts. In this work, we present a raw data normalization algorithm to reduce the image artifacts caused by source intensity fluctuation. The proposed method was tested using computer simulations with a uniform water phantom and a Shepp-Logan phantom, and experimental data of an ice-filled PMMA phantom and a rabbit. The effect on image resolution and robustness of the noise were tested using MTF and standard deviation of the reconstructed noise image. With the intensity fluctuation and no correction, reconstructed images from simulation and experimental data show high frequency artifacts and ring artifacts which are removed effectively using the proposed method. It is also observed that the proposed method does not degrade the image resolution and is very robust to the presence of noise. PMID:25837090
Coupling reconstruction and motion estimation for dynamic MRI through optical flow constraint

NASA Astrophysics Data System (ADS)

Zhao, Ningning; O'Connor, Daniel; Gu, Wenbo; Ruan, Dan; Basarab, Adrian; Sheng, Ke

2018-03-01

This paper addresses the problem of dynamic magnetic resonance image (DMRI) reconstruction and motion estimation jointly. Because of the inherent anatomical movements in DMRI acquisition, reconstruction of DMRI using motion estimation/compensation (ME/MC) has been explored under the compressed sensing (CS) scheme. In this paper, by embedding the intensity based optical flow (OF) constraint into the traditional CS scheme, we are able to couple the DMRI reconstruction and motion vector estimation. Moreover, the OF constraint is employed in a specific coarse resolution scale in order to reduce the computational complexity. The resulting optimization problem is then solved using a primal-dual algorithm due to its efficiency when dealing with nondifferentiable problems. Experiments on highly accelerated dynamic cardiac MRI with multiple receiver coils validate the performance of the proposed algorithm.
THE SCREENING AND RANKING ALGORITHM FOR CHANGE-POINTS DETECTION IN MULTIPLE SAMPLES

PubMed Central

Song, Chi; Min, Xiaoyi; Zhang, Heping

2016-01-01

The chromosome copy number variation (CNV) is the deviation of genomic regions from their normal copy number states, which may associate with many human diseases. Current genetic studies usually collect hundreds to thousands of samples to study the association between CNV and diseases. CNVs can be called by detecting the change-points in mean for sequences of array-based intensity measurements. Although multiple samples are of interest, the majority of the available CNV calling methods are single sample based. Only a few multiple sample methods have been proposed using scan statistics that are computationally intensive and designed toward either common or rare change-points detection. In this paper, we propose a novel multiple sample method by adaptively combining the scan statistic of the screening and ranking algorithm (SaRa), which is computationally efficient and is able to detect both common and rare change-points. We prove that asymptotically this method can find the true change-points with almost certainty and show in theory that multiple sample methods are superior to single sample methods when shared change-points are of interest. Additionally, we report extensive simulation studies to examine the performance of our proposed method. Finally, using our proposed method as well as two competing approaches, we attempt to detect CNVs in the data from the Primary Open-Angle Glaucoma Genes and Environment study, and conclude that our method is faster and requires less information while our ability to detect the CNVs is comparable or better. PMID:28090239
GPU accelerated dynamic functional connectivity analysis for functional MRI data.

PubMed

Akgün, Devrim; Sakoğlu, Ünal; Esquivel, Johnny; Adinoff, Bryon; Mete, Mutlu

2015-07-01

Recent advances in multi-core processors and graphics card based computational technologies have paved the way for an improved and dynamic utilization of parallel computing techniques. Numerous applications have been implemented for the acceleration of computationally-intensive problems in various computational science fields including bioinformatics, in which big data problems are prevalent. In neuroimaging, dynamic functional connectivity (DFC) analysis is a computationally demanding method used to investigate dynamic functional interactions among different brain regions or networks identified with functional magnetic resonance imaging (fMRI) data. In this study, we implemented and analyzed a parallel DFC algorithm based on thread-based and block-based approaches. The thread-based approach was designed to parallelize DFC computations and was implemented in both Open Multi-Processing (OpenMP) and Compute Unified Device Architecture (CUDA) programming platforms. Another approach developed in this study to better utilize CUDA architecture is the block-based approach, where parallelization involves smaller parts of fMRI time-courses obtained by sliding-windows. Experimental results showed that the proposed parallel design solutions enabled by the GPUs significantly reduce the computation time for DFC analysis. Multicore implementation using OpenMP on 8-core processor provides up to 7.7× speed-up. GPU implementation using CUDA yielded substantial accelerations ranging from 18.5× to 157× speed-up once thread-based and block-based approaches were combined in the analysis. Proposed parallel programming solutions showed that multi-core processor and CUDA-supported GPU implementations accelerated the DFC analyses significantly. Developed algorithms make the DFC analyses more practical for multi-subject studies with more dynamic analyses. Copyright © 2015 Elsevier Ltd. All rights reserved.
Key issues of ultraviolet radiation of OH at high altitudes

NASA Astrophysics Data System (ADS)

Zhang, Yuhuai; Wan, Tian; Jiang, Jianzheng; Fan, Jing

2014-12-01

Ultraviolet (UV) emissions radiated by hydroxyl (OH) is one of the fundamental elements in the prediction of radiation signature of high-altitude and high-speed vehicle. In this work, the OH A2Σ+→ X2Π ultraviolet emission band behind the bow shock is computed under the experimental condition of the second bow-shock ultraviolet flight (BSUV-2). Four related key issues are discussed, namely, the source of hydrogen element in the high-altitude atmosphere, the formation mechanism of OH species, efficient computational algorithm of trace species in rarefied flows, and accurate calculation of OH emission spectra. Firstly, by analyzing the typical atmospheric model, the vertical distributions of the number densities of different species containing hydrogen element are given. According to the different dominating species containing hydrogen element, the atmosphere is divided into three zones, and the formation mechanism of OH species is analyzed in the different zones. The direct simulation Monte Carlo (DSMC) method and the Navier-Stokes equations are employed to compute the number densities of the different OH electronically and vibrationally excited states. Different to the previous work, the trace species separation (TSS) algorithm is applied twice in order to accurately calculate the densities of OH species and its excited states. Using a non-equilibrium radiation model, the OH ultraviolet emission spectra and intensity at different altitudes are computed, and good agreement is obtained with the flight measured data.
Computational Software for Fitting Seismic Data to Epidemic-Type Aftershock Sequence Models

NASA Astrophysics Data System (ADS)

Chu, A.

2014-12-01

Modern earthquake catalogs are often analyzed using spatial-temporal point process models such as the epidemic-type aftershock sequence (ETAS) models of Ogata (1998). My work introduces software to implement two of ETAS models described in Ogata (1998). To find the Maximum-Likelihood Estimates (MLEs), my software provides estimates of the homogeneous background rate parameter and the temporal and spatial parameters that govern triggering effects by applying the Expectation-Maximization (EM) algorithm introduced in Veen and Schoenberg (2008). Despite other computer programs exist for similar data modeling purpose, using EM-algorithm has the benefits of stability and robustness (Veen and Schoenberg, 2008). Spatial shapes that are very long and narrow cause difficulties in optimization convergence and problems with flat or multi-modal log-likelihood functions encounter similar issues. My program uses a robust method to preset a parameter to overcome the non-convergence computational issue. In addition to model fitting, the software is equipped with useful tools for examining modeling fitting results, for example, visualization of estimated conditional intensity, and estimation of expected number of triggered aftershocks. A simulation generator is also given with flexible spatial shapes that may be defined by the user. This open-source software has a very simple user interface. The user may execute it on a local computer, and the program also has potential to be hosted online. Java language is used for the software's core computing part and an optional interface to the statistical package R is provided.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Strout, Michelle

Programming parallel machines is fraught with difficulties: the obfuscation of algorithms due to implementation details such as communication and synchronization, the need for transparency between language constructs and performance, the difficulty of performing program analysis to enable automatic parallelization techniques, and the existence of important "dusty deck" codes. The SAIMI project developed abstractions that enable the orthogonal specification of algorithms and implementation details within the context of existing DOE applications. The main idea is to enable the injection of small programming models such as expressions involving transcendental functions, polyhedral iteration spaces with sparse constraints, and task graphs into full programsmore » through the use of pragmas. These smaller, more restricted programming models enable orthogonal specification of many implementation details such as how to map the computation on to parallel processors, how to schedule the computation, and how to allocation storage for the computation. At the same time, these small programming models enable the expression of the most computationally intense and communication heavy portions in many scientific simulations. The ability to orthogonally manipulate the implementation for such computations will significantly ease performance programming efforts and expose transformation possibilities and parameter to automated approaches such as autotuning. At Colorado State University, the SAIMI project was supported through DOE grant DE-SC3956 from April 2010 through August 2015. The SAIMI project has contributed a number of important results to programming abstractions that enable the orthogonal specification of implementation details in scientific codes. This final report summarizes the research that was funded by the SAIMI project.« less

Trans-algorithmic nature of learning in biological systems.

PubMed

Shimansky, Yury P

2018-05-02

Learning ability is a vitally important, distinctive property of biological systems, which provides dynamic stability in non-stationary environments. Although several different types of learning have been successfully modeled using a universal computer, in general, learning cannot be described by an algorithm. In other words, algorithmic approach to describing the functioning of biological systems is not sufficient for adequate grasping of what is life. Since biosystems are parts of the physical world, one might hope that adding some physical mechanisms and principles to the concept of algorithm could provide extra possibilities for describing learning in its full generality. However, a straightforward approach to that through the so-called physical hypercomputation so far has not been successful. Here an alternative approach is proposed. Biosystems are described as achieving enumeration of possible physical compositions though random incremental modifications inflicted on them by active operating resources (AORs) in the environment. Biosystems learn through algorithmic regulation of the intensity of the above modifications according to a specific optimality criterion. From the perspective of external observers, biosystems move in the space of different algorithms driven by random modifications imposed by the environmental AORs. A particular algorithm is only a snapshot of that motion, while the motion itself is essentially trans-algorithmic. In this conceptual framework, death of unfit members of a population, for example, is viewed as a trans-algorithmic modification made in the population as a biosystem by environmental AORs. Numerous examples of AOR utilization in biosystems of different complexity, from viruses to multicellular organisms, are provided.
Exploring quantum computing application to satellite data assimilation

NASA Astrophysics Data System (ADS)

Cheung, S.; Zhang, S. Q.

2015-12-01

This is an exploring work on potential application of quantum computing to a scientific data optimization problem. On classical computational platforms, the physical domain of a satellite data assimilation problem is represented by a discrete variable transform, and classical minimization algorithms are employed to find optimal solution of the analysis cost function. The computation becomes intensive and time-consuming when the problem involves large number of variables and data. The new quantum computer opens a very different approach both in conceptual programming and in hardware architecture for solving optimization problem. In order to explore if we can utilize the quantum computing machine architecture, we formulate a satellite data assimilation experimental case in the form of quadratic programming optimization problem. We find a transformation of the problem to map it into Quadratic Unconstrained Binary Optimization (QUBO) framework. Binary Wavelet Transform (BWT) will be applied to the data assimilation variables for its invertible decomposition and all calculations in BWT are performed by Boolean operations. The transformed problem will be experimented as to solve for a solution of QUBO instances defined on Chimera graphs of the quantum computer.
QCE: A Simulator for Quantum Computer Hardware

NASA Astrophysics Data System (ADS)

Michielsen, Kristel; de Raedt, Hans

2003-09-01

The Quantum Computer Emulator (QCE) described in this paper consists of a simulator of a generic, general purpose quantum computer and a graphical user interface. The latter is used to control the simulator, to define the hardware of the quantum computer and to debug and execute quantum algorithms. QCE runs in a Windows 98/NT/2000/ME/XP environment. It can be used to validate designs of physically realizable quantum processors and as an interactive educational tool to learn about quantum computers and quantum algorithms. A detailed exposition is given of the implementation of the CNOT and the Toffoli gate, the quantum Fourier transform, Grover's database search algorithm, an order finding algorithm, Shor's algorithm, a three-input adder and a number partitioning algorithm. We also review the results of simulations of an NMR-like quantum computer.
Cardiac MRI in mice at 9.4 Tesla with a transmit-receive surface coil and a cardiac-tailored intensity-correction algorithm.

PubMed

Sosnovik, David E; Dai, Guangping; Nahrendorf, Matthias; Rosen, Bruce R; Seethamraju, Ravi

2007-08-01

To evaluate the use of a transmit-receive surface (TRS) coil and a cardiac-tailored intensity-correction algorithm for cardiac MRI in mice at 9.4 Tesla (9.4T). Fast low-angle shot (FLASH) cines, with and without delays alternating with nutations for tailored excitation (DANTE) tagging, were acquired in 13 mice. An intensity-correction algorithm was developed to compensate for the sensitivity profile of the surface coil, and was tailored to account for the unique distribution of noise and flow artifacts in cardiac MR images. Image quality was extremely high and allowed fine structures such as trabeculations, valve cusps, and coronary arteries to be clearly visualized. The tag lines created with the surface coil were also sharp and clearly visible. Application of the intensity-correction algorithm improved signal intensity, tissue contrast, and image quality even further. Importantly, the cardiac-tailored properties of the correction algorithm prevented noise and flow artifacts from being significantly amplified. The feasibility and value of cardiac MRI in mice with a TRS coil has been demonstrated. In addition, a cardiac-tailored intensity-correction algorithm has been developed and shown to improve image quality even further. The use of these techniques could produce significant potential benefits over a broad range of scanners, coil configurations, and field strengths. (c) 2007 Wiley-Liss, Inc.
Radio Synthesis Imaging - A High Performance Computing and Communications Project

NASA Astrophysics Data System (ADS)

Crutcher, Richard M.

The National Science Foundation has funded a five-year High Performance Computing and Communications project at the National Center for Supercomputing Applications (NCSA) for the direct implementation of several of the computing recommendations of the Astronomy and Astrophysics Survey Committee (the "Bahcall report"). This paper is a summary of the project goals and a progress report. The project will implement a prototype of the next generation of astronomical telescope systems - remotely located telescopes connected by high-speed networks to very high performance, scalable architecture computers and on-line data archives, which are accessed by astronomers over Gbit/sec networks. Specifically, a data link has been installed between the BIMA millimeter-wave synthesis array at Hat Creek, California and NCSA at Urbana, Illinois for real-time transmission of data to NCSA. Data are automatically archived, and may be browsed and retrieved by astronomers using the NCSA Mosaic software. In addition, an on-line digital library of processed images will be established. BIMA data will be processed on a very high performance distributed computing system, with I/O, user interface, and most of the software system running on the NCSA Convex C3880 supercomputer or Silicon Graphics Onyx workstations connected by HiPPI to the high performance, massively parallel Thinking Machines Corporation CM-5. The very computationally intensive algorithms for calibration and imaging of radio synthesis array observations will be optimized for the CM-5 and new algorithms which utilize the massively parallel architecture will be developed. Code running simultaneously on the distributed computers will communicate using the Data Transport Mechanism developed by NCSA. The project will also use the BLANCA Gbit/s testbed network between Urbana and Madison, Wisconsin to connect an Onyx workstation in the University of Wisconsin Astronomy Department to the NCSA CM-5, for development of long-distance distributed computing. Finally, the project is developing 2D and 3D visualization software as part of the international AIPS++ project. This research and development project is being carried out by a team of experts in radio astronomy, algorithm development for massively parallel architectures, high-speed networking, database management, and Thinking Machines Corporation personnel. The development of this complete software, distributed computing, and data archive and library solution to the radio astronomy computing problem will advance our expertise in high performance computing and communications technology and the application of these techniques to astronomical data processing.
Rational use of cognitive resources: levels of analysis between the computational and the algorithmic.

PubMed

Griffiths, Thomas L; Lieder, Falk; Goodman, Noah D

2015-04-01

Marr's levels of analysis-computational, algorithmic, and implementation-have served cognitive science well over the last 30 years. But the recent increase in the popularity of the computational level raises a new challenge: How do we begin to relate models at different levels of analysis? We propose that it is possible to define levels of analysis that lie between the computational and the algorithmic, providing a way to build a bridge between computational- and algorithmic-level models. The key idea is to push the notion of rationality, often used in defining computational-level models, deeper toward the algorithmic level. We offer a simple recipe for reverse-engineering the mind's cognitive strategies by deriving optimal algorithms for a series of increasingly more realistic abstract computational architectures, which we call "resource-rational analysis." Copyright © 2015 Cognitive Science Society, Inc.
Automatic segmentation for detecting uterine fibroid regions treated with MR-guided high intensity focused ultrasound (MR-HIFU).

PubMed

Antila, Kari; Nieminen, Heikki J; Sequeiros, Roberto Blanco; Ehnholm, Gösta

2014-07-01

Up to 25% of women suffer from uterine fibroids (UF) that cause infertility, pain, and discomfort. MR-guided high intensity focused ultrasound (MR-HIFU) is an emerging technique for noninvasive, computer-guided thermal ablation of UFs. The volume of induced necrosis is a predictor of the success of the treatment. However, accurate volume assessment by hand can be time consuming, and quick tools produce biased results. Therefore, fast and reliable tools are required in order to estimate the technical treatment outcome during the therapy event so as to predict symptom relief. A novel technique has been developed for the segmentation and volume assessment of the treated region. Conventional algorithms typically require user interaction ora priori knowledge of the target. The developed algorithm exploits the treatment plan, the coordinates of the intended ablation, for fully automatic segmentation with no user input. A good similarity to an expert-segmented manual reference was achieved (Dice similarity coefficient = 0.880 ± 0.074). The average automatic segmentation time was 1.6 ± 0.7 min per patient against an order of tens of minutes when done manually. The results suggest that the segmentation algorithm developed, requiring no user-input, provides a feasible and practical approach for the automatic evaluation of the boundary and volume of the HIFU-treated region.
An Evolutionary Method for Financial Forecasting in Microscopic High-Speed Trading Environment

PubMed Central

Li, Hsu-Chih

2017-01-01

The advancement of information technology in financial applications nowadays have led to fast market-driven events that prompt flash decision-making and actions issued by computer algorithms. As a result, today's markets experience intense activity in the highly dynamic environment where trading systems respond to others at a much faster pace than before. This new breed of technology involves the implementation of high-speed trading strategies which generate significant portion of activity in the financial markets and present researchers with a wealth of information not available in traditional low-speed trading environments. In this study, we aim at developing feasible computational intelligence methodologies, particularly genetic algorithms (GA), to shed light on high-speed trading research using price data of stocks on the microscopic level. Our empirical results show that the proposed GA-based system is able to improve the accuracy of the prediction significantly for price movement, and we expect this GA-based methodology to advance the current state of research for high-speed trading and other relevant financial applications. PMID:28316618
3D Kirchhoff depth migration algorithm: A new scalable approach for parallelization on multicore CPU based cluster

NASA Astrophysics Data System (ADS)

Rastogi, Richa; Londhe, Ashutosh; Srivastava, Abhishek; Sirasala, Kirannmayi M.; Khonde, Kiran

2017-03-01

In this article, a new scalable 3D Kirchhoff depth migration algorithm is presented on state of the art multicore CPU based cluster. Parallelization of 3D Kirchhoff depth migration is challenging due to its high demand of compute time, memory, storage and I/O along with the need of their effective management. The most resource intensive modules of the algorithm are traveltime calculations and migration summation which exhibit an inherent trade off between compute time and other resources. The parallelization strategy of the algorithm largely depends on the storage of calculated traveltimes and its feeding mechanism to the migration process. The presented work is an extension of our previous work, wherein a 3D Kirchhoff depth migration application for multicore CPU based parallel system had been developed. Recently, we have worked on improving parallel performance of this application by re-designing the parallelization approach. The new algorithm is capable to efficiently migrate both prestack and poststack 3D data. It exhibits flexibility for migrating large number of traces within the available node memory and with minimal requirement of storage, I/O and inter-node communication. The resultant application is tested using 3D Overthrust data on PARAM Yuva II, which is a Xeon E5-2670 based multicore CPU cluster with 16 cores/node and 64 GB shared memory. Parallel performance of the algorithm is studied using different numerical experiments and the scalability results show striking improvement over its previous version. An impressive 49.05X speedup with 76.64% efficiency is achieved for 3D prestack data and 32.00X speedup with 50.00% efficiency for 3D poststack data, using 64 nodes. The results also demonstrate the effectiveness and robustness of the improved algorithm with high scalability and efficiency on a multicore CPU cluster.
A class of parallel algorithms for computation of the manipulator inertia matrix

NASA Technical Reports Server (NTRS)

Fijany, Amir; Bejczy, Antal K.

1989-01-01

Parallel and parallel/pipeline algorithms for computation of the manipulator inertia matrix are presented. An algorithm based on composite rigid-body spatial inertia method, which provides better features for parallelization, is used for the computation of the inertia matrix. Two parallel algorithms are developed which achieve the time lower bound in computation. Also described is the mapping of these algorithms with topological variation on a two-dimensional processor array, with nearest-neighbor connection, and with cardinality variation on a linear processor array. An efficient parallel/pipeline algorithm for the linear array was also developed, but at significantly higher efficiency.
Tissue Probability Map Constrained 4-D Clustering Algorithm for Increased Accuracy and Robustness in Serial MR Brain Image Segmentation

PubMed Central

Xue, Zhong; Shen, Dinggang; Li, Hai; Wong, Stephen

2010-01-01

The traditional fuzzy clustering algorithm and its extensions have been successfully applied in medical image segmentation. However, because of the variability of tissues and anatomical structures, the clustering results might be biased by the tissue population and intensity differences. For example, clustering-based algorithms tend to over-segment white matter tissues of MR brain images. To solve this problem, we introduce a tissue probability map constrained clustering algorithm and apply it to serial MR brain image segmentation, i.e., a series of 3-D MR brain images of the same subject at different time points. Using the new serial image segmentation algorithm in the framework of the CLASSIC framework, which iteratively segments the images and estimates the longitudinal deformations, we improved both accuracy and robustness for serial image computing, and at the mean time produced longitudinally consistent segmentation and stable measures. In the algorithm, the tissue probability maps consist of both the population-based and subject-specific segmentation priors. Experimental study using both simulated longitudinal MR brain data and the Alzheimer’s Disease Neuroimaging Initiative (ADNI) data confirmed that using both priors more accurate and robust segmentation results can be obtained. The proposed algorithm can be applied in longitudinal follow up studies of MR brain imaging with subtle morphological changes for neurological disorders. PMID:26566399
Numerical evaluation of the intensity transport equation for well-known wavefronts and intensity distributions

NASA Astrophysics Data System (ADS)

Campos-García, Manuel; Granados-Agustín, Fermín.; Cornejo-Rodríguez, Alejandro; Estrada-Molina, Amilcar; Avendaño-Alejo, Maximino; Moreno-Oliva, Víctor Iván.

2013-11-01

In order to obtain a clearer interpretation of the Intensity Transport Equation (ITE), in this work, we propose an algorithm to solve it for some particular wavefronts and its corresponding intensity distributions. By simulating intensity distributions in some planes, the ITE is turns into a Poisson equation with Neumann boundary conditions. The Poisson equation is solved by means of the iterative algorithm SOR (Simultaneous Over-Relaxation).
An Updated Algorithm for Estimation of Pesticide Exposure Intensity in the Agricultural Health Study

EPA Science Inventory

An algorithm developed to estimate pesticide exposure intensity for use in epidemiologic analyses was revised based on data from two exposure monitoring studies. In the first study, we estimated relative exposure intensity based on the results of measurements taken during the app...
Earthquake forecasts for the CSEP Japan experiment based on the RI algorithm

NASA Astrophysics Data System (ADS)

Nanjo, K. Z.

2011-03-01

An earthquake forecast testing experiment for Japan, the first of its kind, is underway within the framework of the Collaboratory for the Study of Earthquake Predictability (CSEP) under a controlled environment. Here we give an overview of the earthquake forecast models, based on the RI algorithm, which we have submitted to the CSEP Japan experiment. Models have been submitted to a total of 9 categories, corresponding to 3 testing classes (3 years, 1 year, and 3 months) and 3 testing regions. The RI algorithm is originally a binary forecast system based on the working assumption that large earthquakes are more likely to occur in the future at locations of higher seismicity in the past. It is based on simple counts of the number of past earthquakes, which is called the Relative Intensity (RI) of seismicity. To improve its forecast performance, we first expand the RI algorithm by introducing spatial smoothing. We then convert the RI representation from a binary system to a CSEP-testable model that produces forecasts for the number of earthquakes of predefined magnitudes. We use information on past seismicity to tune the parameters. The final submittal consists of 36 executable computer codes: 4 variants corresponding to different smoothing parameters for each of the 9 categories. They will help to elucidate which categories and which smoothing parameters are the most meaningful for the RI hypothesis. The main purpose of our participation in the experiment is to better understand the significance of the relative intensity of seismicity for earthquake forecastability in Japan.
Network Community Detection based on the Physarum-inspired Computational Framework.

PubMed

Gao, Chao; Liang, Mingxin; Li, Xianghua; Zhang, Zili; Wang, Zhen; Zhou, Zhili

2016-12-13

Community detection is a crucial and essential problem in the structure analytics of complex networks, which can help us understand and predict the characteristics and functions of complex networks. Many methods, ranging from the optimization-based algorithms to the heuristic-based algorithms, have been proposed for solving such a problem. Due to the inherent complexity of identifying network structure, how to design an effective algorithm with a higher accuracy and a lower computational cost still remains an open problem. Inspired by the computational capability and positive feedback mechanism in the wake of foraging process of Physarum, which is a large amoeba-like cell consisting of a dendritic network of tube-like pseudopodia, a general Physarum-based computational framework for community detection is proposed in this paper. Based on the proposed framework, the inter-community edges can be identified from the intra-community edges in a network and the positive feedback of solving process in an algorithm can be further enhanced, which are used to improve the efficiency of original optimization-based and heuristic-based community detection algorithms, respectively. Some typical algorithms (e.g., genetic algorithm, ant colony optimization algorithm, and Markov clustering algorithm) and real-world datasets have been used to estimate the efficiency of our proposed computational framework. Experiments show that the algorithms optimized by Physarum-inspired computational framework perform better than the original ones, in terms of accuracy and computational cost. Moreover, a computational complexity analysis verifies the scalability of our framework.
A highly efficient multi-core algorithm for clustering extremely large datasets

PubMed Central

2010-01-01

Background In recent years, the demand for computational power in computational biology has increased due to rapidly growing data sets from microarray and other high-throughput technologies. This demand is likely to increase. Standard algorithms for analyzing data, such as cluster algorithms, need to be parallelized for fast processing. Unfortunately, most approaches for parallelizing algorithms largely rely on network communication protocols connecting and requiring multiple computers. One answer to this problem is to utilize the intrinsic capabilities in current multi-core hardware to distribute the tasks among the different cores of one computer. Results We introduce a multi-core parallelization of the k-means and k-modes cluster algorithms based on the design principles of transactional memory for clustering gene expression microarray type data and categorial SNP data. Our new shared memory parallel algorithms show to be highly efficient. We demonstrate their computational power and show their utility in cluster stability and sensitivity analysis employing repeated runs with slightly changed parameters. Computation speed of our Java based algorithm was increased by a factor of 10 for large data sets while preserving computational accuracy compared to single-core implementations and a recently published network based parallelization. Conclusions Most desktop computers and even notebooks provide at least dual-core processors. Our multi-core algorithms show that using modern algorithmic concepts, parallelization makes it possible to perform even such laborious tasks as cluster sensitivity and cluster number estimation on the laboratory computer. PMID:20370922
[Parallel virtual reality visualization of extreme large medical datasets].

PubMed

Tang, Min

2010-04-01

On the basis of a brief description of grid computing, the essence and critical techniques of parallel visualization of extreme large medical datasets are discussed in connection with Intranet and common-configuration computers of hospitals. In this paper are introduced several kernel techniques, including the hardware structure, software framework, load balance and virtual reality visualization. The Maximum Intensity Projection algorithm is realized in parallel using common PC cluster. In virtual reality world, three-dimensional models can be rotated, zoomed, translated and cut interactively and conveniently through the control panel built on virtual reality modeling language (VRML). Experimental results demonstrate that this method provides promising and real-time results for playing the role in of a good assistant in making clinical diagnosis.
Parallel-hierarchical processing and classification of laser beam profile images based on the GPU-oriented architecture

NASA Astrophysics Data System (ADS)

Yarovyi, Andrii A.; Timchenko, Leonid I.; Kozhemiako, Volodymyr P.; Kokriatskaia, Nataliya I.; Hamdi, Rami R.; Savchuk, Tamara O.; Kulyk, Oleksandr O.; Surtel, Wojciech; Amirgaliyev, Yedilkhan; Kashaganova, Gulzhan

2017-08-01

The paper deals with a problem of insufficient productivity of existing computer means for large image processing, which do not meet modern requirements posed by resource-intensive computing tasks of laser beam profiling. The research concentrated on one of the profiling problems, namely, real-time processing of spot images of the laser beam profile. Development of a theory of parallel-hierarchic transformation allowed to produce models for high-performance parallel-hierarchical processes, as well as algorithms and software for their implementation based on the GPU-oriented architecture using GPGPU technologies. The analyzed performance of suggested computerized tools for processing and classification of laser beam profile images allows to perform real-time processing of dynamic images of various sizes.
Computationally efficient algorithm for high sampling-frequency operation of active noise control

NASA Astrophysics Data System (ADS)

Rout, Nirmal Kumar; Das, Debi Prasad; Panda, Ganapati

2015-05-01

In high sampling-frequency operation of active noise control (ANC) system the length of the secondary path estimate and the ANC filter are very long. This increases the computational complexity of the conventional filtered-x least mean square (FXLMS) algorithm. To reduce the computational complexity of long order ANC system using FXLMS algorithm, frequency domain block ANC algorithms have been proposed in past. These full block frequency domain ANC algorithms are associated with some disadvantages such as large block delay, quantization error due to computation of large size transforms and implementation difficulties in existing low-end DSP hardware. To overcome these shortcomings, the partitioned block ANC algorithm is newly proposed where the long length filters in ANC are divided into a number of equal partitions and suitably assembled to perform the FXLMS algorithm in the frequency domain. The complexity of this proposed frequency domain partitioned block FXLMS (FPBFXLMS) algorithm is quite reduced compared to the conventional FXLMS algorithm. It is further reduced by merging one fast Fourier transform (FFT)-inverse fast Fourier transform (IFFT) combination to derive the reduced structure FPBFXLMS (RFPBFXLMS) algorithm. Computational complexity analysis for different orders of filter and partition size are presented. Systematic computer simulations are carried out for both the proposed partitioned block ANC algorithms to show its accuracy compared to the time domain FXLMS algorithm.
GPU-based ultra-fast dose calculation using a finite size pencil beam model.

PubMed

Gu, Xuejun; Choi, Dongju; Men, Chunhua; Pan, Hubert; Majumdar, Amitava; Jiang, Steve B

2009-10-21

Online adaptive radiation therapy (ART) is an attractive concept that promises the ability to deliver an optimal treatment in response to the inter-fraction variability in patient anatomy. However, it has yet to be realized due to technical limitations. Fast dose deposit coefficient calculation is a critical component of the online planning process that is required for plan optimization of intensity-modulated radiation therapy (IMRT). Computer graphics processing units (GPUs) are well suited to provide the requisite fast performance for the data-parallel nature of dose calculation. In this work, we develop a dose calculation engine based on a finite-size pencil beam (FSPB) algorithm and a GPU parallel computing framework. The developed framework can accommodate any FSPB model. We test our implementation in the case of a water phantom and the case of a prostate cancer patient with varying beamlet and voxel sizes. All testing scenarios achieved speedup ranging from 200 to 400 times when using a NVIDIA Tesla C1060 card in comparison with a 2.27 GHz Intel Xeon CPU. The computational time for calculating dose deposition coefficients for a nine-field prostate IMRT plan with this new framework is less than 1 s. This indicates that the GPU-based FSPB algorithm is well suited for online re-planning for adaptive radiotherapy.

New approach to evaluate rotation of cervical vertebrae

NASA Astrophysics Data System (ADS)

Hahn, Matthias

2001-07-01

Functional deficits after whiplash injury can be analyzed with a quite novel radiologic method by examination of joint-blocks in C0/1 and C1/2. Thereto the movability of C0, C1 and C2 is determined with three spiral CT-scans of the patient's cervical spine. One series in neutral and one in maximal active lateral right and left rotation each. Previous methods were slice based and time consuming when manually evaluated. We propose a new approach to a computation of these angles in 3D. After a threshold segmentation of bone tissue, a rough 2D classification takes place for C0, C1 and C2 in each rotation series. The center of an axial rotation for each vertebra is gained from the approximation of its center of gravity. The rotation itself is estimated by a cross-correlation of the radial distance functions. From the previous rotation the results are taken to initialize a 3D matching algorithm based on the sum of squared differences in intensity. The optimal match of the vertebrae is computed by means of the multidimensional Powell minimization algorithm. The three translational and three rotational components build a six-dimensional search-space. The vertebrae detection and rotation computation is done fully automatic.
Smartphones as image processing systems for prosthetic vision.

PubMed

Zapf, Marc P; Matteucci, Paul B; Lovell, Nigel H; Suaning, Gregg J

2013-01-01

The feasibility of implants for prosthetic vision has been demonstrated by research and commercial organizations. In most devices, an essential forerunner to the internal stimulation circuit is an external electronics solution for capturing, processing and relaying image information as well as extracting useful features from the scene surrounding the patient. The capabilities and multitude of image processing algorithms that can be performed by the device in real-time plays a major part in the final quality of the prosthetic vision. It is therefore optimal to use powerful hardware yet to avoid bulky, straining solutions. Recent publications have reported of portable single-board computers fast enough for computationally intensive image processing. Following the rapid evolution of commercial, ultra-portable ARM (Advanced RISC machine) mobile devices, the authors investigated the feasibility of modern smartphones running complex face detection as external processing devices for vision implants. The role of dedicated graphics processors in speeding up computation was evaluated while performing a demanding noise reduction algorithm (image denoising). The time required for face detection was found to decrease by 95% from 2.5 year old to recent devices. In denoising, graphics acceleration played a major role, speeding up denoising by a factor of 18. These results demonstrate that the technology has matured sufficiently to be considered as a valid external electronics platform for visual prosthetic research.
Failure probability analysis of optical grid

NASA Astrophysics Data System (ADS)

Zhong, Yaoquan; Guo, Wei; Sun, Weiqiang; Jin, Yaohui; Hu, Weisheng

2008-11-01

Optical grid, the integrated computing environment based on optical network, is expected to be an efficient infrastructure to support advanced data-intensive grid applications. In optical grid, the faults of both computational and network resources are inevitable due to the large scale and high complexity of the system. With the optical network based distributed computing systems extensive applied in the processing of data, the requirement of the application failure probability have been an important indicator of the quality of application and an important aspect the operators consider. This paper will present a task-based analysis method of the application failure probability in optical grid. Then the failure probability of the entire application can be quantified, and the performance of reducing application failure probability in different backup strategies can be compared, so that the different requirements of different clients can be satisfied according to the application failure probability respectively. In optical grid, when the application based DAG (directed acyclic graph) is executed in different backup strategies, the application failure probability and the application complete time is different. This paper will propose new multi-objective differentiated services algorithm (MDSA). New application scheduling algorithm can guarantee the requirement of the failure probability and improve the network resource utilization, realize a compromise between the network operator and the application submission. Then differentiated services can be achieved in optical grid.
Fast processing of microscopic images using object-based extended depth of field.

PubMed

Intarapanich, Apichart; Kaewkamnerd, Saowaluck; Pannarut, Montri; Shaw, Philip J; Tongsima, Sissades

2016-12-22

Microscopic analysis requires that foreground objects of interest, e.g. cells, are in focus. In a typical microscopic specimen, the foreground objects may lie on different depths of field necessitating capture of multiple images taken at different focal planes. The extended depth of field (EDoF) technique is a computational method for merging images from different depths of field into a composite image with all foreground objects in focus. Composite images generated by EDoF can be applied in automated image processing and pattern recognition systems. However, current algorithms for EDoF are computationally intensive and impractical, especially for applications such as medical diagnosis where rapid sample turnaround is important. Since foreground objects typically constitute a minor part of an image, the EDoF technique could be made to work much faster if only foreground regions are processed to make the composite image. We propose a novel algorithm called object-based extended depths of field (OEDoF) to address this issue. The OEDoF algorithm consists of four major modules: 1) color conversion, 2) object region identification, 3) good contrast pixel identification and 4) detail merging. First, the algorithm employs color conversion to enhance contrast followed by identification of foreground pixels. A composite image is constructed using only these foreground pixels, which dramatically reduces the computational time. We used 250 images obtained from 45 specimens of confirmed malaria infections to test our proposed algorithm. The resulting composite images with all in-focus objects were produced using the proposed OEDoF algorithm. We measured the performance of OEDoF in terms of image clarity (quality) and processing time. The features of interest selected by the OEDoF algorithm are comparable in quality with equivalent regions in images processed by the state-of-the-art complex wavelet EDoF algorithm; however, OEDoF required four times less processing time. This work presents a modification of the extended depth of field approach for efficiently enhancing microscopic images. This selective object processing scheme used in OEDoF can significantly reduce the overall processing time while maintaining the clarity of important image features. The empirical results from parasite-infected red cell images revealed that our proposed method efficiently and effectively produced in-focus composite images. With the speed improvement of OEDoF, this proposed algorithm is suitable for processing large numbers of microscope images, e.g., as required for medical diagnosis.
A Parallel Compact Multi-Dimensional Numerical Algorithm with Aeroacoustics Applications

NASA Technical Reports Server (NTRS)

Povitsky, Alex; Morris, Philip J.

1999-01-01

In this study we propose a novel method to parallelize high-order compact numerical algorithms for the solution of three-dimensional PDEs (Partial Differential Equations) in a space-time domain. For this numerical integration most of the computer time is spent in computation of spatial derivatives at each stage of the Runge-Kutta temporal update. The most efficient direct method to compute spatial derivatives on a serial computer is a version of Gaussian elimination for narrow linear banded systems known as the Thomas algorithm. In a straightforward pipelined implementation of the Thomas algorithm processors are idle due to the forward and backward recurrences of the Thomas algorithm. To utilize processors during this time, we propose to use them for either non-local data independent computations, solving lines in the next spatial direction, or local data-dependent computations by the Runge-Kutta method. To achieve this goal, control of processor communication and computations by a static schedule is adopted. Thus, our parallel code is driven by a communication and computation schedule instead of the usual "creative, programming" approach. The obtained parallelization speed-up of the novel algorithm is about twice as much as that for the standard pipelined algorithm and close to that for the explicit DRP algorithm.
A robust algorithm for automated target recognition using precomputed radar cross sections

NASA Astrophysics Data System (ADS)

Ehrman, Lisa M.; Lanterman, Aaron D.

2004-09-01

Passive radar is an emerging technology that offers a number of unique benefits, including covert operation. Many such systems are already capable of detecting and tracking aircraft. The goal of this work is to develop a robust algorithm for adding automated target recognition (ATR) capabilities to existing passive radar systems. In previous papers, we proposed conducting ATR by comparing the precomputed RCS of known targets to that of detected targets. To make the precomputed RCS as accurate as possible, a coordinated flight model is used to estimate aircraft orientation. Once the aircraft's position and orientation are known, it is possible to determine the incident and observed angles on the aircraft, relative to the transmitter and receiver. This makes it possible to extract the appropriate radar cross section (RCS) from our simulated database. This RCS is then scaled to account for propagation losses and the receiver's antenna gain. A Rician likelihood model compares these expected signals from different targets to the received target profile. We have previously employed Monte Carlo runs to gauge the probability of error in the ATR algorithm; however, generation of a statistically significant set of Monte Carlo runs is computationally intensive. As an alternative to Monte Carlo runs, we derive the relative entropy (also known as Kullback-Liebler distance) between two Rician distributions. Since the probability of Type II error in our hypothesis testing problem can be expressed as a function of the relative entropy via Stein's Lemma, this provides us with a computationally efficient method for determining an upper bound on our algorithm's performance. It also provides great insight into the types of classification errors we can expect from our algorithm. This paper compares the numerically approximated probability of Type II error with the results obtained from a set of Monte Carlo runs.
Normalized gradient fields cross-correlation for automated detection of prostate in magnetic resonance images

NASA Astrophysics Data System (ADS)

Fotin, Sergei V.; Yin, Yin; Periaswamy, Senthil; Kunz, Justin; Haldankar, Hrishikesh; Muradyan, Naira; Cornud, François; Turkbey, Baris; Choyke, Peter L.

2012-02-01

Fully automated prostate segmentation helps to address several problems in prostate cancer diagnosis and treatment: it can assist in objective evaluation of multiparametric MR imagery, provides a prostate contour for MR-ultrasound (or CT) image fusion for computer-assisted image-guided biopsy or therapy planning, may facilitate reporting and enables direct prostate volume calculation. Among the challenges in automated analysis of MR images of the prostate are the variations of overall image intensities across scanners, the presence of nonuniform multiplicative bias field within scans and differences in acquisition setup. Furthermore, images acquired with the presence of an endorectal coil suffer from localized high-intensity artifacts at the posterior part of the prostate. In this work, a three-dimensional method for fast automated prostate detection based on normalized gradient fields cross-correlation, insensitive to intensity variations and coil-induced artifacts, is presented and evaluated. The components of the method, offline template learning and the localization algorithm, are described in detail. The method was validated on a dataset of 522 T2-weighted MR images acquired at the National Cancer Institute, USA that was split in two halves for development and testing. In addition, second dataset of 29 MR exams from Centre d'Imagerie Médicale Tourville, France were used to test the algorithm. The 95% confidence intervals for the mean Euclidean distance between automatically and manually identified prostate centroids were 4.06 +/- 0.33 mm and 3.10 +/- 0.43 mm for the first and second test datasets respectively. Moreover, the algorithm provided the centroid within the true prostate volume in 100% of images from both datasets. Obtained results demonstrate high utility of the detection method for a fully automated prostate segmentation.
A parallel Jacobson-Oksman optimization algorithm. [parallel processing (computers)

NASA Technical Reports Server (NTRS)

Straeter, T. A.; Markos, A. T.

1975-01-01

A gradient-dependent optimization technique which exploits the vector-streaming or parallel-computing capabilities of some modern computers is presented. The algorithm, derived by assuming that the function to be minimized is homogeneous, is a modification of the Jacobson-Oksman serial minimization method. In addition to describing the algorithm, conditions insuring the convergence of the iterates of the algorithm and the results of numerical experiments on a group of sample test functions are presented. The results of these experiments indicate that this algorithm will solve optimization problems in less computing time than conventional serial methods on machines having vector-streaming or parallel-computing capabilities.
Computational Ghost Imaging for Remote Sensing

NASA Technical Reports Server (NTRS)

Erkmen, Baris I.

2012-01-01

This work relates to the generic problem of remote active imaging; that is, a source illuminates a target of interest and a receiver collects the scattered light off the target to obtain an image. Conventional imaging systems consist of an imaging lens and a high-resolution detector array [e.g., a CCD (charge coupled device) array] to register the image. However, conventional imaging systems for remote sensing require high-quality optics and need to support large detector arrays and associated electronics. This results in suboptimal size, weight, and power consumption. Computational ghost imaging (CGI) is a computational alternative to this traditional imaging concept that has a very simple receiver structure. In CGI, the transmitter illuminates the target with a modulated light source. A single-pixel (bucket) detector collects the scattered light. Then, via computation (i.e., postprocessing), the receiver can reconstruct the image using the knowledge of the modulation that was projected onto the target by the transmitter. This way, one can construct a very simple receiver that, in principle, requires no lens to image a target. Ghost imaging is a transverse imaging modality that has been receiving much attention owing to a rich interconnection of novel physical characteristics and novel signal processing algorithms suitable for active computational imaging. The original ghost imaging experiments consisted of two correlated optical beams traversing distinct paths and impinging on two spatially-separated photodetectors: one beam interacts with the target and then illuminates on a single-pixel (bucket) detector that provides no spatial resolution, whereas the other beam traverses an independent path and impinges on a high-resolution camera without any interaction with the target. The term ghost imaging was coined soon after the initial experiments were reported, to emphasize the fact that by cross-correlating two photocurrents, one generates an image of the target. In CGI, the measurement obtained from the reference arm (with the high-resolution detector) is replaced by a computational derivation of the measurement-plane intensity profile of the reference-arm beam. The algorithms applied to computational ghost imaging have diversified beyond simple correlation measurements, and now include modern reconstruction algorithms based on compressive sensing.
Development of a computer-based clinical decision support tool for selecting appropriate rehabilitation interventions for injured workers.

PubMed

Gross, Douglas P; Zhang, Jing; Steenstra, Ivan; Barnsley, Susan; Haws, Calvin; Amell, Tyler; McIntosh, Greg; Cooper, Juliette; Zaiane, Osmar

2013-12-01

To develop a classification algorithm and accompanying computer-based clinical decision support tool to help categorize injured workers toward optimal rehabilitation interventions based on unique worker characteristics. Population-based historical cohort design. Data were extracted from a Canadian provincial workers' compensation database on all claimants undergoing work assessment between December 2009 and January 2011. Data were available on: (1) numerous personal, clinical, occupational, and social variables; (2) type of rehabilitation undertaken; and (3) outcomes following rehabilitation (receiving time loss benefits or undergoing repeat programs). Machine learning, concerned with the design of algorithms to discriminate between classes based on empirical data, was the foundation of our approach to build a classification system with multiple independent and dependent variables. The population included 8,611 unique claimants. Subjects were predominantly employed (85 %) males (64 %) with diagnoses of sprain/strain (44 %). Baseline clinician classification accuracy was high (ROC = 0.86) for selecting programs that lead to successful return-to-work. Classification performance for machine learning techniques outperformed the clinician baseline classification (ROC = 0.94). The final classifiers were multifactorial and included the variables: injury duration, occupation, job attachment status, work status, modified work availability, pain intensity rating, self-rated occupational disability, and 9 items from the SF-36 Health Survey. The use of machine learning classification techniques appears to have resulted in classification performance better than clinician decision-making. The final algorithm has been integrated into a computer-based clinical decision support tool that requires additional validation in a clinical sample.
FPGA-accelerated adaptive optics wavefront control

NASA Astrophysics Data System (ADS)

Mauch, S.; Reger, J.; Reinlein, C.; Appelfelder, M.; Goy, M.; Beckert, E.; Tünnermann, A.

2014-03-01

The speed of real-time adaptive optical systems is primarily restricted by the data processing hardware and computational aspects. Furthermore, the application of mirror layouts with increasing numbers of actuators reduces the bandwidth (speed) of the system and, thus, the number of applicable control algorithms. This burden turns out a key-impediment for deformable mirrors with continuous mirror surface and highly coupled actuator influence functions. In this regard, specialized hardware is necessary for high performance real-time control applications. Our approach to overcome this challenge is an adaptive optics system based on a Shack-Hartmann wavefront sensor (SHWFS) with a CameraLink interface. The data processing is based on a high performance Intel Core i7 Quadcore hard real-time Linux system. Employing a Xilinx Kintex-7 FPGA, an own developed PCie card is outlined in order to accelerate the analysis of a Shack-Hartmann Wavefront Sensor. A recently developed real-time capable spot detection algorithm evaluates the wavefront. The main features of the presented system are the reduction of latency and the acceleration of computation For example, matrix multiplications which in general are of complexity O(n3 are accelerated by using the DSP48 slices of the field-programmable gate array (FPGA) as well as a novel hardware implementation of the SHWFS algorithm. Further benefits are the Streaming SIMD Extensions (SSE) which intensively use the parallelization capability of the processor for further reducing the latency and increasing the bandwidth of the closed-loop. Due to this approach, up to 64 actuators of a deformable mirror can be handled and controlled without noticeable restriction from computational burdens.
Algorithmic complexity of quantum capacity

NASA Astrophysics Data System (ADS)

Oskouei, Samad Khabbazi; Mancini, Stefano

2018-04-01

We analyze the notion of quantum capacity from the perspective of algorithmic (descriptive) complexity. To this end, we resort to the concept of semi-computability in order to describe quantum states and quantum channel maps. We introduce algorithmic entropies (like algorithmic quantum coherent information) and derive relevant properties for them. Then we show that quantum capacity based on semi-computable concept equals the entropy rate of algorithmic coherent information, which in turn equals the standard quantum capacity. Thanks to this, we finally prove that the quantum capacity, for a given semi-computable channel, is limit computable.
Efficient conjugate gradient algorithms for computation of the manipulator forward dynamics

NASA Technical Reports Server (NTRS)

Fijany, Amir; Scheid, Robert E.

1989-01-01

The applicability of conjugate gradient algorithms for computation of the manipulator forward dynamics is investigated. The redundancies in the previously proposed conjugate gradient algorithm are analyzed. A new version is developed which, by avoiding these redundancies, achieves a significantly greater efficiency. A preconditioned conjugate gradient algorithm is also presented. A diagonal matrix whose elements are the diagonal elements of the inertia matrix is proposed as the preconditioner. In order to increase the computational efficiency, an algorithm is developed which exploits the synergism between the computation of the diagonal elements of the inertia matrix and that required by the conjugate gradient algorithm.
A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data.

PubMed

Liang, Faming; Kim, Jinsu; Song, Qifan

2016-01-01

Markov chain Monte Carlo (MCMC) methods have proven to be a very powerful tool for analyzing data of complex structures. However, their computer-intensive nature, which typically require a large number of iterations and a complete scan of the full dataset for each iteration, precludes their use for big data analysis. In this paper, we propose the so-called bootstrap Metropolis-Hastings (BMH) algorithm, which provides a general framework for how to tame powerful MCMC methods to be used for big data analysis; that is to replace the full data log-likelihood by a Monte Carlo average of the log-likelihoods that are calculated in parallel from multiple bootstrap samples. The BMH algorithm possesses an embarrassingly parallel structure and avoids repeated scans of the full dataset in iterations, and is thus feasible for big data problems. Compared to the popular divide-and-combine method, BMH can be generally more efficient as it can asymptotically integrate the whole data information into a single simulation run. The BMH algorithm is very flexible. Like the Metropolis-Hastings algorithm, it can serve as a basic building block for developing advanced MCMC algorithms that are feasible for big data problems. This is illustrated in the paper by the tempering BMH algorithm, which can be viewed as a combination of parallel tempering and the BMH algorithm. BMH can also be used for model selection and optimization by combining with reversible jump MCMC and simulated annealing, respectively.
An ATR architecture for algorithm development and testing

NASA Astrophysics Data System (ADS)

Breivik, Gøril M.; Løkken, Kristin H.; Brattli, Alvin; Palm, Hans C.; Haavardsholm, Trym

2013-05-01

A research platform with four cameras in the infrared and visible spectral domains is under development at the Norwegian Defence Research Establishment (FFI). The platform will be mounted on a high-speed jet aircraft and will primarily be used for image acquisition and for development and test of automatic target recognition (ATR) algorithms. The sensors on board produce large amounts of data, the algorithms can be computationally intensive and the data processing is complex. This puts great demands on the system architecture; it has to run in real-time and at the same time be suitable for algorithm development. In this paper we present an architecture for ATR systems that is designed to be exible, generic and efficient. The architecture is module based so that certain parts, e.g. specific ATR algorithms, can be exchanged without affecting the rest of the system. The modules are generic and can be used in various ATR system configurations. A software framework in C++ that handles large data ows in non-linear pipelines is used for implementation. The framework exploits several levels of parallelism and lets the hardware processing capacity be fully utilised. The ATR system is under development and has reached a first level that can be used for segmentation algorithm development and testing. The implemented system consists of several modules, and although their content is still limited, the segmentation module includes two different segmentation algorithms that can be easily exchanged. We demonstrate the system by applying the two segmentation algorithms to infrared images from sea trial recordings.
A Bootstrap Metropolis–Hastings Algorithm for Bayesian Analysis of Big Data

PubMed Central

Kim, Jinsu; Song, Qifan

2016-01-01

Markov chain Monte Carlo (MCMC) methods have proven to be a very powerful tool for analyzing data of complex structures. However, their computer-intensive nature, which typically require a large number of iterations and a complete scan of the full dataset for each iteration, precludes their use for big data analysis. In this paper, we propose the so-called bootstrap Metropolis-Hastings (BMH) algorithm, which provides a general framework for how to tame powerful MCMC methods to be used for big data analysis; that is to replace the full data log-likelihood by a Monte Carlo average of the log-likelihoods that are calculated in parallel from multiple bootstrap samples. The BMH algorithm possesses an embarrassingly parallel structure and avoids repeated scans of the full dataset in iterations, and is thus feasible for big data problems. Compared to the popular divide-and-combine method, BMH can be generally more efficient as it can asymptotically integrate the whole data information into a single simulation run. The BMH algorithm is very flexible. Like the Metropolis-Hastings algorithm, it can serve as a basic building block for developing advanced MCMC algorithms that are feasible for big data problems. This is illustrated in the paper by the tempering BMH algorithm, which can be viewed as a combination of parallel tempering and the BMH algorithm. BMH can also be used for model selection and optimization by combining with reversible jump MCMC and simulated annealing, respectively. PMID:29033469
a Threshold-Free Filtering Algorithm for Airborne LIDAR Point Clouds Based on Expectation-Maximization

NASA Astrophysics Data System (ADS)

Hui, Z.; Cheng, P.; Ziggah, Y. Y.; Nie, Y.

2018-04-01

Filtering is a key step for most applications of airborne LiDAR point clouds. Although lots of filtering algorithms have been put forward in recent years, most of them suffer from parameters setting or thresholds adjusting, which will be time-consuming and reduce the degree of automation of the algorithm. To overcome this problem, this paper proposed a threshold-free filtering algorithm based on expectation-maximization. The proposed algorithm is developed based on an assumption that point clouds are seen as a mixture of Gaussian models. The separation of ground points and non-ground points from point clouds can be replaced as a separation of a mixed Gaussian model. Expectation-maximization (EM) is applied for realizing the separation. EM is used to calculate maximum likelihood estimates of the mixture parameters. Using the estimated parameters, the likelihoods of each point belonging to ground or object can be computed. After several iterations, point clouds can be labelled as the component with a larger likelihood. Furthermore, intensity information was also utilized to optimize the filtering results acquired using the EM method. The proposed algorithm was tested using two different datasets used in practice. Experimental results showed that the proposed method can filter non-ground points effectively. To quantitatively evaluate the proposed method, this paper adopted the dataset provided by the ISPRS for the test. The proposed algorithm can obtain a 4.48 % total error which is much lower than most of the eight classical filtering algorithms reported by the ISPRS.
Camera-enabled techniques for organic synthesis

PubMed Central

Ingham, Richard J; O’Brien, Matthew; Browne, Duncan L

2013-01-01

Summary A great deal of time is spent within synthetic chemistry laboratories on non-value-adding activities such as sample preparation and work-up operations, and labour intensive activities such as extended periods of continued data collection. Using digital cameras connected to computer vision algorithms, camera-enabled apparatus can perform some of these processes in an automated fashion, allowing skilled chemists to spend their time more productively. In this review we describe recent advances in this field of chemical synthesis and discuss how they will lead to advanced synthesis laboratories of the future. PMID:23766820
Reducing Speckle In One-Look SAR Images

NASA Technical Reports Server (NTRS)

Nathan, K. S.; Curlander, J. C.

1990-01-01

Local-adaptive-filter algorithm incorporated into digital processing of synthetic-aperture-radar (SAR) echo data to reduce speckle in resulting imagery. Involves use of image statistics in vicinity of each picture element, in conjunction with original intensity of element, to estimate brightness more nearly proportional to true radar reflectance of corresponding target. Increases ratio of signal to speckle noise without substantial degradation of resolution common to multilook SAR images. Adapts to local variations of statistics within scene, preserving subtle details. Computationally simple. Lends itself to parallel processing of different segments of image, making possible increased throughput.
A divide-and-conquer algorithm for large-scale de novo transcriptome assembly through combining small assemblies from existing algorithms.

PubMed

Sze, Sing-Hoi; Parrott, Jonathan J; Tarone, Aaron M

2017-12-06

While the continued development of high-throughput sequencing has facilitated studies of entire transcriptomes in non-model organisms, the incorporation of an increasing amount of RNA-Seq libraries has made de novo transcriptome assembly difficult. Although algorithms that can assemble a large amount of RNA-Seq data are available, they are generally very memory-intensive and can only be used to construct small assemblies. We develop a divide-and-conquer strategy that allows these algorithms to be utilized, by subdividing a large RNA-Seq data set into small libraries. Each individual library is assembled independently by an existing algorithm, and a merging algorithm is developed to combine these assemblies by picking a subset of high quality transcripts to form a large transcriptome. When compared to existing algorithms that return a single assembly directly, this strategy achieves comparable or increased accuracy as memory-efficient algorithms that can be used to process a large amount of RNA-Seq data, and comparable or decreased accuracy as memory-intensive algorithms that can only be used to construct small assemblies. Our divide-and-conquer strategy allows memory-intensive de novo transcriptome assembly algorithms to be utilized to construct large assemblies.

Reconfigurable vision system for real-time applications

NASA Astrophysics Data System (ADS)

Torres-Huitzil, Cesar; Arias-Estrada, Miguel

2002-03-01

Recently, a growing community of researchers has used reconfigurable systems to solve computationally intensive problems. Reconfigurability provides optimized processors for systems on chip designs, and makes easy to import technology to a new system through reusable modules. The main objective of this work is the investigation of a reconfigurable computer system targeted for computer vision and real-time applications. The system is intended to circumvent the inherent computational load of most window-based computer vision algorithms. It aims to build a system for such tasks by providing an FPGA-based hardware architecture for task specific vision applications with enough processing power, using the minimum amount of hardware resources as possible, and a mechanism for building systems using this architecture. Regarding the software part of the system, a library of pre-designed and general-purpose modules that implement common window-based computer vision operations is being investigated. A common generic interface is established for these modules in order to define hardware/software components. These components can be interconnected to develop more complex applications, providing an efficient mechanism for transferring image and result data among modules. Some preliminary results are presented and discussed.
Sparse-view proton computed tomography using modulated proton beams.

PubMed

Lee, Jiseoc; Kim, Changhwan; Min, Byungjun; Kwak, Jungwon; Park, Seyjoon; Lee, Se Byeong; Park, Sungyong; Cho, Seungryong

2015-02-01

Proton imaging that uses a modulated proton beam and an intensity detector allows a relatively fast image acquisition compared to the imaging approach based on a trajectory tracking detector. In addition, it requires a relatively simple implementation in a conventional proton therapy equipment. The model of geometric straight ray assumed in conventional computed tomography (CT) image reconstruction is however challenged by multiple-Coulomb scattering and energy straggling in the proton imaging. Radiation dose to the patient is another important issue that has to be taken care of for practical applications. In this work, the authors have investigated iterative image reconstructions after a deconvolution of the sparsely view-sampled data to address these issues in proton CT. Proton projection images were acquired using the modulated proton beams and the EBT2 film as an intensity detector. Four electron-density cylinders representing normal soft tissues and bone were used as imaged object and scanned at 40 views that are equally separated over 360°. Digitized film images were converted to water-equivalent thickness by use of an empirically derived conversion curve. For improving the image quality, a deconvolution-based image deblurring with an empirically acquired point spread function was employed. They have implemented iterative image reconstruction algorithms such as adaptive steepest descent-projection onto convex sets (ASD-POCS), superiorization method-projection onto convex sets (SM-POCS), superiorization method-expectation maximization (SM-EM), and expectation maximization-total variation minimization (EM-TV). Performance of the four image reconstruction algorithms was analyzed and compared quantitatively via contrast-to-noise ratio (CNR) and root-mean-square-error (RMSE). Objects of higher electron density have been reconstructed more accurately than those of lower density objects. The bone, for example, has been reconstructed within 1% error. EM-based algorithms produced an increased image noise and RMSE as the iteration reaches about 20, while the POCS-based algorithms showed a monotonic convergence with iterations. The ASD-POCS algorithm outperformed the others in terms of CNR, RMSE, and the accuracy of the reconstructed relative stopping power in the region of lung and soft tissues. The four iterative algorithms, i.e., ASD-POCS, SM-POCS, SM-EM, and EM-TV, have been developed and applied for proton CT image reconstruction. Although it still seems that the images need to be improved for practical applications to the treatment planning, proton CT imaging by use of the modulated beams in sparse-view sampling has demonstrated its feasibility.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Shen, Z; Greskovich, J; Xia, P

Purpose: To generate virtual phantoms with clinically relevant deformation and use them to objectively evaluate geometric and dosimetric uncertainties of deformable image registration (DIR) algorithms. Methods: Ten lung cancer patients undergoing adaptive 3DCRT planning were selected. For each patient, a pair of planning CT (pCT) and replanning CT (rCT) were used as the basis for virtual phantom generation. Manually adjusted meshes were created for selected ROIs (e.g. PTV, lungs, spinal cord, esophagus, and heart) on pCT and rCT. The mesh vertices were input into a thin-plate spline algorithm to generate a reference displacement vector field (DVF). The reference DVF wasmore » used to deform pCT to generate a simulated replanning CT (srCT) that was closely matched to rCT. Three DIR algorithms (Demons, B-Spline, and intensity-based) were applied to these ten virtual phantoms. The images, ROIs, and doses were mapped from pCT to srCT using the DVFs computed by these three DIRs and compared to those mapped using the reference DVF. Results: The average Dice coefficients for selected ROIs were from 0.85 to 0.96 for Demons, from 0.86 to 0.97 for intensity-based, and from 0.76 to 0.95 for B-Spline. The average Hausdorff distances for selected ROIs were from 2.2 to 5.4 mm for Demons, from 2.3 to 6.8 mm for intensity-based, and from 2.4 to 11.4 mm for B-Spline. The average absolute dose errors for selected ROIs were from 0.2 to 0.6 Gy for Demons, from 0.1 to 0.5 Gy for intensity-based, and from 0.5 to 1.5 Gy for B-Spline. Conclusion: Virtual phantoms were modeled after patients with lung cancer and were clinically relevant for adaptive radiotherapy treatment replanning. Virtual phantoms with known DVFs serve as references and can provide a fair comparison when evaluating different DIRs. Demons and intensity-based DIRs were shown to have smaller geometric and dosimetric uncertainties than B-Spline. Z Shen: None; K Bzdusek: an employee of Philips Healthcare; J Greskovich: None; P Xia: received research grants from Philips Healthcare and Siemens Healthcare.« less
Evaluation of Demons- and FEM-Based Registration Algorithms for Lung Cancer.

PubMed

Yang, Juan; Li, Dengwang; Yin, Yong; Zhao, Fen; Wang, Hongjun

2016-04-01

We evaluated and compared the accuracy of 2 deformable image registration algorithms in 4-dimensional computed tomography images for patients with lung cancer. Ten patients with non-small cell lung cancer or small cell lung cancer were enrolled in this institutional review board-approved study. The displacement vector fields relative to a specific reference image were calculated by using the diffeomorphic demons (DD) algorithm and the finite element method (FEM)-based algorithm. The registration accuracy was evaluated by using normalized mutual information (NMI), the sum of squared intensity difference (SSD), modified Hausdorff distance (dH_M), and ratio of gross tumor volume (rGTV) difference between reference image and deformed phase image. We also compared the registration speed of the 2 algorithms. Of all patients, the FEM-based algorithm showed stronger ability in aligning 2 images than the DD algorithm. The means (±standard deviation) of NMI were 0.86 (±0.05) and 0.90 (±0.05) using the DD algorithm and the FEM-based algorithm, respectively. The means of SSD were 0.006 (±0.003) and 0.003 (±0.002) using the DD algorithm and the FEM-based algorithm, respectively. The means of dH_M were 0.04 (±0.02) and 0.03 (±0.03) using the DD algorithm and the FEM-based algorithm, respectively. The means of rGTV were 3.9% (±1.01%) and 2.9% (±1.1%) using the DD algorithm and the FEM-based algorithm, respectively. However, the FEM-based algorithm costs a longer time than the DD algorithm, with the average running time of 31.4 minutes compared to 21.9 minutes for all patients. The preliminary results showed that the FEM-based algorithm was more accurate than the DD algorithm while compromised with the registration speed. © The Author(s) 2015.
Optimizing physician access to surgical intensive care unit laboratory information through mobile computing.

PubMed

Strain, J J; Felciano, R M; Seiver, A; Acuff, R; Fagan, L

1996-01-01

Approximately 30 minutes of computer access time are required by surgical residents at Stanford University Medical Center (SUMC) to examine the lab values of all patients on a surgical intensive care unit (ICU) service, a task that must be performed several times a day. To reduce the time accessing this information and simultaneously increase the readability and currency of the data, we have created a mobile, pen-based user interface and software system that delivers lab results to surgeons in the ICU. The ScroungeMaster system, loaded on a portable tablet computer, retrieves lab results for a subset of patients from the central laboratory computer and stores them in a local database cache. The cache can be updated on command; this update takes approximately 2.7 minutes for all ICU patients being followed by the surgeon, and can be performed as a background task while the user continues to access selected lab results. The user interface presents lab results according to physiologic system. Which labs are displayed first is governed by a layout selection algorithm based on previous accesses to the patient's lab information, physician preferences, and the nature of the patient's medical condition. Initial evaluation of the system has shown that physicians prefer the ScroungeMaster interface to that of existing systems at SUMC and are satisfied with the system's performance. We discuss the evolution of ScroungeMaster and make observations on changes to physician work flow with the presence of mobile, pen-based computing in the ICU.
Efficient GW calculations using eigenvalue-eigenvector decomposition of the dielectric matrix

NASA Astrophysics Data System (ADS)

Nguyen, Huy-Viet; Pham, T. Anh; Rocca, Dario; Galli, Giulia

2011-03-01

During the past 25 years, the GW method has been successfully used to compute electronic quasi-particle excitation spectra of a variety of materials. It is however a computationally intensive technique, as it involves summations over occupied and empty electronic states, to evaluate both the Green function (G) and the dielectric matrix (DM) entering the expression of the screened Coulomb interaction (W). Recent developments have shown that eigenpotentials of DMs can be efficiently calculated without any explicit evaluation of empty states. In this work, we will present a computationally efficient approach to the calculations of GW spectra by combining a representation of DMs in terms of its eigenpotentials and a recently developed iterative algorithm. As a demonstration of the efficiency of the method, we will present calculations of the vertical ionization potentials of several systems. Work was funnded by SciDAC-e DE-FC02-06ER25777.
Thermodynamic cost of computation, algorithmic complexity and the information metric

NASA Technical Reports Server (NTRS)

Zurek, W. H.

1989-01-01

Algorithmic complexity is discussed as a computational counterpart to the second law of thermodynamics. It is shown that algorithmic complexity, which is a measure of randomness, sets limits on the thermodynamic cost of computations and casts a new light on the limitations of Maxwell's demon. Algorithmic complexity can also be used to define distance between binary strings.
Parallel Directionally Split Solver Based on Reformulation of Pipelined Thomas Algorithm

NASA Technical Reports Server (NTRS)

Povitsky, A.

1998-01-01

In this research an efficient parallel algorithm for 3-D directionally split problems is developed. The proposed algorithm is based on a reformulated version of the pipelined Thomas algorithm that starts the backward step computations immediately after the completion of the forward step computations for the first portion of lines This algorithm has data available for other computational tasks while processors are idle from the Thomas algorithm. The proposed 3-D directionally split solver is based on the static scheduling of processors where local and non-local, data-dependent and data-independent computations are scheduled while processors are idle. A theoretical model of parallelization efficiency is used to define optimal parameters of the algorithm, to show an asymptotic parallelization penalty and to obtain an optimal cover of a global domain with subdomains. It is shown by computational experiments and by the theoretical model that the proposed algorithm reduces the parallelization penalty about two times over the basic algorithm for the range of the number of processors (subdomains) considered and the number of grid nodes per subdomain.
Parallel algorithms for computation of the manipulator inertia matrix

NASA Technical Reports Server (NTRS)

Amin-Javaheri, Masoud; Orin, David E.

1989-01-01

The development of an O(log2N) parallel algorithm for the manipulator inertia matrix is presented. It is based on the most efficient serial algorithm which uses the composite rigid body method. Recursive doubling is used to reformulate the linear recurrence equations which are required to compute the diagonal elements of the matrix. It results in O(log2N) levels of computation. Computation of the off-diagonal elements involves N linear recurrences of varying-size and a new method, which avoids redundant computation of position and orientation transforms for the manipulator, is developed. The O(log2N) algorithm is presented in both equation and graphic forms which clearly show the parallelism inherent in the algorithm.
Dragonfly: an implementation of the expand-maximize-compress algorithm for single-particle imaging.

PubMed

Ayyer, Kartik; Lan, Ti-Yen; Elser, Veit; Loh, N Duane

2016-08-01

Single-particle imaging (SPI) with X-ray free-electron lasers has the potential to change fundamentally how biomacromolecules are imaged. The structure would be derived from millions of diffraction patterns, each from a different copy of the macromolecule before it is torn apart by radiation damage. The challenges posed by the resultant data stream are staggering: millions of incomplete, noisy and un-oriented patterns have to be computationally assembled into a three-dimensional intensity map and then phase reconstructed. In this paper, the Dragonfly software package is described, based on a parallel implementation of the expand-maximize-compress reconstruction algorithm that is well suited for this task. Auxiliary modules to simulate SPI data streams are also included to assess the feasibility of proposed SPI experiments at the Linac Coherent Light Source, Stanford, California, USA.
Shor's factoring algorithm and modern cryptography. An illustration of the capabilities inherent in quantum computers

NASA Astrophysics Data System (ADS)

Gerjuoy, Edward

2005-06-01

The security of messages encoded via the widely used RSA public key encryption system rests on the enormous computational effort required to find the prime factors of a large number N using classical (conventional) computers. In 1994 Peter Shor showed that for sufficiently large N, a quantum computer could perform the factoring with much less computational effort. This paper endeavors to explain, in a fashion comprehensible to the nonexpert, the RSA encryption protocol; the various quantum computer manipulations constituting the Shor algorithm; how the Shor algorithm performs the factoring; and the precise sense in which a quantum computer employing Shor's algorithm can be said to accomplish the factoring of very large numbers with less computational effort than a classical computer. It is made apparent that factoring N generally requires many successive runs of the algorithm. Our analysis reveals that the probability of achieving a successful factorization on a single run is about twice as large as commonly quoted in the literature.
A continuous scale-space method for the automated placement of spot heights on maps

NASA Astrophysics Data System (ADS)

Rocca, Luigi; Jenny, Bernhard; Puppo, Enrico

2017-12-01

Spot heights and soundings explicitly indicate terrain elevation on cartographic maps. Cartographers have developed design principles for the manual selection, placement, labeling, and generalization of spot height locations, but these processes are work-intensive and expensive. Finding an algorithmic criterion that matches the cartographers' judgment in ranking the significance of features on a terrain is a difficult endeavor. This article proposes a method for the automated selection of spot heights locations representing natural features such as peaks, saddles and depressions. A lifespan of critical points in a continuous scale-space model is employed as the main measure of the importance of features, and an algorithm and a data structure for its computation are described. We also introduce a method for the comparison of algorithmically computed spot height locations with manually produced reference compilations. The new method is compared with two known techniques from the literature. Results show spot height locations that are closer to reference spot heights produced manually by swisstopo cartographers, compared to previous techniques. The introduced method can be applied to elevation models for the creation of topographic and bathymetric maps. It also ranks the importance of extracted spot height locations, which allows for a variation in the size of symbols and labels according to the significance of represented features. The importance ranking could also be useful for adjusting spot height density of zoomable maps in real time.
A robust fingerprint matching algorithm based on compatibility of star structures

NASA Astrophysics Data System (ADS)

Cao, Jia; Feng, Jufu

2009-10-01

In fingerprint verification or identification systems, most minutiae-based matching algorithms suffered from the problems of non-linear distortion and missing or faking minutiae. Local structures such as triangle or k-nearest structure are widely used to reduce the impact of non-linear distortion, but are suffered from missing and faking minutiae. In our proposed method, star structure is used to present local structure. A star structure contains various number of minutiae, thus, it is more robust with missing and faking minutiae. Our method consists of four steps: 1) Constructing star structures at minutia level; 2) Computing similarity score for each structure pair, and eliminating impostor matched pairs which have the low scores. As it is generally assumed that there is only linear distortion in local area, the similarity is defined by rotation and shifting. 3) Voting for remained matched pairs according to the compatibility between them, and eliminating impostor matched pairs which gain few votes. The concept of compatibility is first introduced by Yansong Feng [4], the original definition is only based on triangles. We define the compatibility for star structures to adjust to our proposed algorithm. 4) Computing the matching score, based on the number of matched structures and their voting scores. The score also reflects the fact that, it should get higher score if minutiae match in more intensive areas. Experiments evaluated on FVC 2004 show both effectiveness and efficiency of our methods.
Vehicle monitoring under Vehicular Ad-Hoc Networks (VANET) parameters employing illumination invariant correlation filters for the Pakistan motorway police

NASA Astrophysics Data System (ADS)

Gardezi, A.; Umer, T.; Butt, F.; Young, R. C. D.; Chatwin, C. R.

2016-04-01

A spatial domain optimal trade-off Maximum Average Correlation Height (SPOT-MACH) filter has been previously developed and shown to have advantages over frequency domain implementations in that it can be made locally adaptive to spatial variations in the input image background clutter and normalised for local intensity changes. The main concern for using the SPOT-MACH is its computationally intensive nature. However in the past enhancements techniques were proposed for the SPOT-MACH to make its execution time comparable to its frequency domain counterpart. In this paper a novel approach is discussed which uses VANET parameters coupled with the SPOT-MACH in order to minimise the extensive processing of the large video dataset acquired from the Pakistan motorways surveillance system. The use of VANET parameters gives us an estimation criterion of the flow of traffic on the Pakistan motorway network and acts as a precursor to the training algorithm. The use of VANET in this scenario would contribute heavily towards the computational complexity minimization of the proposed monitoring system.
A hyperspectral X-ray computed tomography system for enhanced material identification

NASA Astrophysics Data System (ADS)

Wu, Xiaomei; Wang, Qian; Ma, Jinlei; Zhang, Wei; Li, Po; Fang, Zheng

2017-08-01

X-ray computed tomography (CT) can distinguish different materials according to their absorption characteristics. The hyperspectral X-ray CT (HXCT) system proposed in the present work reconstructs each voxel according to its X-ray absorption spectral characteristics. In contrast to a dual-energy or multi-energy CT system, HXCT employs cadmium telluride (CdTe) as the x-ray detector, which provides higher spectral resolution and separate spectral lines according to the material's photon-counter working principle. In this paper, a specimen containing ten different polymer materials randomly arranged was adopted for material identification by HXCT. The filtered back-projection algorithm was applied for image and spectral reconstruction. The first step was to sort the individual material components of the specimen according to their cross-sectional image intensity. The second step was to classify materials with similar intensities according to their reconstructed spectral characteristics. The results demonstrated the feasibility of the proposed material identification process and indicated that the proposed HXCT system has good prospects for a wide range of biomedical and industrial nondestructive testing applications.
Iterative reconstruction of volumetric particle distribution

NASA Astrophysics Data System (ADS)

Wieneke, Bernhard

2013-02-01

For tracking the motion of illuminated particles in space and time several volumetric flow measurement techniques are available like 3D-particle tracking velocimetry (3D-PTV) recording images from typically three to four viewing directions. For higher seeding densities and the same experimental setup, tomographic PIV (Tomo-PIV) reconstructs voxel intensities using an iterative tomographic reconstruction algorithm (e.g. multiplicative algebraic reconstruction technique, MART) followed by cross-correlation of sub-volumes computing instantaneous 3D flow fields on a regular grid. A novel hybrid algorithm is proposed here that similar to MART iteratively reconstructs 3D-particle locations by comparing the recorded images with the projections calculated from the particle distribution in the volume. But like 3D-PTV, particles are represented by 3D-positions instead of voxel-based intensity blobs as in MART. Detailed knowledge of the optical transfer function and the particle image shape is mandatory, which may differ for different positions in the volume and for each camera. Using synthetic data it is shown that this method is capable of reconstructing densely seeded flows up to about 0.05 ppp with similar accuracy as Tomo-PIV. Finally the method is validated with experimental data.
Sub-pixel localisation of passive micro-coil fiducial markers in interventional MRI.

PubMed

Rea, Marc; McRobbie, Donald; Elhawary, Haytham; Tse, Zion T H; Lamperth, Michael; Young, Ian

2009-04-01

Electromechanical devices enable increased accuracy in surgical procedures, and the recent development of MRI-compatible mechatronics permits the use of MRI for real-time image guidance. Integrated imaging of resonant micro-coil fiducials provides an accurate method of tracking devices in a scanner with increased flexibility compared to gradient tracking. Here we report on the ability of ten different image-processing algorithms to track micro-coil fiducials with sub-pixel accuracy. Five algorithms: maximum pixel, barycentric weighting, linear interpolation, quadratic fitting and Gaussian fitting were applied both directly to the pixel intensity matrix and to the cross-correlation matrix obtained by 2D convolution with a reference image. Using images of a 3 mm fiducial marker and a pixel size of 1.1 mm, intensity linear interpolation, which calculates the position of the fiducial centre by interpolating the pixel data to find the fiducial edges, was found to give the best performance for minimal computing power; a maximum error of 0.22 mm was observed in fiducial localisation for displacements up to 40 mm. The inherent standard deviation of fiducial localisation was 0.04 mm. This work enables greater accuracy to be achieved in passive fiducial tracking.
System calibration method for Fourier ptychographic microscopy

NASA Astrophysics Data System (ADS)

Pan, An; Zhang, Yan; Zhao, Tianyu; Wang, Zhaojun; Dan, Dan; Lei, Ming; Yao, Baoli

2017-09-01

Fourier ptychographic microscopy (FPM) is a recently proposed computational imaging technique with both high-resolution and wide field of view. In current FPM imaging platforms, systematic error sources come from aberrations, light-emitting diode (LED) intensity fluctuation, parameter imperfections, and noise, all of which may severely corrupt the reconstruction results with similar artifacts. Therefore, it would be unlikely to distinguish the dominating error from these degraded reconstructions without any preknowledge. In addition, systematic error is generally a mixture of various error sources in the real situation, and it cannot be separated due to their mutual restriction and conversion. To this end, we report a system calibration procedure, termed SC-FPM, to calibrate the mixed systematic errors simultaneously from an overall perspective, based on the simulated annealing algorithm, the LED intensity correction method, the nonlinear regression process, and the adaptive step-size strategy, which involves the evaluation of an error metric at each iteration step, followed by the re-estimation of accurate parameters. The performance achieved both in simulations and experiments demonstrates that the proposed method outperforms other state-of-the-art algorithms. The reported system calibration scheme improves the robustness of FPM, relaxes the experiment conditions, and does not require any preknowledge, which makes the FPM more pragmatic.
CBESW: sequence alignment on the Playstation 3.

PubMed

Wirawan, Adrianto; Kwoh, Chee Keong; Hieu, Nim Tri; Schmidt, Bertil

2008-09-17

The exponential growth of available biological data has caused bioinformatics to be rapidly moving towards a data-intensive, computational science. As a result, the computational power needed by bioinformatics applications is growing exponentially as well. The recent emergence of accelerator technologies has made it possible to achieve an excellent improvement in execution time for many bioinformatics applications, compared to current general-purpose platforms. In this paper, we demonstrate how the PlayStation 3, powered by the Cell Broadband Engine, can be used as a computational platform to accelerate the Smith-Waterman algorithm. For large datasets, our implementation on the PlayStation 3 provides a significant improvement in running time compared to other implementations such as SSEARCH, Striped Smith-Waterman and CUDA. Our implementation achieves a peak performance of up to 3,646 MCUPS. The results from our experiments demonstrate that the PlayStation 3 console can be used as an efficient low cost computational platform for high performance sequence alignment applications.
CBESW: Sequence Alignment on the Playstation 3

PubMed Central

Wirawan, Adrianto; Kwoh, Chee Keong; Hieu, Nim Tri; Schmidt, Bertil

2008-01-01

Background The exponential growth of available biological data has caused bioinformatics to be rapidly moving towards a data-intensive, computational science. As a result, the computational power needed by bioinformatics applications is growing exponentially as well. The recent emergence of accelerator technologies has made it possible to achieve an excellent improvement in execution time for many bioinformatics applications, compared to current general-purpose platforms. In this paper, we demonstrate how the PlayStation® 3, powered by the Cell Broadband Engine, can be used as a computational platform to accelerate the Smith-Waterman algorithm. Results For large datasets, our implementation on the PlayStation® 3 provides a significant improvement in running time compared to other implementations such as SSEARCH, Striped Smith-Waterman and CUDA. Our implementation achieves a peak performance of up to 3,646 MCUPS. Conclusion The results from our experiments demonstrate that the PlayStation® 3 console can be used as an efficient low cost computational platform for high performance sequence alignment applications. PMID:18798993

High performance hybrid functional Petri net simulations of biological pathway models on CUDA.

PubMed

Chalkidis, Georgios; Nagasaki, Masao; Miyano, Satoru

2011-01-01

Hybrid functional Petri nets are a wide-spread tool for representing and simulating biological models. Due to their potential of providing virtual drug testing environments, biological simulations have a growing impact on pharmaceutical research. Continuous research advancements in biology and medicine lead to exponentially increasing simulation times, thus raising the demand for performance accelerations by efficient and inexpensive parallel computation solutions. Recent developments in the field of general-purpose computation on graphics processing units (GPGPU) enabled the scientific community to port a variety of compute intensive algorithms onto the graphics processing unit (GPU). This work presents the first scheme for mapping biological hybrid functional Petri net models, which can handle both discrete and continuous entities, onto compute unified device architecture (CUDA) enabled GPUs. GPU accelerated simulations are observed to run up to 18 times faster than sequential implementations. Simulating the cell boundary formation by Delta-Notch signaling on a CUDA enabled GPU results in a speedup of approximately 7x for a model containing 1,600 cells.
Algorithm-Based Fault Tolerance Integrated with Replication

NASA Technical Reports Server (NTRS)

Some, Raphael; Rennels, David

2008-01-01

In a proposed approach to programming and utilization of commercial off-the-shelf computing equipment, a combination of algorithm-based fault tolerance (ABFT) and replication would be utilized to obtain high degrees of fault tolerance without incurring excessive costs. The basic idea of the proposed approach is to integrate ABFT with replication such that the algorithmic portions of computations would be protected by ABFT, and the logical portions by replication. ABFT is an extremely efficient, inexpensive, high-coverage technique for detecting and mitigating faults in computer systems used for algorithmic computations, but does not protect against errors in logical operations surrounding algorithms.
Impedance computed tomography using an adaptive smoothing coefficient algorithm.

PubMed

Suzuki, A; Uchiyama, A

2001-01-01

In impedance computed tomography, a fixed coefficient regularization algorithm has been frequently used to improve the ill-conditioning problem of the Newton-Raphson algorithm. However, a lot of experimental data and a long period of computation time are needed to determine a good smoothing coefficient because a good smoothing coefficient has to be manually chosen from a number of coefficients and is a constant for each iteration calculation. Thus, sometimes the fixed coefficient regularization algorithm distorts the information or fails to obtain any effect. In this paper, a new adaptive smoothing coefficient algorithm is proposed. This algorithm automatically calculates the smoothing coefficient from the eigenvalue of the ill-conditioned matrix. Therefore, the effective images can be obtained within a short computation time. Also the smoothing coefficient is automatically adjusted by the information related to the real resistivity distribution and the data collection method. In our impedance system, we have reconstructed the resistivity distributions of two phantoms using this algorithm. As a result, this algorithm only needs one-fifth the computation time compared to the fixed coefficient regularization algorithm. When compared to the fixed coefficient regularization algorithm, it shows that the image is obtained more rapidly and applicable in real-time monitoring of the blood vessel.
A control method of the rotor re-levitation for different orbit responses during touchdowns in active magnetic bearings

NASA Astrophysics Data System (ADS)

Lyu, Mindong; Liu, Tao; Wang, Zixi; Yan, Shaoze; Jia, Xiaohong; Wang, Yuming

2018-05-01

Touchdown can make active magnetic bearings (AMB) unable to work, and bring severe damages to touchdown bearings (TDB). To resolve it, we presents a novel re-levitation method consisting of two operations, i.e., orbit response recognition and rotor re-levitation. In the operation of orbit response recognition, the three orbit responses (pendulum vibration, combined rub and bouncing, and full rub) can be identified by the expectation of radial displacement of rotor and expectation of instantaneous frequency (IF) of rotor motion in the sampling period. In the rotor re-levitation operation, a decentralized PID control algorithm is employed for pendulum vibration and combined rub and bouncing, and the decentralized PID control algorithm and another whirl damping algorithm, in which the weighting factor is determined by the whirl frequency, are jointly executed for the full rub. The method has been demonstrated by the simulation results of an AMB model. The results reveal that the method is effective in actively suppressing the whirl motion and promptly re-levitating the rotor. As the PID control algorithm and the simple operations of signal processing are employed, the algorithm has a low computation intensity, which makes it more easily realized in practical applications.
Design of a Wireless Sensor System with the Algorithms of Heart Rate and Agility Index for Athlete Evaluation

PubMed Central

Li, Meina; Kim, Youn Tae

2017-01-01

Athlete evaluation systems can effectively monitor daily training and boost performance to reduce injuries. Conventional heart-rate measurement systems can be easily affected by artifact movement, especially in the case of athletes. Significant noise can be generated owing to high-intensity activities. To improve the comfort for athletes and the accuracy of monitoring, we have proposed to combine robust heart rate and agility index monitoring algorithms into a small, light, and single node. A band-pass-filter-based R-wave detection algorithm was developed. The agility index was calculated by preprocessing with band-pass filtering and employing the zero-crossing detection method. The evaluation was conducted under both laboratory and field environments to verify the accuracy and reliability of the algorithm. The heart rate and agility index measurements can be wirelessly transmitted to a personal computer in real time by the ZigBee telecommunication system. The results show that the error rate of measurement of the heart rate is within 2%, which is comparable with that of the traditional wired measurement method. The sensitivity of the agility index, which could be distinguished as the activity speed, changed slightly. Thus, we confirmed that the developed algorithm could be used in an effective and safe exercise-evaluation system for athletes. PMID:29039763
A new automated quantification algorithm for the detection and evaluation of focal liver lesions with contrast-enhanced ultrasound.

PubMed

Gatos, Ilias; Tsantis, Stavros; Spiliopoulos, Stavros; Skouroliakou, Aikaterini; Theotokas, Ioannis; Zoumpoulis, Pavlos; Hazle, John D; Kagadis, George C

2015-07-01

Detect and classify focal liver lesions (FLLs) from contrast-enhanced ultrasound (CEUS) imaging by means of an automated quantification algorithm. The proposed algorithm employs a sophisticated segmentation method to detect and contour focal lesions from 52 CEUS video sequences (30 benign and 22 malignant). Lesion detection involves wavelet transform zero crossings utilization as an initialization step to the Markov random field model toward the lesion contour extraction. After FLL detection across frames, time intensity curve (TIC) is computed which provides the contrast agents' behavior at all vascular phases with respect to adjacent parenchyma for each patient. From each TIC, eight features were automatically calculated and employed into the support vector machines (SVMs) classification algorithm in the design of the image analysis model. With regard to FLLs detection accuracy, all lesions detected had an average overlap value of 0.89 ± 0.16 with manual segmentations for all CEUS frame-subsets included in the study. Highest classification accuracy from the SVM model was 90.3%, misdiagnosing three benign and two malignant FLLs with sensitivity and specificity values of 93.1% and 86.9%, respectively. The proposed quantification system that employs FLLs detection and classification algorithms may be of value to physicians as a second opinion tool for avoiding unnecessary invasive procedures.
a Fully Automated Pipeline for Classification Tasks with AN Application to Remote Sensing

NASA Astrophysics Data System (ADS)

Suzuki, K.; Claesen, M.; Takeda, H.; De Moor, B.

2016-06-01

Nowadays deep learning has been intensively in spotlight owing to its great victories at major competitions, which undeservedly pushed `shallow' machine learning methods, relatively naive/handy algorithms commonly used by industrial engineers, to the background in spite of their facilities such as small requisite amount of time/dataset for training. We, with a practical point of view, utilized shallow learning algorithms to construct a learning pipeline such that operators can utilize machine learning without any special knowledge, expensive computation environment, and a large amount of labelled data. The proposed pipeline automates a whole classification process, namely feature-selection, weighting features and the selection of the most suitable classifier with optimized hyperparameters. The configuration facilitates particle swarm optimization, one of well-known metaheuristic algorithms for the sake of generally fast and fine optimization, which enables us not only to optimize (hyper)parameters but also to determine appropriate features/classifier to the problem, which has conventionally been a priori based on domain knowledge and remained untouched or dealt with naïve algorithms such as grid search. Through experiments with the MNIST and CIFAR-10 datasets, common datasets in computer vision field for character recognition and object recognition problems respectively, our automated learning approach provides high performance considering its simple setting (i.e. non-specialized setting depending on dataset), small amount of training data, and practical learning time. Moreover, compared to deep learning the performance stays robust without almost any modification even with a remote sensing object recognition problem, which in turn indicates that there is a high possibility that our approach contributes to general classification problems.
Automated patient setup and gating using cone beam computed tomography projections

NASA Astrophysics Data System (ADS)

Wan, Hanlin; Bertholet, Jenny; Ge, Jiajia; Poulsen, Per; Parikh, Parag

2016-03-01

In radiation therapy, fiducial markers are often implanted near tumors and used for patient positioning and respiratory gating purposes. These markers are then used to manually align the patients by matching the markers in the cone beam computed tomography (CBCT) reconstruction to those in the planning CT. This step is time-intensive and user-dependent, and often results in a suboptimal patient setup. We propose a fully automated, robust method based on dynamic programming (DP) for segmenting radiopaque fiducial markers in CBCT projection images, which are then used to automatically optimize the treatment couch position and/or gating window bounds. The mean of the absolute 2D segmentation error of our DP algorithm is 1.3+/- 1.0 mm for 87 markers on 39 patients. Intrafraction images were acquired every 3 s during treatment at two different institutions. For gated patients from Institution A (8 patients, 40 fractions), the DP algorithm increased the delivery accuracy (96+/- 6% versus 91+/- 11% , p < 0.01) compared to the manual setup using kV fluoroscopy. For non-gated patients from Institution B (6 patients, 16 fractions), the DP algorithm performed similarly (1.5+/- 0.8 mm versus 1.6+/- 0.9 mm, p = 0.48) compared to the manual setup matching the fiducial markers in the CBCT to the mean position. Our proposed automated patient setup algorithm only takes 1-2 s to run, requires no user intervention, and performs as well as or better than the current clinical setup.
First experiences with model based iterative reconstructions influence on quantitative plaque volume and intensity measurements in coronary computed tomography angiography.

PubMed

Precht, H; Kitslaar, P H; Broersen, A; Gerke, O; Dijkstra, J; Thygesen, J; Egstrup, K; Lambrechtsen, J

2017-02-01

Investigate the influence of adaptive statistical iterative reconstruction (ASIR) and the model-based IR (Veo) reconstruction algorithm in coronary computed tomography angiography (CCTA) images on quantitative measurements in coronary arteries for plaque volumes and intensities. Three patients had three independent dose reduced CCTA performed and reconstructed with 30% ASIR (CTDI vol at 6.7 mGy), 60% ASIR (CTDI vol 4.3 mGy) and Veo (CTDI vol at 1.9 mGy). Coronary plaque analysis was performed for each measured CCTA volumes, plaque burden and intensities. Plaque volume and plaque burden show a decreasing tendency from ASIR to Veo as median volume for ASIR is 314 mm 3 and 337 mm 3 -252 mm 3 for Veo and plaque burden is 42% and 44% for ASIR to 39% for Veo. The lumen and vessel volume decrease slightly from 30% ASIR to 60% ASIR with 498 mm 3 -391 mm 3 for lumen volume and vessel volume from 939 mm 3 to 830 mm 3 . The intensities did not change overall between the different reconstructions for either lumen or plaque. We found a tendency of decreasing plaque volumes and plaque burden but no change in intensities with the use of low dose Veo CCTA (1.9 mGy) compared to dose reduced ASIR CCTA (6.7 mGy & 4.3 mGy), although more studies are warranted. Copyright © 2016 The College of Radiographers. Published by Elsevier Ltd. All rights reserved.
Analytic reconstruction algorithms for triple-source CT with horizontal data truncation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Ming; Yu, Hengyong, E-mail: hengyong-yu@ieee.org

2015-10-15

Purpose: This paper explores a triple-source imaging method with horizontal data truncation to enlarge the field of view (FOV) for big objects. Methods: The study is conducted by using theoretical analysis, mathematical deduction, and numerical simulations. The proposed algorithms are implemented in c + + and MATLAB. While the basic platform is constructed in MATLAB, the computationally intensive segments are coded in c + +, which are linked via a MEX interface. Results: A triple-source circular scanning configuration with horizontal data truncation is developed, where three pairs of x-ray sources and detectors are unevenly distributed on the same circle tomore » cover the whole imaging object. For this triple-source configuration, a fan-beam filtered backprojection-type algorithm is derived for truncated full-scan projections without data rebinning. The algorithm is also extended for horizontally truncated half-scan projections and cone-beam projections in a Feldkamp-type framework. Using their method, the FOV is enlarged twofold to threefold to scan bigger objects with high speed and quality. The numerical simulation results confirm the correctness and effectiveness of the developed algorithms. Conclusions: The triple-source scanning configuration with horizontal data truncation cannot only keep most of the advantages of a traditional multisource system but also cover a larger FOV for big imaging objects. In addition, because the filtering is shift-invariant, the proposed algorithms are very fast and easily parallelized on graphic processing units.« less
Analytic reconstruction algorithms for triple-source CT with horizontal data truncation.

PubMed

Chen, Ming; Yu, Hengyong

2015-10-01

This paper explores a triple-source imaging method with horizontal data truncation to enlarge the field of view (FOV) for big objects. The study is conducted by using theoretical analysis, mathematical deduction, and numerical simulations. The proposed algorithms are implemented in c + + and matlab. While the basic platform is constructed in matlab, the computationally intensive segments are coded in c + +, which are linked via a mex interface. A triple-source circular scanning configuration with horizontal data truncation is developed, where three pairs of x-ray sources and detectors are unevenly distributed on the same circle to cover the whole imaging object. For this triple-source configuration, a fan-beam filtered backprojection-type algorithm is derived for truncated full-scan projections without data rebinning. The algorithm is also extended for horizontally truncated half-scan projections and cone-beam projections in a Feldkamp-type framework. Using their method, the FOV is enlarged twofold to threefold to scan bigger objects with high speed and quality. The numerical simulation results confirm the correctness and effectiveness of the developed algorithms. The triple-source scanning configuration with horizontal data truncation cannot only keep most of the advantages of a traditional multisource system but also cover a larger FOV for big imaging objects. In addition, because the filtering is shift-invariant, the proposed algorithms are very fast and easily parallelized on graphic processing units.
QCCM Center for Quantum Algorithms

DTIC Science & Technology

2008-10-17

algorithms (e.g., quantum walks and adiabatic computing ), as well as theoretical advances relating algorithms to physical implementations (e.g...Park, NC 27709-2211 15. SUBJECT TERMS Quantum algorithms, quantum computing , fault-tolerant error correction Richard Cleve MITACS East Academic...0511200 Algebraic results on quantum automata A. Ambainis, M. Beaudry, M. Golovkins, A. Kikusts, M. Mercer, D. Thrien Theory of Computing Systems 39(2006
Autonomous rock detection on mars through region contrast

NASA Astrophysics Data System (ADS)

Xiao, Xueming; Cui, Hutao; Yao, Meibao; Tian, Yang

2017-08-01

In this paper, we present a new autonomous rock detection approach through region contrast. Unlike current state-of-art pixel-level rock segmenting methods, new method deals with this issue in region level, which will significantly reduce the computational cost. Image is firstly splitted into homogeneous regions based on intensity information and spatial layout. Considering the high-water memory constraints of onboard flight processor, only low-level features, average intensity and variation of superpixel, are measured. Region contrast is derived as the integration of intensity contrast and smoothness measurement. Rocks are then segmented from the resulting contrast map by an adaptive threshold. Since the merely intensity-based method may cause false detection in background areas with different illuminations from surroundings, a more reliable method is further proposed by introducing spatial factor and background similarity to the region contrast. Spatial factor demonstrates the locality of contrast, while background similarity calculates the probability of each subregion belonging to background. Our method is efficient in dealing with large images and only few parameters are needed. Preliminary experimental results show that our algorithm outperforms edge-based methods in various grayscale rover images.
Distributed Algorithm for Voronoi Partition of Wireless Sensor Networks with a Limited Sensing Range.

PubMed

He, Chenlong; Feng, Zuren; Ren, Zhigang

2018-02-03

For Wireless Sensor Networks (WSNs), the Voronoi partition of a region is a challenging problem owing to the limited sensing ability of each sensor and the distributed organization of the network. In this paper, an algorithm is proposed for each sensor having a limited sensing range to compute its limited Voronoi cell autonomously, so that the limited Voronoi partition of the entire WSN is generated in a distributed manner. Inspired by Graham's Scan (GS) algorithm used to compute the convex hull of a point set, the limited Voronoi cell of each sensor is obtained by sequentially scanning two consecutive bisectors between the sensor and its neighbors. The proposed algorithm called the Boundary Scan (BS) algorithm has a lower computational complexity than the existing Range-Constrained Voronoi Cell (RCVC) algorithm and reaches the lower bound of the computational complexity of the algorithms used to solve the problem of this kind. Moreover, it also improves the time efficiency of a key step in the Adjust-Sensing-Radius (ASR) algorithm used to compute the exact Voronoi cell. Extensive numerical simulations are performed to demonstrate the correctness and effectiveness of the BS algorithm. The distributed realization of the BS combined with a localization algorithm in WSNs is used to justify the WSN nature of the proposed algorithm.
Distributed Algorithm for Voronoi Partition of Wireless Sensor Networks with a Limited Sensing Range

PubMed Central

Feng, Zuren; Ren, Zhigang

2018-01-01

For Wireless Sensor Networks (WSNs), the Voronoi partition of a region is a challenging problem owing to the limited sensing ability of each sensor and the distributed organization of the network. In this paper, an algorithm is proposed for each sensor having a limited sensing range to compute its limited Voronoi cell autonomously, so that the limited Voronoi partition of the entire WSN is generated in a distributed manner. Inspired by Graham’s Scan (GS) algorithm used to compute the convex hull of a point set, the limited Voronoi cell of each sensor is obtained by sequentially scanning two consecutive bisectors between the sensor and its neighbors. The proposed algorithm called the Boundary Scan (BS) algorithm has a lower computational complexity than the existing Range-Constrained Voronoi Cell (RCVC) algorithm and reaches the lower bound of the computational complexity of the algorithms used to solve the problem of this kind. Moreover, it also improves the time efficiency of a key step in the Adjust-Sensing-Radius (ASR) algorithm used to compute the exact Voronoi cell. Extensive numerical simulations are performed to demonstrate the correctness and effectiveness of the BS algorithm. The distributed realization of the BS combined with a localization algorithm in WSNs is used to justify the WSN nature of the proposed algorithm. PMID:29401649
A structure-exploiting numbering algorithm for finite elements on extruded meshes, and its performance evaluation in Firedrake

NASA Astrophysics Data System (ADS)

Bercea, Gheorghe-Teodor; McRae, Andrew T. T.; Ham, David A.; Mitchell, Lawrence; Rathgeber, Florian; Nardi, Luigi; Luporini, Fabio; Kelly, Paul H. J.

2016-10-01

We present a generic algorithm for numbering and then efficiently iterating over the data values attached to an extruded mesh. An extruded mesh is formed by replicating an existing mesh, assumed to be unstructured, to form layers of prismatic cells. Applications of extruded meshes include, but are not limited to, the representation of three-dimensional high aspect ratio domains employed by geophysical finite element simulations. These meshes are structured in the extruded direction. The algorithm presented here exploits this structure to avoid the performance penalty traditionally associated with unstructured meshes. We evaluate the implementation of this algorithm in the Firedrake finite element system on a range of low compute intensity operations which constitute worst cases for data layout performance exploration. The experiments show that having structure along the extruded direction enables the cost of the indirect data accesses to be amortized after 10-20 layers as long as the underlying mesh is well ordered. We characterize the resulting spatial and temporal reuse in a representative set of both continuous-Galerkin and discontinuous-Galerkin discretizations. On meshes with realistic numbers of layers the performance achieved is between 70 and 90 % of a theoretical hardware-specific limit.
Graph run-length matrices for histopathological image segmentation.

PubMed

Tosun, Akif Burak; Gunduz-Demir, Cigdem

2011-03-01

The histopathological examination of tissue specimens is essential for cancer diagnosis and grading. However, this examination is subject to a considerable amount of observer variability as it mainly relies on visual interpretation of pathologists. To alleviate this problem, it is very important to develop computational quantitative tools, for which image segmentation constitutes the core step. In this paper, we introduce an effective and robust algorithm for the segmentation of histopathological tissue images. This algorithm incorporates the background knowledge of the tissue organization into segmentation. For this purpose, it quantifies spatial relations of cytological tissue components by constructing a graph and uses this graph to define new texture features for image segmentation. This new texture definition makes use of the idea of gray-level run-length matrices. However, it considers the runs of cytological components on a graph to form a matrix, instead of considering the runs of pixel intensities. Working with colon tissue images, our experiments demonstrate that the texture features extracted from "graph run-length matrices" lead to high segmentation accuracies, also providing a reasonable number of segmented regions. Compared with four other segmentation algorithms, the results show that the proposed algorithm is more effective in histopathological image segmentation.
Real Time Corner Detection for Miniaturized Electro-Optical Sensors Onboard Small Unmanned Aerial Systems

PubMed Central

Forlenza, Lidia; Carton, Patrick; Accardo, Domenico; Fasano, Giancarmine; Moccia, Antonio

2012-01-01

This paper describes the target detection algorithm for the image processor of a vision-based system that is installed onboard an unmanned helicopter. It has been developed in the framework of a project of the French national aerospace research center Office National d’Etudes et de Recherches Aérospatiales (ONERA) which aims at developing an air-to-ground target tracking mission in an unknown urban environment. In particular, the image processor must detect targets and estimate ground motion in proximity of the detected target position. Concerning the target detection function, the analysis has dealt with realizing a corner detection algorithm and selecting the best choices in terms of edge detection methods, filtering size and type and the more suitable criterion of detection of the points of interest in order to obtain a very fast algorithm which fulfills the computation load requirements. The compared criteria are the Harris-Stephen and the Shi-Tomasi, ones, which are the most widely used in literature among those based on intensity. Experimental results which illustrate the performance of the developed algorithm and demonstrate that the detection time is fully compliant with the requirements of the real-time system are discussed. PMID:22368499
Design of thrust vectoring exhaust nozzles for real-time applications using neural networks

NASA Technical Reports Server (NTRS)

Prasanth, Ravi K.; Markin, Robert E.; Whitaker, Kevin W.

1991-01-01

Thrust vectoring continues to be an important issue in military aircraft system designs. A recently developed concept of vectoring aircraft thrust makes use of flexible exhaust nozzles. Subtle modifications in the nozzle wall contours produce a non-uniform flow field containing a complex pattern of shock and expansion waves. The end result, due to the asymmetric velocity and pressure distributions, is vectored thrust. Specification of the nozzle contours required for a desired thrust vector angle (an inverse design problem) has been achieved with genetic algorithms. This approach is computationally intensive and prevents the nozzles from being designed in real-time, which is necessary for an operational aircraft system. An investigation was conducted into using genetic algorithms to train a neural network in an attempt to obtain, in real-time, two-dimensional nozzle contours. Results show that genetic algorithm trained neural networks provide a viable, real-time alternative for designing thrust vectoring nozzles contours. Thrust vector angles up to 20 deg were obtained within an average error of 0.0914 deg. The error surfaces encountered were highly degenerate and thus the robustness of genetic algorithms was well suited for minimizing global errors.
Development and application of unified algorithms for problems in computational science

NASA Technical Reports Server (NTRS)

Shankar, Vijaya; Chakravarthy, Sukumar

1987-01-01

A framework is presented for developing computationally unified numerical algorithms for solving nonlinear equations that arise in modeling various problems in mathematical physics. The concept of computational unification is an attempt to encompass efficient solution procedures for computing various nonlinear phenomena that may occur in a given problem. For example, in Computational Fluid Dynamics (CFD), a unified algorithm will be one that allows for solutions to subsonic (elliptic), transonic (mixed elliptic-hyperbolic), and supersonic (hyperbolic) flows for both steady and unsteady problems. The objectives are: development of superior unified algorithms emphasizing accuracy and efficiency aspects; development of codes based on selected algorithms leading to validation; application of mature codes to realistic problems; and extension/application of CFD-based algorithms to problems in other areas of mathematical physics. The ultimate objective is to achieve integration of multidisciplinary technologies to enhance synergism in the design process through computational simulation. Specific unified algorithms for a hierarchy of gas dynamics equations and their applications to two other areas: electromagnetic scattering, and laser-materials interaction accounting for melting.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.